What is data masking?

support Hero Banner

Data masking replaces sensitive data in databases or data stores with realistic and functional disguised values to strengthen data security practices.

Data masking often includes automated character or numeric manipulation techniques. Organizations rely on data masking because it veils or anonymizes sensitive, high-quality production data, including personally identifiable information (PII) and protected health information (PHI), for development and testing, analytics, customer service, and training purposes.

Data masking frees valuable staff resources from recreating data outside production environments to meet compliance requirements. It also improves cyber resilience by safeguarding valuable company and customer information from disasters such as ransomware attacks.

How does data masking work, and why is it needed?

Data masking works by manually or automatically disguising data values within the same database or data store with different numbers, letters, characters, or arranging. The practice has gotten more attention since the European Union General Data Protection Regulation (GDPR) went into effect, requiring organizations to ensure pseudonymization—an assurance that data cannot be used for personal identification.

Data masking is becoming increasingly necessary because today’s businesses and government agencies depend on digital operations. These are among the reasons organizations adopt data masking.

  • Business acceleration — Data masking helps organizations storing sensitive information about employees, partners, customers, and transactions to continue using that data without interruption, fear of data theft and loss, or concerns about industry or government compliance. Masked data is often used in software development and testing but also for business intelligence, training, and analysis.
  • Business continuity — More recently, data masking has become a powerful tool in helping organizations protect sensitive data from cyberattacks such as double-extortion ransomware schemes because even if cybercriminals steal data, it’s useless to them.
  • Data protection — Data masking helps teams protect against external (e.g., cybercriminals, contractors, and third parties) and internal bad actors (e.g., insider threats) from gaining access to valuable information while also giving more people access to sensitive company data to perform valuable tasks that can help boost revenue.
  • Compliance — Organizations do not want to pay fines associated with non-compliance with privacy regulations such as GDPR and the California Consumer Privacy Act (CCPA).

What data requires masking?

Sensitive data that organizations keep in compliance with industry and government-defined privacy regulations can be subject to data masking requirements. Commonly, data masking is part of conversations about these types of data:

  • Personally identifiable information (PII) – This is data that, alone or with other details, can be used to establish a person’s identity—for example, a name, an address, a license number, a telephone number, a passport number, and more.
  • Protected health information (PHI) — This is any data in a medical record that can be used to identify an individual that was created or shared during a healthcare encounter, for example, name, address, medical record number, dates of services, diagnoses, device numbers, and more.
  • Financial data — This information is part of a transaction and is subject to Payment Card Industry–Data Security Standard (PCI-DSS) rules.
  • Intellectual property — This is information that is considered proprietary to an organization.

What are the different types of data masking?

These are some of the most widely used types of data masking and why organizations typically use them to safeguard their sensitive information:

  • Static data masking — An organization creates a new, sanitized copy of a database or data store by making a backup of production and moving it to another environment or via a virtual air gap before automating or manually masking the data while offline and reloading it for dev/test, analysis, or training use.
  • Dynamic data masking — As the name implies, an organization masks data in real time as it moves directly from a production data store to another location, often a response queue as part of an application such as customer service.
  • On-the-fly data masking — This approach also works in real time. It is used as data moves directly between the production and another server, typically in the engineering organization.
  • Deterministic data masking — Organizations that use this approach decide they will replace certain data (e.g., certain names) with the same value (e.g., 1) or characters (e.g., xyz) every time the data is used in a data source outside of production to speed the data’s use, often for dev/test.

What are data masking methods and best practices?

The most common data masking methods are scrambling, which is randomly rearranging letters or numbers; nulling out, or simply removing the data from view; substitution, which is altering values or real words; and shuffling, which is altering values and also moving them around in the database columns and rows or data sets. Variance, assigning a random value or date to a field, and date aging, which is consistently changing a date range, are recurring data masking techniques for transactional data.

Organizations applying any of these methods will want to begin by:

  • Assessing their data — Before teams mask any data, they should understand their sensitive data, who is authorized to access it, what applications and individuals need access to it, and where it exists.
  • Synchronizing their teams — A single team is rarely responsible for all data masking tasks, making it imperative that departments coordinate to maintain referential integrity. Referential integrity ensures that the value of one attribute of a relation in a relational database that references another attribute does exist.
  • Securing their processes — Ransomware is rampant, and insider threats are on the rise, putting the onus on organizations to strengthen security everywhere, including who and what systems are responsible for protecting their data and their data masking protocols.
  • Testing their approaches — Teams responsible for data masking should conduct quality assurance to ensure the security levels they anticipate, and the performance of the systems and applications will meet expectations post-launch.

What is the difference between data masking and encryption?

The ease of use is a big difference between data masking and data encryption. Organizations typically apply encryption, which converts and transforms the data into unreadable text by humans and machines while that data is at rest. Restoring the data to usable form takes a corresponding decryption algorithm and the original encryption key. In contrast, data masking can be applied to data at rest or in motion. Masked data can also be immediately accessed and used to power applications and perform regular business activities such as answering customer questions, testing or developing code, and conducting analysis on data sets.

Cohesity and data masking

Exponentially growing data, rising ransomware threats, and stricter compliance requirements challenge organizations to improve how they protect sensitive data while making it more available for business use. Cohesity Data Cloud boosts cyber resilience and operational efficiency by strengthening security and automating previously manual processes.

It’s purpose-built to simplify, scale, and strengthen data security and management. The Cohesity platform is also powered by artificial intelligence and machine learning (AI/ML) insights and Zero Trust security principles. These principles help organizations use third-party applications such as data masking to deliver multilayered defense.

The extensible Cohesity platform plus data masking vendors such as DataMasque work together to anonymize data with intelligent masking capabilities and clear it of sensitive information, including PII, before sharing it with other teams—all of which makes compliance easier. With the seamless-to-use solution, organizations improve data protection while building customer, employee, and partner trust.

You may also like

Blog

World Backup Day: The Past, Present, and Future of Data Protection

Learn more
resource_pattern
Blog

GDPR simplified: Distilling its significance on infrastructure

Learn more
resource_pattern
Blog

All in This Together: Next-Gen Data Management Benefits from App Integration

Learn more
X
Icon ionic ios-globe

You are now leaving the German section of www.cohesity.com/de/ and come to an English section of the site. Please click if you want to continue.

Don't show this warning again

Icon ionic ios-globe

You are now leaving the German section of www.cohesity.com/de/ and come to an English section of the site. Please click if you want to continue.

Don't show this warning again