1- What is data masking?

Data masking is also known as de-identification. Data-masking is a simple form of data anonymization that alters the values to hide the personally identifiable data, while subsequently maintaining the format. This is the minimum compliance that is expected by firms and corporations to comply with privacy laws. With data-masking, you create a fake, but realistic version of the data. This may include data encryption, data scrambling, nulling out, value variance, data substitution, data shuffling, pseudonymization.

2- What is anonymization?

Anonymization is the removal or alteration of information so that no sensitive information can be attributed to a person.   The anonymization procedure is not limited to the removal of direct identifiers that might exist in a dataset (de-identification). A more aggressive approach requires the removal of secondary information like family relations or a job title that could be used to identify certain content to a person.

3- Why anonymize data?

Data is being collected, stored and transferred at ever-increasing rates. It’s also being hacked, stolen, illegally shared and sold without permission. Fines are being levied, consumer confidence is being eroded and the flow of information between organizations is being constrained. By altering personally identifiable information, anonymized assets can be collected, processed and transmitted more safely across an organization or between an organization and third parties.

4- What are the different types of anonymization?

Anonymization with gaps automatically detects entities and replaces the personal data with gaps. Anonymization with placeholders replaces the data with symbols.  In pseudonymization the data is replaced with synonyms (fake identifiers or pseudonyms).

5- What is the difference between anonymization and pseudonymization?

Anonymous data must be de-identified in a way that a specific individual can’t be identified.  Pseudonymization is where personal information like names and addresses have been removed, but other names, professions, locations and dates are readable. This is ideal for legal documents.

6- What are the standard techniques for compliance with privacy regulations like GDPR and CCPA?

There is no standard solution for this new market.  Our data masker helps you satisfy personal privacy legislation by removing personal identifiable entities in order to use the anonymized documents or share them with third parties.

Pangea Masker

1- What type of personal data can be detected and anonymized?

Categories of data like names, company names, addresses, telephone numbers and identification documents, to name a few.  Additional secondary data that is also considered private include locations, jobs, functions and dates can also be identified.

2- What formats are supported?

Word processing, PDFS, presentations, emails, spreadsheets, social media and business applications like MS office documents. Pangea Masker has a powerful API for product integration.

3- What languages are supported?

Pangea Masker is natively multilingual across named entities and available in 25 languages (coming in May, 2021).  Machine translation can be integrated onto the same platform.  It’s currently being used as the anonymization platform in the MAPA project with public administrators in Europe.

4- Is the Pangea Masker anonymization process reversible?

The anonymization process is normally a one-way process and can be de-anonymized if desired. Once content is anonymized, there is usually no way to recover it.  But the Pangea Masker can be configured to generate an ‘index file’ that contains the information necessary to reconstruct the original data from the anonymized text.

5- What does the sensitivity selector do in the Masker?

This allows the user to select the aggressiveness of the AI engine performing anonymization for a specific document.  On one setting the anonymization procedure can remove direct identifiers. In a more aggressive approach, secondary information like family relations or a job description is anonymized as well.

6- Is the system trainable?

Parameters for API and on-premise offerings can be fine-tuned to your data set.  Industries, domains and companies often have data that is unique to their use case.

7- What backends does Pangea Masker integrate with?

The system can be configured as an API and therefore integrates with many back-ends including MySQL, SQL Server, Oracle, MongoDB and most databases. Custom systems can be integrated in on-premises or private SaaS deployments.

8- Is the Pangea Masker tied to a specific software system?

The Pangea Masker is portable and can be accessed in a secure cloud or deployed on-premise for additional privacy. Using a standard RESTFul API, text assets (files, documents, sequences or segments) can be sent to the Masker which will produce the same format but replacing the data with private data.For subscriptions, Master is deployed within Pangeanic’s ECO platform, providing full integration with Neural Machine Translation capabilities for automatically translating content into target languages.

9- What is the API?

Masker API is a professional feature that integrates the Masker into the customer’s workflow (a DMS, a CMS,…). The Pangeanic Masker API offers the full functionality that can be found thru the Masker SaaS web app. It anonymizes plain text or files using the full range of parameters that can be chosen in the web app.


1- What are different regulations that govern data privacy?

Different countries and states have different regulations that govern how personally identifiable data can be collected, shared and stored.  Some include CCPA, GDPR, HIPAA, APPI and LGDP.

2- What are the CCPA and CPRA regulations?

July, 2020 was the official start to legal enforcement of CCPA, the California Consumer Privacy Act. It has national, and even international implications, since it became the first major privacy law that gives consumers in the United States control over their personal information, closely aligning with GDPR (General Data Protection Regulation) in Europe.


CCPA regulates personal information that businesses collect, how that data should be used, and how consumers can opt out of the data being sold. If everything proceeds as planned, the attorney general will have the authority to enforce violations relating to employment data dating back to Jan. 1, 2020, considered the “look-back” period.  On the heels of the beginning of enforcement, Californians approved a follow-up act, the CPRA.


CPRA isn’t a different law, but is an expansion of the current law, which strengthens protections for consumers and clarifies some of the more unclear compliance questions for organizations.

It also creates a new government agency dedicated to handling enforcement and compliance with the new privacy regulations. It is likely to further influence and strengthen models being formalized by federal and local laws around the country

3- What is GDPR, the General Data Protection Regulation?

Europe’s data privacy and security law is often considered the most thorough privacy and security law in the world. It covers data collected in the European Union (EU), but imposes obligations onto organizations anywhere. Since the regulation was put into effect in 2018, the GDPR levies harsh fines against those who violate its privacy and security standards.

Find the guide to GDPR privacy compliance here. A general GDPR overview is here.

4- What is HIPAA - Health Insurance Portability and Accountability?

This US law provides privacy standards to protect patients’ medical records and other health information provided to health plans, doctors, hospitals and other health care providers.

Developed by the Department of Health and Human Services, these standards provide patients with access to their medical records and more control over how their personal health information is used and disclosed. HIPAA represents a uniform, federal floor of privacy protections for consumers across the country. State laws providing additional protections to consumers are not affected by this new rule. HIPAA took effect on April 14, 2003. Find government published details here.

5- What is Japan’s APPI - Act on the Protection of Personal Information?

In 2017, Japan’s reformed privacy law took effect, replacing the former Act on Protection of Personal Information. The new law, “APPI Amendment 2017,” outlines basic data protection policies. See the current provisions of the APPI here.


Any business in Japan that holds personal data is required to abide by the APPI Amendment, with some minor exclusions. It includes provisions on third-party transfers, record-keeping, anonymity and breaches, and protects the rights of individuals in regard to their personal data.


The reformed law has helped to get Japan on the EU’s “white list” of countries with adequate data protection legislation.

6- What is LGPD - General Data Protection Law?

Brazil’s data protection legislation is a patchwork of several individual laws, codes and frameworks. This includes Article 5 of Brazil’s Federal Constitution 1988 including general provisions relating to a person’s right to privacy, and the Consumer Protection Code 1990 which contains legislation regarding the collection, storage, processing and use of personal data. The Brazilian Internet Act 2014 also regulates the protection of privacy and personal data online.

In August 2018, the Brazilian President, Michel Temer, signed off on the latest General Data Privacy Law. Following in the EU’s steps, Brazil’s LGPD has many similarities to the GDPR. See the final LGPD law here.