Data Anonymizer Plugin

This plugin is a processor for visual data preparation that anonymizes data, making it suitable for export to people or organizations that may not have the right to view the original data.

Anonymization works on a single column. It works by replacing content of the cells by content from an anonymization dictionary. The anonymization dictionary is either provided by the user or by using one of the built-in dictionaries.
This plugin can operate on regular content, on emails (preserving the domain), or on arrays (anonymizing each element). Anonymization can be done with or without possible collisions.

Beware: designing a robust anonymization strategy that is resilient to determined uncloacking efforts is hard and cannot be done by simply using a single processor (see the history of the AOL web logs for reference).

Plugin information

AuthorDataiku (Clément Stenac)
Last updated2015/11/13
LicenseApache Software License
Source codeNot yet released
Reporting issuesGithub

How to use

Open your dataset in an analysis or prepare recipe and search for the Anonymize data processor