This plugin is a processor for visual data preparation that anonymizes data, making it suitable for export to people or organizations that may not have the right to view the original data.
Anonymization works on a single column. It works by replacing content of the cells by content from an anonymization dictionary. The anonymization dictionary is either provided by the user or by using one of the built-in dictionaries.
This plugin can operate on regular content, on emails (preserving the domain), or on arrays (anonymizing each element). Anonymization can be done with or without possible collisions.
Beware: designing a robust anonymization strategy that is resilient to determined uncloacking efforts is hard and cannot be done by simply using a single processor (see the history of the AOL web logs for reference).
|Author||Dataiku (Clément Stenac)|
|License||Apache Software License|
|Source code||Not yet released|
How To Use
Open your dataset in an analysis or prepare recipe and search for the Anonymize data processor