Dataset Audit

This plugin provides a recipe that takes a SQL-based or HDFS-based dataset as input, and outputs an audit of the data in the input dataset.

The output is a dataset with one line per column in the input dataset. For each column, the recipe outputs:

The recipe uses in-processing or in-Hadoop processing, as appropriate for the input dataset

Plugin Information

More information about the plugin is available in the Github repository

Learn everything you ever wanted to know about Dataiku (but were afraid to ask), including detailed specifications on features and integrations.