The Dataiku DSS Data Directory

July 03, 2015

You might often see some references the DSS Data Directory (or Data Dir for short)

This could happen for instance when contacting our support team, when we’ll ask for the full logs available in your current Data Dir.

What is the Data Dir?

The Data Dir is where DSS stores all its configuration and data files. Notably, you will find:

  • binaries and programs to run DSS
  • settings and definitions of your datasets
  • configuration of your analysis, recipes, notebooks
  • the actual data of your Machine Learning models
  • some datasets' data files
  • logs...

How do I find my Data Dir?

This is in practice the directory which:

  • you set during the installation of DSS on your server (the -d option):
    ./ -p PORT -d path/to/data_dir
  • has been set by default for Community Edition on Mac OS X:

If you use an Enterprise Edition of DSS and did not install it yourself, you may find the actual path to the data dir by following these steps:

  1. Create a new IPython Notebook
  2. Select a cell and type:
    !echo $DIP_HOME
  3. The path to the Data Dir will be displayed in the output

Structure of the Data Dir

Assuming you have access to the Data Dir ("DATA_DIR" below), here is the list of directories you will find. The interesting ones are highlighted:

      |__ analysis_data
      |__ bin/                   Programs and binaries to manage DSS
      |__ caches/
      |__ config/                Global and per-project configuration files
      |__ exports/               Temporary location for exported files
      |__ jobs/                  Project and jobs-specific log files
      |__ html-apps/
      |__ lib/                   JDBC drivers and custom Python files
      |__ managed_datasets/
      |__ nginx/
      |__ pyenv/
      |__ run/                   Main directory for DSS log files
      |__ saved_models/
      |__ tmp/
      |__ uploads/               Manually uploaded files, project wise
      |__ user_data/

Your log files are thus in: DATA_DIR/run/*.log

If you need more info on DSS logs, please make sure to read this note.

Happy DSSing!