Dataiku Concepts Overview

Dataiku Data Science Studio Concepts

First time using Dataiku DSS?

From the Lab to the Flow

Drafting your work in the Lab

When beginning a data project, you should start by drafting your work in the Lab environment:

  • the Visual Analysis tool lets you draft data preparation steps, create charts, and build machine learning models
  • the code notebooks let you explore your data interactively.

Once your Lab work is complete, deploy it in the project Flow. Learn more about going from Lab to Flow.

Sampled vs. Complete Data

When working on a prepare recipe or in an analysis, you get live visual feedback for all the preparation steps that you add. This couldn’t happen if you were previewing whole datasets (big data just doesn’t fit in DSS’s memory), so you are simply looking at a sample.

Keeping Everything Up-to-date in the Flow

When you edit an existing recipe or when your data is updated, you will need to rebuild your downstream datasets to update their contents.

Make sure you understand the tools to accelerate this task by reading our Rebuilding Data page.

Last but not least, when your development work is done, your flow should be now steady. It is time to deploy your work in production.

Dataiku DSS greatly facilitates all automation and monitoring tasks, so make sure you understand all production related features by reading our portal on automation.

Collaboration

Dataiku DSS is all about making teams more efficient, so don’t forget to promote your work to others!