In DSS, teams or users organize their datasets and associated tasks into separate projects:
The projects help managing:
Within a project, the DSS items that the user manipulates are accessed within 6 "universes" mapping the main concepts of DSS:
A Dataset (in the DSS sense) is a series of rows with the same data structure. The underlying data:
The DSS Dataset Abstraction Layer allows users to access, visualize and write the data in a unified way whatever the storage system.
Creating your first DSS dataset and learning how to cleanse it is the subject of the DSS 101 Tutorial.
In DSS, the data manipulations (cleansing, aggregation, joining, etc) are performed within Recipes which take datasets as inputs and outputs.
The lineage of a dataset (or a model) is thus defined by the inputs and outputs of its ancestor recipes. The overall view of the dependency structure of a project is accessible in the Flow tab:
The knowledge of these dependencies helps the DSS engine minimizing the amount of data processes to be launched when (re)building a dataset.
Data Science in real life is full of dirty data. Data tacklers' daily tasks include:
Within the Lab, DSS provides a dedicated module called Visual Analysis to quickly iterate over these processes. This module allows data workers to experiment in a visual friendly manner various constructions prior to their deployment in the Flow in order to efficiently build their data driven applications.
We strongly invite you to follow the Tutorial 101 to discover how tackling data problems can be fluid within the DSS Analysis.
Some people like to do this iterative process using code. DSS is shipped with interactive development environments that are called Notebooks.
These can be used either
The secret to efficiently taking advantage of your data assets lies in the ability to (re)play the full pipeline of an analysis and to always have up-to-date predictive scoring.
The monitoring of the associated tasks is accessed in the jobs tab. Every time the reconstruction of a dataset is requested, DSS creates new Job with all the build dependency information defined in the Flow.
Scenarios help you automate these reconstruction tasks, for example running daily updates to your models. Reports on scenarios that ran previously and their results are shown in Monitoring.
The DSS Dashboard is a communication tools to organize, share or deliver the Insights on your data (charts, datasets, static reports, etc).
On the Dashboard, the team structures their findings and the final data consumers get their updated summary.
Users with Web coding skills can create advanced custom Web applications using our dedicated editor and REST API. Templates and code samples are provided to help you get started. Head to the dedicated howtos to learn more!