Take your first steps with Dataiku DSS in this tutorial!
Tutorials
Dataiku DSS Basics
Tutorial: Basics
Tutorial: From Lab to Flow
In this tutorial, you will learn how to enrich your data, combine data from multiple datasets, and create a new dataset based on recipe transformations.
Tutorial: Machine Learning
In this two-part tutorial, you will create your first machine learning model and then use it to score new records.
Automation and Production
Tutorial: Automation
In this tutorial, you will learn the basics of scheduling jobs using scenarios and monitoring jobs using metrics and checks.
Tutorial: Deploying to Production
In this tutorial, you will learn how to package flows for deployment, version flows, and deploy packages in a production environment.
Tutorial: Deploying to Real-Time Scoring
In this tutorial, you will learn how to package an API service, which includes a model, for deployment, deploy a service to the real-time scoring environment, and version service packages.
Advanced Dataiku DSS
Using window recipes
If you are trying to:
- filter rows by order of appearance within a group,
- compute moving averages or cumulative sums,
- or perform data manipulation similar to what SQL window functions do,
you can do all these by using a window recipe!
Partitioning datasets
Partitioning is a very powerful tool when working with incremental data. It will help with reducing computation time when you update you data science workflow.
- If you are trying to partition a file based on the values of one of its columns, you will need to read our tutorial on repartitioning file-based datasets.
Creating web apps
Web apps are a great way to share your insights. You can build beautiful and interactive data visualizations, or even dashboards for sophisticated reporting! See the web app section of the data visualization portal for tutorials on standard, Bokeh, and Shiny web apps.
Deep Learning
You can define Deep Learning architectures in Dataiku’s Visual Machine Learning for a variety of applications, such as image processing, text analysis, and time series, in addition to models for structured data.
Use Case Samples
Churn prediction
Churn prediction is one of best known applications of data science in the Customer Relationship Management (CRM) and Marketing fields. Simply put, a churner is a user or customer that stops using a company’s products or services.
In this tutorial you will create a complete data science workflow to predict if a customer is going to churn. It covers the whole process, from data preparation to machine learning, and includes a good amount of feature engineering.