Guided Machine Learning
The interface of Dataiku DSS was designed to make creating potent machine learning models easy. Clicking through the interface is enough for most use cases, whether you are an expert Data Scientist or a beginner! Discover below what you can do in the visual interface:
- Learn how to save a model to the flow and use it to score another dataset.
- A Dataiku model consists of a whole pipeline that combines data preparation, feature handling and a ML model. This means that you can directly score raw data with a Dataiku model, without reimplementing data cleaning nor feature preprocessing.
The lifecycle of a model in production
The models saved in the flow can be retrained, versioned and monitored. These capabilities are critical to all predictive applications used in production. See here how to handle the whole lifecycle of a model.
Machine learning engines
Dataiku DSS lets you use multiple machine learning engines within its guided machine learning framework.
Deep learning offers extremely flexible modeling of the relationships between a target and its input features, and is used in a variety of challenging applications, such as image processing, text analysis, and time series, in addition to models for structured data.
The python and the MLlib machine learning engines allow you to define custom models by adding your own code while still taking advantages of the Dataiku DSS visual interface for machine learning.
Score in real time through a REST API
A saved model can be deployed into a Dataiku DSS API node to query a prediction on new data.
The API node provides all the necessary features for scoring in production:
- High availability and scalability for scoring new records.
- Model versioning and rollback using model packages.
- The ability to score in realtime, even with models trained using a distributed engine
- And more advanced capacities such as enriching queries in real-time or handling custom models.
- Refer to the Automation and real-time scoring portal for more information.
Energy & environment
SAMPLE PROJECT: Model energy consumption and predict power peaks
SAMPLE PROJECT: Geographic clustering based on POIs