Get data products in production faster with code environments and advanced capabilities for experimentation, modeling, and deployment.Discover
To expedite the feature engineering process, data scientists of all types — from citizens to experts — can discover reference feature sets in Dataiku’s feature store and import them into their projects.
AutoML in Dataiku provides automatic feature generation and reduction techniques and applies handling strategies for feature selection, missing values, variable encoding, and rescaling based on data type. Accept the default settings or easily modify any part for your specific objectives.
Delivering More Models with AutoML
Dataiku augments the model development process with a guided methodology, built-in guardrails, and white-box explainability so data scientists and analysts alike can build and compare multiple production-ready models.
Dataiku AutoML offers algorithms from leading frameworks for prediction, clustering, time series forecasting, and computer vision tasks to help people across the business generate the best results, all in an easy to use interface.
Advanced data scientists can extend the visual ML interface by adding a custom Python algorithm, or programmatically develop models using Python, R, Scala, Julia, Pyspark, and other languages. To ensure external efforts are captured and interpretable to the rest of the team, Dataiku captures the details of these experiments and automatically provides model comparisons and explainability reports.
Regardless of where a model is developed, Dataiku remains the central platform for deployment, monitoring, and governance.
Model Validation and Evaluation
Dataiku AutoML provides numerous features for validating and evaluating models, from design to deployment. Data scientists can take advantage of k-fold cross tests, automatic diagnostics, and model assertions for sanity checks during the experimentation phase.
An extensive battery of interactive performance and interpretation reports including fairness analysis, what-if analysis, and stress tests provides the tools teams need to explain results and responsibly deliver reliable, accurate models.
Time Series Analysis and Forecasting
Dataiku provides a suite of tools for time-series exploration and statistical analysis, along with preparation tasks such as resampling, imputations, and extrema & interval extraction.
Business specialists and data scientists can easily develop, deploy, and maintain statistical or deep learning forecasting models using Dataiku’s visual ML interface.
Visual and Code-Based Deep Learning
Dataiku’s familiar model design, deployment, and governance experience makes it easy to include deep learning as part of data projects and business applications.
Define custom deep learning architectures with Keras and Tensorflow, or take advantage of pretrained models, transfer learning, and no-code interfaces for computer vision tasks such image classification and object detection.
Scale with Managed Spark on Kubernetes
For large computation or model training jobs, teams can automatically and efficiently scale workloads with on-demand, elastic resources powered by Spark and Kubernetes on your cloud of choice.
Pre-configured and fully managed clusters abstract away the complexity of containerized infrastructure from data scientists, so you spend more time doing what you love, and less time setting up backend resources.
Get Started with Dataiku
Start an online hosted trial, download the free editionGet Started