Explore Machine Learning Features
Deep learning, Time Series, MLlib, Partitioned Models and Custom.
FeaturesBuild advanced machine learning models using the latest techniques.
To aid in the feature engineering process, Dataiku AutoML automatically fills missing values and converts non-numeric data into numerical values using well-established encoding techniques.
Users can also create new features using formulas, code, or built-in visual recipes to provide additional signals to improve model accuracy. Once created, Dataiku stores feature engineering steps in recipes for reuse in scoring and model retraining.
Automating the model training process using the best practice techniques combined with built-in guardrails allows business analysts to build and compare multiple production-ready models.
Dataiku AutoML uses leading algorithms and frameworks like Scikit-Learn and XGBoost to find the best modeling results in an easy to use interface for users across the business.
Dataiku supports a variety of notebooks for code-based experimentation and model development using Python, R, and Scala-based on Jupyter.
Dataiku also includes eight prebuilt notebooks for data analysis including statistics, dimensionality reduction, time series, and topics modeling.
Dataiku supports time-series data preparation, including resampling, windowing, extrema extraction, and interval extraction. Time series visualization creates line charts to display time-series data for analysis.
Data scientists can develop forecasting models using the forecasting plugin or using custom code and notebooks combined with data preparation and visualization in a project to ensure their forecast model is ready for production use.
Dataiku fully supports deep learning with Keras and Tensorflow, including training and deployment to CPUs and GPUs.
In Dataiku, deep learning models are treated just like any other model created and managed in Dataiku, making deep learning models easy to deploy as part of projects and business applications.
Dataiku does not restrict you to the algorithms that are part of its AutoML capabilities — it also allows users to write custom models using Python or Scala. Custom models are first-class citizens in Dataiku.
Once deployed in a project, custom models are handled like any other model. This powerful capability to use custom-coded models opens up various use cases that may not be easily modeled by other methods (such as AutoML).
Dataiku supports model training on large datasets that don’t fit into memory using Spark MLLib or H2O Sparkling Water.
Once configured, Spark becomes available to users for model training. Depending on the configuration, users can then train models using the available algorithms in MLLib like regression, decision trees, etc., or use H2O Sparkling Water with support for deep learning, GBM, GLM, random forest, and more.
Deep learning, Time Series, MLlib, Partitioned Models and Custom.
FeaturesExplore this Dataiku project on building a model using 5 different libraries
DiscoverFrom code environments to AutoML and advanced Machine Learning features, and data products in production.
DiscoverStart an online hosted trial, download the free edition,
or compare the features of the Lite, Team, and Enterprise editions.