AutoML and Augmented Analytics as the Future of AI

Companies who successfully scale AI efforts in the next five years will undoubtedly leverage end-to-end AutoML, from data ingestion to model monitoring (otherwise known as augmented analytics).

At a very high level, AutoML is about using machine learning techniques to automatically do machine learning. Or in other words, it means automating the process of applying machine learning. Early on, AutoML was almost exclusively used for the automatic selection of the best-performing algorithms for a given task and for tuning the hyperparameters of said algorithms.

Yet AutoML can have a broader scope with later versions of auto-sklearn and tpot (and has). Its development has spurred the application of automation to the whole data-to-insights pipeline, from cleaning the data to tuning algorithms through feature selection and feature creation, even operationalization. At this larger scale, it’s no longer AutoML, but augmented analytics. Today, automated analytics can add efficiency to large swaths of the data pipeline, with the potential to impact the entire process and influence the structure of data teams long term.

“By 2025, 50% of data scientist activities will be automated by AI, easing the acute talent shortage.” 

Gartner, How Augmented Machine Learning Is Democratizing Data Science; Jim Hare, Carlie Idoine, Peter Krensky, 29 August 2019 (Report available to Gartner subscribers)

Automating AI in the Data-Driven Enterprise Watch Video

AutoML and the AI Skill Gap

Get the White Paper

AutoML in Dataiku

Since the early days of the product (all the way back to 2014), Dataiku has offered a visual AutoML suite that guides the user through all of the machine learning steps (train-test split, feature handling, metrics to optimize, and different templates of pre-set algorithms). 

The interface offers a one-button option, simply called “Train” – this will automatically infer the feature handling, pre-select a collection of algorithms, and returns the best performing one. But of course, it’s still up to the user to tune those parameters and select the best possible settings based on their experience.

Try Dataiku for Free

Get Started

More specifically, Dataiku offers AutoML features for:

  • Model Deployment and Monitoring, including self-service deployment of models by data scientist and/or analysts; model lifecycle management (challenger, dev, test, prod); and scalable and highly available deployment of machine learning API services on container fleets on Docker/Kubernetes (either on premise or using cloud services).
  • Feature Engineering, including text, numeric, and categorical handling; imputation and rescaling; and feature selections / PCA.
  • Model Training, including for Python, Mllib, H2O, etc., and plugin models; regression, classification, clustering, and times series; interruptible and resumable grid search; support for random search and time-boundable search; and Kubernetes distributed training (per algorithm).

Data Scientists and the AI Revolution

Data scientists are a critical piece of the AI puzzle, and they know what tools and technologies work best for any given task.

Read more

Go Further

Workflows, Automation, And Monitoring

Dataiku lets you deploy workflows easily with safe versioning and rollback. Then automate your deployments as part of a larger production strategy.

Learn More

Deploying ML Models in Production

Deploy your ML models in production in one click. High performance and scalable scoring, deployable on the cloud with Kubernetes.

Learn More

Visual Machine Learning and Modeling

Dataiku makes it easy to leverage machine learning technologies and get instant visual and statistical feedback on model performance.

Learn More

Data exploration and visualisation

Get instant insights from your data whatever their size or format, and share with your teams.

Learn More