en
Get Started

DataOps with Dataiku

Automate data pipelines for clean and timely data across the enterprise.

 

Projects

Dataiku projects are the central place for all work and collaboration for users. Each Dataiku project has a visual flow, including the pipeline of datasets and recipes associated with the project.

Users can view the project and associated assets (like dashboards), check the project’s overall status, and view recent activity.

 

Visual Flow

Organizing data pipelines to transform, prepare, and analyze data is critical for production-ready AI projects.

The Dataiku visual flow allows coders and non-coders alike to easily build data pipelines with datasets, recipes to join and transform datasets, and the ability to build predictive models. The flow also has code and reusable plugin elements for customization and advanced functions.

 

Data Quality and Checks

Checks in Dataiku allow for automatic assessment of flow elements to compare with specified or previous values, ensuring that automated flows run within expected timeframes and with expected results. When data pipeline items fail checks, an error will be returned, prompting investigation and promoting quick resolution.

 

Scenarios and Triggers for Automation

Operating AI projects require repetitive tasks like loading and processing data, running batch scoring jobs, and more. With Dataiku, scenarios and triggers automate repetitive processes by scheduling for periodic execution or triggers based on conditions.

With automation in place, production teams can manage more projects and scale to deliver more production AI projects.

 

Code Notebooks, Recipes, and Environments

Dataiku is for coders and non-coders alike. Developers and advanced data scientists who prefer tools like Python or R can incorporate code into projects via notebooks or directly with code recipes and plugins.

Dataiku supports code notebooks for SQL, Python, and R, and code recipes developed in Python, R, SQL, Hive, Pig, Impala, Spark-Scala, PySpark, Spark/R, SparkSQL, and Shell. Dataiku also supports code environments for Python, R, and Conda, and it has a complete API for R.

 

Git Integration

Integrating with Git for code version management is required for development projects. Dataiku provides integration with Git, including version control of projects, importing Python and R code, developing reusable plugins, importing plugins, and more.

 

APIs

Dataiku includes robust APIs to integrate with external systems to create and manage AI and analytics projects. The Dataiku public API allows authorized users to interact via an external system, including administration, maintenance, and data access.

The public API is available via a Python API client or via HTTP REST API. Dataiku also includes a complete R API as well as APIs for JavaScript and Scala for specific functions.

Go Further

Explore Data Ops Features

Automations, Checks, Metrics, Scenarios, and Triggers!

Features

Get a Demo

Watch our end-to-end demo to discover the platform.

On-Demand Dataiku Demo

Beyond Dataiku with Plugins

Extend the power of Dataiku with your own datasets, recipes, and processors!

Plugins

Discover how Dataiku enables Data Architects

From AI orchestration to smooth operationalization, explore how Dataiku helps data architects.

Discover

Get Started with Dataiku

Start an online hosted trial, download the free edition,
or compare the features of the Lite, Team, and Enterprise editions.

Let's go