Automations, Checks, Metrics, Scenarios, and Triggers!Features
DataOps with Dataiku
Automate data pipelines for clean and timely data across the enterprise.
Dataiku projects are the central place for all work and collaboration for users. Each Dataiku project has a visual flow, including the pipeline of datasets and recipes associated with the project.
Users can view the project and associated assets (like dashboards), check the project’s overall status, and view recent activity.
Organizing data pipelines to transform, prepare, and analyze data is critical for production-ready AI projects.
The Dataiku visual flow allows coders and non-coders alike to easily build data pipelines with datasets, recipes to join and transform datasets, and the ability to build predictive models. The flow also has code and reusable plugin elements for customization and advanced functions.
Data Quality and Checks
Checks in Dataiku allow for automatic assessment of flow elements to compare with specified or previous values, ensuring that automated flows run within expected timeframes and with expected results. When data pipeline items fail checks, an error will be returned, prompting investigation and promoting quick resolution.
Scenarios and Triggers for Automation
Operating AI projects require repetitive tasks like loading and processing data, running batch scoring jobs, and more. With Dataiku, scenarios and triggers automate repetitive processes by scheduling for periodic execution or triggers based on conditions.
With automation in place, production teams can manage more projects and scale to deliver more production AI projects.
Code Notebooks, Recipes, and Environments
Dataiku is for coders and non-coders alike. Developers and advanced data scientists who prefer tools like Python or R can incorporate code into projects via notebooks or directly with code recipes and plugins.
Dataiku supports code notebooks for SQL, Python, and R, and code recipes developed in Python, R, SQL, Hive, Pig, Impala, Spark-Scala, PySpark, Spark/R, SparkSQL, and Shell. Dataiku also supports code environments for Python, R, and Conda, and it has a complete API for R.
Integrating with Git for code version management is required for development projects. Dataiku provides integration with Git, including version control of projects, importing Python and R code, developing reusable plugins, importing plugins, and more.
Dataiku includes robust APIs to integrate with external systems to create and manage AI and analytics projects. The Dataiku public API allows authorized users to interact via an external system, including administration, maintenance, and data access.
Get Started with Dataiku
Start an online hosted trial, download the free edition,
or compare the features of the Lite, Team, and Enterprise editions.