Learn more about IT observability and monitoring with Dataiku in this webinar.Watch Now
Self-Contained, Deployable Projects
Dataiku projects are the central place for all work and collaboration, and where teams create and maintain related data products. Each Dataiku project has a visual flow that represents the pipeline of data transformations and movement from start to finish.
A timeline of recent activity, automatic flow documentation, and project bundles make it easy to track changes and manage data pipeline versions in production.
Batch or Real-Time Deployments
Project bundles snapshot the data, logic, and dependencies needed to recreate and execute pipelines in QA or production environments. Run scheduled jobs, or expose elements as REST APIs to support real time applications.
Dataiku’s central deployer provides oversight over both types of deployments, and event logs and dashboards allow data operators to continuously monitor systems and detect issues.
Data Quality Metrics and Checks
Metrics in Dataiku automatically assess data or model elements for changes in quality or validity, and checks ensure that scheduled flows run within expected timeframes and that metrics deliver the expected results.
Configurable alerts and warnings give teams the control they need to safely manage production pipelines, without the tedium of constant manual monitoring.
Automation Scenarios and Triggers
With scenarios, Dataiku’s built-in scheduler, teams automate repetitive sequential tasks like loading and processing data, running batch scoring jobs, retraining models, updating documentation, and much more.
Operators may use the visual interface or execute scenarios programmatically using APIs, flexibly configuring partial or full pipeline execution based on time and condition-dependent triggers.
Smart Flow Operations
Interrupted connections, broken dependencies, out-of-sync schemas — avoid these common pitfalls with Dataiku’s features for data operations and orchestration.
Flow-aware tooling helps operators manage pipeline dependencies, check for schema consistency, and intelligently rebuild datasets and sub-flows to reflect recent updates.
APIs and Git Integration
Dataiku includes robust APIs so you can programmatically interact with and operate data projects from external systems and IDEs.
Git integration delivers project version control and traceability and enables teams to easily incorporate external libraries, notebooks, and repositories for both code development and CI/CD purposes.
Get Started with Dataiku
Start an online hosted trial, download the free editionGet Started