The best choice usually depends on your setup, needs, and use cases; that said, many of our customers switched to Dataiku because of the depth of our integration into big data environments, the ease with which Dataiku pipelines are managed and pushed to production, and the collaboration features that keep a data team working together in a single orchestrating tool. Please contact us to find out if Dataiku is the right fit for you.
We’re glad you’re interested! We’re all about bringing the ecosystem together to better serve our customers and partners alike. Please contact us to find out how to become one of Dataiku’s trusted partners.
We have a Dataiku Free Edition that lasts forever and is perfect for an individual who is looking for a “small data” development environment.
Our Dataiku Enterprise Edition includes connectors to Hadoop data sources and enterprise SQL databases, plus automation, scheduling, and collaboration features to supercharge the productivity of your data team. Enterprise Edition pricing is based on a few metrics, including the number of users, the size of your infrastructure, and your functional needs. Please contact us to learn more about Dataiku’s pricing structure.
Dataiku is a “portmanteau” word combining Data - information that is produced or stored by a computer - and Haïku - a very short and structured form of Japanese poetry. How we pronounce Dataiku varies somewhat by region, but the two primary pronunciations are:
Dataiku is an enterprise-ready solution that is intended to be installed on a Linux x86-64 server running any of the most popular operating systems. Dataiku connects to your existing big data infrastructure to orchestrate jobs that run locally, in-database, or in-cluster (Hadoop MapReduce or Spark).
Sizing your environment depends upon the amount of data you plan to work with, the available computing engines, the number of concurrent uses, the type of workload you will run, and other factors. Please contact us for sizing recommendations when working with Dataiku.
Find out more about how Dataiku fits into a big data architecture through our Learning portal.
Dataiku supports connections to dozens of data storage systems:
Find out more about how to connect to data through our Learning portal.
Dataiku enables numerous data preparation and shaping options through visual recipes that don’t require coding skills.
|Join with...||Split||Top N|
|Push to editable||Export to folder|
The scope of data wrangling techniques is increased by code written in any of Dataiku’s supported programming languages (see below), and code can be surfaced as new visual recipes and data processors using plugins.
Find out more about visual data preparation through our Learning portal.
Dataiku allows you to integrate code into your analytic workflow with code recipes using your favorite languages:
|PySpark||SparkR||Spark SQL||Scala Spark|
Additionally, Dataiku features notebooks for exploratory / experimental work using SQL, Python, and R code.
Find out more about integrating code into Dataiku through our Learning portal.
Dataiku leverages machine learning libraries from scikit-learn, Spark MLlib, H2O and Vertica for guided ML analyses that give you instant visual & statistical feedback on model performance. You can also integrate any external ML library (Vowpal Wabbit, Tensorflow, etc.) through code APIs to build custom models.
Find out more about how to build machine learning models through our Learning portal.
Dataiku supports a number of tools for engaging stakeholders and team members, including:
Find out more about how to create data visualizations and reports through our Learning portal.
We have a comprehensive Learning portal that includes:
We also regularly host and appear at events to share use cases, tips, and tricks. These are great places to learn, meet and connect with other users.
Whether something isn’t working the way you expected or you’re not sure from the Learning material how to achieve an analytic goal, there are resources to help you get on track:
Currently, the Dataiku interface is English-only.