Frequently Asked Questions

How does Dataiku compare to competing products?

The best choice usually depends on your setup, needs, and use cases; that said, many of our customers switched to Dataiku because of the depth of our integration into big data environments, the ease with which Dataiku pipelines are managed and pushed to production, and the collaboration features that keep a data team working together in a single orchestrating tool. Please contact us to find out if Dataiku is the right fit for you.

How do I become a Dataiku partner?

We’re glad you’re interested! We’re all about bringing the ecosystem together to better serve our customers and partners alike. Please contact us to find out how to become one of Dataiku’s trusted partners.

How much does Dataiku cost?

We have a Dataiku Free Edition that lasts forever and is perfect for an individual who is looking for a “small data” development environment.

Our Dataiku Enterprise Edition includes connectors to Hadoop data sources and enterprise SQL databases, plus automation, scheduling, and collaboration features to supercharge the productivity of your data team. Enterprise Edition pricing is based on a few metrics, including the number of users, the size of your infrastructure, and your functional needs. Please contact us to learn more about Dataiku’s pricing structure.

How do I pronounce Dataiku?

Dataiku is a “portmanteau” word combining Data - information that is produced or stored by a computer - and Haïku - a very short and structured form of Japanese poetry. How we pronounce Dataiku varies somewhat by region, but the two primary pronunciations are:

  • /ˈdeɪ.tɑː.ˌaɪ.kuː/

  • /dɑː.ˈtaɪ.kuː/

What architectures do you support?

Dataiku is an enterprise-ready solution that is intended to be installed on a Linux x86-64 server running any of the most popular operating systems. Dataiku connects to your existing big data infrastructure to orchestrate jobs that run locally, in-database, or in-cluster (Hadoop MapReduce or Spark).

Your environment can be on-premises or hosted on a Cloud service of your choice. Note that Dataiku does not host enterprise cloud services, but is available on AWS AWS and MS Azure Azure.

Sizing your environment depends upon the amount of data you plan to work with, the available computing engines, the number of concurrent uses, the type of workload you will run, and other factors. Please contact us for sizing recommendations when working with Dataiku.

Find out more about how Dataiku fits into a big data architecture through our Learning portal.

What data connections do you support?

Dataiku supports connections to dozens of data storage systems:

Data sources supported by Dataiku

Find out more about how to connect to data through our Learning portal.

What data wrangling techniques do you support?

Dataiku enables numerous data preparation and shaping options through visual recipes that don’t require coding skills.

Sync Prepare Sample/Filter
Group Distinct Window
Join with... Split Top N
Sort Pivot Stack
Push to editable Export to folder

The scope of data wrangling techniques is increased by code written in any of Dataiku’s supported programming languages (see below), and code can be surfaced as new visual recipes and data processors using plugins.

Find out more about visual data preparation through our Learning portal.

What programming languages do you support?

Dataiku allows you to integrate code into your analytic workflow with code recipes using your favorite languages:

Python R SQL Shell
PySpark SparkR Spark SQL Scala Spark
Hive Impala Pig

Additionally, Dataiku features notebooks for exploratory / experimental work using SQL, Python, and R code.

Find out more about integrating code into Dataiku through our Learning portal.

What machine learning algorithms do you support?

Dataiku leverages machine learning libraries from scikit-learn, Spark MLlib, H2O and Vertica for guided ML analyses that give you instant visual & statistical feedback on model performance. You can also integrate any external ML library (Vowpal Wabbit, Tensorflow, etc.) through code APIs to build custom models.

Find out more about how to build machine learning models through our Learning portal.

What data visualization and reporting methods do you support?

Dataiku supports a number of tools for engaging stakeholders and team members, including:

  • 25 types of interactive charts
  • Dashboards that allow you to share insights from your project
  • Web-apps that allow you to create custom HTML, Javascript (d3.js, Leaflet, plot.ly, …), and Python based web applications.

Find out more about how to create data visualizations and reports through our Learning portal.

How do I learn more about using Dataiku?

We have a comprehensive Learning portal that includes:

We also regularly host and appear at events to share use cases, tips, and tricks. These are great places to learn, meet and connect with other users.

Where do I find help in times of trouble?

Whether something isn’t working the way you expected or you’re not sure from the Learning material how to achieve an analytic goal, there are resources to help you get on track:

  • You can find answers in the in-product help; click on the question mark (?) in the upper right of the screen to access it. This search bar only works if the server hosting Dataiku DSS can access the internet. Head to our help center to access the online version.
  • Q&A is a community supported question-and-answer board. Chances are if you have a question, someone else has the same question, and you may find the answer already in Q&A!
  • Support is Dataiku’s official support portal.
  • Intercom can be useful if you have a quick question.
  • Your account executive or customer success manager can pass feature requests to development.

How do I change the language of the interface?

Currently, the Dataiku interface is English-only.