Get Started

Working with Code

Code in Dataiku Data Science Studio

Available Languages

Dataiku DSS provides API and coding environments to access and process the data using your favorite language:

  • Python & PySpark
  • R & SparkR
  • SQL & SparkSQL
  • Shell
  • Scala Spark
  • Hive, Impala, & Pig
Prototyping in notebooks

You can use notebooks to prototype your recipes in Python, R, SQL and Scala. These sandbox environments will help you develop iteratively and quickly. Once your code is ready, just copy it to a recipe in your flow!

Find out how to use notebooks in DSS here.

Code environments

Different projects (or even recipes within a project!) may require different versions of Python or R. You can create an arbitrary number of code environments, each of which runs a specific version of Python or R and has its own set of packages. You can then set the environment used to process code at the DSS instance, project, or recipe/notebook/web app level.

Web applications

If you know how to code you can fully customize your insights by creating custom web apps! Read our tutorial on how to create web apps here, and find out more about web apps in the data visualization portal.

When should you code?

Whether your are a seasoned coder or a new one, visual tools can tremendously accelerate your work. Take a look at what can be done through the visual interface, especially regarding visual machine learning and the visual recipes.

Use Cases

Using geospatial data
Deep learning with Keras and Tensorflow

You can define Deep Learning architectures using Keras and Tensorflow in Dataiku’s Visual Machine Learning for a variety of applications, such as image processing, text analysis, and time series, in addition to models for structured data.