Loading Shapefiles in DSS

Applies to DSS 2.0 and above | September 28, 2015

The shapefile file format, initially created by ESRI, is a standard format when working with spatial or geographic data. Data Science Studio provides builtin support for it.

We’ll use in this example this file from US Census TIGER, giving US counties borders among some other information.

Create a new Dataset by uploading your Shapefile to DSS:

Uploaded dataset showing the individual elements of the shapefile

It should be recognized automatically as a Shapefile:

Preview of the shapefile, automatically recognized by Dataiku

Create your dataset. You will be taken to Explore view, where you will be able to see your Shapefile in a tabular format, with the geographic object geometry in the first column, and its attributes in the following column:

You’ll be able to use geographic processors in a visual data preparation script to deal with your data now, and for instance easily blend it with your internal data sources using geo-joins :)