Draw the San Francisco Crime Map (Web App tutorial)

San Francisco has a very progressive and efficient politics in terms of open data. The city provides clean and fascinating datasets on various subjects: SFPD Crime Incidents, Business locations and even Film locations. This politics is fruitful and leads to numerous fun and insightful exploitations.

Note: the SFPD crime data is used under the Open Data Commons PDDL license.

By following this post, you will be able to draw a map of San Francisco with information on the number of crime by year and location.

First, you should familiarize yourself with the Data Science Studio Web app editor.

This Howto will give you a quick glance at how it works.


You can either start this tutorial from the DSS interface or create a new project.

To use the tutorial project with a preloaded dataset, from the Dataiku homepage, click +New Project, select DSS Tutorials from the list, select the Visualizations panel, and select SFPD Incidents.

You can download the full dataset from the San Francisco Open Data Portal.

Explore and prepare the dataset

On the map, we are going to display the data with year-based filters. In order to do that efficiently, we are going to start by creating a new enriched dataset with a “year” column.

  • Open the dataset (either the included sample or the one your downloaded from the Open Data portal)
  • Create a new analysis
  • On the “Date” column, click the header, select “Parse date”, and select the proper format
  • Click on the new “Date_parsed” column, and choose “Extract date components”
  • Only extract the year to a new column named “year” (empty the column names for month and day)
  • Click on deploy script and create a new dataset, named for example “sfpd_enriched”
  • Don’t forget to build the newly created dataset

Setup your web app

Create a new empty web app:

  1. In the top navigation bar, select Lab - Notebooks > Web apps
  2. Click + New Web App
  3. Select Standard
  4. Choose Empty web app and type a name for the web app

Since we’ll be reading a dataset, you need to authorize it in the security settings (Settings button, check “Read data” for your dataset). Don’t forget to click on “Add snippet” link to update your JS code automatically.

We are going to use Leaflet to draw the map, jQuery to add a slider to select the year and d3.js to tune colors. Import these three libraries via the DSS interface.

Reminder: this Howto will give you more details about this.

You will need to source jQuery adds-on at the beginning of your HTML code:

<!-- sourcing adds on for jquery-->
<link rel="stylesheet" href="//code.jquery.com/ui/1.11.4/themes/smoothness/jquery-ui.css">
<script src="//code.jquery.com/ui/1.11.4/jquery-ui.js"></script>

Create a map with Leaflet

Leaflet is a great Javascript libraries for creating maps.

As a first step, we will create a map with no information about crime. We are going to use leaflet to create a map object centered on SF. Add this to the Javascript part of your app:

// Create a Map centered on San Francisco
var map = L.map('map').setView([37.76, -122.4487], 11);
// Add an OpenStreetMap(c) based background
var cartodb =  new  L.tileLayer(
    'http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png', {
            '&copy; <a href="http://www.openstreetmap.org/copyright">OpenStreetMap</a>'
            + ' contributors, &copy; <a href="http://cartodb.com/attributions">CartoDB</a>'

If you created your web app with the “Starter code for map visualizations”, you only need to change the line containing the “setView” call to center your map on San Francisco.

You also need to add an anchor for this map in the HTML. We’ll also add a title.

<div id='map'>

At this step, the map is not appearing on the output. This is because the anchor has no height and, therefore, is invisible. Set the anchor height to 400 px in the CSS. We’ll also style a bit for later

#map { height: 400px; }

path {
    stroke-width: 0;

Click on the “Save” button to update the output. (NB: you can also use Ctrl+Enter, or Ctrl+S to save your work)

Perfect! A beautiful map is appearing on the output! Now, let’s add some data to it.

Note: the map that you are seeing is displaying data from OpenStreetMap, a free and open world database, and the tiles (the actual images) are provided courtesy of CartoDB.

Web App with map of San Francisco

Load data in the Python Backend

As you might remember, a web app can retrieve the contents of a dataset in Javascript to be able to visualize it. However, with this dataset, there is a lot of data, and it would not be realistic to have it all in the browser.

We are going to use a Python backend to load the dataset into a Pandas dataframe and filter it by year and aggregate it by area. You can start with this post to understand better how to interact with the Python backend.

  • Go to the Python tab
  • Enable the Python backend
  • Remove the automatically generated code sample.
  • Remove the automatically generated Javascript code sample from the JS tab (last lines)

Then paste the following code in the Python backend editor. You can test it in a Python notebook to better understand what’s happening).

It creates a square lattice on the city and inside each square it counts the number of incidents. The result is returned in a JSON that will be used by our Javascript code to add information to the map.

import dataiku
import pandas as pd

# import dataset - NB: update this to fit your dataset name
sfpd = dataiku.Dataset("sfpd_enriched").get_dataframe()
# Only keep points with a valid location and only the criminal incidents
sfpd= sfpd[(sfpd.X!=0) & (sfpd.Y!=0) & (sfpd.Category !="NON-CRIMINAL")]

def count_crime():
    year= 2014
    # filter data for the choosen year
    tab = sfpd[['X','Y']][(sfpd.year == year) ]

    #group incident locations into bins
    X_B = pd.cut(tab.X, 25, retbins=True, labels=False )
    Y_B = pd.cut(tab.Y,25, retbins=True, labels=False)

    tab['X'] = X_B[0]
    tab['Y'] = Y_B[0]
    tab['C'] = 1

    # group incident by binned locations
    gp = tab.groupby(['X','Y'])
    # and count them
    df = gp['C'].count().reset_index()
    max_cr = max(df.C)
    min_cr = min(df.C)
    gp_js = df.to_json(orient='records')

    #return a  JSON containing incident count by location and location limits
    return json.dumps({
        "bin_X" : list(X_B[1]) ,
        "bin_Y": list(Y_B[1]),
        "NB_crime" : eval(gp_js),
        "min_nb":min_cr, "max_nb":max_cr

When you run the web app, the Python backend automatically starts. You can find the logs of the Python backend in the “Log” tab next to your Python code. You can click on the Refresh button to get up-to-date logs.

Query data from the backend and draw it on the map

Add the following sample to the Javascript part of your app to query the Python backend and draw it on the map.

The function draw_map calls the Python backend to retrieve the lattice and goes through each lattice square to draw it on the map with a proper color (the more red the more crimes).

var draw_map = function() {
    //request python backend aggregated data
    $.getJSON(getWebAppBackendUrl('Count_crime')).done(function(data) {
        //then draw data on map

        //use d3 scale for color map
        var cmap = d3.scale.sqrt()

        for(var i = 0; i < data.NB_crime.length; i++) {
            //retrieve corner of square
            C1 = [data.bin_Y[data.NB_crime[i].Y],data.bin_X[data.NB_crime[i].X]];
            C2 = [data.bin_Y[data.NB_crime[i].Y],data.bin_X[data.NB_crime[i].X+1]];
            C3 = [data.bin_Y[data.NB_crime[i].Y+1],data.bin_X[data.NB_crime[i].X+1]];
            C4 = [data.bin_Y[data.NB_crime[i].Y+1],data.bin_X[data.NB_crime[i].X]];

            //draw square with color coding for the number of crime
            var polygon = L.polygon([C1,C2,C3,C4], {
                    color: cmap(data.NB_crime[i].C)


Bright! We have the map of San Francisco with a transparent lattice representing the number of crimes by area for the year 2014.

However, except for the possibility to move the map and to zoom into it our application is not very interactive. We are going to add a slider to select the year displayed. Each time we move the slider, the backend will be called to process data for the wanted year.

Web App displaying lattice of crime by area in San Francisco

Add a slider to dynamically select the year

Add an anchor for the slider to the HTML and an input to display the year selected.

    <label for="amount"> Year:</label>
    <input type="text" id="amount" readonly style="border:0; color:#f6931f; font-weight:bold;">
<div id ='slider'></div>

Now, we are going to change slightly the draw map function to pass the selected year to the python backend.

var draw_map = function(year) {
    //request python backend aggregated data
    $.getJSON(getWebAppBackendUrl('Count_crime'), {year:year})
            function(data) {

We now pass the argument year to function draw_map. Note that, we added the JSON {year:year} in our request to the backend.

In the backend, we’ll retrieve the passed argument and modify the count_crime function.

The “routes” in the backend are made with Flask. Let’s import the functions to access the parameters.

from flask import request

And modify count_crime:

def count_crime():
    year= int(request.args.get("year"))
    # filter data for the choosen year
    tab = sfpd[['X','Y']][(sfpd.year == year) ]

Finally, we append this sample to the Javascript part. We add a slider and a function to clear the map each time we change the year.

function clearMap() {
    for(i in map._layers) {
        if(map._layers[i]._path != undefined) {
            try {
            } catch(e) {
                console.log("problem with " + e + map._layers[i]);

//Create a slider to select the year
    min: 2004,
    max: 2014,
    step: 1,
    create:function(event, ui ) {
    change: function(event, ui) {
        $('#amount').val( ui.value );

$('#amount').val( $('#slider').slider('value') );

You now have a beautiful interactive map of hot areas in San Francisco.

Web App of crime in San Francisco with a slider to choose the year

Go ahead and add more info or selectors to your app. You could try to correlate business areas with thefts or why not see if trees have a calming effect on criminal activity.