howto

Use the webapp Python Backend

June 22, 2015

A webapp can retrieve the contents of a dataset in Javascript to be able to visualize it. However, with large datasets with lots of data, it is often not realistic to do all the processing in the browser.

Furthermore, the JS API only provides simple dataset access. For example, it cannot perform SQL queries.

The DSS custom webapps provide a Python backend that can be very useful to process and aggregate dataset in the backend rather than in the browser.

It enables you to access data that would be too large to process in your browser, gives you access to the pandas library to filter and/or aggregate your data, or basically anything you could do in Python.

This post explains how to query the Python backend from the client with JavaScript. We are going to dissect the automatic sample that DSS generates when you enable the Python backend for the first time.

Then we will go through a slightly more complex example (display top film directors in San Francisco!) .

Python Backend Hello World

Open a new web app and click on the Python tab and click on Enable python backend link.



When you enable the python backend, DSS automatically generates samples in the Python and in the JavaScript panels to illustrate how to interact with the backend:



The JavaScript uses the jQuery function $.getJSON() to call and retrieve the response of the function first_call() in the python backend.

$.getJSON calls the Python backend with the url:

   http://dss_host:dss_port/html-apps-backends/the_id_of_the_webapp/first_api_call

The URL is passed as first argument, of the $.getJSON function:

   $.getJSON("/html-apps-backends/mZA2nTM/first_api_call",

To link the URL to the Python function, we have to add the app.route decorator before the function declaration:

   @app.route('/first_api_call')

Then first_call() return a JSON dictionary that is processed by the function given in second argument of $.getJSON(). Here we just print the result in the javascript console. Notice that first_call() returns a JSON dictionary that is interpretable by $.getJSON().

, function(data) {
console.info(data);
});

Open the Javascript console and observe the result.

Phew, this was a little tedious and verbose but we now know how to call the Python backend from javascript.

This is nice, we requested the backend and printed the answer. But this is not very useful, let's do data processing with the backend.

Let's build our first interactive app with a python backend.

Top film directors

We are going to construct a small application that shows directors and the number of films they realized in San Francisco. We will order directors by the number of films they realized in San Francisco hence showing Film directors that love the most San Francisco!

  • Retrieve the dataset on San Francisco film locations and upload it in the studio.

  • Create a new webapp. Then, import the Bootstrap library in your web app:



  • First, we load the dataset in the backend, aggregate it by film title (to have one line by film):
import dataiku
from dataiku import pandasutils as pdu
import pandas as pd

# Retrieve Film location
sf_films = dataiku.Dataset(
              "SFMAP.Film_Locations_in_San_Francisco").get_dataframe()

# Aggreggate data by film
San_francisco_film = sf_films.groupby(['Title'])
          .first()[['Release Year', 'Director', 'Actor 1', 'Actor 2', 'Actor 3']]
          .reset_index()
  • We then aggregate it by director to retrieve the number of films by director. We also sort the result by the number of films.
# Agreggate data by directors and sort directors by number of films realised in
# San Francisco
count_Director = San_francisco_film
                .groupby(['Director']).Title.count()
count_Director.sort(ascending=False)

*Then, we define a function to return our aggregated dataset to the client. We use the handy pandas Dataframe method to_json() to automatically convert our dataframe to JSON.

@app.route('/director_table')
def director_table():
    return count_Director.to_json(orient='index')
  • We call the python function in javascript with $.getJSON and display the result in the Javascript console.
$.getJSON("/html-apps-backends/mZA2nTM/director_table",
          function(data)
           {
           console.log(data);
           });

In the console, you can see that your Pandas dataframe has been properly return in the form of a json dictionary:



  • Now, we would like to display the directors table in the output of our webapp. First we add a container for our table in the HTML part of our editor:
<div class="container">
  <button id='get_dir' class="btn btn-default" type="submit">Get directors</button>
  <button id='hide_dir' class="btn btn-default" type="submit">Get directors</button>
  <h2>Top Film Directors Of San Franscisco</h2>           
  <table class="table table-hover">
    <tbody id ="directors" >
    </tbody>
  </table>
</div>
  • We use jQuery to add lines to the tables container:
var getDirectors = function(){
$.getJSON("/html-apps-backends/mZA2nTM/director_table",
           function(data) {
            $.each(data,function(key,value)
                    {
                    console.log(key);

                     var entry = "<td>"+key+"</td> <td>"+ value + "</td>";
                     var line = "<tr class='director'>"+ entry +"</tr>";
                        $("#directors").append(line);

                    });
             });
};
getDirectors();

You should know see a classy table in the output of our insight:



Add interactivity

We would like to show or hide the table on demand of the user.

  • Add two buttons in the HTML:
<div class="container">
  <button id='get_dir' class="btn btn-default" type="submit">Show directors</button>
  <button id='hide_dir' class="btn btn-default" type="submit">Hide directors</button>
...
  • Link the click event on the two button to showing or hiding the table with Jquery:
$("#get_dir").bind("click", function(){
$("table").show();
});

$("#hide_dir").bind("click", function(){
$("table").hide();
});
  • Congratulations! You have constructed your first interactive app that calls the python backend: