sample project

Forecasting Sales

April 01, 2016

This sample project is based on data from a Kaggle challenge.

Many retail businesses need accurate forecasting of the revenue produced by each of their stores. These forecasts allow for planning, staffing optimization, as well as sure that each store has the necessary supply. Without these forecasts, businesses may waste money by overstocking a store, or worse yet, lose out on revenue because a store does not have enough supplies to handle predicted revenue.

In this project, we use historical data from the Rossman pharmacy chain to build a predictive model to forecast the revenue of each of their stores. This model can be run weekly or monthly and provide business actors with accurate predictions about the revenue for coming days or weeks. This information can then be used to optimize business practices and streamline operations.

Business Goal

We want to build a project to answer the following questions:

  • What is the expected revenue for each store on each day?
  • What factors influence the revenue of a store most?

How do we do this?

We start with 2 different data sources: - two datasets with the revenue per store per day, split between our historical data (used to train the model), and our forecasting data (used to deploy our model) - a dataset with information about each store.

Like many data projects, we then proceed with three steps: 1. Data Cleaning: we clean our data and build our features 2. Predictive Modeling: we build and deploy a predictive model 3. Visualization: we create a useful visualization of our predicted data

Let's go through each one of those steps in more detail to see what we did.

Explore this sample project

  • Flow

    Start by looking at the flow and visualising the different steps of the project. You can see the preparation steps in yellow and the predictive modelling steps in green. 

    Explore !
  • Visual Preparation

    We used a preparation script to parse dates and engineer features from them. This is a data type common to many datasets, when the relevant data from a column has to be extracted to be useful. 

    Explore !
  • Join recipe

    We then used a join recipe to *enrich our data* with meta-data about each store. This gives us more features that will be fundamental for the next step: predictive modelling. 

    Explore !
  • Model

    We built a model to predict the revenue for each store with an accuracy as high as possible. This project can be used in production to regularly produce forecasts for the coming week or month for a business. The business can then use these number to optimize staffing or stocks at each store. 

    Explore !
  • Feature Importance

    We can check the variables importance to see which factors are more important in predicting each store's revenue. After looking at this we can see that the most important predictors for revenue are:
    - The day of the week
    - Whether there's a sale or not
    - How far the store is from a competitor's store 

    Explore !
  • Dashboard

    To communicate on our model's results, we built a dashboard with visualizations of the predictive model. Rather than looking at an excel-style table, these visualizations allow a team to easily get a quick feel for the data and the revenue forecasts. 

    Explore !

Ready to enter Dataiku DSS ?

If you never used DSS, it might be worthy to familiarize yourself with DSS concepts in the first place.

Learn the concepts Enter DSS

This sample is already available in your DSS!

From your DSS home page, click on "Sample projects".

If your DSS server doesn't have Internet access, you can download this sample and import it manually (click on "Import project")

Don't have Dataiku DSS yet? Try for free now

From your DSS home page, click on "Sample projects".

If your DSS server doesn't have Internet access, you can download this sample and import it manually (click on "Import project")

Don't have Dataiku DSS yet? Try for free now

Your browser has a width smaller than 1000px

You can't access DSS using a mobile device, on desktop a browser width (Chrome or Firefox support only) of at least 1280px is recommended.

Only Chrome and Firefox are supported

Sorry you seem to use another browser not supported by DSS, please try again from Chrome or Firefox.