Uplift modeling

How to do uplift modeling in Dataiku

We’re (again) a major telecom operator. Just like pretty much any company in the world, we are concerned with keeping our customers happy, so they won’t leave us. In other words, we want to reduce churn. To do this, we set up a task force of data analysts and people from our business teams who came up with several business goals to reduce churn.

So in a way described in the “Predicting churn project”, we created a model to predict churn. In conjunction with the marketing team, we decided on a few actions to take for likely churners. In order to measure the usefulness of these actions, we set up an A/B test. A random sample of around half of the customers were targeted. the result of this A/B test is what we analyze in this project.

Before starting, we recommend you have a look at the following link to be familiar with uplift modeling theory.

Business Goal

  • Target clients with more effective advertising based on their usage profiles
  • Retrieve customers with very high likeliness of churn so we could get in touch and offer them special deals before they even thought of leaving

How Did We Do This ?

  • The simplest approach consist in targeting likely churners. But likely churners may not be the people the more likely to react positively to our treatment. In our case we suppose there are 4 main types of people :
    • lost cause : highly likely to churn and unlikely to react positively to any marketing action. We should not bother targeting these people.
    • persuadable : highly likely to churn but likely to react positively to any marketing action. This is our real target !
    • sure thing : unlikely to churn whatever happens. Targeting them is a waste of money.
    • sleeping dog : unlikely to churn unless targeted by a marketing action. These are the people that had forgotten they are paying a subscription or hate to be targeted by marketing actions. This segmentation is not known before the A/B test. It is indeed a hidden variable that impacts our outcome : churn or not.
  • Contrary to churn modeling, uplift is about predicting the gain of probability when the user is targeted by a marketing action. This enable us to target the customers most likely to respond well to our actions.

Thus, if we denote X the covariates, T the treatment set (targeted by marketing) and C the control set, uplift is about estimating :

                 U = P(churn=0| X, T) - P(churn=0| X, C)
  • There are three common ways to do so :
    • model separately the two probabilities and do the difference
    • use a change of target variable
    • use specific machine learning models (out of the scope of this project)

Explore This Sample Project