en

AB test calculator

Perform A/B testing in the most enjoyable way!

 

AB test analysis web app
AB testing analysis web app

Plugin information

Version 1.0.1
Author Dataiku (Marine SOBAS, Du PHAN)
Released 2021-01
Last updated 2021-03-01
License Apache Software License
Source code Github
Reporting issues Github

Sometimes, a standard A/B test is more relevant than a black-box model to address today’s challenges. For years, A/B testing has been a reliable method to compare two variants of the same ad, website, drug, or machine learning model. This A/B test calculator provides features to design your own A/B tests and analyses their outcomes inside DSS.

In its first version, the plugin focuses on A/B tests with rate metrics. So, it is a perfect match if you want to optimise a success rate such as a click-through rate, a conversion rate or a cure rate. 

How to set up

When you install this plugin, you will need to build its code environment. Note that Python version 3.6 is required.

Code environment creation
Code environment creation

How to use

This plugin will assist you in setting up the experiment and interpreting its results. It relies on two main steps :

  • Experimental design: compute the minimum sample size (1) and split your experiment population into an A and a B group (2).
  • Results analysis: retrieve a statistical summary of your experiment (3) and analyse the outcome of the test (4).
A/B testing workflow
A/B testing workflow

 

Design the experiment

During this first step, you will estimate the minimum sample size required by the experiment with the A/B test sample size calculator and split your population into two groups using the Population split recipe.

1. A/B test sample size calculator

A statistically relevant A/B test requires a minimum sample size. This web app computes it and saves the experiment parameters into a managed folder.

Access the sample size calculator
  • Open the </> tab and select Webapps.
  • Create a new visual web app
Visual web app creation
Create a new visual web app
  • Choose the AB test sample size calculator.
Choose the A/B test sample size calculator
Setting up the A/B test sample size calculator web app

From the settings tab of the web app, please specify the following parameter :

  • Parameters folder, a managed folder where to store the parameters and sample sizes. It’s possible to create the folder from the drop-down menu.
Sample size calculator settings
Sample size calculator settings

Once you set up the web app, click on “Save and view webapp”.

Computing the sample size

The minimum sample size depends on the input parameters of the web app, namely :

  • Baseline success rate (%) : success rate of the baseline variant . If this value is difficult to estimate, set it to 50%, or to your average success rate.
  • Minimal detectable effect (%): the minimal variation of the baseline success rate that you need to detect a significant change. This use-case dependent value has a significant impact on the sample size. You may base it upon the minimum revenue expected from the new variant B compared to A. For instance, a variant B costs $X more than A but might increase A’s baseline purchase rate by D% . To make it profitable, you need to ensure that B leads to an increase in the purchase rate of at least X divided by the baseline revenue. This is your minimum detectable effect.
  • Daily number of people exposed
  • Percentage of the traffic affected

From these values, a minimum sample size is computed and illustrated thanks to the chart of the distributions. Sample size computation relies on [1].

For instance, a user might want to optimise the open rate of her email campaigns. Therefore, she plans to compare two different sending times, 9 am and 2 pm, during an A/B test. The average open rate is 40%, so inside the A/B test sample size calculator, she sets a 40% baseline success rate. For her, an email campaign outperforms another if the open rate is 7% higher, so she defines a 7% minimum detectable effect. Finally, she sets a 90% statistical significance. Hence, she is confident that 90% of the time, if there is an actual difference between the two variants, the test will detect it. This leads to the following input parameters:

Example for input parameters
An example of input parameters
Saving the parameters

When you click on the button Save parameters, the parameters and the samples sizes are saved in the folder Parameters .

Save size in the input folder
Save size in the input folder

When you open this folder, you may observe your parameters stored in json files :

Managed folder parameters
Managed folder parameters

2. Population split

This recipe splits the users enrolled in the experiment into two groups, usually based on the sample sizes which were previously computed in the AB test sample size calculator.

Access the population split recipe

To create your first recipe, navigate to the Flow, click on the + RECIPE button and access the AB test calculator menu.

Create a population split recipe
Create a population split recipe

If your input dataset or/and your input folder are selected, you can directly find the plugin on the right panel.

Select recipe in the right panel
Select recipe in the right panel

Then, select the population split recipe.

The population split recipe
The population split recipe
Input dataset
  • Population dataset : Dataset with the reference of the users involved in the experiment(ids, emails…) stored in one of the columns. If your experiments run on Hubspot, check out the Hubspot plugin to retrieve your contact list in a dataset format.
Input dataset for the population split recipe
Input dataset for the population split recipe
  • Parameters folder (optional): Folder containing the parameters computed in the AB test sample size calculator, mentioned previously. This is optional, you can also define the sizes manually within the settings of the recipe.
Output dataset
  • Experiment dataset : Input dataset with an extra column containing the group indicators used for the AB test (A or B)
Output dataset for the population split recipe
Output dataset for the population split recipe
Settings

Review the recipe parameters:

  • User reference : Column containing user reference (user Id , email…). Each user should have a unique reference.
  • Sample size definition : do you want to retrieve the sample sizes from the web app or edit them manually?

If you input a Parameters folder, choose the option : “Retrieve values from web app”. Otherwise, you may define the sample sizes manually.

If you want to retrieve the sample sizes from the parameters folder , choose the following parameter :

  • Parameters (computed in the web app): choose which json file contains the right parameters and sample sizes. Please, make sure that your input dataset contains enough users given the sample sizes specified in the file.
Select sizes from the managed folder Parameters
Select sizes from the managed folder Parameters

If you want to edit sizes manually, specify the following parameters :

  • Sample size for variation A : Minimum sample size for the A group
  • Sample size for variation B : Minimum sample size for the B group

Please, make sure that your input dataset contains enough users given the sample sizes specified in the settings.

Edit sizes manually
Edit sizes manually
  • Deal with leftover users : If the population is greater than the sample size, this field specifies in which group the leftover users should go.

Analyse the results of the experiment

Once the experiment is complete, you may upload the results back to DSS. With Experiment summary, a recipe, you compute the resulting statistics. For instance, if your success event is a click, it will be the click through rates for each group. With the results analysis web app, you can analyse these results and determine the outcome of the statistical test.

3. Experiment summary

Access the experiment summary recipe

To create your first recipe, navigate to the Flow, click on the + RECIPE button and access the AB test calculator menu.

Create a population split recipe
Create an experiment summary recipe

If your input dataset is selected, you can directly find the plugin on the right panel.

Select recipe in the right panel
Select recipe in the right panel

Then, select the experiment summary recipe.

The experiment summary recipe
The experiment summary recipe
Input dataset
  • experiment_results : This dataset should contain the experiment’s results at a user level. There should be a group and a success column.  The group column should contain only two values such A & B or group_A, & group B. The success column should be only zeros and ones. 1, represents a successful event and 0 a failure.
Input dataset for the experiment summary recipe
Input dataset for the experiment summary recipe
Output dataset
  • AB testing statistics : Statistics required to answer the statistical test. For each group, you get the sample size and the success rate.
Output dataset for the experiment summary recipe
Output dataset for the experiment summary recipe

 

Settings
  • User reference : Column containing user reference (user Id , email…). Each user should have a unique reference. If you previously used the Population split recipe, it should be the same value.
  • Conversion column : Column indicating if a user converted or not (Binary values)
  • AB group column : Column indicating to which group a user belongs. This column should contain binary values (O-1, A-B, group_A-group_B)
Settings of the experiment summary recipe
Settings of the experiment summary recipe

4. Results analysis

This web app analyses the outcome of the A/B test and

Accessing the results analysis web app
  • Open the </> tab and select Webapps.
  • Create a new visual web app
Create a visual web app
Create a visual web app
  • Choose the AB test results analysis.
Setting the result analysis web app
  • AB statistics entry from : do you want to retrieve statistics from the AB testing statistics dataset or just enter the values manually?

Select “an input dataset”, if you already performed the Experiment summary recipe and want to analyse the computed statistics. Otherwise, use the manual mode.

  • Dataset : It should be the output of the recipe AB statistics of the AB testing plugin.
  • AB group column : Column indicating to which group a user belongs (A or B)
Setting the result analysis web app from the statistics dataset
Setting the result analysis web app from the statistics dataset
  • Output folder for results : Where do you want to save the results of the experiment?
Computing the results of the experiment

The results of the experiments are computed based on the following input parameters:

  • The sample size for each group
  • Their success rates
  • Desired statistical significance: probability to find that the two samples have the same success rate, when this is the case. It is therefore the minimum threshold of the true positive probability. Its most common values are 95% and 90%.
  • Two tailed test: Are you willing to test for an increase in success rate, a decrease, or even both? If you only want to determine if there is any difference between the two groups, you should use a two-tailed test. It means that you are testing both for positive and negative differences. However, if you only test in one direction, to find out for instance, whether the success rate is higher for B, you may want to use a one-tailed distribution. For example, if you test a new email template, your major concern is whether it leads to more conversion. A two-tailed test is not necessary since you are only interested in positive changes.
    The parameters of the results analysis web app
    The parameters of the results analysis web app – Note that these sizes and success rates were retrieved from the AB_testing_statistics dataset.

The results are displayed in the results box below. On the left hand side, a sentence explains the results of the A/B tests.

The test outcome
The test outcome

One the right hand side, a table displays some indicators about the test, namely ..

  • the uplift, the difference in success rate between the two variants (%)
  • the Z score: how many standard deviations below or above the population mean a raw score is.  [2]
  • the p value: the probability to obtain the following results were there no actual difference between the success rates.
Statistical results
Statistical results
Saving the results in the output folder

When you click on the button save parameters, the parameters and the samples sizes are saved in the folder Results .

When you open this folder, you may observe your results stored in json files :

Results folder
Results folder

References

[1] S. Holmes. POWER and SAMPLE SIZE Introduction to Statistics for Biology and Biostatistics (2004)

[2]  Stephanie Glen. “Welcome to Statistics How To!” From StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/

Get the Dataiku Data Sheet

Learn everything you ever wanted to know about Dataiku (but were afraid to ask), including detailed specifications on features and integrations.

Get the data sheet