Sample A/B Experiment for Strava

Context

The Recommended Routes feature of Strava provides users with pre-defined route options based on user input of distance, elevation, and activity type. However, especially with cycling routes, there may be construction, detours, increased traffic, or other factors that impact the current usability of that route. The goal of this experiment is to assess whether adding a feature that provide the user with a date of the last time the majority of the route segments were used will increase the probability of the user saving the route.

Variables

Research Question: Does adding a “Last time route used: [date]” feature to the recommended route options increase the probability of saving the route?

Assumptions

Test Version A is the control group which depicts the existing features of recommended routes.
Test Version B is the experimental group to experiment the new version of recommended routes with the date feature to see if it increases saves of the route (conversions).
Converted – Based on the given hypothetical dataset, there are two categories defined by binary variable:

Converted = 1 when user saves a route
Converted = 0 when user visits the recommended route page but does not save a route

A/B Test Hypothesis

Null Hypothesis

Both version A and B have the same probability of driving user conversion.

Alternative Hypothesis

Versions A and B have different probabilities of driving user conversion. There is a difference between version A and B. Version B is better than A in driving user route saves. PExp_B != Pcont_A

Sample Size Calculations

To determine sample size for the experiment, the following inputs are used:

Statistical test - logistic regression with binary predictor
Baseline value - value for control condition, assumed in this hypothetical to be that 10% of users who visit the recommended routes page save a route
Desired value - value for test condition, assumed in this hypothetical to be 15% of users who visit the recommended routes page and save a route after seeing the “Last time route used: [date]” feature
Proportion of data from test condition - ideally 0.5
Significance - 0.05
Power - probability of correctly rejecting null hypothesis, generally 0.80

sample_size <- SSizeLogisticBin(p1 = 0.10,
                                p2 = 0.15,
                                B = 0.5,
                                alpha = 0.05,
                                power = 0.80)
sample_size

## [1] 1372

Results show that a sample of at least 1,372 users is needed to detect a difference in conversion proportions between the control and experimental groups of 0.5.

Results

Calculate relative uplift

## [1] 63.63636

Hypothesis Testing

# Create and view contingency table
strava_test <- table(full_df$condition, full_df$conversion)
strava_test

##    
##       0   1
##   0 611  75
##   1 561 125

# Confirm expected counts assumption
chisq.test(strava_test)$expected

##    
##       0   1
##   0 586 100
##   1 586 100

# Run chi-squared test of independence
chisq.test(strava_test, correct=FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  strava_test
## X-squared = 14.633, df = 1, p-value = 0.0001306

Conclusions Drawn from Data

There were 75 saved routes (conversions) for test version A (control) and 125 saved routes for test version B (experimental).
The relative uplift was 63.64%, based on a conversion proportion for A = 0.11, and the conversion rate for B = 0.18.
P-value computed for this analysis was 0.0001, indicating that the proportion of saved routes from the experimental condition was significantly higher than the proportion of saved routes from the control condition.

Future Research

Given the significant increase in routes saved using the experimental feature, I recommend the following:

Repeat experiment to confirm findings over time
Segment by activity type (run, ride, walk etc) to assess whether the association between the added feature and saving routes varies by activity type