AB testing is a powerful way to try out a new design or program changes before making final decisions
AB testing is a framework for you to test different ideas for how to improve upon an existing design, often a website
You want to be contantly updating your website or app to maximize thing like conversion rate or usage time
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.3
## -- Attaching packages --------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.1
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.0
## Warning: package 'tidyr' was built under R version 4.0.3
## Warning: package 'readr' was built under R version 4.0.3
## -- Conflicts ------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(scales)
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
library(powerMediation)
## Warning: package 'powerMediation' was built under R version 4.0.3
library(broom)
# Read in data
click_data <- read_csv("click_data.csv")
click_data
## # A tibble: 3,650 x 2
## visit_date clicked_adopt_today
## <date> <dbl>
## 1 2017-01-01 1
## 2 2017-01-02 1
## 3 2017-01-03 0
## 4 2017-01-04 1
## 5 2017-01-05 1
## 6 2017-01-06 0
## 7 2017-01-07 0
## 8 2017-01-08 0
## 9 2017-01-09 0
## 10 2017-01-10 0
## # ... with 3,640 more rows
# Find oldest and most recent date
min(click_data$visit_date)
## [1] "2017-01-01"
max(click_data$visit_date)
## [1] "2017-12-31"
What means more clicks?
# Calculate the mean conversion rate by day of the week
click_data %>%
group_by(wday(visit_date)) %>%
summarize(conversion_rate = mean(clicked_adopt_today))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 7 x 2
## `wday(visit_date)` conversion_rate
## <dbl> <dbl>
## 1 1 0.3
## 2 2 0.277
## 3 3 0.271
## 4 4 0.298
## 5 5 0.271
## 6 6 0.267
## 7 7 0.256
# Calculate the mean conversion rate by week of the year
click_data %>%
group_by(week(visit_date)) %>%
summarize(conversion_rate = mean(clicked_adopt_today))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 53 x 2
## `week(visit_date)` conversion_rate
## <dbl> <dbl>
## 1 1 0.229
## 2 2 0.243
## 3 3 0.171
## 4 4 0.129
## 5 5 0.157
## 6 6 0.186
## 7 7 0.257
## 8 8 0.171
## 9 9 0.186
## 10 10 0.2
## # ... with 43 more rows
# Compute conversion rate by week of the year
click_data_sum <- click_data %>%
group_by(week(visit_date)) %>%
summarize(conversion_rate = mean(clicked_adopt_today))
## `summarise()` ungrouping output (override with `.groups` argument)
# Build plot
ggplot(click_data_sum, aes(x = `week(visit_date)`,
y = conversion_rate)) +
geom_point() +
geom_line() +
scale_y_continuous(limits = c(0, 1),
labels = percent)
Seasonality is importnat for the experiment. The conversion rates might be different along the year
Experiment length is one of the big questions in A/B testing. If you stop too soon you may not get enough data to see an effect. Too long and you may waste valuable resources on a failed experiment. One way to safeguard this is with power analysis.
A power analysis will tell you how many data points (or sample size) that you need to be sure an effect is real
# Compute and look at sample size for experiment in August
total_sample_size <- SSizeLogisticBin(p1 = 0.54, # Baseline
p2 = 0.64, # Expected result
B = 0.5, # proportion
alpha = 0.05, # Significance
power = 0.8) # statistical power
total_sample_size
## [1] 758
experiment_data_clean <- read_csv("experiment_data.csv")
##
## -- Column specification --------------------------------------------
## cols(
## visit_date = col_date(format = ""),
## condition = col_character(),
## clicked_adopt_today = col_double()
## )
# Group and summarize data
experiment_data_clean_sum <- experiment_data_clean %>%
group_by(visit_date, condition) %>%
summarize(conversion_rate = mean(clicked_adopt_today))
## `summarise()` regrouping output by 'visit_date' (override with `.groups` argument)
# Make plot of conversion rates over time
ggplot(experiment_data_clean_sum,
aes(x = visit_date,
y = conversion_rate,
color = condition,
group = condition)) +
geom_point() +
geom_line()
# View summary of results
experiment_data_clean %>%
group_by(condition) %>%
summarize(conversion_rate = mean(clicked_adopt_today))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 2 x 2
## condition conversion_rate
## <chr> <dbl>
## 1 control 0.167
## 2 test 0.384
# Run logistic regression
experiment_results <- glm(clicked_adopt_today ~ condition,
family = "binomial",
data = experiment_data_clean) %>%
tidy()
experiment_results
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -1.61 0.156 -10.3 8.28e-25
## 2 conditiontest 1.14 0.197 5.77 7.73e- 9
AB testing is the use of experimental design and statistics to compare two or more variants of a design
viz_website_2017 <- read_csv("data_viz_website_2018_04.csv")
##
## -- Column specification --------------------------------------------
## cols(
## visit_date = col_date(format = ""),
## condition = col_character(),
## time_spent_homepage_sec = col_double(),
## clicked_article = col_double(),
## clicked_like = col_double(),
## clicked_share = col_double()
## )