Bayesian Analysis

The role of data is to re-allocate credibility across possibilities.

Possibilities are parameter values in a model, such as the mean of a normal distribution.

we reallocate credibility to parameter values that are consistent with the data.

Why Bayesian statistics?

Frequentist vs. Bayesian

In the field of statistical inference, there are two very different, yet mainstream, schools of thought: the frequentist approach, under which the framework of Hypothesis Testing was developed, and the Bayesian approach.

For Bayesians, the concept of probability is extended to cover degrees of certainty about any given statement on reality. However, in a strict frequentist view, it is meaningless to talk about the probability of the true conversion rate. For frequentists, the true conversion rate is by definition a single fixed number, and to talk about a probability distribution for a fixed number is mathematically nonsensical.

Steps of Bayesian analysis

  1. Define a generative model for describing the data
  2. Specify the prior, e.g. uniform distribution, binomal
  3. collect data
  4. Compute posterior distribution of parameter, Markov chain Monte Carlo (MCMC) sampling simulation.

Bayesian A/B test

  • Binomial distribution

  • Poisson distribution

An example of conversion A/B test

name_A <- "Test A"
name_B <- "Test B"

success_A <- 200
success_B <- 230

total_A <- 1000
total_B <- 1000
conv_A <- success_A/total_A
conv_B <- success_B/total_B
alpha_A <- success_A + 1
alpha_B <- success_B + 1
beta_A <- total_A - success_A + 1
beta_B <- total_B - success_B + 1

hdi_A <- hdi_of_icdf(qbeta, shape1 = alpha_A, shape2 = beta_A)
hdi_B <- hdi_of_icdf(qbeta, shape1 = alpha_B, shape2 = beta_B)

diff_post_sample <- create_sample_beta_diff(alpha_A, 
                                            beta_A, 
                                            alpha_B, 
                                            beta_B, 
                                            size = 10000)
      
hdi_diff <- hdi_of_sample(diff_post_sample)


prob_B_beats_A(alpha_A, beta_A, alpha_B, beta_B)*100
## [1] 94.86183
expected_loss_B_over_A(alpha_B, beta_B, alpha_A, beta_A)*100
## [1] 3.03363
x_lim <- {
        a <- min(hdi_A, hdi_B)
        b <- max(hdi_A, hdi_B)
        c(1.2*a - 0.2*b, 1.2*b - 0.2*a)
      }

x_lim_diff <- {
  a <- hdi_diff[1]
  b <- hdi_diff[2]
  c(1.2*a - 0.2*b, 1.2*b - 0.2*a)
}
plot_post_beta(
        alphaA = alpha_A, betaA = beta_A, alphaB = alpha_B, betaB = beta_B,
        hdiA = hdi_A, hdiB = hdi_B, xlim = x_lim,
        paste('', name_A), paste('', name_B)
      )
## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
plot_sample_density(
        sample = diff_post_sample,
        hdi = hdi_diff
      )

An example of A/B test via bayesAB package

library(bayesAB)
## 
## Attaching package: 'bayesAB'
## The following object is masked from 'package:plotly':
## 
##     rename
## The following objects are masked from 'package:dplyr':
## 
##     combine, rename
A_binom <- rbinom(1000, 1, .23)
B_binom <- rbinom(1000, 1, .2)

plotBeta(65, 200)

AB1 <- bayesTest(A_binom, B_binom, priors = c('alpha' = 65, 'beta' = 200), n_samples = 10000, distribution = 'bernoulli')

print(AB1)
## --------------------------------------------
## Distribution used: bernoulli 
## --------------------------------------------
## Using data with the following properties: 
##             A     B
## Min.    0.000 0.000
## 1st Qu. 0.000 0.000
## Median  0.000 0.000
## Mean    0.259 0.181
## 3rd Qu. 1.000 0.000
## Max.    1.000 1.000
## --------------------------------------------
## Conjugate Prior Distribution: Beta 
## Conjugate Prior Parameters: 
## $alpha
## [1] 65
## 
## $beta
## [1] 200
## 
## --------------------------------------------
## Calculated posteriors for the following parameters: 
## Probability 
## --------------------------------------------
## Monte Carlo samples generated per posterior: 
## [1] 10000
summary(AB1)
## Quantiles of posteriors for A and B:
## 
## $Probability
## $Probability$A
##        0%       25%       50%       75%      100% 
## 0.2112158 0.2478564 0.2559201 0.2644819 0.2994380 
## 
## $Probability$B
##        0%       25%       50%       75%      100% 
## 0.1504326 0.1868761 0.1943721 0.2018655 0.2349470 
## 
## 
## --------------------------------------------
## 
## P(A > B) by (0)%: 
## 
## $Probability
## [1] 0.9999
## 
## --------------------------------------------
## 
## Credible Interval on (A - B) / B for interval length(s) (0.9) : 
## 
## $Probability
##        5%       95% 
## 0.1656444 0.4963547 
## 
## --------------------------------------------
## 
## Posterior Expected Loss for choosing B over A:
## 
## $Probability
## [1] 7.165274e-07
plot(AB1)

A bayesian tutorial

Exercise 1: Bayesian A testing

Bayesian tutorial exercises

The marketing department have done a pilot study and tried the following marketing method:

A: Sending a mail with a colorful brochure that invites people to sign up for a one year salmon subscription.

The marketing department sent out 16 mails of type A. Six Danes that received a mail signed up for one year of salmon and marketing now wants to know, how good is method A?

  • Question I: Build a Bayesian model that answers the question: What would the rate of sign-up be if method A was used on a larger number of people?
n_draw <- 10000

# Defining and drawing from the prior distribution
prior_rate <- runif(n_draw, 0, 1)

# Defining the generative model
gen_model <- function(rate) {
  subscribers <- rbinom(1, size = 16, prob = rate)
  subscribers
}

# Simulating the data
subscribers <- rep(NA, n_draw)
for(i in 1:n_draw) {
  subscribers[i] <- gen_model(prior_rate[i])
}

# Filtering out those parameter values that didn't result in the
# data that we actually observed
post_rate <- prior_rate[subscribers == 6]

# Checking that there enough samples left
length(post_rate)
## [1] 591
# Plotting and summarising the posterior.
hist(post_rate, xlim = c(0, 1))

mean(post_rate)
## [1] 0.3858367
quantile(post_rate, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.1908802 0.5962105
  • Question II: What’s the probability that method A is better than telemarketing?

So marketing just told us that the rate of sign-up would be 20% if salmon subscribers were snared by a telemarketing campaign instead (to us it’s very unclear where marketing got this very precise number from). So given the model and the data that we developed in the last question, what’s the probability that method A has a higher rate of sign-up than telemarketing?

sum(post_rate > 0.2) / length(post_rate)
## [1] 0.9712352
  • Question III: If method A was used on 100 people what would be number of sign-ups?
# This can be done with a for loop
singnups <- rep(NA, length(post_rate))
for(i in 1:length(post_rate)) {
  singnups[i] <- rbinom(n = 1, size = 100, prob = post_rate[i])
}

# But since rbinom is vectorized we can simply write it like this:
signups <- rbinom(n = length(post_rate), size = 100, prob = post_rate)

hist(signups, xlim = c(0, 100))

quantile(signups, c(0.025, 0.975))
##  2.5% 97.5% 
## 18.00 61.25

So a decent guess is that is would be between 20 and 60 sign-ups.

Exercise 2: Bayesian A/B testing

Exercise 2: Bayesian A/B testing

The marketing department have done a pilot study and tried two different marketing methods:

A: Sending a mail with a colorful brochure that invites people to sign up for a one year salmon subscription.

B: Sending a colorful brochure that invites people to sign up for a one year salmon subscription and that includes a free salmon.

The marketing department sent out 16 mails of type A and 16 mails of type B. Six Danes that received a mail of type A signed up for one year of salmon, and ten Danes that received a mail of type B signed up!

The marketing department now wants to know, which method should we use, A or B?

  • Question I: Build a Bayesian model in Stan that answers the question: What is the probability that method B is better than method A?
# The Stan model as a string.
model_string <- "
data {
  # Number of trials
  int nA;
  int nB;
  # Number of successes
  int sA;
  int sB;
}

parameters {
  real<lower=0, upper=1> rateA;
  real<lower=0, upper=1> rateB;
}

model {
  rateA ~ uniform(0, 1);
  rateB ~ uniform(0, 1);
  sA ~ binomial(nA, rateA);
  sB ~ binomial(nB, rateB); 
}

generated quantities {
  real rate_diff;
  rate_diff = rateB - rateA;
}
"

data_list <- list(nA = 16, nB = 16, sA = 6, sB = 10)

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)
## DIAGNOSTIC(S) FROM PARSER:
## Info: Comments beginning with # are deprecated.  Please use // in place of # for line comments.
## Info: Comments beginning with # are deprecated.  Please use // in place of # for line comments.
## 
## 
## SAMPLING FOR MODEL '664dda01545ecafd215b2a9e2b8c9335' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.015857 seconds (Warm-up)
## Chain 1:                0.015208 seconds (Sampling)
## Chain 1:                0.031065 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '664dda01545ecafd215b2a9e2b8c9335' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.014666 seconds (Warm-up)
## Chain 2:                0.015099 seconds (Sampling)
## Chain 2:                0.029765 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '664dda01545ecafd215b2a9e2b8c9335' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.014247 seconds (Warm-up)
## Chain 3:                0.016087 seconds (Sampling)
## Chain 3:                0.030334 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '664dda01545ecafd215b2a9e2b8c9335' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 3e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.015136 seconds (Warm-up)
## Chain 4:                0.013909 seconds (Sampling)
## Chain 4:                0.029045 seconds (Total)
## Chain 4:
# Plotting and summarizing the posterior distribution
stan_samples
## Inference for Stan model: 664dda01545ecafd215b2a9e2b8c9335.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##             mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
## rateA       0.39    0.00 0.11   0.19   0.31   0.38   0.46   0.61  3478    1
## rateB       0.61    0.00 0.11   0.40   0.53   0.62   0.69   0.81  3641    1
## rate_diff   0.22    0.00 0.16  -0.08   0.12   0.23   0.34   0.52  3496    1
## lp__      -25.07    0.02 0.99 -27.85 -25.45 -24.78 -24.36 -24.08  1907    1
## 
## Samples were drawn using NUTS(diag_e) at Fri Jul  3 16:50:58 2020.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).
traceplot(stan_samples)

plot(stan_samples)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)

# So, which rate is likely higher? A or B?

# Export the samples to a data.frame for easier handling.
posterior <- as.data.frame(stan_samples)
sum(posterior$rate_diff > 0) / length(posterior$rate_diff)
## [1] 0.92475

So with around 90% probability rate B is higher than rate A.

  • Question II: Change the model so that it uses a more informative prior. What is now the probability that method B is better than method A?

The marketing department are starting to believe that it was a fluke that such a large proportion of the Danes signed up. In all other European markets the proportion that signs up for a year of salmon is around 5% to 15%, even when given a free salmon. Use this information and make the priors in your model more informative.

# We will represent the background knowledge using the following beta distribution which is mostly focused on the region 0.05-0.15.

hist(rbeta(9999, shape1 = 3, shape2 = 25), xlim=c(0, 1), 30)
lines(c(0.05, 0.15), c(0,0), col="red", lwd = 3)

#Except for the prior, the model below is exactly the same as in question I.

# The Stan model as a string.
model_string <- "
data {
  # Number of trials
  int nA;
  int nB;
  # Number of successes
  int sA;
  int sB;
}

parameters {
  real<lower=0, upper=1> rateA;
  real<lower=0, upper=1> rateB;
}

model {  
  rateA ~ beta(3, 25);
  rateB ~ beta(3, 25);
  sA ~ binomial(nA, rateA);
  sB ~ binomial(nB, rateB); 
}

generated quantities {
  real rate_diff;
  rate_diff = rateB - rateA;
}
"

data_list <- list(nA = 16, nB = 16, sA = 6, sB = 10)

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)
## DIAGNOSTIC(S) FROM PARSER:
## Info: Comments beginning with # are deprecated.  Please use // in place of # for line comments.
## Info: Comments beginning with # are deprecated.  Please use // in place of # for line comments.
## 
## 
## SAMPLING FOR MODEL '1a05c479674d0cb861bf78256f65f24e' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 1.2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.013537 seconds (Warm-up)
## Chain 1:                0.015299 seconds (Sampling)
## Chain 1:                0.028836 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL '1a05c479674d0cb861bf78256f65f24e' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 3e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.014941 seconds (Warm-up)
## Chain 2:                0.016485 seconds (Sampling)
## Chain 2:                0.031426 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL '1a05c479674d0cb861bf78256f65f24e' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.013649 seconds (Warm-up)
## Chain 3:                0.011938 seconds (Sampling)
## Chain 3:                0.025587 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL '1a05c479674d0cb861bf78256f65f24e' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.014785 seconds (Warm-up)
## Chain 4:                0.013568 seconds (Sampling)
## Chain 4:                0.028353 seconds (Total)
## Chain 4:
# Plotting and summarizing the posterior distribution
plot(stan_samples)
## ci_level: 0.8 (80% intervals)
## outer_level: 0.95 (95% intervals)

posterior <- as.data.frame(stan_samples)
sum(posterior$rate_diff > 0) / length(posterior$rate_diff)
## [1] 0.83375

So rate B is still estimated to be higher than A with around 85% probability, but both rates are estimated to be much lower.

  • Question III: So what should we do? Make a simple decision analysis. The economy department gives you the following information:

A mail of type A costs 30 kr to send out. A mail of type B costs 300 kr to send out (due to the cost of the free salmon). A salmon subscription brings in 1000 kr in revenue. Which method, A or B, is most likely to make Swedish Fish Incorporated the most money?

posterior <- as.data.frame(stan_samples)
# calculating the estimated posterior profit using method A (or B)
# a cost of 30 kr + the average profit per sent out add
profitA <- -30 + posterior$rateA * 1000 
profitB <- -300 + posterior$rateB * 1000 
hist(profitA)

hist(profitB)

hist(profitA - profitB)
expected_profit_diff <- mean(profitA - profitB)
abline(v = expected_profit_diff, col = "red", lwd =2)

The expected profit when using method A is around 190 kr higher than for method B (which actually has a negative expected profit). So I guess sending free salmon to people isn’t the best idea. But note that we got this result after having made the decision analysis based on the model with the informative priors. If we use the non-informative priors we get a different result, and it’s up to you, the analyst, to decide which version of the model you decide to use.

References:

  1. Formulas for Bayesian A/B Testing

  2. Bayes AB testing Git

  3. r bayesAB package

  4. Bayesian Model to Calculate Whether My Wife is Pregnant or Not

  5. Rstan package

  6. Bayesian First Aid package

  7. Doing Bayesian Data Analysis Intro - Youtube