Bayesian analysis in R Exercise.

This exercise will use the bayesAB package.

The package can be downloaded using the following command: install.packages(“bayesAB”).

URL: https://www.r-bloggers.com/bayesian-ab-testing-made-easy/

Exercise created by: BC Mullins.

Date: 21 August 2017.

library(bayesAB)
set.seed(1)

The setup

Beta probability density function will have:

alpha = 10

beta = 10

The control will have prior distribution of 50%

The treatment will have a prior distribution of 30%

Essentially, we are testing the hypothesis that the treatment group is has a lower probability of having an event occur compared to the control (30% versus 50%, respectively). Therefore, the control group can have a prior that follows a Bernoulli distribution.

First collection – Start with 20 observations each

In the first collection, we assume that the control follows a prior Bernoulli distribution of 50%. Imagine that this is the probablity of having an event. Therefore, a 50% probability of having an event is like a coin flip. However, with treatment, the probability of having an event is lower, 30%. We start with a sample size of 20 for each group.

control_1 <- rbinom(20, 1, 0.5)
treatment_1 <- rbinom(20, 1, 0.3)

First Analysis

In the first analysis, the distance and size of the treatment and control are slightly different. However, the probabilty that the treatment is better than control is 91.3%.

test1 <- bayesTest(treatment_1, control_1, distribution = "bernoulli", priors = c("alpha" = 10, "beta" = 10))
print(test1)
## --------------------------------------------
## Distribution used: bernoulli 
## --------------------------------------------
## Using data with the following properties: 
##            A    B
## Min.    0.00 0.00
## 1st Qu. 0.00 0.00
## Median  0.00 1.00
## Mean    0.25 0.55
## 3rd Qu. 0.25 1.00
## Max.    1.00 1.00
## --------------------------------------------
## Priors used for the calculation: 
## $alpha
## [1] 10
## 
## $beta
## [1] 10
## 
## --------------------------------------------
## Calculated posteriors for the following parameters: 
## Probability 
## --------------------------------------------
## Monte Carlo samples generated per posterior: 
## [1] 1e+05
summary(test1)
## Quantiles of posteriors for A and B:
## 
## $Probability
## $Probability$A
##         0%        25%        50%        75%       100% 
## 0.09783095 0.32214060 0.37321051 0.42561620 0.69819657 
## 
## $Probability$B
##        0%       25%       50%       75%      100% 
## 0.2028917 0.4717612 0.5250980 0.5787883 0.8255518 
## 
## 
## --------------------------------------------
## 
## P(A > B) by (0)%: 
## 
## $Probability
## [1] 0.08677
## 
## --------------------------------------------
## 
## Credible Interval on (A - B) / B for interval length(s) (0.9) : 
## 
## $Probability
##          5%         95% 
## -0.54229787  0.07151629 
## 
## --------------------------------------------
## 
## Posterior Expected Loss for choosing B over A:
## 
## $Probability
## [1] 0.4880871
plot(test1)

Second Collection – Now, we add 20 more observations

In the second collection, 20 additional observations are added to increase the sample size to 40. The expectation is that the more data that are available, the better the precision of the Bayesian posterior distribution. Again, maintaining that they control follows a prior Bernoulli distribution of 50% and the treatment follows a prior distribution of 30%, we obtain the following:

control_2 <- rbind(control_1, rbinom(20, 1, 0.5))
treatment_2 <- rbind(treatment_1, rbinom(20, 1, 0.3))

Second Analysis

In the second analysis, the distance and size between the two distrubitions have increased. You can now see that the probability that treatment is better than the control is 96.9%.

test2 <- bayesTest(treatment_2, control_2, distribution = "bernoulli", priors = c("alpha" = 10, "beta" = 10))
print(test2)
## --------------------------------------------
## Distribution used: bernoulli 
## --------------------------------------------
## Using data with the following properties: 
##        A                B               
##   [1,] "Min.   :0.00  " "Min.   :0.00  "
##   [2,] "1st Qu.:0.25  " "1st Qu.:0.25  "
##   [3,] "Median :0.50  " "Median :0.50  "
##   [4,] "Mean   :0.50  " "Mean   :0.50  "
##   [5,] "3rd Qu.:0.75  " "3rd Qu.:0.75  "
##   [6,] "Max.   :1.00  " "Max.   :1.00  "
##   [7,] "Min.   :0  "    "Min.   :0.00  "
##   [8,] "1st Qu.:0  "    "1st Qu.:0.25  "
##   [9,] "Median :0  "    "Median :0.50  "
##  [10,] "Mean   :0  "    "Mean   :0.50  "
##  [11,] "3rd Qu.:0  "    "3rd Qu.:0.75  "
##  [12,] "Max.   :0  "    "Max.   :1.00  "
##  [13,] "Min.   :0.00  " "Min.   :0.00  "
##  [14,] "1st Qu.:0.25  " "1st Qu.:0.25  "
##  [15,] "Median :0.50  " "Median :0.50  "
##  [16,] "Mean   :0.50  " "Mean   :0.50  "
##  [17,] "3rd Qu.:0.75  " "3rd Qu.:0.75  "
##  [18,] "Max.   :1.00  " "Max.   :1.00  "
##  [19,] "Min.   :0.00  " "Min.   :0.00  "
##  [20,] "1st Qu.:0.25  " "1st Qu.:0.25  "
##  [21,] "Median :0.50  " "Median :0.50  "
##  [22,] "Mean   :0.50  " "Mean   :0.50  "
##  [23,] "3rd Qu.:0.75  " "3rd Qu.:0.75  "
##  [24,] "Max.   :1.00  " "Max.   :1.00  "
##  [25,] "Min.   :0  "    "Min.   :0.00  "
##  [26,] "1st Qu.:0  "    "1st Qu.:0.25  "
##  [27,] "Median :0  "    "Median :0.50  "
##  [28,] "Mean   :0  "    "Mean   :0.50  "
##  [29,] "3rd Qu.:0  "    "3rd Qu.:0.75  "
##  [30,] "Max.   :0  "    "Max.   :1.00  "
##  [31,] "Min.   :0  "    "Min.   :1  "   
##  [32,] "1st Qu.:0  "    "1st Qu.:1  "   
##  [33,] "Median :0  "    "Median :1  "   
##  [34,] "Mean   :0  "    "Mean   :1  "   
##  [35,] "3rd Qu.:0  "    "3rd Qu.:1  "   
##  [36,] "Max.   :0  "    "Max.   :1  "   
##  [37,] "Min.   :0.00  " "Min.   :0.00  "
##  [38,] "1st Qu.:0.25  " "1st Qu.:0.25  "
##  [39,] "Median :0.50  " "Median :0.50  "
##  [40,] "Mean   :0.50  " "Mean   :0.50  "
##  [41,] "3rd Qu.:0.75  " "3rd Qu.:0.75  "
##  [42,] "Max.   :1.00  " "Max.   :1.00  "
##  [43,] "Min.   :0  "    "Min.   :1  "   
##  [44,] "1st Qu.:0  "    "1st Qu.:1  "   
##  [45,] "Median :0  "    "Median :1  "   
##  [46,] "Mean   :0  "    "Mean   :1  "   
##  [47,] "3rd Qu.:0  "    "3rd Qu.:1  "   
##  [48,] "Max.   :0  "    "Max.   :1  "   
##  [49,] "Min.   :0.00  " "Min.   :0.00  "
##  [50,] "1st Qu.:0.25  " "1st Qu.:0.25  "
##  [51,] "Median :0.50  " "Median :0.50  "
##  [52,] "Mean   :0.50  " "Mean   :0.50  "
##  [53,] "3rd Qu.:0.75  " "3rd Qu.:0.75  "
##  [54,] "Max.   :1.00  " "Max.   :1.00  "
##  [55,] "Min.   :0  "    "Min.   :0.00  "
##  [56,] "1st Qu.:0  "    "1st Qu.:0.25  "
##  [57,] "Median :0  "    "Median :0.50  "
##  [58,] "Mean   :0  "    "Mean   :0.50  "
##  [59,] "3rd Qu.:0  "    "3rd Qu.:0.75  "
##  [60,] "Max.   :0  "    "Max.   :1.00  "
##  [61,] "Min.   :0.00  " "Min.   :0  "   
##  [62,] "1st Qu.:0.25  " "1st Qu.:0  "   
##  [63,] "Median :0.50  " "Median :0  "   
##  [64,] "Mean   :0.50  " "Mean   :0  "   
##  [65,] "3rd Qu.:0.75  " "3rd Qu.:0  "   
##  [66,] "Max.   :1.00  " "Max.   :0  "   
##  [67,] "Min.   :0  "    "Min.   :0.00  "
##  [68,] "1st Qu.:0  "    "1st Qu.:0.25  "
##  [69,] "Median :0  "    "Median :0.50  "
##  [70,] "Mean   :0  "    "Mean   :0.50  "
##  [71,] "3rd Qu.:0  "    "3rd Qu.:0.75  "
##  [72,] "Max.   :0  "    "Max.   :1.00  "
##  [73,] "Min.   :0.00  " "Min.   :1  "   
##  [74,] "1st Qu.:0.25  " "1st Qu.:1  "   
##  [75,] "Median :0.50  " "Median :1  "   
##  [76,] "Mean   :0.50  " "Mean   :1  "   
##  [77,] "3rd Qu.:0.75  "