Introduction

This experiment recruits participants to a chatbot survey using Facebook ads. After clicking on the Facebook ad, a participant is randomized into one of the treatment arms. I consider 5 treatment arms. The current experimental design calls for 5 treatment arms, but we may consider adding 2 more arms in the appendix.

The power analysis varies along two dimensions:

  • pairwise comparison of all treatments or more structured hypotheses
  • number of observations (clicks)
    • in addition to looking at 50k, 75k, 100k, 125k, and 150k in each case, I look at some edge cases

There are three outcomes, which are correlated: Pr(unvax complete|click), Pr(unvax website click|complete), and Pr(unvax useful free text|complete). We have an estimate of ~14% for Pr(unvax complete|click) from the pilots, but we do not have an estimate for Pr(unvax website click|complete). I have used 20% as the starting point for the best hypothesized treatment. The hypothesized treatment effect sizes are 2 to 4 percentage points between each treatment, so Pr(unvax complete|click) ranges from 4 to 14% and Pr(unvax website click|complete) ranges from 10 to 20% based on treatment. Add sentence about mean r(unvax useful free text|complete) .

Key Takeaways

  • We are well-powered for the outcome Pr(unvax complete|click) starting at between 30k and 50k clicks, depending on number of treatment arms and hypotheses structure, which would be an estimated cost of ~4-7k at 0.146 per click estimated by the most recent pilot series.
  • We are not well-powered for the outcome Pr(unvax website click|complete) even at 500k clicks, which would be an estimated cost of ~73k.
  • Add sentence about key takeway for Pr(unvax useful free text|complete) .

Question: are the baseline and hypothesized treatment effect sizes for Pr(unvax website click|complete) reasonable? If so, do we need to drop this outcome?

5 Treatment arms

To assess the power for testing multiple correlated hypotheses, I simulate data, test the hypotheses, correct the p-values, and repeat for r repetitions, then calculate the proportion of repetitions in which I reject each hypothesis. The variables needed in the simulated data are the treatment assignment and the two outcomes: completion and clicking on the website link. In particular, we are interested in unvaccinated completes and unvaccinated website clicks. The sample for completion are clicks, while the sample for clicking on the website link are completes.

The data generation process is as follows:

  • Assign each observation to a treatment group. (treatment variable)
  • Take a draw from a binomial distribution with probability equal to the hypothesized unvax completion probability for the observation’s treatment group. (completion variable)
  • For observations who completed the survey, take a draw from a binomial distribution with probability equal to the hypothesized unvax website click probability for the observation’s treatment group. (webclick variable)

Power is calculated using the following parameters:

  • \(\alpha=0.05\)
  • replications \(r=1000\)
  • number of observations (clicks) \(n=\{50k, 75k,100k,125k,150k\}\) for a cost of \(\{11k,15k,22k,29k\}\) at $0.146 per click.

Results are reported for uncorrected p-values and for p-values corrected using Bonferonni, Holm, and Hochberg.

Hypothesis

Hypothesis 1. Participants in Treatment 4 provide more information, higher-quality information, and are more receptive to new information about the vaccine compared to participants in Treatment 1.

Hypothesis 2. Participants in Treatment 3 provide more information, higher-quality information, and are more receptive to new information about the vaccine compared to participants in Treatment 2.

Hypothesis 3. Participants in Treatment 4 provide more information, higher-quality information, and are more receptive to new information about the vaccine compared to participants in Treatment 3.

Hypothesis 4. Participants in Treatment 4 provide more information, higher-quality information, and are more receptive to new information about the vaccine compared to participants in Treatment 5.

Hypothesis 5. (A) Participants in Treatment 2 provide more information, higher-quality information, and are more receptive to new information about the vaccine compared to participants in Treatment 1. (B) Moreover, the difference between participants in Treatment 2 and participants in Treatment 1 is greater than the difference between participants in Treatment 4 and participants in Treatment 5.

set.seed(94305)

### DEFINE PARAMETERS ###

n <- seq(50000, 150000, 25000)

# outcomes
outcomes <- matrix(c(   0.08,   0.12,   0.14,   0.16,   0.14, # Pr(unvax complete|click)
                 0.14,  0.19,   0.22,   0.25,   0.22, # Pr(unvax web click|complete)
                 0.44,  0.48,   0.5,    0.52,   0.5), # Pr(useful free text| complete)
                  nrow = 3, byrow = T)


outcomes2 <-  matrix(c( 0.04,   0.12,   0.16,   0.2,    0.16, # Pr(unvax complete|click)
                 0.05,  0.15,   0.2,    0.25,   0.2, # Pr(unvax web click|complete)
                 0.35,  0.45,   0.5,    0.55,   0.5), # Pr(useful free text| complete)
                  nrow = 3, byrow = T)

# example second outcomes matrix
outcomes3 <-  matrix(c( 0.04,   0.12,   0.16,   0.2,    0.14, # Pr(unvax complete|click)
                 0.22,  0.29,   0.32,   0.35,   0.32,
                    0.3,    0.4,    0.45,   0.55,   0.5), # Pr(useful free text| complete)
                  nrow = 3, byrow = T)


outcomes_list <- list(outcomes, outcomes2, outcomes3)

# treatments
t_list <- lapply(outcomes_list, FUN=function(x){c(1:ncol(x))})

# alpha
alpha <- 0.05

# number of replications
r <- 1000

### DEFINE FUNCTION ###

trial <- function(N, outcomes, t){
  
  ### Simulate Data
  
  # create dataframe
  d <- data.frame(matrix(nrow = N, ncol = 0))
  
  # treatment assignment
  d$treat <- sample(t, N, replace=T, prob = c(3/14,3/14,2/14,4/14,2/14)) 

  
  # generate data with completion outcome
  d <- d %>% 
    mutate(complete = rbinom(n(),1,outcomes[1,treat])) 
  
  d_complete <- d %>% 
    group_by(treat) %>% 
    summarise(
      complete = sum(complete),
      clicks = n()
    )
  
  # generate data with website click outcome
  d_website <- d %>% 
    filter(complete == 1) %>% 
    mutate(webclick = rbinom(n(),1,outcomes[2,treat])) %>% 
    group_by(treat) %>% 
    summarise(
      webclick = sum(webclick),
      completes = n()
    )
  
  # generate data with useful information outcome
  d_useful <- d %>% 
    filter(complete == 1) %>% 
    mutate(useful = rbinom(n(),1,outcomes[3,treat]))  %>% 
    group_by(treat) %>% 
    summarise(
      useful = sum(useful), 
      completes = n()
    )
  
  
  ### Calculate p-values
  
  # create matrix for p-values
  p_values <- matrix(nrow = nrow(outcomes),ncol = length(t) * (length(t) - 1)/2)
  
  # calculate for completions outcome
  d_subset <- cbind(d_complete[,2], d_complete[,3])
  p_values[1,] <- na.omit(as.numeric(pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                                                  p.adj = "none")$p.value))
  
  # calculate for completions outcome
  d_subset <- cbind(d_website[,2], d_website[,3])
  p_values[2,] <- na.omit(as.numeric(pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                                                  p.adj = "none")$p.value))
  
   # calculate for useful outcome
  d_subset <- cbind(d_useful[,2], d_useful[,3])
  p_values[3,] <- na.omit(as.numeric(pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                                                  p.adj = "none")$p.value))
  
  ### MHT corrections
  
  return(rbind(p.adjust(p_values[1,], method = "none"),
               p.adjust(p_values[1,], method = "bonferroni"),
               p.adjust(p_values[1,], method = "holm"),
               p.adjust(p_values[1,], method = "hochberg"),
               p.adjust(p_values[2,], method = "none"),
               p.adjust(p_values[2,], method = "bonferroni"),
               p.adjust(p_values[2,], method = "holm"),
               p.adjust(p_values[2,], method = "hochberg"),
               p.adjust(p_values[3,], method = "none"),
               p.adjust(p_values[3,], method = "bonferroni"),
               p.adjust(p_values[3,], method = "holm"),
               p.adjust(p_values[3,], method = "hochberg")))
}


### POWER CALCULATIONS ###

for (i in n){
  for(o in 1:length(outcomes_list)){
    treatments <- t_list[[o]]
    outcome_matrix  <- outcomes_list[[o]]
    
    # complete r replications of n trials
    results <- replicate(r,  trial(i, outcomes = outcome_matrix, t = treatments))
    
    # create matrix to store power calculations
    power_calc <- matrix(NA, 
                         nrow = nrow(outcome_matrix)*4, 
                         ncol = (length(treatments) * (length(treatments) - 1)/2)) 
    
    # for each pairwise treatment comparison
    for(j in 1:(length(treatments) * (length(treatments) - 1)/2)){
      
      # for each mht correction
      for(k in 1:4){
        
        # calculate power for completions outcome
        power_calc[k, j] <- sum(results[k, j, ] < alpha)/r
        
        # calculate power for website clicks outcome
        power_calc[4+k,j] <- sum(results[k+4,j,]<alpha)/r
        
        # calculate power for useful free text
        power_calc[8+k,j] <- sum(results[k+8,j,]<alpha)/r
        
      }
    }
    
    power_calc <- data.frame(outcome = c(rep('Unvax completes|Click',4),
                           rep('Unvax web clicks|Complete', 4),
                           rep('Unvax useful freetext|Complete', 4)), 
                         test = rep(c("None","Bonferonni", "Holm", "Hochberg"), 3),
                         power_calc)
   
    colnames(power_calc) <- c("Outcome","Correction",  'T1 v. T2', 'T1 v. T3', 'T1 v. T4','T1 v. T5','T2 v. T3','T2 v. T4','T2 v. T5','T3 v. T4','T3 v. T5','T4 v. T5')
    
    kable(power_calc, format = "html",  caption = paste("Power Calculations: ",i/1000," Thousand Clicks", "Matrix", o), digits = 3) %>%
  kable_styling()%>%
  collapse_rows(columns = 1)%>%
      scroll_box(height = "100%")%>%
      print()
  }  
}
set.seed(94305)

### DEFINE PARAMETERS ###
# parameters defined in previous code chunk

### DEFINE FUNCTION ###

trial_subset <- function(N, outcomes, t){
  
  ### Simulate Data
  
  # create dataframe
  d <- data.frame(matrix(nrow = N, ncol = 0))
  
  # treatment assignment
  d$treat <- sample(t, N, replace=T) 
  
  # generate data with completion outcome
  d <- d %>% 
    mutate(complete = rbinom(n(),1,outcomes[1,treat])) 
  
  d_complete <- d %>% 
    group_by(treat) %>% 
    summarise(
      complete = sum(complete),
      clicks = n()
    )
  
  # generate data with website click outcome
  d_website <- d %>% 
    filter(complete == 1) %>% 
    mutate(webclick = rbinom(n(),1,outcomes[2,treat])) %>% 
    group_by(treat) %>% 
    summarise(
      webclick = sum(webclick),
      completes = n()
    )

    # generate data with useful information outcome
  d_useful <- d %>% 
    filter(complete == 1) %>% 
    mutate(useful = rbinom(n(),1,outcomes[3,treat]))  %>% 
    group_by(treat) %>% 
    summarise(
      useful = sum(useful), 
      completes = n()
    )
  
  
  ### Calculate p-values
  
  # create matrix for p-values
  p_values <- matrix(nrow = nrow(outcomes),ncol = length(t) - 1)
  
  # calculate for completions outcome
  d_subset <- cbind(d_complete[,2], d_complete[,3])
  p <- pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                      p.adj = "none")$p.value
  p_values[1,] <-  diag(p)
  
  # calculate for webclicks outcome
  d_subset <- cbind(d_website[,2], d_website[,3])
  p <- pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                      p.adj = "none")$p.value
  p_values[2,] <- diag(p)
  
  # calculate for useful information outcome
  d_subset <- cbind(d_useful[,2], d_useful[,3])
  p <- pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                      p.adj = "none")$p.value
  p_values[3,] <- diag(p)
  
  
  ### MHT corrections
  
  return(rbind(p.adjust(p_values[1,], method = "none"),
               p.adjust(p_values[1,], method = "bonferroni"),
               p.adjust(p_values[1,], method = "holm"),
               p.adjust(p_values[1,], method = "hochberg"),
               p.adjust(p_values[2,], method = "none"),
               p.adjust(p_values[2,], method = "bonferroni"),
               p.adjust(p_values[2,], method = "holm"),
               p.adjust(p_values[2,], method = "hochberg"),
               p.adjust(p_values[3,], method = "none"),
               p.adjust(p_values[3,], method = "bonferroni"),
               p.adjust(p_values[3,], method = "holm"),
               p.adjust(p_values[3,], method = "hochberg")))
}

### POWER CALCULATIONS ###

for (i in n){
  for(o in 1:length(outcomes_list)){
    treatments <- t_list[[o]]
    outcome_matrix  <- outcomes_list[[o]]
    
    # complete r replications of n trials
    results <- replicate(r,  trial_subset(i, outcomes = outcome_matrix, t = treatments))
    
    # create matrix to store power calculations
    power_calc <- matrix(NA, 
                         nrow = nrow(outcome_matrix)*4, 
                         ncol = length(treatments)-1) 
    
    # for each pairwise treatment comparison
    # for each pairwise treatment comparison
    for(j in 1:(length(treatments) - 1)){
      
      # for each mht correction
      for(k in 1:4){
        
        # calculate power for completions outcome
        power_calc[k, j] <- sum(results[k, j, ] < alpha)/r
        
        # calculate power for website clicks outcome
        power_calc[4+k, j] <- sum(results[k+4, j, ] < alpha)/r
        
        # calculate power for useful informaton  outcome
        power_calc[8+k, j] <- sum(results[k+8, j, ] < alpha)/r
        
      }
    }
    
    # Generate Graph
  
    
        power_calc <- data.frame(outcome = c(rep('Unvax completes|Click',4),
                           rep('Unvax web clicks|Complete', 4),
                           rep('Unvax useful freetext|Complete', 4)), 
                         test = rep(c("None","Bonferonni", "Holm", "Hochberg"), 3),
                         power_calc)
   
    colnames(power_calc) <- c("Outcome","Correction", 
                              'T1 v. T2', 'T2 v. T3','T3 v. T4','T4 v. T5')
 
      
    kable(power_calc,  format = "html", caption = paste("Power Calculations: ",i/1000," Thousand Clicks", "Matrix", o), digits = 3)%>%
  kable_styling()%>%
  collapse_rows(columns = 1)%>%
      scroll_box(height = "100%")%>%
      print()
  }  
}
set.seed(94305)
### DEFINE PARAMETERS ###

n <- seq(50000, 150000, 25000)

# outcomes
outcomes <- matrix(c(   0.08,   0.12,   0.14,   0.16,   0.14, # Pr(unvax complete|click)
                 0.14,  0.19,   0.22,   0.25,   0.22, # Pr(unvax web click|complete)
                 0.44,  0.48,   0.5,    0.52,   0.5), # Pr(useful free text| complete)
                  nrow = 3, byrow = T)


outcomes2 <-  matrix(c( 0.04,   0.12,   0.16,   0.2,    0.16, # Pr(unvax complete|click)
                 0.05,  0.15,   0.2,    0.25,   0.2, # Pr(unvax web click|complete)
                 0.35,  0.45,   0.5,    0.55,   0.5), # Pr(useful free text| complete)
                  nrow = 3, byrow = T)

outcomes3 <-  matrix(c( 0.04,   0.12,   0.16,   0.2,    0.14, # Pr(unvax complete|click)
                 0.22,  0.29,   0.32,   0.35,   0.32,
                    0.3,    0.4,    0.45,   0.55,   0.5), # Pr(useful free text| complete)
                  nrow = 3, byrow = T)

outcomes_list <- list(outcomes, outcomes2, outcomes3)

# treatments
t_list <- lapply(outcomes_list, FUN=function(x){c(1:ncol(x))})

# alpha
alpha <- 0.05

# number of replications
r <- 1000

### DEFINE FUNCTION ###

trial_subset <- function(N, outcomes, t){
  
  ### Simulate Data
  
  # create dataframe
  d <- data.frame(matrix(nrow = N, ncol = 0))
  
  # treatment assignment
  d$treat <- sample(t, N, replace=T, prob = c(3/14,3/14,2/14,4/14,2/14)) 
  
  # generate data with completion outcome
  d <- d %>% 
    mutate(complete = rbinom(n(),1,outcomes[1,treat])) 
  
  d_complete <- d %>% 
    group_by(treat) %>% 
    summarise(
      complete = sum(complete),
      clicks = n()
    )

#  d_complete2 <- d %>%
#    filter(treat != 3) %>%
#    mutate(t_1 = ifelse(treat == 1, 1, 0), 
#           t_2 = ifelse(treat == 2, 1, 0),
#           #t_3 = ifelse(treat == 3, 1, 0),
#           t_4 = ifelse(treat == 4, 1, 0),
#           t_5 = ifelse(treat == 5, 1, 0)) %>%
 #   select(!treat)
  
#  d_lm <- lm(complete~ . -1, d_complete2)
  
#  summary(d_lm)
 # anova(d_lm)
 
  
  # generate data with website click outcome
  d_website <- d %>% 
    filter(complete == 1) %>% 
    mutate(webclick = rbinom(n(),1,outcomes[2,treat])) %>% 
    group_by(treat) %>% 
    summarise(
      webclick = sum(webclick),
      completes = n()
    )

    # generate data with useful information outcome
  d_useful <- d %>% 
    filter(complete == 1) %>% 
    mutate(useful = rbinom(n(),1,outcomes[3,treat]))  %>% 
    group_by(treat) %>% 
    summarise(
      useful = sum(useful), 
      completes = n()
    )
  
  
  ### Calculate p-values
  
  # create matrix for p-values
  p_values <- matrix(nrow = nrow(outcomes),ncol = length(t))
  
  # calculate for completions outcome
  d_subset <- cbind(d_complete[,2], d_complete[,3])
  p <- pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                      p.adj = "none")$p.value
  
  
  p_values[1,] <- c(p["4", "1"], 
                    p["3", "2"], 
                    p["4", "3"], 
                    p["5", "4"], 
                    p["2", "1"])
  
  # calculate for webclicks outcome
  d_subset <- cbind(d_website[,2], d_website[,3])
  p <- pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                      p.adj = "none")$p.value
  p_values[2,] <- c(p["4", "1"], 
                    p["3", "2"], 
                    p["4", "3"], 
                    p["5", "4"], 
                    p["2", "1"])
  
  # calculate for useful information outcome
  d_subset <- cbind(d_useful[,2], d_useful[,3])
  p <- pairwise.prop.test(d_subset[,1], d_subset[,2], 
                                      p.adj = "none")$p.value
  
  p_values[3,] <- c(p["4", "1"], 
                    p["3", "2"], 
                    p["4", "3"], 
                    p["5", "4"], 
                    p["2", "1"])
  
  
  ### MHT corrections
  
  return(rbind(p.adjust(p_values[1,], method = "none"),
               p.adjust(p_values[1,], method = "bonferroni"),
               p.adjust(p_values[1,], method = "holm"),
               p.adjust(p_values[1,], method = "hochberg"),
               p.adjust(p_values[2,], method = "none"),
               p.adjust(p_values[2,], method = "bonferroni"),
               p.adjust(p_values[2,], method = "holm"),
               p.adjust(p_values[2,], method = "hochberg"),
               p.adjust(p_values[3,], method = "none"),
               p.adjust(p_values[3,], method = "bonferroni"),
               p.adjust(p_values[3,], method = "holm"),
               p.adjust(p_values[3,], method = "hochberg")))
}

### POWER CALCULATIONS ###

for (i in n){
  for(o in 1:length(outcomes_list)){
    treatments <- t_list[[o]]
    outcome_matrix  <- outcomes_list[[o]]
    
    # complete r replications of n trials
    results <- replicate(r,  trial_subset(i, outcomes = outcome_matrix, t = treatments))
    
    # create matrix to store power calculations
    power_calc <- matrix(NA, 
                         nrow = nrow(outcome_matrix)*4, 
                         ncol = length(treatments)) 
    
    # for each pairwise treatment comparison
    # for each pairwise treatment comparison
    for(j in 1:(length(treatments))){
      
      # for each mht correction
      for(k in 1:4){
        
        # calculate power for completions outcome
        power_calc[k, j] <- sum(results[k, j, ] < alpha)/r
        
        # calculate power for website clicks outcome
        power_calc[4+k, j] <- sum(results[k+4, j, ] < alpha)/r
        
        # calculate power for useful informaton  outcome
        power_calc[8+k, j] <- sum(results[k+8, j, ] < alpha)/r
        
      }
    }
    
    # Generate Graph
  
    
        power_calc <- data.frame(outcome = c(rep('Unvax completes|Click',4),
                           rep('Unvax web clicks|Complete', 4),
                           rep('Unvax useful freetext|Complete', 4)), 
                         test = rep(c("None","Bonferonni", "Holm", "Hochberg"), 3),
                         power_calc)
   
    colnames(power_calc) <- c("Outcome","Correction", 
                              'T1 v. T4', 'T2 v. T3','T3 v. T4','T4 v. T5', 'T1 vs T2')
 
      
    kable(power_calc,  format = "html", caption = paste("Power Calculations: ",i/1000," Thousand Clicks", "Matrix", o), digits = 3)%>%
  kable_styling()%>%
  collapse_rows(columns = 1)%>%
      scroll_box(height = "100%")%>%
      print()
  }  
}
Power Calculations: 50 Thousand Clicks Matrix 1
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1.000 0.972 0.966 0.961 1.000
Bonferonni 1.000 0.910 0.896 0.889 1.000
Holm 1.000 0.971 0.962 0.955 1.000
Hochberg 1.000 0.971 0.964 0.957 1.000
Unvax web clicks|Complete None 1.000 0.393 0.470 0.444 0.852
Bonferonni 1.000 0.193 0.236 0.230 0.671
Holm 1.000 0.255 0.313 0.301 0.722
Hochberg 1.000 0.261 0.322 0.309 0.730
Unvax useful freetext|Complete None 0.978 0.142 0.176 0.154 0.435
Bonferonni 0.926 0.042 0.054 0.057 0.216
Holm 0.926 0.048 0.069 0.073 0.240
Hochberg 0.927 0.051 0.071 0.076 0.243
Power Calculations: 50 Thousand Clicks Matrix 2
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.907 0.927 0.921 1.000
Bonferonni 1 0.770 0.765 0.765 1.000
Holm 1 0.896 0.917 0.901 1.000
Hochberg 1 0.901 0.920 0.909 1.000
Unvax useful freetext|Complete None 1 0.675 0.801 0.808 0.943
Bonferonni 1 0.417 0.600 0.607 0.851
Holm 1 0.612 0.737 0.732 0.919
Hochberg 1 0.631 0.752 0.751 0.925
Power Calculations: 50 Thousand Clicks Matrix 3
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.351 0.452 0.398 0.815
Bonferonni 1 0.146 0.216 0.189 0.578
Holm 1 0.201 0.280 0.259 0.648
Hochberg 1 0.215 0.299 0.280 0.666
Unvax useful freetext|Complete None 1 0.677 1.000 0.772 0.952
Bonferonni 1 0.437 1.000 0.543 0.861
Holm 1 0.640 1.000 0.724 0.938
Hochberg 1 0.648 1.000 0.735 0.945
Power Calculations: 75 Thousand Clicks Matrix 1
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1.000 0.998 0.995 0.997 1.000
Bonferonni 1.000 0.986 0.977 0.983 1.000
Holm 1.000 0.998 0.995 0.997 1.000
Hochberg 1.000 0.998 0.995 0.997 1.000
Unvax web clicks|Complete None 1.000 0.557 0.624 0.636 0.954
Bonferonni 1.000 0.328 0.376 0.380 0.868
Holm 1.000 0.450 0.508 0.511 0.924
Hochberg 1.000 0.483 0.545 0.538 0.931
Unvax useful freetext|Complete None 0.997 0.213 0.233 0.265 0.599
Bonferonni 0.987 0.070 0.096 0.110 0.325
Holm 0.987 0.086 0.113 0.134 0.367
Hochberg 0.987 0.092 0.116 0.139 0.369
Power Calculations: 75 Thousand Clicks Matrix 2
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.969 0.984 0.985 1.000
Bonferonni 1 0.913 0.940 0.954 1.000
Holm 1 0.969 0.984 0.985 1.000
Hochberg 1 0.969 0.984 0.985 1.000
Unvax useful freetext|Complete None 1 0.833 0.954 0.947 0.996
Bonferonni 1 0.655 0.837 0.813 0.966
Holm 1 0.824 0.945 0.932 0.993
Hochberg 1 0.826 0.948 0.937 0.993
Power Calculations: 75 Thousand Clicks Matrix 3
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.480 0.589 0.540 0.935
Bonferonni 1 0.231 0.350 0.300 0.801
Holm 1 0.343 0.453 0.408 0.855
Hochberg 1 0.366 0.479 0.430 0.867
Unvax useful freetext|Complete None 1 0.865 1.000 0.919 0.995
Bonferonni 1 0.675 1.000 0.783 0.976
Holm 1 0.860 1.000 0.907 0.994
Hochberg 1 0.863 1.000 0.910 0.994
Power Calculations: 100 Thousand Clicks Matrix 1
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1.000 1.000 0.998 1.000 1.000
Bonferonni 1.000 0.997 0.997 0.997 1.000
Holm 1.000 1.000 0.998 1.000 1.000
Hochberg 1.000 1.000 0.998 1.000 1.000
Unvax web clicks|Complete None 1.000 0.669 0.755 0.744 0.990
Bonferonni 1.000 0.445 0.532 0.508 0.956
Holm 1.000 0.607 0.677 0.663 0.977
Hochberg 1.000 0.629 0.699 0.690 0.983
Unvax useful freetext|Complete None 0.999 0.265 0.305 0.299 0.732
Bonferonni 0.999 0.102 0.132 0.134 0.506
Holm 0.999 0.126 0.170 0.167 0.553
Hochberg 0.999 0.128 0.173 0.169 0.554
Power Calculations: 100 Thousand Clicks Matrix 2
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.993 0.997 0.999 1.000
Bonferonni 1 0.980 0.987 0.988 1.000
Holm 1 0.992 0.997 0.998 1.000
Hochberg 1 0.993 0.997 0.999 1.000
Unvax useful freetext|Complete None 1 0.931 0.978 0.977 0.999
Bonferonni 1 0.813 0.931 0.923 0.994
Holm 1 0.929 0.977 0.975 0.999
Hochberg 1 0.929 0.978 0.976 0.999
Power Calculations: 100 Thousand Clicks Matrix 3
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.591 0.706 0.657 0.978
Bonferonni 1 0.369 0.476 0.414 0.918
Holm 1 0.501 0.618 0.564 0.956
Hochberg 1 0.526 0.638 0.586 0.964
Unvax useful freetext|Complete None 1 0.946 1.000 0.966 1.000
Bonferonni 1 0.847 1.000 0.894 0.997
Holm 1 0.944 1.000 0.962 0.999
Hochberg 1 0.946 1.000 0.965 1.000
Power Calculations: 125 Thousand Clicks Matrix 1
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.805 0.832 0.820 0.999
Bonferonni 1 0.590 0.645 0.633 0.992
Holm 1 0.766 0.799 0.776 0.998
Hochberg 1 0.775 0.812 0.791 0.998
Unvax useful freetext|Complete None 1 0.302 0.380 0.369 0.820
Bonferonni 1 0.130 0.190 0.174 0.633
Holm 1 0.157 0.243 0.234 0.681
Hochberg 1 0.163 0.252 0.243 0.687
Power Calculations: 125 Thousand Clicks Matrix 2
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1
Bonferonni 1 1.000 1.000 1.000 1
Holm 1 1.000 1.000 1.000 1
Hochberg 1 1.000 1.000 1.000 1
Unvax web clicks|Complete None 1 1.000 1.000 0.999 1
Bonferonni 1 0.995 0.997 0.996 1
Holm 1 1.000 1.000 0.999 1
Hochberg 1 1.000 1.000 0.999 1
Unvax useful freetext|Complete None 1 0.982 0.997 0.990 1
Bonferonni 1 0.916 0.966 0.965 1
Holm 1 0.982 0.997 0.990 1
Hochberg 1 0.982 0.997 0.990 1
Power Calculations: 125 Thousand Clicks Matrix 3
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.668 0.824 0.762 0.997
Bonferonni 1 0.447 0.617 0.543 0.977
Holm 1 0.612 0.763 0.693 0.995
Hochberg 1 0.628 0.777 0.712 0.995
Unvax useful freetext|Complete None 1 0.976 1.000 0.989 1.000
Bonferonni 1 0.903 1.000 0.951 1.000
Holm 1 0.975 1.000 0.988 1.000
Hochberg 1 0.976 1.000 0.989 1.000
Power Calculations: 150 Thousand Clicks Matrix 1
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.847 0.892 0.896 1.000
Bonferonni 1 0.685 0.726 0.739 0.997
Holm 1 0.830 0.871 0.869 1.000
Hochberg 1 0.836 0.875 0.878 1.000
Unvax useful freetext|Complete None 1 0.346 0.449 0.438 0.879
Bonferonni 1 0.148 0.219 0.215 0.721
Holm 1 0.215 0.285 0.288 0.764
Hochberg 1 0.229 0.295 0.298 0.775
Power Calculations: 150 Thousand Clicks Matrix 2
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1
Bonferonni 1 1.000 1.000 1.000 1
Holm 1 1.000 1.000 1.000 1
Hochberg 1 1.000 1.000 1.000 1
Unvax web clicks|Complete None 1 1.000 1.000 1.000 1
Bonferonni 1 1.000 1.000 1.000 1
Holm 1 1.000 1.000 1.000 1
Hochberg 1 1.000 1.000 1.000 1
Unvax useful freetext|Complete None 1 0.991 1.000 0.997 1
Bonferonni 1 0.955 0.996 0.991 1
Holm 1 0.991 1.000 0.997 1
Hochberg 1 0.991 1.000 0.997 1
Power Calculations: 150 Thousand Clicks Matrix 3
Outcome Correction T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5 T1 vs T2
Unvax completes|Click None 1 1.000 1.000 1.000 1.000
Bonferonni 1 1.000 1.000 1.000 1.000
Holm 1 1.000 1.000 1.000 1.000
Hochberg 1 1.000 1.000 1.000 1.000
Unvax web clicks|Complete None 1 0.817 0.880 0.836 0.999
Bonferonni 1 0.591 0.697 0.627 0.989
Holm 1 0.784 0.853 0.790 0.999
Hochberg 1 0.798 0.870 0.813 0.999
Unvax useful freetext|Complete None 1 0.992 1.000 0.998 1.000
Bonferonni 1 0.961 1.000 0.990 1.000
Holm 1 0.992 1.000 0.998 1.000
Hochberg 1 0.992 1.000 0.998 1.000

Romano-Wolf Correction

We run power calculations on a specific set of outcomes using the Romano-Wolf correction. We will use the Romano-Wolf correction later in experimental analysis to correct for multiple hypothesis. The specific outcomes we are testing are:

  • List specific tests for Romano-Wolf

N is the number of observations in the sample

B is the number of bootstrap subsamples to draw

n defines the size of the bootstrap subsamples

r is the number of times we replicate this process to get r p-values for each hypotheses test.

In this output, N = 50000, B = 100,n=3000, and r = 100. We can adjust these later, this is just to keep run time low.

The function is divided into three parts:

  1. compute t statistic/p estimates for each combination of treatment and outcome comparisons (ie treatment A and B for outcome y1) from the full data set

  2. get B bootstrapped samples of size n from the full data set and compute a test statistic for each outcome-treatment comparisons

  3. The actual Romano-Wolf correction

  4. Now we replicate this function r times to get r estimates of pvalues

### romano wolf
romano_wolf <- function(N, B, n){
# number of observations for simulated data set
 # N = 50000
  
  # number of bootstrap subsamples to draw
  #B= 5000
  
  # size of bootstrap subsamples to draw
  #n = 5000

# outcomes
  outcomes <- matrix(c(0.04, 0.08, 0.10, 0.14, 0.12, # Pr(unvax complete|click)
                       0.12, 0.16, 0.18, 0.20, 0.18, # Pr(unvax web click|complete)
                       1- 0.13, 1- 0.15, 1- 0.20, 1-0.17, 1-0.21), # Pr(useful free text| complete)
                     nrow = 3, byrow = T)
  # treatments
  t <- 1:ncol(outcomes)
  
  # make data frame
  df <- data.frame(matrix(nrow = N, ncol = 0))
  
  # treatment assignment
  df$treat <- sample(t, N, replace=T, c(3/14,3/14,2/14,4/14,2/14)) 
  
  # generate data with completion outcome
  df <- df %>% 
    mutate(complete = rbinom(n(),1,outcomes[1,treat])) 
  
  completes <- df %>%
    filter(complete ==1) %>%
    mutate(webclick = rbinom(n(),1,outcomes[2,treat]), 
           useful = rbinom(n(),1,outcomes[3,treat]))

  noncompletes <- df %>%
    filter(complete ==0) %>%
    mutate(webclick = NA,
           useful = NA)
  
  
  d <-rbind(completes, noncompletes) %>% 
    .[sample(1:nrow(.)), ] #random order
  

  
  ### Romano Wolf Starts Here
  
  # Say that we are interested in testing B vs A, C vs A, D vs A, C vs B, D vs B, and D vs C for each of the outcomes.
  # That makes a total of 18 hypotheses.
  # We will need the estimates and t-stats (or z-stats) of each parameter about which we are conducting an hypothesis test.
  # Eventually, we will also need the standard errors #
  
  
  comparisons = combn(t,2)[, c(3, 5, 8, 10, 1)]
  num_hyp = nrow(outcomes) * ncol(comparisons)
  
  orig_results = data.frame(matrix(NA,num_hyp,3))
  colnames(orig_results) = c("estimates","ts","p_val")
  
  
  ### step 1: compute t statistic/p estimates for each combination of treatment and outcome comparisons (ie treatment A and B for outcome y1)  from the full data set
  
  i = 1
  for (y in c("complete", "webclick", "useful")){
    for (comp in 1:ncol(comparisons)){
      # We create the name of the hypothesis we are testing, for example y1_A_vs_B #
      rownames(orig_results)[i] = paste(y,"_",comparisons[1,comp],"_vs_",comparisons[2,comp],sep="")
      # Run the t.test and extract the estimate and t-statistic #
      
      data <- d %>%
        select(all_of(c("treat", y))) %>%
        drop_na()
      
      test = t.test(data[data$treat==comparisons[1,comp],str_remove(string  = y, pattern = "d_")],
                    data[data$treat==comparisons[2,comp],str_remove(string  = y, pattern = "d_")])
      
      orig_results[i,"estimates"] = test$estimate[1] - test$estimate[2] 
      # We use absolute values for the t-statistic since we want to test two-sided hypothesis # 
      orig_results[i,"ts"] = abs(test$statistic)
      # Also get the p-values to compare
      orig_results[i,"p_val"] = test$p.value
      i = i+1}}
  
  # We sort the hypotheses by the size of the original t-statistic #
  sorted_orig = orig_results[order(-orig_results$ts),]
  # We check if we reject the null hypothesis using the unadjusted p-values
  alpha = 0.05
  sorted_orig$reject = sorted_orig$p_val < alpha
  sorted_orig
  
  
  ### bootstrapping
  
  bootstrapped_statistics = data.frame(matrix(NA,B,nrow(sorted_orig)))
  # Important, we keep the order of the sorted set of hypotheses #
  colnames(bootstrapped_statistics) = rownames(sorted_orig)
  
  #set.seed(777)
  # In each of the bootstrap rounds we do the following:
  for (b in 1:B){

      data <-  d
      
      # We sample with replacement the indices which will make up our bootstrap sample
      idx = sample(1:nrow(data),n, replace = TRUE)
  
      data_boot = data[idx,]
      
    # We repeat exactly the same procedure as before to get the bootstrap estimates and standard errors 
    for (y in c("complete","webclick","useful")){
      
      # filter to whichever outcome we are looking at 
      data_boot_temp <- data_boot %>%
        select(all_of(c("treat", y))) %>%
        drop_na
      
      
      for (comp in 1:ncol(comparisons)){
        # By getting the name of the hypothesis we are testing we don't need to worry about order, we match by name
        hyp = paste(y, "_", comparisons[1,comp],"_vs_",comparisons[2,comp],sep="")
        # Run the t.test on the bootstrap sample
        test = t.test(data_boot_temp[data_boot_temp$treat==comparisons[2,comp],str_remove(y, "d_")],data_boot_temp[data_boot_temp$treat==comparisons[1,comp],str_remove(y, "d_")])
        # The null statistic is computed by subtracting the original estimate from the bootstrap sample and dividing
        # by the standard error of the bootstraped estimate
        # Remember the absolute value since we are testing a two-sided hypothesis
        bootstrapped_statistics[b,hyp] = abs((test$estimate[1] - test$estimate[2] - sorted_orig[hyp,"estimates"])/test$stderr)}
    }
      
  }

  ### The actual Romano-Wolf Correction Step
  
  # Now we compute the "max" statistics and the empirical quantiles #
  max_stats = data.frame(matrix(NA,B,num_hyp))
  colnames(max_stats) = colnames(bootstrapped_statistics)
  for (h in 1:(num_hyp-1)){
    max_stats[,h] = apply(bootstrapped_statistics[,h:num_hyp], MARGIN=1, FUN=max)}
  max_stats[,num_hyp] = bootstrapped_statistics[,num_hyp]
  
  # And the empirical quantiles given the significance level #
  Cs = unname(apply(max_stats, MARGIN=2, FUN=quantile, probs=c(1-alpha)))
  
  # And finally we run the rejection algorithm
  sorted_orig$adjust_rejected = NA
  # R will index the hypothesis we need to test at each round
  R = 1
  # r will be the counter of rejected hypotheses at each round
  r = Inf
  # The while loop stops if no hypotheses are rejected in a round or if we reject all.
  while (r > 0 & R <= num_hyp){
    # We compare the original t-statistics of the hypotheses we haven't tested at each round with the quantiles
    sorted_orig$adjust_rejected[R:num_hyp] = (sorted_orig$ts[R:num_hyp] > Cs[R])
    r = sum(sorted_orig$adjust_rejected[R:num_hyp])
    R = r + R}

  sorted_orig[c("reject","adjust_rejected")]
  
  
  sorted_orig$adj_p_val = NA
  sorted_orig$adj_p_val[1] = (sum(max_stats[,1]>=sorted_orig$ts[1])+1)/(B+1)
  for (h in 2:num_hyp){
    sorted_orig$adj_p_val[h] = max((sum(max_stats[,h]>=sorted_orig$ts[h])+1)/(B+1), sorted_orig$adj_p_val[h-1])}
  
  
  round(sorted_orig[c("p_val","adj_p_val")],5)
  
  romano_pvalues <-  as.data.frame(sorted_orig) %>% 
    #rownames_to_column(var = "comparison") %>% 
    select( p_val, adj_p_val) %>% 
    arrange(rownames(.)) %>% as.matrix()

  return(romano_pvalues)
}
r=100
results <- replicate(r,  romano_wolf(N = 50000,B = 100, n=3000))
 # create matrix to store power calculations
    power_calc <- matrix(NA, 
                         nrow = dim(results)[1], 
                         ncol = dim(results)[2]) 
    
    # for each pairwise treatment comparison
    # for each pairwise treatment comparison
    for(j in 1:(ncol(power_calc))){
      
      # for each mht correction
      for(k in 1:nrow(power_calc)){
        
        # calculate power for completions outcome
        power_calc[k, j] <- sum(results[k, j, ] < alpha)/r
        
        # calculate power for website clicks outcome
       # power_calc[4+k, j] <- sum(results[k+4, j, ] < alpha)/r
        
        # calculate power for useful informaton  outcome
        #power_calc[8+k, j] <- sum(results[k+8, j, ] < alpha)/r
        
      }
    }
    
    # Generate Graph
  

    rownames(power_calc) <- rownames(results[, , 1])
    colnames(power_calc) <- colnames(results[, , 1])
    
    power_calc <- power_calc %>%
      as.data.frame() %>%
      rownames_to_column(var = "comparison") %>%
      mutate(treatments = str_extract(string = comparison, pattern = "[0-9]\\_vs\\_[0-9]"), 
             outcomes = str_remove(string = comparison, pattern = paste("_", treatments, sep= ""))) %>%
      select(outcomes, p_val, adj_p_val, treatments) %>% 
      pivot_longer(cols = c("p_val", "adj_p_val"), names_to = "p_val_type")%>%
      pivot_wider(names_from = treatments, values_from  = value) %>%
      mutate(outcomes = factor(outcomes, levels = c("complete", "webclick", "useful"))) %>% 
      arrange(outcomes) %>%
      mutate(outcomes = c(rep('Unvax completes|Click',2), rep('Unvax web clicks|Complete', 2),
                           rep('Unvax useful freetext|Complete', 2)), 
             p_val_type = ifelse(p_val_type == "p_val", "Romano-Wolf", "Adj. Romano-Wolf")) %>%
      rename(correction = p_val_type)
     
    

   
      colnames(power_calc) <- c("Outcome","Correction",  'T1 v. T2',  'T1 v. T4', 'T2 v. T3','T3 v. T4', 'T4 v. T5')
 
      i = 50000
      o = 1
    kable(power_calc,  format = "html", caption = paste("Power Calculations: ",i/1000," Thousand Clicks", "Matrix", o), digits = 3)%>%
  kable_styling()%>%
  collapse_rows(columns = 1)%>%
      scroll_box(height = "100%")
Power Calculations: 50 Thousand Clicks Matrix 1
Outcome Correction T1 v. T2 T1 v. T4 T2 v. T3 T3 v. T4 T4 v. T5
Unvax completes|Click Romano-Wolf 1.00 1.00 1.00 1.00 0.98
Adj. Romano-Wolf 1.00 1.00 0.22 0.97 0.12
Unvax web clicks|Complete Romano-Wolf 0.42 0.94 0.21 0.31 0.27
Adj. Romano-Wolf 0.00 0.02 0.00 0.00 0.00
Unvax useful freetext|Complete Romano-Wolf 0.20 0.47 0.81 0.45 0.76
Adj. Romano-Wolf 0.00 0.00 0.00 0.00 0.01

Appendix

Sample size calculations – Bonferonni – pr(unvax complete|click)

The below figure uses the below equation to calculate the sample size needed for each treatment arm, given the parameters.

  • The outcome is the probability of an unvaccinated person completing the survey, conditional on clicking on the Facebook ad.
  • The sample for this outcome is the number of clicks.
  • The baseline of this outcome from the pilot is ~14%.
  • I consider treatment effects ranging from 10% to 30% of baseline (i.e. ~1.4% to 4.2%)
  • I correct for multiple hypothesis testing using Bonferroni.
  • For the Bonferroni correction, I assume the number of hypotheses = number of outcomes (3) x number of treatment comparisons (factorial(t)/2) for 5 or 7 treatment arms.

\[ n \geq \dfrac{4(1-p)}{\lambda^2p} \times (z_\alpha + \Phi^{-1}(1-\beta))^2 \]

### DEFINE PARAMETERS ###

# outcome 
p <- 0.14365 #from pilot 5 descriptive analytics analysis 595/4142

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- seq(from = 0.1, to = 0.3, length.out = 100 )

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power <- function(t, lambda) {
  4 * (qnorm(1-alpha/2/(3*factorial(t)/2))+(qnorm(1 - beta)))^2 * ((1 - p)/((lambda * lambda) * p))}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda and number of treatments
n_obs <- matrix(NA, nrow = length(lambda), ncol = length(treat))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    n_obs[i,j] <- vax_power(treat[j], lambda[i])
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(n_obs,lambda))

# create the graph
ggplot(df, aes(x = lambda)) +
  geom_line(aes(y = df[,1], color = "5")) +
  geom_line(aes(y = df[,2], color = "7")) +
  labs(x = 'Treatment effect size (in percentage terms)',
       y = 'Number of clicks needed per treatment arm') +
  scale_color_manual(name = "Number of treatment arms", values = c("5" = "darkblue", "7" = "red"))

Then, we can calculate total costs based on $0.155 per click.

Let’s say instead that we only compare each treatment arm to one other treatment arm. This approach is what we have outlined in the experimental design document. Then, the below figure contains the same information as the above figure, but we are conducting fewer hypothesis tests.

### DEFINE PARAMETERS ###

# outcome 
p <- 0.14365 #from pilot 5 descriptive analytics analysis 595/4142

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- seq(from = 0.1, to = 0.3, length.out = 100 )

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power <- function(t, lambda) {
  4 * (qnorm(1-alpha/2/(3*(t-1)))+(qnorm(1 - beta)))^2 * ((1 - p)/((lambda * lambda) * p))}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda and number of treatments
n_obs <- matrix(NA, nrow = length(lambda), ncol = length(treat))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    n_obs[i,j] <- vax_power(treat[j], lambda[i])
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(n_obs,lambda))

# create the graph
ggplot(df, aes(x = lambda)) +
  geom_line(aes(y = df[,1], color = "5")) +
  geom_line(aes(y = df[,2], color = "7")) +
  labs(x = 'Treatment effect size (in percentage terms)',
       y = 'Number of clicks needed per treatment arm') +
  scale_color_manual(name = "Number of treatment arms", values = c("5" = "darkblue", "7" = "red"))

Then, we can again calculate total costs based on $0.155 per click.

print(paste("For treatment effect of 15% over baseline, the treatment effect would need to be",round(0.15*p,4)," and you would need:"))
## [1] "For treatment effect of 15% over baseline, the treatment effect would need to be 0.0215  and you would need:"
print(paste(round(vax_power(treat,0.15),0),"clicks per treatment arm for", treat,"treatment arms, for a total number of ", round(treat*vax_power(treat,0.15),0),"clicks at a total cost of $",round(0.155*treat*vax_power(treat,0.15),0)))
## [1] "14563 clicks per treatment arm for 5 treatment arms, for a total number of  72813 clicks at a total cost of $ 11286" 
## [2] "15570 clicks per treatment arm for 7 treatment arms, for a total number of  108989 clicks at a total cost of $ 16893"

Sample size calculations – Bonferonni – pr(unvax web click|complete)

The below figure uses the previous equation to calculate the sample size needed for each treatment arm, given the parameters.

  • The outcome is the probability of an unvaccinated person clicking the website link, conditional on completing the survey.
  • So, the sample for this outcome is the number of completes.
  • We have not piloted this outcome, so I am guessing the baseline to be ~20% (this is probably a bit high, but it is also the only way to get costs in a reasonable range – pilot will inform if this is reasonable).
  • I consider treatment effects ranging from 10% to 30% of baseline (i.e. ~1% to 4%)
  • I correct for multiple hypothesis testing using Bonferroni.
  • For the Bonferroni correction, I assume the number of hypotheses = number of outcomes (3) x number of treatment comparisons (factorial(t)/2) for 5 or 7 treatment arms.
### DEFINE PARAMETERS ###

# outcome 
p <- 0.2

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- seq(from = 0.1, to = 0.3, length.out = 100 )

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power <- function(t, lambda) {
  4 * (qnorm(1-alpha/2/(3*factorial(t)/2))+(qnorm(1 - beta)))^2 * ((1 - p)/((lambda * lambda) * p))}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda and number of treatments
n_obs <- matrix(NA, nrow = length(lambda), ncol = length(treat))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    n_obs[i,j] <- vax_power(treat[j], lambda[i])
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(n_obs,lambda))

# create the graph
ggplot(df, aes(x = lambda)) +
  geom_line(aes(y = df[,1], color = "5")) +
  geom_line(aes(y = df[,2], color = "7")) +
  labs(x = 'Treatment effect size (in percentage terms)',
       y = 'Number of completes needed per treatment arm') +
  scale_color_manual(name = "Number of treatment arms", values = c("5" = "darkblue", "7" = "red"))

Then, we can calculate total costs based on $1.30 per unvaccinated complete.

print(paste("For treatment effect of 25% over baseline, the treatment effect would need to be",round(0.25*p,4)," and you would need:"))
## [1] "For treatment effect of 25% over baseline, the treatment effect would need to be 0.05  and you would need:"
print(paste(round(vax_power(treat,0.25),0),"completes per treatment arm for", treat,"treatment arms, for a total number of ", round(treat*vax_power(treat,0.25),0),"clicks at a total cost of $",round(1.3*treat*vax_power(treat,0.25),0)))
## [1] "5131 completes per treatment arm for 5 treatment arms, for a total number of  25654 clicks at a total cost of $ 33350"
## [2] "7320 completes per treatment arm for 7 treatment arms, for a total number of  51241 clicks at a total cost of $ 66614"

Let’s say instead that we only compare each treatment arm to one other treatment arm. This approach is what we have outlined in the experimental design document. Then, the below figure contains the same information as the above figure, but we are conducting fewer hypothesis tests.

### DEFINE PARAMETERS ###

# outcome 
p <- 0.2 

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- seq(from = 0.1, to = 0.3, length.out = 100 )

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power <- function(t, lambda) {
  4 * (qnorm(1-alpha/2/(3*(t-1)))+(qnorm(1 - beta)))^2 * ((1 - p)/((lambda * lambda) * p))}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda and number of treatments
n_obs <- matrix(NA, nrow = length(lambda), ncol = length(treat))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    n_obs[i,j] <- vax_power(treat[j], lambda[i])
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(n_obs,lambda))

# create the graph
ggplot(df, aes(x = lambda)) +
  geom_line(aes(y = df[,1], color = "5")) +
  geom_line(aes(y = df[,2], color = "7")) +
  labs(x = 'Treatment effect size (in percentage terms)',
       y = 'Number of completes needed per treatment arm') +
  scale_color_manual(name = "Number of treatment arms", values = c("5" = "darkblue", "7" = "red"))

Then, we can again calculate total costs based on $1.30 per unvaccinated complete.

print(paste("For treatment effect of 25% over baseline, the treatment effect would need to be",round(0.25*p,4)," and you would need:"))
## [1] "For treatment effect of 25% over baseline, the treatment effect would need to be 0.05  and you would need:"
print(paste(round(vax_power(treat,0.25),0),"completes per treatment arm for", treat,"treatment arms, for a total number of ", round(treat*vax_power(treat,0.25),0),"completes at a total cost of $",round(1.3*treat*vax_power(treat,0.25),0)))
## [1] "3518 completes per treatment arm for 5 treatment arms, for a total number of  17588 completes at a total cost of $ 22865"
## [2] "3761 completes per treatment arm for 7 treatment arms, for a total number of  26327 completes at a total cost of $ 34225"

Power calculations – Bonferonni – pr(unvax complete|click)

The below figure uses the below equation to calculate the power needed for each treatment arm, given the parameters.

  • The outcome is the probability of an unvaccinated person completing the survey, conditional on clicking on the Facebook ad.
  • The sample for this outcome is the number of clicks.
  • The baseline of this outcome from the pilot is ~14%.
  • I consider treatment effects ranging from 10% to 20% of baseline (i.e. ~1.4% to 2.8%)
  • I correct for multiple hypothesis testing using Bonferroni.
  • For the Bonferroni correction, I assume the number of hypotheses = number of outcomes (3) x number of treatment comparisons (factorial(t)/2) for 5 or 7 treatment arms.
  • I used the number of observations that seemed in a reasonable range cost-wise from the sample size analysis to set the scale for the power analysis.

\[ \boxed{1-\beta = \Phi\left( \frac{\tau}{2\sigma/\sqrt{n}} - z_{1-\alpha} \right)} \]

## DEFINE PARAMETERS ###

# define the possible total number of clicks
n <- seq(from = 50000, to = 300000, length.out = 100 )

# outcome 
p <- 0.14365 #from pilot 5 descriptive analytics analysis 595/4142

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- cbind(0.1, 0.15, 0.2)

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power2 <- function(lambda, n, t) {
  pnorm(
    ( lambda*p / (2 * sd / sqrt(n / t))) - qnorm(1-alpha/2/(3*factorial(t)/2))
  )
}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda, number of observations, and number of treatments
power <- matrix(NA, nrow = length(n), ncol = length(treat)*length(lambda))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    for (k in 1:length(n)){
      power[k,j+2*i-2] <- vax_power2(lambda[i], n[k], treat[j])
      }
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(power,n))

# create the graph
ggplot(df, aes(x = n)) +
  geom_line(aes(y = df[,1], color = "lambda = 0.1, treat = 5")) +
  geom_line(aes(y = df[,2], color = "lambda = 0.1, treat = 7")) +
  geom_line(aes(y = df[,3], color = "lambda = 0.15, treat = 5")) +
  geom_line(aes(y = df[,4], color = "lambda = 0.15, treat = 7")) +
  geom_line(aes(y = df[,5], color = "lambda = 0.2, treat = 5")) +
  geom_line(aes(y = df[,6], color = "lambda = 0.2, treat = 7")) +
  labs(x = 'Number of Total Clicks',
       y = 'Power') +
  scale_color_manual(name = "Number of treatment arms", values = c("lambda = 0.1, treat = 5" = "darkblue", "lambda = 0.1, treat = 7" = "red", "lambda = 0.15, treat = 5" = "darkgreen", "lambda = 0.15, treat = 7" = "turquoise4", "lambda = 0.2, treat = 5" = "darkorchid4", "lambda = 0.2, treat = 7" = "springgreen2"))

Let’s say instead that we only compare each treatment arm to one other treatment arm. This approach is what we have outlined in the experimental design document. Then, the below figure contains the same information as the above figure, but we are conducting fewer hypothesis tests.

## DEFINE PARAMETERS ###

# define the possible total number of clicks
n <- seq(from = 50000, to = 300000, length.out = 100 )

# outcome 
p <- 0.14365 #from pilot 5 descriptive analytics analysis 595/4142

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- cbind(0.1, 0.15, 0.2)

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power2 <- function(lambda, n, t) {
  pnorm(
    ( lambda*p / (2 * sd / sqrt(n / t))) - qnorm(1-alpha/2/(3*(t-1)))
  )
}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda, number of observations, and number of treatments
power <- matrix(NA, nrow = length(n), ncol = length(treat)*length(lambda))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    for (k in 1:length(n)){
      power[k,j+2*i-2] <- vax_power2(lambda[i], n[k], treat[j])
      }
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(power,n))

# create the graph
ggplot(df, aes(x = n)) +
  geom_line(aes(y = df[,1], color = "lambda = 0.1, treat = 5")) +
  geom_line(aes(y = df[,2], color = "lambda = 0.1, treat = 7")) +
  geom_line(aes(y = df[,3], color = "lambda = 0.15, treat = 5")) +
  geom_line(aes(y = df[,4], color = "lambda = 0.15, treat = 7")) +
  geom_line(aes(y = df[,5], color = "lambda = 0.2, treat = 5")) +
  geom_line(aes(y = df[,6], color = "lambda = 0.2, treat = 7")) +
  labs(x = 'Number of Total Clicks',
       y = 'Power') +
  scale_color_manual(name = "Number of treatment arms", values = c("lambda = 0.1, treat = 5" = "darkblue", "lambda = 0.1, treat = 7" = "red", "lambda = 0.15, treat = 5" = "darkgreen", "lambda = 0.15, treat = 7" = "turquoise4", "lambda = 0.2, treat = 5" = "darkorchid4", "lambda = 0.2, treat = 7" = "springgreen2"))

Power calculations – Bonferonni – pr(unvax web click|complete)

The below figure uses the previous equation to calculate the power needed for each treatment arm, given the parameters.

  • The outcome is the probability of an unvaccinated person clicking the website link, conditional on completing the survey.
  • So, the sample for this outcome is the number of completes.
  • We have not piloted this outcome, so I am guessing the baseline to be ~20% (this is probably a bit high, but it is also the only way to get costs in a reasonable range – pilot will inform if this is reasonable).
  • I correct for multiple hypothesis testing using Bonferroni.
  • For the Bonferroni correction, I assume the number of hypotheses = number of outcomes (3) x number of treatment comparisons (factorial(t)/2) for 5 or 7 treatment arms.
  • I used the number of observations that seemed in a reasonable range cost-wise from the sample size analysis to set the scale for the power analysis.

\[ \boxed{1-\beta = \Phi\left( \frac{\tau}{2\sigma/\sqrt{n}} - z_{1-\alpha} \right)} \]

## DEFINE PARAMETERS ###

# define the possible total number of completes
n <- seq(from = 10000, to = 100000, length.out = 100 )

# outcome 
p <- 0.2 #from pilot 5 descriptive analytics analysis 595/4142

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- cbind(0.2, 0.25, 0.3)

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power2 <- function(lambda, n, t) {
  pnorm(
    ( lambda*p / (2 * sd / sqrt(n / t))) - qnorm(1-alpha/2/(3*factorial(t)/2))
  )
}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda, number of observations, and number of treatments
power <- matrix(NA, nrow = length(n), ncol = length(treat)*length(lambda))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    for (k in 1:length(n)){
      power[k,j+2*i-2] <- vax_power2(lambda[i], n[k], treat[j])
      }
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(power,n))

# create the graph
ggplot(df, aes(x = n)) +
  geom_line(aes(y = df[,1], color = "lambda = 0.2, treat = 5")) +
  geom_line(aes(y = df[,2], color = "lambda = 0.2, treat = 7")) +
  geom_line(aes(y = df[,3], color = "lambda = 0.25, treat = 5")) +
  geom_line(aes(y = df[,4], color = "lambda = 0.25, treat = 7")) +
  geom_line(aes(y = df[,5], color = "lambda = 0.3, treat = 5")) +
  geom_line(aes(y = df[,6], color = "lambda = 0.3, treat = 7")) +
  labs(x = 'Number of Total Completes',
       y = 'Power') +
  scale_color_manual(name = "Number of treatment arms", values = c("lambda = 0.2, treat = 5" = "darkblue", "lambda = 0.2, treat = 7" = "red", "lambda = 0.25, treat = 5" = "darkgreen", "lambda = 0.25, treat = 7" = "turquoise4", "lambda = 0.3, treat = 5" = "darkorchid4", "lambda = 0.3, treat = 7" = "springgreen2"))

Let’s say instead that we only compare each treatment arm to one other treatment arm. This approach is what we have outlined in the experimental design document. Then, the below figure contains the same information as the above figure, but we are conducting fewer hypothesis tests.

## DEFINE PARAMETERS ###

# define the possible total number of completes
n <- seq(from = 10000, to = 75000, length.out = 100 )

# outcome 
p <- 0.2 #from pilot 5 descriptive analytics analysis 595/4142

# standard deviation
sd <- sqrt(p * (1 - p))

# define the possible treatment effect sizes in *percent*
lambda <- cbind(0.2, 0.25, 0.3)

# alpha
alpha <- 0.05

# beta
beta <- 0.2

# number of treatments
treat <- cbind(5,7)

### DEFINE FUNCTION ###

vax_power2 <- function(lambda, n, t) {
  pnorm(
    ( lambda*p / (2 * sd / sqrt(n / t))) - qnorm(1-alpha/2/(3*(t-1)))
  )
}

### CREATE PLOT ###

# create vector to hold number of observations for each value of lambda, number of observations, and number of treatments
power <- matrix(NA, nrow = length(n), ncol = length(treat)*length(lambda))

# calculate number of observations for each value of lambda
for (i in 1:length(lambda)) {
  for (j in 1:length(treat)){
    for (k in 1:length(n)){
      power[k,j+2*i-2] <- vax_power2(lambda[i], n[k], treat[j])
      }
    }
  }

# create data frame that combines data to graph
df <- data.frame(cbind(power,n))

# create the graph
ggplot(df, aes(x = n)) +
  geom_line(aes(y = df[,1], color = "lambda = 0.2, treat = 5")) +
  geom_line(aes(y = df[,2], color = "lambda = 0.2, treat = 7")) +
  geom_line(aes(y = df[,3], color = "lambda = 0.25, treat = 5")) +
  geom_line(aes(y = df[,4], color = "lambda = 0.25, treat = 7")) +
  geom_line(aes(y = df[,5], color = "lambda = 0.3, treat = 5")) +
  geom_line(aes(y = df[,6], color = "lambda = 0.3, treat = 7")) +
  labs(x = 'Number of Total Completes',
       y = 'Power') +
  scale_color_manual(name = "Number of treatment arms", values = c("lambda = 0.2, treat = 5" = "darkblue", "lambda = 0.2, treat = 7" = "red", "lambda = 0.25, treat = 5" = "darkgreen", "lambda = 0.25, treat = 7" = "turquoise4", "lambda = 0.3, treat = 5" = "darkorchid4", "lambda = 0.3, treat = 7" = "springgreen2"))

7 Treatment arms

To assess the power for testing multiple correlated hypotheses, I simulate data, test the hypotheses, correct the p-values, and repeat for r repetitions, then calculate the proportion of repetitions in which I reject each hypothesis. The variables needed in the simulated data are the treatment assignment and the two outcomes: completion and clicking on the website link. In particular, we are interested in unvaccinated completes and unvaccinated website clicks. The sample for completion are clicks, while the sample for clicking on the website link are completes.

The data generation process is as follows:

  • Assign each observation to a treatment group. (treatment variable)
  • Take a draw from a binomial distribution with probability equal to the hypothesized unvax completion probability for the observation’s treatment group. (completion variable)
  • For observations who completed the survey, take a draw from a binomial distribution with probability equal to the hypothesized unvax website click probability for the observation’s treatment group. (webclick variable)

Power is calculated using the following parameters:

  • \(\alpha=0.05\)
  • replications \(r=1000\)
  • number of observations (clicks) \(n=\{75k,100k,150k,200k\}\) for a cost of \(\{11k,15k,22k,29k\}\) at $0.146 per click.

Results are reported for uncorrected p-values and for p-values corrected using Bonferonni, Holm, and Hochberg.

All pairwise comparison of treatments

First, we consider multiple hypothesis correction for evaluating all pairwise comparisons of treatments.

set.seed(94305)

### DEFINE PARAMETERS ###


outcomes1 <- matrix(c(0.04, 0.06, 0.08, 0.10, 0.12, 0.14, 0.12, # Pr(unvax complete|click)
                 0.10, 0.12, 0.14, 0.16, 0.18, 0.20, 0.18, # Pr(unvax web click|complete)
                 1- 0.13, 1- 0.15, 1- 0.20, 1-0.17, 1-0.21, 1-.2, 1-.16), # Pr(useful free text| complete) 
                 nrow = 3, byrow = T)

outcomes2 <- outcomes1 + 0.05

outcomes_list <- list(outcomes1, outcomes2)
t_list <- lapply(outcomes_list, FUN=function(x){c(1:ncol(x))})

### POWER CALCULATIONS ###

for (i in n){
  for(o in 1:length(outcomes_list)){
    treatments <- t_list[[o]]
    outcome_matrix  <- outcomes_list[[o]]
    
    # complete r replications of n trials
    results <- replicate(r,  trial(i, outcomes = outcome_matrix, t = treatments))
    
    # create matrix to store power calculations
    power_calc <- matrix(NA, 
                         nrow = nrow(outcome_matrix)*4, 
                         ncol = (length(treatments) * (length(treatments) - 1)/2)) 
    
    # for each pairwise treatment comparison
    for(j in 1:(length(treatments) * (length(treatments) - 1)/2)){
      
      # for each mht correction
      for(k in 1:4){
        
        # calculate power for completions outcome
        power_calc[k, j] <- sum(results[k, j, ] < alpha)/r
        
        # calculate power for website clicks outcome
        power_calc[4+k,j] <- sum(results[k+4,j,]<alpha)/r
        
        # calculate power for useful information  outcome
        power_calc[8+k,j] <- sum(results[k+8,j,]<alpha)/r
        
      }
    }
    
    # Generate Graph
        power_calc <- data.frame(outcome = c(rep('Unvax completes|Click',4),
                           rep('Unvax web clicks|Complete', 4),
                           rep('Unvax useful freetext|Complete', 4)), 
                         test = rep(c("None","Bonferonni", "Holm", "Hochberg"), 3),
                         power_calc)
   
    colnames(power_calc) <- c("Outcome","Correction", 
                              'T1 v. T2', 'T1 v. T3', 'T1 v. T4','T1 v. T5','T1 v. T6','T1 v. T7','T2 v. T3','T2 v. T4','T2 v. T5','T2 v. T6','T2 v. T7','T3 v. T4','T3 v. T5','T3 v. T6','T3 v. T7','T4 v. T5','T4 v. T6','T4 v. T7','T5 v. T6','T5 v. T7','T6 v. T7')
 
    
    kable(power_calc, format = "html",  caption = paste("Power Calculations: ",i/1000," Thousand Clicks", "Matrix", o), digits = 3) %>%
      collapse_rows(columns = 1) %>%
      print()
  }  
}

Only comparisons of treatment \(t\) to treatment \(t-1\)

Then, I evaluate how adding structure to the hypothesis testing increases power. Specifically, each treatment \(t\) is designed to be compared to the treatment below it and the treatment above it. So, for five treatments, we make 4 comparisons: T1 v. T2, T2 v. T3, T3 v. T4, and T4 v. T5.

set.seed(94305)

### DEFINE PARAMETERS ###
# parameters defined in previous code chunk

for (i in n){
  for(o in 1:length(outcomes_list)){
    treatments <- t_list[[o]]
    outcome_matrix  <- outcomes_list[[o]]
    
    # complete r replications of n trials
    results <- replicate(r,  trial_subset(i, outcomes = outcome_matrix, t = treatments))
    
    # create matrix to store power calculations
    power_calc <- matrix(NA, 
                         nrow = nrow(outcome_matrix)*4, 
                         ncol = length(treatments)-1) 
    
    # for each pairwise treatment comparison
    for(j in 1:(length(treatments) - 1)){
      
      # for each mht correction
      for(k in 1:4){
        
        # calculate power for completions outcome
        power_calc[k, j] <- sum(results[k, j, ] < alpha)/r
        
        # calculate power for website clicks outcome
        power_calc[4+k,j] <- sum(results[k+4,j,]<alpha)/r
        
        # calculate power for useful information outcome
        power_calc[8+k,j] <- sum(results[k+8,j,]<alpha)/r
        
      }
    }
    
    # Generate Graph
        power_calc <- data.frame(outcome = c(rep('Unvax completes|Click',4),
                           rep('Unvax web clicks|Complete', 4),
                           rep('Unvax useful freetext|Complete', 4)), 
                         test = rep(c("None","Bonferonni", "Holm", "Hochberg"), 3),
                         power_calc)
   
    colnames(power_calc) <- c("Outcome","Correction", 
                              'T1 v. T2', 'T2 v. T3','T3 v. T4','T4 v. T5','T5 v. T6','T6 v. T7')
 
    kable(power_calc, format = "html", caption = paste("Power Calculations: ",i/1000," Thousand Clicks", "Matrix", o), digits = 3) %>%
      collapse_rows(columns = 1) %>%
      print()
  }  
}