Bayesian Predictive Power for Paired T-test using STAN

Introduction

Pilot data has been used to estimate the standard deviation of the differences for a future paired study. The pilot data is only 10 paired samples. A single point estimate of power does not reflect the uncertainty of the quantities used to calculate the power. A Bayesian approach using STAN is used to generate the predictive power distribution which will give a comprehensive illustration of what it is reasonable to believe about the power of the planned future study. The Bayesian predictive power distribution completely summarizes what is known about the power, based on the historical data available.

Pilot data

d <- read.table(text="ACqT    BCqT   CCqT        AqR             BCqR               CCqR
                -1.8437877297   -1.5592599217   -2.3139273849   -1.8341663748   -1.5425044964   -2.3745490226
                -0.5991266957   0.2629266851     -2.106288456      -0.4884675753   -0.052387544      -2.4636144784
                -2.4926691161   -2.3929118472   -3.7191617809   -2.1901884446   -2.1522001914   -3.3927693859
                -1.0465323886   -0.1179741618   -2.7931909355   -0.6064499177   0.1065362241     -2.8892182024
                -3.0905527786   -1.8075111746   -3.3981659717   -3.2018646424   -2.3039604743   -3.5905536786
                -0.8902361232   -2.3709100524   -2.9002937307   -0.9797577744   -2.1357309027   -2.8360027835
                -2.9059931807   -1.7863903728   -4.0336715002   -2.8450771467   -1.8677429945   -4.0297507463
                -1.6608943324   -1.1153663741   -4.7331067271   -1.975007893      -1.672985582      -5.2174619408
                -0.1460556763   2.5721730518     -2.1463799439   -0.0868255756   2.4229107601     -2.1160801
                ",
                header=TRUE, stringsAsFactors=FALSE)
 
        # select relevant data
        dd <- d[,c(2,5)]
        print(dd)

        BCqT        BCqR
1 -1.5592599 -1.54250450
2  0.2629267 -0.05238754
3 -2.3929118 -2.15220019
4 -0.1179742  0.10653622
5 -1.8075112 -2.30396047
6 -2.3709101 -2.13573090
7 -1.7863904 -1.86774299
8 -1.1153664 -1.67298558
9  2.5721731  2.42291076

        (sdd<-sd(dd[,1]-dd[,2]))    # sd of differences

[1] 0.3088401

        (diff <- dd[,1]-dd[,2])     # vector of differences

[1] -0.01675543  0.31531423 -0.24071166 -0.22451039  0.49644930 -0.23517915  0.08135262  0.55761921  0.14926229

Bundle data together

# bundle data and some instructions
data_list <- list(resp1 = diff, start = list(mu1 = mean(diff), sigma = sd(diff)))

Model

        f2 <- alist (
               
                resp1 ~ dnorm(mu1,sigma),
               
                mu1 ~ dnorm(0,5),
                sigma ~ dcauchy(0,1)
               
        )

Execute

i <- 500  # iterations per chain
w <- 200  # warm up per chain
ch <- 4  # chains
mcmcs <- i * ch - w * ch  # mcmcs used in estimation

# execute the analysis

result <- map2stan(f2, data = data_list, iter = i, warmup = w, chains = ch)

Inspect joint distribution

Inspect MCMC chains

The chains mix well after warm up

Print posterior summary and extract MCMC samples

print(precis(result, depth = 2, prob = 0.95, digits = 4))

        Mean StdDev lower 0.95 upper 0.95 n_eff   Rhat
mu1   0.1078 0.1228    -0.1542     0.3432   341 1.0038
sigma 0.3620 0.1165     0.1906     0.5799   300 1.0059

posts <- extract.samples(result)

mu1 <- posts$mu1  # store the posterior density
std <- posts$sigma  # store the posterior density

Inputs

n <- 10  # assess this sample size
delta <- 0.5  # assess this effect

A frequentist may report the power as this…

# enusure you used standard deviation of the differences
power.t.test(delta = delta, sd = sdd, n = n, type = "one.sample")


     One-sample t test power calculation 

              n = 10
          delta = 0.5
             sd = 0.3088401
      sig.level = 0.05
          power = 0.9947832
    alternative = two.sided

Or a frequentist simulation could be reported

# enusure you used standard deviation of the differences
mean(replicate(1000, t.test(x = rnorm(n, delta, sdd))$p.value) < 0.05)

[1] 0.996

Frequentist the inner working of powering t-test, for information

   # manipulated code (see reference) 
   # Note the SD not SD of diff is used here because original code
   # is for 2 sample power
 
   n1 <- n
   n2 <- n                                  # EoB manipulated for same size in both grps
   mu1 <- 0
   mu2 <- delta
   sd1 <- sd2 <- sdd
   sd1 <- sd2 <- sdd/sqrt(2)                # EOB added this for paired!
   alpha   <- 0.05
   dfGP    <- n1+n2 - 2                     # degrees of freedom (used by G*Power)
   dfGP    <- (n - 1) * 1                   # EOB added this for paired!
   cvGP    <- qt(1-alpha/2, dfGP)           # crit. value 2-sided test (under the null)
   muDiff  <- mu2-mu1                       # true difference in means
   sigDiff <- sqrt((sd1^2/n1) + (sd2^2/n2)) # true SD for difference in empirical means
   ncp     <- muDiff / sigDiff              # noncentrality parameter (under alternative)
   1-pt(cvGP, dfGP, ncp)                    # power

[1] 0.9947832

Bayesian posterior distribution of power, for the given sample size and effect

res <- power.t.test(delta = delta, sd = std, n = n, type = "one.sample")  #
summary(res$power)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2070  0.9248  0.9847  0.9352  0.9985  1.0000

# interogate the R code power.t.test
nu <- (n - 1)
qu <- qt(0.025, nu, lower.tail = FALSE)
power <- pt(qu, nu, ncp = sqrt(n/1) * delta/std, lower.tail = FALSE) + pt(-qu, nu, ncp = sqrt(n/1) * delta/std, lower.tail = TRUE)

summary(power)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2079  0.9248  0.9847  0.9352  0.9985  1.0000

Repeat of above calculation

   n1 <- n
   n2 <- n                                  # EoB manipulated for same size in both grps
   mu1 <- 0
   mu2 <- delta
   sd1 <- sd2 <- std
   sd1 <- sd2 <- std/sqrt(2)                # EOB added this for paired!
   alpha   <- 0.05
   dfGP    <- n1+n2 - 2                     # degrees of freedom (used by G*Power)
   dfGP    <- (n - 1) * 1                   # EOB added this for paired!
   cvGP    <- qt(1-alpha/2, dfGP)           # crit. value 2-sided test (under the null)
   muDiff  <- mu2-mu1                       # true difference in means
   sigDiff <- sqrt((sd1^2/n1) + (sd2^2/n2)) # true SD for difference in empirical means
   ncp     <- muDiff / sigDiff              # noncentrality parameter (under alternative)
   summary(1-pt(cvGP, dfGP, ncp))           # power

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2070  0.9248  0.9847  0.9352  0.9985  1.0000

Visualisation posterior power distribution, PDF

Most of the distribution is in the region of high power, therefore it is highly likely the sample size will have high power to meet the objective.

Visualisation posterior power distribution, CDF

The cumulative probability rises after 0.8 power. Therefore there are far less chance power is less than 0.80

Posterior distribution of power, for the given sample size, how to communicate?

[1] "There is a 94.8% chance the power of the planned study is 0.7 or more"

[1] "There is a 90.2% chance the power of the planned study is 0.8 or more"

[1] "There is a 80% chance the power of the planned study is 0.9 or more"

[1] "There is a 50% chance the power of the planned study is 0.98 or more"

References

    http://stats.stackexchange.com/questions/17042/power-for-two-sample-t-test

Computing Environment

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rethinking_1.58 rstan_2.9.0-3   ggplot2_2.1.0   knitr_1.12.3   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4        magrittr_1.5       MASS_7.3-45        munsell_0.4.3      colorspace_1.2-6   lattice_0.20-33   
 [7] stringr_1.0.0      plyr_1.8.3         tools_3.2.2        grid_3.2.2         gtable_0.2.0       loo_0.1.6         
[13] KernSmooth_2.23-15 coda_0.18-1        matrixStats_0.50.2 htmltools_0.3.5    yaml_2.1.13        digest_0.6.9      
[19] gridExtra_2.2.1    formatR_1.3        codetools_0.2-14   inline_0.3.14      evaluate_0.9       rmarkdown_0.9.6   
[25] stringi_1.0-1      scales_0.4.0       stats4_3.2.2       mvtnorm_1.0-5

[1] "~/X/DEVELOPMENT\\BAYESIAN BOOTSTRAP POWER\\PAIRED TTEST USING STAN"

This took 4.53 seconds to execute.