Part I: Further Reading

In the article entitled, “Power failure: why small sample size undermines the reliability of neuroscience”, Button et al. explain concerns with stastical power, particularly when it’s low. Statistical power shows investigators how likely it is that there are false negative comparisons for two groups. This involves taking a method of measurement (such as the mean), and quantitatively comparing these methods to get the effect size. Sometimes, the rate of false negatives is high, which means there is low statistical power. This can be due to a small sample size, a small effect size, or a combination of the two. It has been shown that studies with low statistical power produce more false negatives than those with high power. This means that there is a lowered discovery rate, because these measurements mightnot meet the statistically significant threshold (usually p < 0.05). These low-power studies might also only be adequate for when there is a large difference between these groups. As such, this can inflate the results. The authors show instances of these in the field of neuroscience, to show how publication bias can affect results. Using meta-analysis, which is becoming increasingly popular, the authors highlight that neuroimaging studies use a small sample size usually, and that these sample sizes have egregiously low statistical power (usually less than 50%). In light of this, there is a high chance of overinflating resutls, which begs the question if these studies are really ethical: why perform experiments on animals, or even humans, if the power isn’t strong enough to detect more subtle changes.

Part II: Statistical Power

2.1 Contingency Data with Grandpa Bernie

Answer 2.1

Figure 1 shows the power curve for a decrease of relative risk by 90% (25% to 2.5% risk) with an \(\alpha\) of 0.05. By examining the figure, one can see that a sample size of 13 is required in both the control and experimental birth control group to achieve 80% power; therefore, 26 subjects are required. To see a statistical power of 95%, 26 human subjects are required for each group, which totals 52 subjects.

2.1 Continuous Data with “the Donald”

Answer 2.2

The power curve generated for an effect size of 50 and an \(\alpha\) = 0.05 is depicted above. According to Figure 2, to achieve 80% power, one would need to have a control and experimental group each containing 10 raptors, totalling 20 animals. To achieve 95% power, one would need 18 animals in each group, totalling 36 subjects.

Formulae Appendix

Contingency Data

#function for all of the stuff
power.n.fun <- function (n.sizes, risk.old, risk.new, deals = 10000){
    RR.b = rep(NA,deals)
    power = rep(NA, length(n.sizes))
    deals = 10000
    alpha = 0.05
    for (j in 1:length(n.sizes)){
      n.treat = c(n.sizes[j],n.sizes[j])
      for(i in 1:deals){
        new.b = sample(c("bad","good"),n.treat[1], replace = TRUE, 
                       prob = c(risk.old, 1 - risk.old))
        old.b = sample(c("bad","good"), n.treat[2], replace = TRUE, 
                       prob = c(risk.old, 1 - risk.old))
        n.bad.b = c(sum(new.b == "bad"),sum(old.b  == "bad"))
        risk.b = n.bad.b/n.treat
        RR.b[i] = risk.b[1]/risk.b[2]
    }
      RR.null = RR.b
      RR.crit = sort(RR.null)[alpha*deals]
    
    for (i in 1:deals){
      new.b = sample(c("bad","good"), n.treat[1], replace = TRUE, 
                     prob=c(risk.new, 1-risk.new))
      old.b = sample(c("bad","good"), n.treat[2], replace = TRUE, 
                     prob=c(risk.old, 1-risk.old))
      n.bad.b = c(sum(new.b == "bad"),sum(old.b == "bad"))
      risk.b = n.bad.b/n.treat
      RR.b[i] = risk.b[1]/risk.b[2]
    }
    RR.effect = RR.b
    power[j] = sum(RR.effect <= RR.crit,na.rm = TRUE)/deals
  }
  return(power)
}

n.sizes = seq(1,100, 1)
bern.bc <-power.n.fun(n.sizes = n.sizes, risk.old = 0.25, risk.new = 0.025)

matplot(n.sizes, bern.bc, type = "l", lwd = 2, col = "navy",
        xlab = "Sample Sizes", ylab = "Power", main = "Figure 1",
        axes = F)
axis(side=2, at=seq(0, 1, by=0.1))
axis(side=1, at=seq(0, 100, by=5))
abline(h=.8, lty = 3)
abline(h=.95, lty = 3)
text(x= 100,y=0.8, labels=paste("80%"), pos = 1)
text(x= 100,y=0.95, labels=paste("95%"), pos = 1)
abline(v = 13)
text(x= 13, y= 0.5, label=paste("n=13"), pos = 4)
abline(v = 26)
text(x= 26, y= 0.5, label=paste("n=26"), pos = 4)

Continuous Data

effectsizes <- 50
samplesizes <- seq(1,100, by=1)
xaxis <- samplesizes[seq(5, length(samplesizes), 5)]

X <- c(250, 276, 277, 278, 301, 311, 331, 336, 338, 340)
Y <- c(250, 276, 277, 278, 301, 311, 331, 336, 338, 340)

nboot = 10000
deals = 10000

dstat<- function(x1,x2, center=mean){center(x1) - center(x2)}
d0 <- dstat(X,Y)
dnull <- array(NA, c(length(samplesizes), nboot));
for (j in 1:length(samplesizes)){
  for (k in 1:nboot){
    x <- sample(X-d0, samplesizes[j]/2, replace=TRUE)
    y <- sample(Y, samplesizes[j]/2, replace=TRUE)
    dnull[j,k]<- dstat(x,y)
  }
}
crit <- apply(dnull,1,quantile, 0.95, na.rm =T )

deffect <- array(NA, c(length(effectsizes), length(samplesizes), nboot));
  for (i in 1:length(effectsizes)){
    for (j in 1:length(samplesizes)){
    for (k in 1:nboot){
      x <- sample (X-d0+effectsizes[i], samplesizes[j]/2,replace = TRUE)
      y <- sample (Y, samplesizes[j]/2, replace= TRUE)
      deffect[i,j,k] <- dstat(x,y)
    }
  }
}

power.array <- array(NA, c(length(effectsizes), length(samplesizes))); 
for (i in 1:length(effectsizes)){
    for (j in 1:length(samplesizes)){
        power.array[i,j] <- sum(deffect[i,j,] >= crit[j])/ nboot 
}
}

power.rep =rep(NA, c(length(samplesizes)));
  for(i in 1:length(samplesizes)) {
  power.rep[i]= sum(deffect[1,i,]>=crit[i])/10000
}