Hi Jane this is a plot of how far off the Calculated Lower limit is using the mean method from the exact method. Meaning the y axis is Lower Limit Value of CI from exact binomial method - Lower Limit value of CI from the mean method. As a function of N for a fixed phat of .5

It does seem to converge. Though I do agree calculating the sample standard deviation on a string of 1 and 0’s doesn’t feel right. For small n is way off.

I think the nuance is that \(\Large\frac{\hat{p}*\hat{q}}{n}\) is not the variance of phat, rather an unbiased estimator of the variance of \(\Large \hat{p}\). So I think that is why for small n, say 1,1,0 the estimated variance of phat using \(\Large\frac{\hat{p}*\hat{q}}{n}\) and using the S formula are way off. But that is to be expected because they are both estimators of the TRUE variance, and as long as they converge for reasonable n, we are ok.

A better simulation would be to make 10,000 CI’s using the exact method and make 10,000 CI’s using the mean method and see which one comes closer to capturing the true proportion 95% of the time.

rm(list = ls())
low.limit.diff <- function(x){
success <- x/2 
bin.method <- binom.test(success,x, conf.level = .95)
ci.bm <-c(bin.method$conf.int[1],bin.method$conf.int[2])
data <- c(rep(1,x/2),rep(0,x/2))
mean.method <- t.test(data)
ci.mm <-c(mean.method$conf.int[1],mean.method$conf.int[2])
return(ci.bm[1] - ci.mm[1])
}
nrange <- seq(from = 8, to = 40, by = 2)
qplot(nrange,sapply(nrange, low.limit.diff),
      geom = 'point',
      main = 'Difference in Low Limit CI',
      ylim = c(0,.12), 
      xlab = 'N',
      ylab = 'Difference')

plot of chunk unnamed-chunk-3