Write a computer program to simulate 10,000 Bernoulli trials with probability .3 for success on each trial. Have the program compute the 95 percent confidence interval for the probability of success based on the proportion of successes. Repeat the experiment 100 times and see how many times the true value of .3 is included within the confidence limits.
Bernoulli trials have two possible outcomes success or failure. Here we will use a uniform distribution with \(p = 0.3\). Note the default interval for runif() is 0 to 1.
clt_p <- runif(10000)
head(clt_p)
## [1] 0.8848114 0.5238375 0.3631770 0.1095180 0.4058682 0.5154214
length(clt_p)
## [1] 10000
min(clt_p)
## [1] 8.929288e-06
max(clt_p)
## [1] 0.9999734
We can see by the statistics above that this gives the correct interval for the conditions of the exercise.
success = 0
for(trial in clt_p){
if(trial <= 0.3){
success = success + 1
}#if
}#for
success/length(clt_p)
## [1] 0.3039
Now we need to do this 100 times:
clt_p_100 <- function(){
p_vec = vector()
for(i in 1:100){
clt_p <- runif(10000)
success = 0
for(trial in clt_p){
if(trial <= 0.3){
success = success + 1
}#if
}#for2
p_vec[i] = success/length(clt_p)
}#for1
return(p_vec)
}#function
hist(clt_p_100())
We can see the CLT demonstrated here in that the data was taken from a uniform distribution, however the proportion of successes are normally distributed about the expected value of 0.3.
We will now use the clt_p_100 function to test against the 95% confidence intervals \(p \pm 1.96*\sqrt{\frac{pq}{n}}\):
p = 0.3
q = 0.7
n = 10000
trials <- clt_p_100()
lower_bound = p - 1.96*(p*q/n)^0.5
lower_bound
## [1] 0.2910182
upper_bound = p + 1.96*(p*q/n)^0.5
upper_bound
## [1] 0.3089818
ci_test <- trials <= upper_bound & trials >= lower_bound
table(ci_test)
## ci_test
## FALSE TRUE
## 5 95
barplot(table(ci_test))
We would expect 95 out of 100 to be within the 95% confidence interval.