DATA 605 Discussion 9

11 Computer Program Testing CLT for a Proportion.

Write a computer program to simulate 10,000 Bernoulli trials with probability .3 for success on each trial. Have the program compute the 95 percent confidence interval for the probability of success based on the proportion of successes. Repeat the experiment 100 times and see how many times the true value of .3 is included within the confidence limits.

Bernoulli trials have two possible outcomes success or failure. Here we will use a uniform distribution with \(p = 0.3\). Note the default interval for runif() is 0 to 1.

clt_p <- runif(10000)
head(clt_p)

## [1] 0.8848114 0.5238375 0.3631770 0.1095180 0.4058682 0.5154214

length(clt_p)

## [1] 10000

min(clt_p)

## [1] 8.929288e-06

max(clt_p)

## [1] 0.9999734

We can see by the statistics above that this gives the correct interval for the conditions of the exercise.

success = 0
for(trial in clt_p){
  if(trial <= 0.3){
    success = success + 1
  }#if
}#for
success/length(clt_p)

## [1] 0.3039

Now we need to do this 100 times:

clt_p_100 <- function(){
  p_vec = vector()
  for(i in 1:100){
    clt_p <- runif(10000)
    success = 0
    for(trial in clt_p){
      if(trial <= 0.3){
      success = success + 1
      }#if
    }#for2
    p_vec[i] = success/length(clt_p)
  }#for1
return(p_vec)
}#function
hist(clt_p_100())

We can see the CLT demonstrated here in that the data was taken from a uniform distribution, however the proportion of successes are normally distributed about the expected value of 0.3.

We will now use the clt_p_100 function to test against the 95% confidence intervals \(p \pm 1.96*\sqrt{\frac{pq}{n}}\):

p = 0.3
q = 0.7
n = 10000
trials <- clt_p_100()
lower_bound = p - 1.96*(p*q/n)^0.5
lower_bound

## [1] 0.2910182

upper_bound = p + 1.96*(p*q/n)^0.5
upper_bound

## [1] 0.3089818

ci_test <- trials <= upper_bound & trials >= lower_bound
table(ci_test)

## ci_test
## FALSE  TRUE 
##     5    95

barplot(table(ci_test))

We would expect 95 out of 100 to be within the 95% confidence interval.

DATA 605 Discussion 9

Nathan Cooper

October 23, 2017

11 Computer Program Testing CLT for a Proportion.