Question 1

Using the following function (which was downloaded with the data set), plot all intervals. What proportion of your confidence intervals include the true population mean? Is this proportion exactly equal to the confidence level? If not, explain why.

population <- ames$Gr.Liv.Area
samp <- sample(population, 60)
mean(samp)
## [1] 1513.083
 samp_mean <- rep(NA, 50)
 samp_sd <- rep(NA, 50)
 n <- 60
 set.seed(123)
for(i in 1:50){
 samp <- sample(population, n) # obtain a sample of size n = 60 from the
population
 samp_mean[i] <- mean(samp) # save sample mean in ith element of samp_mean
 samp_sd[i] <- sd(samp) # save sample sd in ith element of samp_sd
}
lower_vector <- samp_mean - 1.96 * samp_sd / sqrt(n)
upper_vector <- samp_mean + 1.96 * samp_sd / sqrt(n)
par(mfrow = c(1, 1))
plot_ci(lower_vector, upper_vector, mean(population))

sum((upper_vector < mean(population)) | (mean(population) < lower_vector)) / length(lower_vector)
## [1] 0.04

About 96% percent of the time true mean is falling into the confidence interval So No, the propotion does not equal to 95% exactly.

Question 2

Pick a confidence level of your choosing, provided it is not 95%. What is the appropriate critical value?

if we would like confidence level of 90% we are able to get the following critical value:

crit = qnorm(0.1, lower.tail = FALSE)
crit
## [1] 1.281552

Question 3

Calculate 50 confidence intervals at the confidence level you chose in the previous question. You do not need to obtain new samples, simply calculate new intervals based on the sample means and standard deviations you have already collected. Using the plot_ci function, plot all intervals and calculate the proportion of intervals that include the true population mean. How does this percentage compare to the confidence level selected for the intervals?

lower_vector <- samp_mean - crit * samp_sd / sqrt(n)
upper_vector <- samp_mean + crit * samp_sd / sqrt(n)
plot_ci(lower_vector, upper_vector, mean(population))

1- sum((upper_vector < mean(population)) | (mean(population) < lower_vector)) /
length(lower_vector)
## [1] 0.8

The percentage is about 80% which is less than 90% confidence interval.