Question 1:
k <- length(which(ACS$HealthInsurance == "1"))
n <- length(na.omit(ACS$HealthInsurance))
p.hat <- k/n
p.hat
sample proportion = [1] 0.861
Question 2:
Question 3:
Suppose we want to construct a confidence interval. Are the conditions met to assume the sampling distribution of sample proportions is approximately normal (i.e., the CLT is valid)? Explain.
Random Sample: Yes, Its a random sample pulled from 3.5 million households.
n < 10%: Yes, the sample is 1% of the the population.
np >= and n(1-p) >= 10: Yes, we are trying to see who has health insurance vs those who don’t have health insurance.
Question 4:
What is the value of the estimated standard error? Use the formula from the Week 5 slides and estimate the standard error using the normal distribution.
se <- sqrt((p.hat*(1-p.hat))/n)
se
Standard Error = [1] 0.01093979
Question 5:
prop.test(k,n, .95)
##
## 1-sample proportions test with continuity correction
##
## data: k out of n, null probability 0.95
## X-squared = 164.89, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is not equal to 0.95
## 95 percent confidence interval:
## 0.8376433 0.8815296
## sample estimates:
## p
## 0.861
z = qnorm(.975, 0, 1)
z
## [1] 1.959964
qnorm(.025,0, 1)
## [1] -1.959964
x = z * se
upper = p.hat + x
lower = p.hat - x
lower
## [1] 0.8395584
upper
## [1] 0.8824416
Question 6:
What is the value of the estimated standard error? Use bootstrap simulations like in HW 4 to find the standard error.
SE = 0.01095587
boot.samp <-sample(ACS$HealthInsurance, size = n, replace = TRUE)
boot.phats <-c()
for(i in 1:10000){
boot.samp <-sample(ACS$HealthInsurance, n, replace = TRUE)
boot.k <-length(which(boot.samp==1))
boot.phat <- boot.k/n
boot.phats <-c(boot.phats, boot.phat)
}
hist(boot.phats)
mean(boot.phats)
## [1] 0.8611106
SE <-sd(boot.phats)
SE
## [1] 0.01101536
Question 7:
Find a confidence interval for the true proportion of US residents who have health insurance based on a confidence level that you choose and the standard error you calculated in question 6. Your confidence interval should be very similar to question 5.
Confidence Interval = (0.839, 0.882)
CI.lb <- (sort(boot.phats)[250])
CI.ub <- (sort(boot.phats)[9750])
CI.lb
## [1] 0.839
CI.ub
## [1] 0.882