A survey on 1,509 high school seniors who took the SAT and who completed an optional web survey between April 25 and April 30, 2007 shows that 55% of high school seniors are fairly certain that they will participate in a study abroad program in college.
n <- 1509 #number of respondents
p <- 0.55 #% of respondents certain that they will participate in study abroad
(a) Is this sample a representative sample from the population of all high school seniors in the US? Explain your reasoning.
No.
Respondents only include those who took the SAT between April 25 and April 30 and are willing to respond. This means that several groups of people were excluded:
those who aren’t willing to respond(might be someone who aren’t interested in the topic), those who didn’t not take SAT (might be someone who took ACT or chose not to take a similar kind of exam),
and those who took the exam during another period of time (students who take SAT in April tend to be Juniors, so the senior population should be relatively small).
(b) Let’s suppose the conditions for inference are met.
(observations are independent; success-failure condition met \(\Rightarrow\) sample distribution are nearly normal.)
Even if your answer to part (a) indicated that this approach would not be reliable, this analysis may still be interesting to carry out (though not report). Construct a 90% confidence interval for the proportion of high school seniors (of those who took the SAT) who are fairly certain they will participate in a study abroad program in college, and interpret this interval in context.
condition interval: point estimate \(\pm\space z*SE\)
Standard Error: \(SE\) = \(\sqrt\frac{p\space(1-p)}{n}\)
SE <- sqrt(p*(1-p)/n) #est of std
z <- qnorm(0.9) #z-score: returns the number whose cumulative distribution matches given probability
confidence_interval_upper <- p + z * SE
confidence_interval_lower <- p - z * SE
c(confidence_interval_lower,confidence_interval_upper)
## [1] 0.5335873 0.5664127
We are 90% confident that the true population of all high school seniors in the US who will participate in a study abroad program in college is between 53.36% and 56.64%.
(c) What does “90% confidence” mean?
By definition, a confidence interval is a plausible range of values for the population parameter. For a confidence interval, similarly, a 90% confidence means we are 90% confident that the confidence interval captured the true parameter.
When we randomly select a sample from the true population of all high school senior in the US, 90% of the population should have a proportion (percentage of high school seniors who are fairly certain that they will participate in a study abroad program in college) within this range (the confidence interval).
(d) Based on this interval, would it be appropriate to claim that the majority of high school seniors are fairly certain that they will participate in a study abroad program in college?
Based on this interval, since both ends of 90% confidence interval are above 50%, it would be appropriate to claim that the majority of high school seniors are fairly certain that they will participate in a study abroad program in college.