Interval estimation for a population proportion

Exercise 1. An automobile club pays for emergency road services (ERS) requested by its members. Upon examining a sample of 2927 ERS calls from the club members, the club finds that 1499 calls related to starting problems, 849 calls involved serious mechanical failures requiring towing, 498 calls involved flat tires or lockouts, and 81 calls were for other reasons.

  1. Construct a \(98\%\) confidence interval “by hand” for the proportion of all ERS calls from club members that are serious mechanical problems requiring towing services (after checking that necessary assumptions are well met).
qnorm(0.99, 0, 1)
## [1] 2.326348
qnorm(0.025, 0, 1)
## [1] -1.959964
  1. The current policy rate the automobile club pays is based on the thought that \(20\%\) of services requested will be serious mechanical problems requiring towing. However, the insurance company claims that the auto club has a higher rate of serious mechanical problems requiring towing services. Using your confidence interval in part (a), respond to the insurance company’s claim.
  1. The club wants to construct a \(95\%\) confidence interval for the proportion of members who want a chocolate fountain at the annual picnic. They want the margin of error to be less than 0.01. How large of a random sample of club members should they contact if they start with the assumption that \(50\%\) are in favor of a chocolate fountain at the picnic? (Hint: write out the formula for margin of error, then solve for n)

T test for a single population mean

Exercise 2. Recall the cherry tree data set in R, trees. Note that the diameter (in inches) is labelled Girth in the data.

  1. Consider the hypothesis test of \(H_0: \mu_D=12\) vs \(H_A: \mu_D \ne 12\) where \(\mu_D\) is the mean diameter of cherry trees from which this sample was collected. Use an alpha level of \(\alpha=0.10\).
mean(trees$Girth)
## [1] 13.24839
sd(trees$Girth)
## [1] 3.138139
qt(0.05, 30)
## [1] -1.697261
qt(0.95, 30)
## [1] 1.697261
pt(2.2149, 30, lower.tail = FALSE)
## [1] 0.01725232
t.test(trees$Girth, mu=12)
## 
##  One Sample t-test
## 
## data:  trees$Girth
## t = 2.2149, df = 30, p-value = 0.0345
## alternative hypothesis: true mean is not equal to 12
## 95 percent confidence interval:
##  12.09731 14.39947
## sample estimates:
## mean of x 
##  13.24839
  1. Compute the t test statistic and pvalue by hand (not using t.test) and then confirm the values using t.test.
  1. Use the p value to draw a conclusion about the hypotheses: \(H_0: \mu_D=12\) vs \(H_A: \mu_D \ne 12\) in the context of the question.
  1. Compare the conclusions drawn from the 90% confidence interval for \(\mu_D\) in homework 5, exercise 2(b) and the hypothesis test in the previous question.
  1. Consider the hypothesis test of \(H_0: \mu_H=77\) vs \(H_A: \mu \ne 77\) where \(\mu_H\) is the mean height of cherry trees from which this sample was collected. Use an alpha level of \(\alpha=0.10\).
mean(trees$Height)
## [1] 76
sd(trees$Height)
## [1] 6.371813
qt(0.05, 30)
## [1] -1.697261
qt(0.95, 30)
## [1] 1.697261
pt(-0.8738, 30)
## [1] 0.1945842
t.test(trees$Height, mu= 77)
## 
##  One Sample t-test
## 
## data:  trees$Height
## t = -0.87381, df = 30, p-value = 0.3892
## alternative hypothesis: true mean is not equal to 77
## 95 percent confidence interval:
##  73.6628 78.3372
## sample estimates:
## mean of x 
##        76
  1. Compute the t test statistic and pvalue by hand (not using t.test) and then confirm the values using t.test.
  1. Use the p value to draw a conclusion about the hypotheses: \(H_0: \mu_H=77\) vs \(H_A: \mu \ne 77\) in the context of the question.
  1. Compare the conclusions drawn from the 90% confidence interval for \(\mu_H\) in homework 5, exercise 2(b) and the hypothesis test in the previous question.
  1. The code below calculates the lower and upper critical values needed for a 90% bootstrap confidence interval for \(\mu_D\) (mean diameter). Do not edit this code - just run the chunk and read off the output.
n <- 31
x_bar <- mean(trees$Girth)

t_hat <- numeric(1000)

set.seed(371)
# Bootstrap loop
for(i in 1:1000){
  # 2. Draw a SRS of size n from data
  x_star <- sample(trees$Girth, size = n, replace = T)
  
  # 3. Calculate resampled mean and sd
  x_bar_star <- mean(x_star)
  s_star <- sd(x_star)
  
  # 4. Calculate t_hat, and store it in vector
  t_hat[i] <- (x_bar_star - x_bar) / (s_star/sqrt(n))
}

# Find left and right critical values of approx. distribution
quantile(t_hat, probs = 0.05, names = F)
## [1] -1.690054
quantile(t_hat, probs = 0.95, names = F)
## [1] 1.523721
mean(trees$Girth)
## [1] 13.24839
sd(trees$Girth)
## [1] 3.138139
hist(t_hat)

Use these critical values to construct a 90% bootstrap t confidence interval for \(\mu_D\) (mean diameter) from the sample data in the trees data set. Compare this confidence interval to the regular t CI constructed in homework 5, 2(b) and brainstorm possible reasons for the relationships you noticed.