Submit your homework to Canvas by the due date and time. Email your lecturer if you have extenuating circumstances and need to request an extension.
If an exercise asks you to use R, include a copy of all relevant code and output in your submitted homework file. You can copy/paste your code, take screenshots, or compile your work in an Rmarkdown document.
If a problem does not specify how to compute the answer, you many use any appropriate method. I may ask you to use R or use manual calculations on your exams, so practice accordingly.
You must include an explanation and/or intermediate calculations for an exercise to be complete.
Be sure to submit the HWK6 Autograde Quiz which will give you ~20 of your 40 accuracy points.
50 points total: 40 points accuracy, and 10 points completion
Exercise 1. An automobile club pays for emergency road services (ERS) requested by its members. Upon examining a sample of 2927 ERS calls from the club members, the club finds that 1499 calls related to starting problems, 849 calls involved serious mechanical failures requiring towing, 498 calls involved flat tires or lockouts, and 81 calls were for other reasons.
- Construct a \(98\%\) confidence interval “by hand” for the proportion of all ERS calls from club members that are serious mechanical problems requiring towing services (after checking that necessary assumptions are well met).
qnorm(0.99, 0, 1)
## [1] 2.326348
qnorm(0.025, 0, 1)
## [1] -1.959964
- The current policy rate the automobile club pays is based on the thought that \(20\%\) of services requested will be serious mechanical problems requiring towing. However, the insurance company claims that the auto club has a higher rate of serious mechanical problems requiring towing services. Using your confidence interval in part (a), respond to the insurance company’s claim.
- The club wants to construct a \(95\%\) confidence interval for the proportion of members who want a chocolate fountain at the annual picnic. They want the margin of error to be less than 0.01. How large of a random sample of club members should they contact if they start with the assumption that \(50\%\) are in favor of a chocolate fountain at the picnic? (Hint: write out the formula for margin of error, then solve for n)
Exercise 2. Recall the cherry tree data set in R,
trees. Note that the diameter (in inches) is labelled Girth
in the data.
- Consider the hypothesis test of \(H_0: \mu_D=12\) vs \(H_A: \mu_D \ne 12\) where \(\mu_D\) is the mean diameter of cherry trees from which this sample was collected. Use an alpha level of \(\alpha=0.10\).
mean(trees$Girth)
## [1] 13.24839
sd(trees$Girth)
## [1] 3.138139
qt(0.05, 30)
## [1] -1.697261
qt(0.95, 30)
## [1] 1.697261
pt(2.2149, 30, lower.tail = FALSE)
## [1] 0.01725232
t.test(trees$Girth, mu=12)
##
## One Sample t-test
##
## data: trees$Girth
## t = 2.2149, df = 30, p-value = 0.0345
## alternative hypothesis: true mean is not equal to 12
## 95 percent confidence interval:
## 12.09731 14.39947
## sample estimates:
## mean of x
## 13.24839
- Compute the t test statistic and pvalue by hand (not using
t.test) and then confirm the values usingt.test.
- Use the p value to draw a conclusion about the hypotheses: \(H_0: \mu_D=12\) vs \(H_A: \mu_D \ne 12\) in the context of the question.
- Compare the conclusions drawn from the 90% confidence interval for \(\mu_D\) in homework 5, exercise 2(b) and the hypothesis test in the previous question.
- Consider the hypothesis test of \(H_0: \mu_H=77\) vs \(H_A: \mu \ne 77\) where \(\mu_H\) is the mean height of cherry trees from which this sample was collected. Use an alpha level of \(\alpha=0.10\).
mean(trees$Height)
## [1] 76
sd(trees$Height)
## [1] 6.371813
qt(0.05, 30)
## [1] -1.697261
qt(0.95, 30)
## [1] 1.697261
pt(-0.8738, 30)
## [1] 0.1945842
t.test(trees$Height, mu= 77)
##
## One Sample t-test
##
## data: trees$Height
## t = -0.87381, df = 30, p-value = 0.3892
## alternative hypothesis: true mean is not equal to 77
## 95 percent confidence interval:
## 73.6628 78.3372
## sample estimates:
## mean of x
## 76
- Compute the t test statistic and pvalue by hand (not using
t.test) and then confirm the values usingt.test.
- Use the p value to draw a conclusion about the hypotheses: \(H_0: \mu_H=77\) vs \(H_A: \mu \ne 77\) in the context of the question.
- Compare the conclusions drawn from the 90% confidence interval for \(\mu_H\) in homework 5, exercise 2(b) and the hypothesis test in the previous question.
- The code below calculates the lower and upper critical values needed for a 90% bootstrap confidence interval for \(\mu_D\) (mean diameter). Do not edit this code - just run the chunk and read off the output.
n <- 31
x_bar <- mean(trees$Girth)
t_hat <- numeric(1000)
set.seed(371)
# Bootstrap loop
for(i in 1:1000){
# 2. Draw a SRS of size n from data
x_star <- sample(trees$Girth, size = n, replace = T)
# 3. Calculate resampled mean and sd
x_bar_star <- mean(x_star)
s_star <- sd(x_star)
# 4. Calculate t_hat, and store it in vector
t_hat[i] <- (x_bar_star - x_bar) / (s_star/sqrt(n))
}
# Find left and right critical values of approx. distribution
quantile(t_hat, probs = 0.05, names = F)
## [1] -1.690054
quantile(t_hat, probs = 0.95, names = F)
## [1] 1.523721
mean(trees$Girth)
## [1] 13.24839
sd(trees$Girth)
## [1] 3.138139
hist(t_hat)
Use these critical values to construct a 90% bootstrap t confidence interval for \(\mu_D\) (mean diameter) from the sample data in the
treesdata set. Compare this confidence interval to the regular t CI constructed in homework 5, 2(b) and brainstorm possible reasons for the relationships you noticed.