Submit your homework to Canvas by the due date and time. Email your lecturer if you have extenuating circumstances and need to request an extension.
If an exercise asks you to use R, include a copy of all relevant code and output in your submitted homework file. You can copy/paste your code, take screenshots, or compile your work in an Rmarkdown document.
If a problem does not specify how to compute the answer, you many use any appropriate method. I may ask you to use R or use manual calculations on your exams, so practice accordingly.
You must include an explanation and/or intermediate calculations for an exercise to be complete.
Be sure to submit the HWK7 Autograde Quiz which will give you ~20 of your 40 accuracy points.
50 points total: 40 points accuracy, and 10 points completion
Exercise 1. At the Hawaii Pineapple Company, managers are interested in the size of the pineapples grown in the company’s fields. Last year, the weight of the pineapples harvested from one large field was roughly normally distributed with a mean of 31 ounces and a standard deviation of 4 ounces.
A different irrigation system was installed in this field after the growing season. Managers wonder if the the mean weight of pineapples grown in the field this year will be different from last.
- Write out the manager’s question in terms of a null \(H_0\) and alternative hypotheses \(H_A\) about the population mean \(\mu\) weight of all pineapples for this year.
- If the managers choose to use a significance level of 0.10 and assume \(\sigma=4\), identify the power of a Z test to detect a mean increase of 2 ounces (\(\mu_A=33\)). They plan to look at a sample of 30 pineapples this year and do a two-sided hypothesis test. Also identify the probability of making a type 2 error for this test.
qnorm(0.05, 0, 1)
## [1] -1.644854
qnorm(0.95, 0, 1)
## [1] 1.644854
pnorm(29.78, 33, 0.7302967) + pnorm(32.20, 33, 0.7302967, lower.tail = FALSE)
## [1] 0.8633444
1-.8633444
## [1] 0.1366556
qnorm(0.95, 0, 1)
## [1] 1.644854
qnorm(.90, 0, 1)
## [1] 1.281552
- Draw pictures of the null and alternative distributions of the sampling distribution of the sample mean (\(\bar{X}\)) and shade and label the areas that correspond to (i) Type 1 error, (ii) Type 2 error, and (iii) Power from part (b). You can draw these pictures by hand.
- Describe Type I and Type II errors of the test in context.
- What sample size should the managers use to ensure their \(10\%\) level 2-sided Z test has power of at least 0.9 to detect a true mean of 33 ounces (assuming \(\sigma=4\))?
Exercise 2. Suppose the managers from Exercise 1 collect a random sample of size 51 from their field this year to test the hypotheses: \(H_0: \mu = 31\) versus \(H_A: \mu \ne 31\) at a 10% significance level. They still believe it is reasonable to assume the distribution of weights for the pineapples is approximately normal, but they are not going to assume they know \(\sigma\). The recorded weights are given below.
pineapples <- c(33.61, 33.88, 33.11, 37.05, 31.41, 30.69,
33.94, 30.54, 26.04, 36.01, 39.70, 35.60,
34.96, 33.06, 33.63, 34.59, 32.99, 36.08,
32.33, 36.16, 25.06, 35.70, 30.74, 34.33,
40.02, 30.95, 32.68, 36.57, 32.44, 33.58,
26.44, 29.52, 35.39, 37.96, 31.56, 37.47,
34.22, 27.87, 26.08, 30.81, 36.86, 32.92,
34.29, 40.86, 33.30, 28.46, 34.10, 35.85,
37.39, 37.80, 35.36)
- Evaluate whether the assumptions of a t test are reasonably met. You should use visual tools to help assess the assumptions.
qqnorm(pineapples)
hist(pineapples)
mean(pineapples)
## [1] 33.56784
sd(pineapples)
## [1] 3.578308
2*pt(5.126, 50, lower.tail=FALSE)
## [1] 4.803707e-06
- Compute a t test statistic and p value and draw a conclusion for the hypothesis test \(H_0: \mu = 31\) versus \(H_A: \mu \ne 31\) at a 10% level. (Do this problem “by hand”, but you can use t.test to check your work).
- The code below runs the bootstrap procedure on the pineapples data 1000 times. It will produce a vector called
t_hatthat contains all of the bootstrap sampling distribution values. You do not need to edit this code.
Complete the bootstrap hypothesis test of \(H_0: \mu = 31\) versus \(H_A: \mu \ne 31\) at the 10% level. You’ll need to use the
t_hatvector to calculate a two-sided bootstrap p-value.
# Calculate summary statistics
x_bar <- mean(pineapples)
n <- length(pineapples)
# Create a vector to store t_hat values
t_hat <- numeric(1000)
# Bootstrap loop
for(i in 1:1000){
# 2. Draw a SRS of size n from data
x_star <- sample(pineapples, size = n, replace = T)
# 3. Calculate resampled mean and sd
x_bar_star <- mean(x_star)
s_star <- sd(x_star)
# 4. Calculate t_hat, and store it in vector
t_hat[i] <- (x_bar_star - x_bar) / (s_star/sqrt(n))
}
t_obs = 5.126
sum(t_hat > t_obs)
## [1] 0
sum(t_hat < t_obs)
## [1] 1000
(0/1000)*2
## [1] 0
- Explain why using the t, z, or bootstrap tools are all reasonable in this scenario.
Exercise 3. The p-value for a two-sided t test of the hypotheses \(H_0: \mu=15\), \(H_A: \mu \ne 15\) is 0.03.
- Does the 99% [two-sided] confidence interval for \(\mu\) using this same sample include 15? Why or why not?
- Does the 95% [two-sided] confidence interval for \(\mu\) using this same sample include 15? Why or why not?
Exercise 4. If there is strong evidence that the median weight of his pumpkins from the jumbo patch is different from 16, then a farmer feels like he will need to give specific directions to his staff on how to sort and price them. Let M be the population median. Use the sign test to test: \(H_0: M=16\) vs \(H_A: M \ne 16\) at \(\alpha=0.05\).
- Compute the observed test statistic and p-value. Draw a conclusion in the context of the problem. He sampled 9 pumpkins from his jumbo patch and found weights of: \[ 12.6, 12.9, 14.8, 14.3, 16, 19.1, 10.2, 11.4, 9.3\]
pumpkins = c(12.6, 12.9, 14.8, 14.3, 16, 19.1, 10.2, 11.4, 9.3)
hist(pumpkins, xlab = "weight")
sum(dbinom(0:1, size=8, prob=0.50))
## [1] 0.03515625
sum(dbinom(1:8, size=8, prob=0.50))
## [1] 0.9960938
0.03515625*2
## [1] 0.0703125
- Discuss how the statistical decision made in (a) compares to your impression about the amount of evidence against the null that the sample contained.
- Explain how, if at all, your observed test statistic and p value for the sign test of your hypotheses would change if the 19.1 lb pumpkin had actually been 35 lbs.