Solutions to Homework from Day 7

Question 4.15:

(a) Let \( \mu \) be the true mean of the amounts slept by New Yorkers (in hours).

\( H_0 \): \( \mu = 8 \).

\( H_1 \): \( \mu < 8 \).

(b) Let \( \mu \) be the true mean of the daily amount of worker time spent on personal business during the month of March (in minutes).

\( H_0 \): \( \mu = 15 \).

\( H_1 \): \( \mu > 15 \).

Question 4.17:

There are several errors in the hypotheses as constructed:

\( H_0 \): \( \bar{x} < 2 \)

\( H_A \): \( \bar{x} > 3.5 \).

It seems to me that each hypothesis should really involve \( \mu \), the true mean of the time spent by college students communicating online each week (in hours), rather than \( \bar{x} \), which is an observed value. Also, the null hypothesis \( H_0 \) should involve an equality. I don't mind the alternative hypothesis being an inequality, but it should relate to the null hypothesis, so I suggest \( \mu > 2 \). Calling the alternative hypothesis \( H_A \) is fine, though I often call it \( H_1 \). So I would have written:

\( H_0 \): \( \mu = 2 \).

\( H_1 \): \( \mu > 2 \).

We can't determine the p-value using only \( \bar{x} = 3.5 \) and \( n=60 \). We'd also need to know the standard deviation \( s \) of the data. So we can't reach a conclusion on the validity of the null hypothesis.

Question 4.24:

(a) Perform a hypothesis test to evaluate if these data provide convincing evidence that the average IQ of mothers of gifted children is different than the average IQ for the population at large, which is 100. Use a significance level of 0.10.

We let \( \mu \) stand for the true average IQ of mothers of gifted children. The hypotheses are:

\( H_0 \): \( \mu = 100 \)

\( H_1 \): \( \mu \neq 100 \).

Here \( \alpha = 0.10 \), \( n = 36 \), \( mean = 118.2 \) and \( s = 6.5 \). The test statistic is computed with reference to the proposed mean (100) in the null hypothesis. We compute the p-value of the test statistic.

n = 36
mean = 118.2
s = 6.5
test.stat = (mean - 100)/(s/sqrt(n))
p = 2 * pt(-test.stat, df = n - 1)  # note that we use -test.stat since test.stat is positive
print(p)
## [1] 2.498e-18

Since \( p < \alpha \), we reject the null hypothesis. The data favor the claim that the mothers of gifted children have higher mean IQ than mothers in general.

(b) Calculate a 90% confidence interval for the average IQ of mothers of gifted children.

tstar = qt(0.95, df = n - 1)
low = mean - tstar * s/sqrt(n)
high = mean + tstar * s/sqrt(n)
print(c(low, high))
## [1] 116.4 120.0

A 90% CI for the average IQ of mothers of gifted children is (116.4,120).

(c) Do your results from the hypothesis test and the confidence interval agree? Explain.

The 90% CI (116.4,120) does not contain the proposed mean (100) from the null hypothesis. This is another way that we can see that we should reject the null hypothesis at the 10% level of significance. This method of determining the result of a hypothesis test is equivalent to running the usual hypothesis test (for two-sided hypothesis testing).

Question 4.28:

A food safety inspector is called upon to investigate a restaurant with a few customer reports of poor sanitation practices. The food safety inspector uses a hypothesis testing framework to evaluate whether regulations are not being met. If he decides the restaurant is in gross violation, its license to serve food will be revoked.

(a) Write the hypotheses in words.

\( H_0 \): The regulations are being met (e.g. the true mean of the number of violations is \( \leq \) some threshold)

\( H_1 \): The regulations are not being met (e.g. the true mean of the number of violations is > the threshold)

(b) What is a Type 1 error in this context?

The restaurant is meeting the regulations but the evidence suggests otherwise (e.g. from sampling variation).

(c) What is a Type 2 error in this context?

The restaurant is not meeting the regulations but the evidence is insufficient to conclude otherwise.

(d) Which error is more problematic for the restaurant owner? Why?

A Type I error is more problematic for the restaurant owner since s/he is meeting the sanitation standards but are about to be punished for not meeting them.

(e) Which error is more problematic for the diners? Why?

A Type II error is more problematic for the diners since they will expect to be eating in a clean environment, but in fact the restaurant should be subject to health regulation enforcement.

(f) As a diner, would you prefer that the food safety inspector requires strong evidence or very strong evidence of health concerns before revoking a restaurant's license? Explain your reasoning.

As a diner, I'd prefer that the food safety inspector requires strong evidence of health concerns before revoking a restaurant's license (rather than very strong evidence). That way I can be confident about the cleanliness of the establishment when I learn that it passed inspection. On the other hand, if standards are too strict, many good restaurants (including clean ones!) will have to close when they get violation notices based on weak evidence. A proper balance is probably necessary. Perhaps a first test with very strong evidence can certify most restaurants as in compliance and a second test (with more observations) could be run for restaurants that fail the first test. This would give good, clean, restaurants an opportunity to avoid punishment.