Question 1

Let \(X_1,\dots,X_{25}\) be i.i.d. \(N(\mu, \sigma^2)\) with known \(\sigma^2 = 36\) and unknown \(\mu\).
So \(ar X \sim N(\mu, \sigma^2/n) = N(\mu, 36/25)\), and \[ \sigma_{ar X} = rac{6}{\sqrt{25}} = 1.2. \]

We test
\(H_0: \mu = 0\) vs \(H_1: \mu = 1.5\) at \(lpha = 0.1\).

(a) Rejection region

Use the Z-test statistic under \(H_0\): \[ Z = rac{ar X - 0}{1.2} \sim N(0,1) ext{ under } H_0. \]

Because the alternative mean is larger (\(1.5 > 0\)), we use a right-tailed test: \[ ext{Reject } H_0 ext{ if } Z > z_{0.9}. \]

alpha <- 0.10
z_crit <- qnorm(1 - alpha)  # z_{0.90}
z_crit
## [1] 1.282
sigma_bar <- 6 / sqrt(25)
c_bar <- z_crit * sigma_bar  # critical value for Xbar
c_bar
## [1] 1.538

Thus,

  • In terms of \(Z\): Reject \(H_0\) if \(Z > z_{0.9} pprox 1.282\).
  • In terms of \(ar X\): Reject \(H_0\) if
    \[ ar X > 1.2 \cdot z_{0.9} pprox 1.54. \]

(b) Power at \(\mu = 1.5\)

Power is \[ ext{Power}(\mu=1.5) = P_{\mu=1.5}igl(ar X > c_{ar X}igr) = P\left( rac{ar X - 1.5}{1.2} > rac{c_{ar X} - 1.5}{1.2} ight). \]

mu1 <- 1.5
power_q1 <- 1 - pnorm((c_bar - mu1) / sigma_bar)
power_q1
## [1] 0.4874

So the power of the test at \(\mu = 1.5\) is approximately the value printed above.


Question 2

Vacuum cleaner lifetimes (months), normal with unknown \(\mu\) and unknown \(\sigma^2\):

(a) Test mean < warranty (critical value approach)

We test
\(H_0: \mu = 36\) vs \(H_1: \mu < 36\) at \(lpha = 0.05\).

Use the one-sample t-test: \[ T = rac{ar X - \mu_0}{s/\sqrt{n}} \sim t_{n-1} ext{ under } H_0. \]

n <- 30
xbar <- 34
s <- 5
mu0 <- 36
alpha <- 0.05
df <- n - 1

t_stat <- (xbar - mu0) / (s / sqrt(n))
t_crit <- qt(alpha, df = df)  # left-tailed critical t

t_stat
## [1] -2.191
t_crit
## [1] -1.699

Decision rule (critical value approach):

  • Reject \(H_0\) if \(T \le t_{0.05, 29}\).

Since \(T\) is less than the critical value, we reject \(H_0\) and conclude that the mean lifetime is significantly shorter than 36 months at the 5% level.

(b) Test mean < warranty (p-value approach)

The p-value for a left-tailed t-test is \[ p ext{-value} = P(T_{29} \le t_{ ext{obs}}). \]

pval_q2b <- pt(t_stat, df = df)
pval_q2b
## [1] 0.01832

Because the p-value is less than 0.05, we again reject \(H_0\) and conclude that the mean lifetime is significantly shorter than 36 months.

(c) Test \(\sigma^2 > 20\) at 5% level

We test
\(H_0: \sigma^2 = 20\) vs \(H_1: \sigma^2 > 20\).

Use chi-square test: \[ \chi^2 = rac{(n-1)s^2}{\sigma_0^2} \sim \chi^2_{n-1} ext{ under } H_0. \]

sigma0_sq <- 20
chi_sq_stat <- (n - 1) * s^2 / sigma0_sq
chi_sq_stat
## [1] 36.25
chi_sq_crit <- qchisq(1 - alpha, df = df)  # upper-tail critical value
chi_sq_crit
## [1] 42.56

Decision rule:

  • Reject \(H_0\) if \(\chi^2 \ge \chi^2_{0.95, 29}\).

Here \(\chi^2_{ ext{obs}}\) is less than the critical value, so we fail to reject \(H_0\). There is not enough evidence to conclude that \(\sigma^2 > 20\) at the 5% level.

(d) Power when true \(\sigma^2 = 30\)

The rejection region from part (c) is \(\chi^2 > c\), where \(c = \chi^2_{0.95,29}\).

Under the true variance \(\sigma^2 = 30\), the statistic \[ rac{(n-1)s^2}{\sigma_0^2} = rac{\sigma^2}{\sigma_0^2} \cdot rac{(n-1)s^2}{\sigma^2} = rac{\sigma^2}{\sigma_0^2} \cdot \chi^2_{29} \] has a scaled chi-square distribution. The power is \[ ext{Power} = P\left( rac{(n-1)s^2}{\sigma_0^2} > c \mid \sigma^2 = 30 ight) = P\left( \chi^2_{29} > c \cdot rac{\sigma_0^2}{\sigma^2} ight). \]

sigma_true_sq <- 30

c_val <- chi_sq_crit
threshold <- c_val * sigma0_sq / sigma_true_sq  # c * (sigma0^2 / sigma^2)

power_q2 <- 1 - pchisq(threshold, df = df)
threshold
## [1] 28.37
power_q2
## [1] 0.4981

So the power of this variance test when the true variance is 30 is approximately the value printed above.


Question 3

Soft drink preference:

We test
\(H_0: p = 0.5\) vs \(H_1: p > 0.5\) at \(lpha = 0.05\).

Use the one-sample proportion Z-test: \[ Z = rac{\hat p - p_0}{\sqrt{p_0(1-p_0)/n}}. \]

n <- 500
x <- 270
p_hat <- x / n
p0 <- 0.5
alpha <- 0.05

se_p <- sqrt(p0 * (1 - p0) / n)
z_stat <- (p_hat - p0) / se_p
z_crit_q3 <- qnorm(1 - alpha)  # right-tailed

z_stat
## [1] 1.789
z_crit_q3
## [1] 1.645
pval_q3 <- 1 - pnorm(z_stat)
pval_q3
## [1] 0.03682

Because \(Z_{ ext{obs}} > z_{0.95}\) and the p-value is less than 0.05, we reject \(H_0\).

Conclusion: There is sufficient evidence at the 5% level to support the claim that a majority of adults prefer this soft drink maker’s beverage.


Question 4

Darwin’s plant data (paired):

cross <- c(23.5, 12.0, 21.0, 22.0, 19.1, 21.5, 22.1)
self  <- c(17.4, 20.4, 20.0, 20.2, 18.4, 18.6, 18.8)

diff <- cross - self
diff
## [1]  6.1 -8.4  1.0  1.8  0.7  2.9  3.3
mean(diff); sd(diff)
## [1] 1.057
## [1] 4.546

We assume both populations are normal and test whether there is a difference in mean height:

\[ H_0: \mu_{ ext{cross}} - \mu_{ ext{self}} = 0 \quad ext{vs}\quad H_1: \mu_{ ext{cross}} - \mu_{ ext{self}} e 0. \]

Use a paired t-test at \(lpha = 0.05\).

t_test_q4 <- t.test(cross, self,
                    paired = TRUE,
                    alternative = "two.sided",
                    conf.level = 0.95)
t_test_q4
## 
##  Paired t-test
## 
## data:  cross and self
## t = 0.62, df = 6, p-value = 0.6
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -3.147  5.261
## sample estimates:
## mean difference 
##           1.057

From the output:

Conclusion: We fail to reject \(H_0\). The data do not provide sufficient evidence at the 5% level to conclude that the mean heights of cross-pollinated and self-pollinated plants differ.


Question 5

Thread tensile strength:

We test the manufacturer’s claim that the mean of A exceeds the mean of B by more than 7 kg:

\[ H_0: \mu_A - \mu_B = 7 \quad ext{vs}\quad H_1: \mu_A - \mu_B > 7. \]

Observed difference: \[ \hat d = ar X_A - ar X_B = 86.8 - 77.8 = 9. \]

Under \(H_0\), the standard error of \(\hat d\) is \[ ext{SE} = \sqrt{ rac{\sigma_A^2}{n_A} + rac{\sigma_B^2}{n_B}}. \]

nA <- 50; nB <- 50
xbarA <- 86.8; xbarB <- 77.8
sigmaA <- 6.3; sigmaB <- 5.6

d_hat <- xbarA - xbarB
d0 <- 7

se_d <- sqrt(sigmaA^2 / nA + sigmaB^2 / nB)
z_stat_q5 <- (d_hat - d0) / se_d
z_crit_q5 <- qnorm(0.95)  # right-tailed alpha=0.05

z_stat_q5
## [1] 1.678
z_crit_q5
## [1] 1.645
pval_q5 <- 1 - pnorm(z_stat_q5)
pval_q5
## [1] 0.0467

Conclusion: We reject \(H_0\) and conclude there is sufficient evidence at the 5% level that the average tensile strength of thread A exceeds that of thread B by more than 7 kg.


Question 6

Internet browsing time:

Assume: \[ X_1 \sim N(\mu_1, \sigma_1^2), \quad X_2 \sim N(\mu_2, \sigma_2^2), \] independent samples.

(a) Test \(H_0: \sigma_1^2 / \sigma_2^2 = 1\) vs \(H_1: \sigma_1^2 / \sigma_2^2 > 1\)

The test statistic for the ratio of variances is \[ F = rac{s_1^2}{s_2^2}. \]

n1 <- 14; n2 <- 16
s1_sq <- 14; s2_sq <- 4

df1 <- n1 - 1
df2 <- n2 - 1

F_stat <- s1_sq / s2_sq
F_stat
## [1] 3.5
F_crit <- qf(0.95, df1, df2)  # upper-tail critical value
F_crit
## [1] 2.448

Decision rule:

  • Reject \(H_0\) if \(F \ge F_{0.95, 13, 15}\).

Since \(F_{ ext{obs}} > F_{ ext{crit}}\), we reject \(H_0\) and conclude that \(\sigma_1^2 > \sigma_2^2\); men’s browsing times are more variable than women’s at the 5% level.

(b) Power when \(\sigma_1 = 2\sigma_2\)

If \(\sigma_1 = 2\sigma_2\), then \(\sigma_1^2 / \sigma_2^2 = 4\). Let \[ F' = rac{S_1^2 / \sigma_1^2}{S_2^2 / \sigma_2^2} \sim F_{df1, df2}. \] Our actual statistic is \[ F = rac{S_1^2}{S_2^2} = rac{\sigma_1^2}{\sigma_2^2} F' = 4F'. \]

We reject if \(F > F_{ ext{crit}}\), so under the true ratio 4: \[ ext{Power} = P(F > F_{ ext{crit}}) = P(F' > F_{ ext{crit}} / 4) = 1 - F_{F_{df1,df2}}(F_{ ext{crit}} / 4). \]

F_crit
## [1] 2.448
power_q6b <- 1 - pf(F_crit / 4, df1, df2)
power_q6b
## [1] 0.8099

The power for detecting the variance ratio when \(\sigma_1 = 2\sigma_2\) is approximately the value printed above.

(c) Test whether mean browsing times differ at 5% level

We now want to test \[ H_0: \mu_1 = \mu_2 \quad ext{vs}\quad H_1: \mu_1 e \mu_2 \] at \(lpha = 0.05\).

From part (a), we found evidence that the variances are not equal, so we use the Welch two-sample t-test (unequal variances).

The test statistic is \[ T = rac{ar X_1 - ar X_2} {\sqrt{ rac{s_1^2}{n_1} + rac{s_2^2}{n_2}}} \] with approximate Welch degrees of freedom.

m1 <- 17; m2 <- 12

se_mean <- sqrt(s1_sq / n1 + s2_sq / n2)
t_stat_q6 <- (m1 - m2) / se_mean

df_welch <- (s1_sq / n1 + s2_sq / n2)^2 /
  ((s1_sq / n1)^2 / (n1 - 1) + (s2_sq / n2)^2 / (n2 - 1))

t_stat_q6
## [1] 4.472
df_welch
## [1] 19.27
pval_q6c <- 2 * (1 - pt(abs(t_stat_q6), df = df_welch))
pval_q6c
## [1] 0.0002532

Decision:

  • Two-sided test at \(lpha = 0.05\).
  • Since the p-value is much smaller than 0.05, we reject \(H_0\).

Conclusion: Using the Welch two-sample t-test, there is significant evidence at the 5% level that the mean browsing times for men and women are different.

(d) 95% CI for \(\mu_1 - \mu_2\)

A two-sided 95% confidence interval with unequal variances is \[ (ar X_1 - ar X_2) \pm t_{0.975, \, df_{ ext{Welch}}} \cdot \sqrt{ rac{s_1^2}{n_1} + rac{s_2^2}{n_2}}. \]

alpha <- 0.05
t_crit_q6 <- qt(1 - alpha/2, df = df_welch)
t_crit_q6
## [1] 2.091
diff_mean <- m1 - m2
lower <- diff_mean - t_crit_q6 * se_mean
upper <- diff_mean + t_crit_q6 * se_mean

c(lower, upper)
## [1] 2.662 7.338

So the 95% confidence interval for \(\mu_1 - \mu_2\) is approximately: \[ ( ext{lower bound},\ ext{upper bound}) \] as computed above.