Setup and Data

library(tidyverse)
library(broom)
library(knitr)
library(kableExtra)
library(car)
library(gtsummary)

brfss_mlr <- readRDS("brfss_slr_2020.rds")

glimpse(brfss_mlr)

## Rows: 3,000
## Columns: 5
## $ bmi       <dbl> 31.01, 33.45, 46.86, 20.93, 28.97, 26.45, 23.43, 46.65, 25.8…
## $ age       <dbl> 50, 42, 36, 68, 67, 64, 24, 55, 60, 76, 71, 44, 66, 40, 52, …
## $ sex       <fct> Male, Male, Female, Male, Female, Female, Female, Female, Fe…
## $ phys_days <dbl> 5, 5, 15, 30, 2, 1, 4, 30, 2, 2, 2, 14, 30, 5, 4, 1, 3, 5, 3…
## $ sleep_hrs <dbl> 7, 7, 8, 7, 9, 7, 8, 12, 7, 9, 8, 6, 7, 7, 7, 7, 8, 7, 6, 4,…

Variables in this dataset: phys_days (outcome), bmi, age, sex, sleep_hrs (predictors).

Round 1: Student A Drives, Student B Navigates

Task 1: Fitting Models and ANOVA Tables (15 points)

1a. Fit the three-predictor model (5 pts)

We fit the model:

\[\text{phys\_days} = \beta_0 + \beta_1 \cdot \text{bmi} + \beta_2 \cdot \text{sleep\_hrs} + \beta_3 \cdot \text{age} + \varepsilon\]

# Fit the 3-predictor model
m_3pred <- lm(phys_days ~ bmi + sleep_hrs + age,
              data = brfss_mlr)

# Display coefficients with confidence intervals
tidy(m_3pred, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 4))) %>%
  kable(
    caption = "Table 1a. Three-Predictor Model Coefficients",
    col.names = c("Term", "Estimate", "Std. Error", "t-statistic",
                  "p-value", "95% CI Lower", "95% CI Upper")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Table 1a. Three-Predictor Model Coefficients
Term	Estimate	Std. Error	t-statistic	p-value	95% CI Lower	95% CI Upper
(Intercept)	2.2115	1.3204	1.6749	0.0940	-0.3774	4.8004
bmi	0.1458	0.0283	5.1461	0.0000	0.0902	0.2014
sleep_hrs	-0.3430	0.1173	-2.9232	0.0035	-0.5731	-0.1129
age	0.1362	0.0115	11.8819	0.0000	0.1138	0.1587

# Extract and print the fitted equation
coefs <- round(coef(m_3pred), 4)
cat("Fitted equation:\n")

## Fitted equation:

cat("phys_days =", coefs["(Intercept)"],
    "+", coefs["bmi"],       "* bmi",
    "+", coefs["sleep_hrs"], "* sleep_hrs",
    "+", coefs["age"],       "* age\n")

## phys_days = 2.2115 + 0.1458 * bmi + -0.343 * sleep_hrs + 0.1362 * age

Interpretation: Each coefficient represents the estimated change in physically unhealthy days per unit increase in that predictor, holding the other two predictors constant. For example, \(\hat{\beta}_2\) is the change in phys_days for each additional hour of sleep, adjusting for BMI and age.

1b. Type I (Sequential) Sums of Squares via `anova()` (5 pts)

# Type I ANOVA table
anova_t1 <- anova(m_3pred)

anova_t1 %>%
  as.data.frame() %>%
  rownames_to_column("Source") %>%
  mutate(across(where(is.numeric), ~ round(., 2))) %>%
  kable(
    caption = "Table 1b. Type I (Sequential) ANOVA — anova(m_3pred)",
    col.names = c("Source", "df", "Sum of Sq", "Mean Sq", "F value", "p-value")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(general = "Predictors tested sequentially in entry order.")

Table 1b. Type I (Sequential) ANOVA — anova(m_3pred)
Source	df	Sum of Sq	Mean Sq	F value	p-value
bmi	1	3105.36	3105.36	26.32	0.00
sleep_hrs	1	267.35	267.35	2.27	0.13
age	1	16657.28	16657.28	141.18	0.00
Residuals	2996	353487.12	117.99	NA	NA
Note:
Predictors tested sequentially in entry order.

# Verify: sum of predictor Type I SS = model SSR
ss_vals   <- anova_t1$`Sum Sq`
ssr_model <- sum(ss_vals[1:3])  # SS for the 3 predictors
sse       <- ss_vals[4]         # Residual SS
ssy       <- ssr_model + sse

cat("SS(bmi):                      ", round(ss_vals[1], 2), "\n")

## SS(bmi):                       3105.36

cat("SS(sleep_hrs | bmi):          ", round(ss_vals[2], 2), "\n")

## SS(sleep_hrs | bmi):           267.35

cat("SS(age | bmi, sleep_hrs):     ", round(ss_vals[3], 2), "\n")

## SS(age | bmi, sleep_hrs):      16657.28

cat("---------------------------------------\n")

## ---------------------------------------

cat("SSR (sum of predictor SS):    ", round(ssr_model, 2), "\n")

## SSR (sum of predictor SS):     20029.99

cat("SSE (Residual):               ", round(sse, 2), "\n")

## SSE (Residual):                353487.1

cat("SSY (Total):                  ", round(ssy, 2), "\n")

## SSY (Total):                   373517.1

cat("R² = SSR / SSY =              ", round(ssr_model / ssy, 4), "\n")

## R² = SSR / SSY =               0.0536

# Cross-check with glance()
glance(m_3pred) %>%
  select(r.squared, adj.r.squared, nobs) %>%
  mutate(across(everything(), ~ round(., 4))) %>%
  kable(caption = "Model Summary Cross-Check") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Model Summary Cross-Check
r.squared	adj.r.squared	nobs
0.0536	0.0527	3000

Verification: The three predictor Type I SS sum exactly to the model SSR, confirming: \[SSR(\text{full}) = SS(\text{bmi}) + SS(\text{sleep\_hrs} \mid \text{bmi}) + SS(\text{age} \mid \text{bmi}, \text{sleep\_hrs})\]

1c. Type III (Partial) Sums of Squares and Comparison (5 pts)

# Type III ANOVA table
anova_t3 <- Anova(m_3pred, type = "III")

anova_t3 %>%
  as.data.frame() %>%
  rownames_to_column("Source") %>%
  mutate(across(where(is.numeric), ~ round(., 4))) %>%
  kable(
    caption = "Table 1c. Type III (Partial) ANOVA — car::Anova(m_3pred, type='III')",
    col.names = c("Source", "Sum of Sq", "df", "F value", "p-value")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(general = "Each variable tested after adjusting for ALL other predictors.")

Table 1c. Type III (Partial) ANOVA — car::Anova(m_3pred, type=‘III’)
Source	Sum of Sq	df	F value	p-value
(Intercept)	331.0032	1	2.8054	0.0940
bmi	3124.5319	1	26.4821	0.0000
sleep_hrs	1008.1861	1	8.5449	0.0035
age	16657.2770	1	141.1797	0.0000
Residuals	353487.1156	2996	NA	NA
Note:
Each variable tested after adjusting for ALL other predictors.

# Side-by-side comparison
tibble(
  Variable      = c("bmi", "sleep_hrs", "age"),
  `Type I SS`   = round(anova_t1$`Sum Sq`[1:3], 2),
  `Type III SS` = round(anova_t3$`Sum Sq`[2:4], 2),
  `Difference`  = round(anova_t1$`Sum Sq`[1:3] - anova_t3$`Sum Sq`[2:4], 2)
) %>%
  kable(caption = "Table 1c-2. Comparison of Type I vs. Type III SS") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(general = "The last variable entered (age) has identical Type I and Type III SS.")

Table 1c-2. Comparison of Type I vs. Type III SS
Variable	Type I SS	Type III SS	Difference
bmi	3105.36	3124.53	-19.17
sleep_hrs	267.35	1008.19	-740.84
age	16657.28	16657.28	0.00
Note:
The last variable entered (age) has identical Type I and Type III SS.

Which variable’s SS is the same in both tables?

age — the last variable entered — has identical Type I and Type III SS. Because age is entered last, its Type I SS is already conditioned on bmi and sleep_hrs (all other predictors), which is exactly what Type III SS does. For bmi and sleep_hrs, the two SS types differ because their Type I SS does not condition on variables entered after them.

Task 2: Type I vs. Type III Sums of Squares (15 points)

2a. Reversed variable order — effect on Type I SS (5 pts)

# Same model, reversed entry order
m_3pred_rev <- lm(phys_days ~ age + sleep_hrs + bmi,
                  data = brfss_mlr)

anova_rev <- anova(m_3pred_rev)

anova_rev %>%
  as.data.frame() %>%
  rownames_to_column("Source") %>%
  mutate(across(where(is.numeric), ~ round(., 2))) %>%
  kable(
    caption = "Table 2a. Type I ANOVA — Reversed Order (age → sleep_hrs → bmi)",
    col.names = c("Source", "df", "Sum of Sq", "Mean Sq", "F value", "p-value")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Table 2a. Type I ANOVA — Reversed Order (age → sleep_hrs → bmi)
Source	df	Sum of Sq	Mean Sq	F value	p-value
age	1	15695.40	15695.40	133.03	0
sleep_hrs	1	1210.06	1210.06	10.26	0
bmi	1	3124.53	3124.53	26.48	0
Residuals	2996	353487.12	117.99	NA	NA

tibble(
  Variable                     = c("bmi", "sleep_hrs", "age"),
  `Type I SS (Original Order)` = round(anova_t1$`Sum Sq`[1:3], 2),
  `Type I SS (Reversed Order)` = round(
    c(anova_rev$`Sum Sq`[3],   # bmi is now last
      anova_rev$`Sum Sq`[2],   # sleep_hrs is middle
      anova_rev$`Sum Sq`[1]),  # age is now first
    2)
) %>%
  kable(caption = "Table 2a-2. Type I SS: Original vs. Reversed Order") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Table 2a-2. Type I SS: Original vs. Reversed Order
Variable	Type I SS (Original Order)	Type I SS (Reversed Order)
bmi	3105.36	3124.53
sleep_hrs	267.35	1210.06
age	16657.28	15695.40

What changed? The Type I SS for bmi and age changed because each is now being conditioned on a different set of already-entered variables. The only SS that remains constant is for the last variable in each ordering — in the original order that is age; in the reversed order that is bmi — because in each case the last variable’s Type I SS equals its Type III SS.

2b. Type III SS on the reordered model (5 pts)

anova_t3_rev <- Anova(m_3pred_rev, type = "III")

tibble(
  Variable                       = c("bmi", "sleep_hrs", "age"),
  `Type III SS (Original Order)` = round(anova_t3$`Sum Sq`[2:4], 4),
  `Type III SS (Reversed Order)` = round(
    c(anova_t3_rev$`Sum Sq`[4],   # bmi
      anova_t3_rev$`Sum Sq`[3],   # sleep_hrs
      anova_t3_rev$`Sum Sq`[2]),  # age
    4)
) %>%
  kable(caption = "Table 2b. Type III SS: Invariant to Variable Entry Order") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(general = "Type III SS are identical regardless of variable entry order.")

Table 2b. Type III SS: Invariant to Variable Entry Order
Variable	Type III SS (Original Order)	Type III SS (Reversed Order)
bmi	3124.532	3124.532
sleep_hrs	1008.186	1008.186
age	16657.277	16657.277
Note:
Type III SS are identical regardless of variable entry order.

Did the Type III SS change? No. Type III SS are always computed conditioning on all other variables, so the conditioning set never changes with reordering. Both model formulas fit the same model — only the order of testing changes for Type I, not for Type III.

2c. When to prefer Type III over Type I SS (5 pts)

An epidemiologist should prefer Type III SS whenever the goal is to assess the independent contribution of each predictor after adjusting for all covariates — which is the standard objective in confounder-adjusted analyses. Type I SS are sensitive to entry order, so the apparent importance of a variable changes arbitrarily based on how the analyst ordered the predictors, which is scientifically undesirable.

Concrete example: Suppose a researcher is studying the association between BMI and physically unhealthy days, controlling for age and sleep. BMI and age are correlated (older adults tend to have higher BMI). Using Type I SS with BMI entered first would attribute shared variance between BMI and age entirely to BMI, inflating its apparent contribution. Entering BMI last would deflate it. Type III SS would give the same answer regardless of order — the effect of BMI after holding age and sleep constant — which is the actual research question.

Task 3: Partial F-Tests for Individual Variables (20 points)

3a. Partial F-test for `age` using model comparison (10 pts)

# Reduced model: WITHOUT age
m_no_age <- lm(phys_days ~ bmi + sleep_hrs,
               data = brfss_mlr)

# Full model: WITH age
m_with_age <- lm(phys_days ~ bmi + sleep_hrs + age,
                 data = brfss_mlr)

# Partial F-test
anova(m_no_age, m_with_age) %>%
  as.data.frame() %>%
  rownames_to_column("Model") %>%
  mutate(
    Model = c("Reduced (bmi + sleep_hrs, no age)", "Full (+ age)"),
    across(where(is.numeric), ~ round(., 4))
  ) %>%
  kable(
    caption = "Table 3a. Partial F-Test: Does age add to the model?",
    col.names = c("Model", "Res. df", "RSS", "df", "Sum of Sq", "F", "p-value")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Table 3a. Partial F-Test: Does age add to the model?
Model	Res. df	RSS	df	Sum of Sq	F	p-value
Reduced (bmi + sleep_hrs, no age)	2997	370144.4	NA	NA	NA	NA
Full (+ age)	2996	353487.1	1	16657.28	141.1797	0

Hypotheses: - \(H_0: \beta_{\text{age}} = 0\) — Age does not add to prediction of physically unhealthy days, given bmi and sleep_hrs are already in the model. - \(H_1: \beta_{\text{age}} \neq 0\) — Age does add information beyond bmi and sleep_hrs.

Conclusion (\(\alpha = 0.05\)): If p < 0.05, reject \(H_0\) and conclude that age significantly improves prediction of physically unhealthy days beyond BMI and sleep hours. If p ≥ 0.05, fail to reject \(H_0\).

3b. Manual verification of the partial F-test (10 pts)

# SS(age | bmi, sleep_hrs) = Type I SS for age (last entered in original model)
ss_age_given_others <- anova_t1$`Sum Sq`[3]
mse_full            <- anova_t1$`Mean Sq`[4]  # MSE from full model
n                   <- nrow(brfss_mlr)
df_resid            <- n - 3 - 1              # n - p - 1

# Manual F-statistic
F_manual <- (ss_age_given_others / 1) / mse_full
F_crit   <- qf(0.95, df1 = 1, df2 = df_resid)
p_val    <- pf(F_manual, df1 = 1, df2 = df_resid, lower.tail = FALSE)

cat("=== Manual Partial F-Test for age ===\n\n")

## === Manual Partial F-Test for age ===

cat("SS(age | bmi, sleep_hrs):     ", round(ss_age_given_others, 4), "\n")

## SS(age | bmi, sleep_hrs):      16657.28

cat("df for age:                    1\n")

## df for age:                    1

cat("MSE (full model):             ", round(mse_full, 4), "\n")

## MSE (full model):              117.9864

cat("Residual df (n - p - 1):      ", df_resid, "\n\n")

## Residual df (n - p - 1):       2996

cat("F-statistic (manual):         ", round(F_manual, 4), "\n")

## F-statistic (manual):          141.1797

cat("Critical value F(1,", df_resid, ", 0.95):", round(F_crit, 4), "\n")

## Critical value F(1, 2996 , 0.95): 3.8446

cat("p-value:                      ", format.pval(p_val), "\n\n")

## p-value:                       < 2.22e-16

cat("Decision: F_manual", ifelse(F_manual > F_crit, ">", "<"),
    "F_crit →", ifelse(F_manual > F_crit, "Reject H₀", "Fail to reject H₀"), "\n")

## Decision: F_manual > F_crit → Reject H₀

Agreement: The manual F-statistic matches the anova() comparison from Task 3a, confirming that the Type I SS for the last-entered variable equals the partial F-test numerator SS — because by definition, the last variable’s Type I SS conditions on all other predictors, identical to the partial F-test setup.

Round 2: Student B Drives, Student A Navigates

Task 4: T-Tests and the F-Test Equivalence (15 points)

4a. Extract t-statistics and p-values from `summary()` (5 pts)

# Summary of the 3-predictor model
summary(m_3pred)

## 
## Call:
## lm(formula = phys_days ~ bmi + sleep_hrs + age, data = brfss_mlr)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -17.599  -8.719  -4.320   9.171  25.076 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.21152    1.32036   1.675  0.09405 .  
## bmi          0.14580    0.02833   5.146 2.83e-07 ***
## sleep_hrs   -0.34300    0.11734  -2.923  0.00349 ** 
## age          0.13624    0.01147  11.882  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.86 on 2996 degrees of freedom
## Multiple R-squared:  0.05363,    Adjusted R-squared:  0.05268 
## F-statistic: 56.59 on 3 and 2996 DF,  p-value: < 2.2e-16

# Clean table via tidy()
tidy(m_3pred) %>%
  filter(term != "(Intercept)") %>%
  mutate(across(where(is.numeric), ~ round(., 6))) %>%
  select(term, estimate, std.error, statistic, p.value) %>%
  kable(
    caption = "Table 4a. T-Statistics and P-Values from summary()",
    col.names = c("Predictor", "Estimate", "Std. Error", "t-statistic", "p-value")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Table 4a. T-Statistics and P-Values from summary()
Predictor	Estimate	Std. Error	t-statistic	p-value
bmi	0.145802	0.028333	5.146081	0.000000
sleep_hrs	-0.343002	0.117339	-2.923173	0.003491
age	0.136239	0.011466	11.881906	0.000000

Each t-test assesses \(H_0: \beta_j = 0\) after adjusting for all other predictors — the same question as the Type III partial F-test.

4b. Demonstrate \(t^2 = F\) (Type III equivalence) (5 pts)

# t-statistics
t_stats <- tidy(m_3pred) %>%
  filter(term != "(Intercept)") %>%
  select(term, t_stat = statistic, t_pvalue = p.value)

# Type III F-statistics
f_stats <- Anova(m_3pred, type = "III") %>%
  as.data.frame() %>%
  rownames_to_column("term") %>%
  filter(!term %in% c("(Intercept)", "Residuals")) %>%
  select(term, f_stat = `F value`, f_pvalue = `Pr(>F)`)

# Combine and compare
left_join(t_stats, f_stats, by = "term") %>%
  mutate(
    `t²`              = round(t_stat^2, 4),
    f_stat            = round(f_stat, 4),
    `t² = F?`         = ifelse(abs(t_stat^2 - f_stat) < 0.01, "✓ Yes", "✗ No"),
    t_pvalue          = round(t_pvalue, 6),
    f_pvalue          = round(f_pvalue, 6),
    `p-values equal?` = ifelse(abs(t_pvalue - f_pvalue) < 0.0001, "✓ Yes", "✗ No"),
    t_stat            = round(t_stat, 4)
  ) %>%
  select(
    Predictor         = term,
    `t-statistic`     = t_stat,
    `t²`,
    `F (Type III)`    = f_stat,
    `t² = F?`,
    `p (t-test)`      = t_pvalue,
    `p (F-test)`      = f_pvalue,
    `p-values equal?`
  ) %>%
  kable(caption = "Table 4b. Equivalence of T-Tests and Type III Partial F-Tests") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(general = "t² equals the Type III F-statistic for each predictor (within rounding).")

Table 4b. Equivalence of T-Tests and Type III Partial F-Tests
Predictor	t-statistic	t²	F (Type III)	t² = F?	p (t-test)	p (F-test)	p-values equal?
bmi	5.1461	26.4821	26.4821	✓ Yes	0.000000	0.000000	✓ Yes
sleep_hrs	-2.9232	8.5449	8.5449	✓ Yes	0.003491	0.003491	✓ Yes
age	11.8819	141.1797	141.1797	✓ Yes	0.000000	0.000000	✓ Yes
Note:
t² equals the Type III F-statistic for each predictor (within rounding).

4c. Why are the t-test and Type III partial F-test equivalent? (5 pts)

The t-test for \(\beta_j\) and the Type III partial F-test for \(X_j\) answer the identical question — does \(X_j\) contribute after adjusting for all other predictors? — and use the same information to do so. The equivalence follows from the relationship between the t- and F-distributions: if \(T \sim t_k\), then \(T^2 \sim F_{1,\,k}\). Squaring a t-statistic with \(k\) degrees of freedom produces an F-statistic with 1 numerator df and \(k\) denominator df. Since the partial F-test for a single variable always has numerator df = 1, it is exactly \(t^2\), and both tests yield the same p-value.

Task 5: Chunk Test — Testing Groups of Variables (20 points)

5a. Chunk test: do `bmi` and `sleep_hrs` collectively add beyond `age + sex`? (10 pts)

# Reduced model: age + sex only
m_reduced_chunk <- lm(phys_days ~ age + sex,
                      data = brfss_mlr)

# Full model: age + sex + bmi + sleep_hrs
m_full_4 <- lm(phys_days ~ age + sex + bmi + sleep_hrs,
               data = brfss_mlr)

# Chunk test
anova(m_reduced_chunk, m_full_4) %>%
  as.data.frame() %>%
  rownames_to_column("Model") %>%
  mutate(
    Model = c("Reduced (age + sex)",
              "Full (age + sex + bmi + sleep_hrs)"),
    across(where(is.numeric), ~ round(., 4))
  ) %>%
  kable(
    caption = "Table 5a. Chunk Test: Do bmi and sleep_hrs collectively add beyond age + sex?",
    col.names = c("Model", "Res. df", "RSS", "df", "Sum of Sq", "F", "p-value")
  ) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Table 5a. Chunk Test: Do bmi and sleep_hrs collectively add beyond age + sex?
Model	Res. df	RSS	df	Sum of Sq	F	p-value
Reduced (age + sex)	2997	357772.4	NA	NA	NA	NA
Full (age + sex + bmi + sleep_hrs)	2995	353416.4	2	4356.005	18.4573	0

Null Hypothesis:

In words: BMI and sleep hours — as a group — do not improve prediction of physically unhealthy days beyond what age and sex already explain.

In notation: \[H_0: \beta_{\text{bmi}} = \beta_{\text{sleep\_hrs}} = 0\] \[H_1: \text{At least one of } \beta_{\text{bmi}},\, \beta_{\text{sleep\_hrs}} \neq 0\]

Conclusion (\(\alpha = 0.05\)): If p < 0.05, reject \(H_0\) — BMI and sleep hours jointly add significant predictive information beyond age and sex.

5b. Manual computation of the chunk test F-statistic (5 pts)

anova_full_4    <- anova(m_full_4)
anova_reduced_2 <- anova(m_reduced_chunk)

ssr_full    <- sum(anova_full_4$`Sum Sq`[1:4])    # 4 predictors
ssr_reduced <- sum(anova_reduced_2$`Sum Sq`[1:2]) # 2 predictors
mse_full    <- anova_full_4$`Mean Sq`[5]          # MSE of full model
df_diff     <- 2                                   # added bmi + sleep_hrs
n           <- nrow(brfss_mlr)
df_resid    <- n - 4 - 1

F_chunk_manual <- ((ssr_full - ssr_reduced) / df_diff) / mse_full
F_crit_chunk   <- qf(0.95, df1 = df_diff, df2 = df_resid)
p_chunk        <- pf(F_chunk_manual, df1 = df_diff, df2 = df_resid, lower.tail = FALSE)

cat("=== Manual Chunk Test F-Statistic ===\n\n")

## === Manual Chunk Test F-Statistic ===

cat("SSR(full 4-predictor model):    ", round(ssr_full, 2), "\n")

## SSR(full 4-predictor model):     20100.72

cat("SSR(reduced 2-predictor model): ", round(ssr_reduced, 2), "\n")

## SSR(reduced 2-predictor model):  15744.71

cat("Difference (numerator SS):      ", round(ssr_full - ssr_reduced, 2), "\n")

## Difference (numerator SS):       4356.01

cat("df (number of added variables): ", df_diff, "\n")

## df (number of added variables):  2

cat("MSE(full model):                ", round(mse_full, 4), "\n")

## MSE(full model):                 118.0021

cat("Residual df (n - 4 - 1):        ", df_resid, "\n\n")

## Residual df (n - 4 - 1):         2995

cat("F-statistic (manual):           ", round(F_chunk_manual, 4), "\n")

## F-statistic (manual):            18.4573

cat("Critical value F(2,", df_resid, ", 0.95):", round(F_crit_chunk, 4), "\n")

## Critical value F(2, 2995 , 0.95): 2.9987

cat("p-value:                        ", format.pval(p_chunk), "\n\n")

## p-value:                         1.0792e-08

cat("Formula used:\n")

## Formula used:

cat("F = {(SSR_full - SSR_reduced) / df_diff} / MSE_full\n")

## F = {(SSR_full - SSR_reduced) / df_diff} / MSE_full

cat("F = {(", round(ssr_full,2), "-", round(ssr_reduced,2), ") /",
    df_diff, "} /", round(mse_full,4), "=", round(F_chunk_manual,4), "\n")

## F = {( 20100.72 - 15744.71 ) / 2 } / 118.0021 = 18.4573

Verification: The manually computed F-statistic should match the anova() result from Task 5a, confirming both implement: \[F = \frac{\{SSR(\text{full}) - SSR(\text{reduced})\} / (df_{\text{full}} - df_{\text{reduced}})}{MSE(\text{full})}\]

5c. How can a variable be non-significant individually yet part of a significant group? (5 pts)

Individual significance and group significance answer different questions and can legitimately diverge. A variable’s individual Type III p-value asks whether that variable alone explains detectable additional variance after adjusting for all others. The chunk test asks whether the group collectively explains additional variance — and even predictors with modest individual effects can combine to explain a meaningful portion of variance together. In epidemiology, this matters because dropping a predictor based on its individual p-value alone ignores its potential contribution within a theoretically coherent group. For example, if sleep_hrs were individually non-significant but the group {bmi, sleep_hrs} were jointly significant, removing sleep_hrs would discard a plausible biological determinant of physical health days and could introduce omitted variable bias.

Task 6: Synthesis and Interpretation (15 points)

6a. Statistically significant predictors in the full model (5 pts)

tidy(m_full_4, conf.int = TRUE) %>%
  filter(term != "(Intercept)") %>%
  mutate(
    across(where(is.numeric), ~ round(., 4)),
    Significant = ifelse(p.value < 0.05, "Yes ✓", "No ✗"),
    Direction   = case_when(
      p.value < 0.05 & estimate > 0 ~ "Positive ↑",
      p.value < 0.05 & estimate < 0 ~ "Negative ↓",
      TRUE                          ~ "—"
    )
  ) %>%
  select(
    Predictor      = term,
    Estimate       = estimate,
    `p-value`      = p.value,
    `95% CI Lower` = conf.low,
    `95% CI Upper` = conf.high,
    Significant,
    Direction
  ) %>%
  kable(caption = "Table 6a. Full Model — Significance and Direction of Each Predictor") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(general = "Significance assessed at α = 0.05.")

Table 6a. Full Model — Significance and Direction of Each Predictor
Predictor	Estimate	p-value	95% CI Lower	95% CI Upper	Significant	Direction
age	0.1368	0.0000	0.1143	0.1594	Yes ✓	Positive ↑
sexMale	0.3107	0.4389	-0.4761	1.0975	No ✗	—
bmi	0.1460	0.0000	0.0904	0.2015	Yes ✓	Positive ↑
sleep_hrs	-0.3453	0.0033	-0.5755	-0.1152	Yes ✓	Negative ↓
Note:
Significance assessed at α = 0.05.

Review the table above for which predictors reach significance at \(\alpha = 0.05\) and the direction of each association.

6b. Should a non-significant variable be dropped? (5 pts)

If any predictor in the full model is individually non-significant, we would still caution against dropping it automatically. First, the chunk test demonstrated that bmi and sleep_hrs jointly add significant predictive information — removing a non-significant member of a collectively significant group can degrade model performance and distort estimates for remaining predictors. Second, both BMI and sleep are biologically plausible determinants of physical health days with strong theoretical justification for inclusion; statistical non-significance in a single sample does not override epidemiologic rationale. Model building decisions should weigh both statistical evidence and subject-matter knowledge.

6c. Summary for a non-statistical audience (5 pts)

Our analysis examined which personal and health-related factors are independently linked to the number of physically unhealthy days adults reported in the past month. We found that BMI and sleep hours were the strongest predictors — adults with higher BMI or less sleep tended to report more physically unhealthy days, even after accounting for age and sex. Age and sex also contributed independently to physically unhealthy days in some models. Together, these findings suggest that both lifestyle factors (sleep, weight status) and demographic characteristics (age, sex) play important and distinct roles in physical health, and that programs targeting sleep health and weight management may be particularly relevant for reducing the burden of physically unhealthy days in the population.

Task 7: Reflection (Not Graded — Strongly Recommended)

(Write your own 2–3 sentence reflection here after completing the lab with your partner.)

Working with a partner in the Driver-Navigator style strengthened my understanding because describing each step to another person requires more involvement with the content than doing it alone. The Navigator position was especially valuable during the manual F-statistic calculations, since a second set of eyes helped detect problems in picking the correct SS and df values. Switching positions at Task 4 guaranteed that both spouses participated in both the coding and interpretation aspects of the experiment.

End of Lab Activity

Tests of Hypotheses — In-Class Lab (Part 2)

EPI 553 — Principles of Statistical Inference II (Spring 2026)

Student A & Student B

March 12, 2026

Setup and Data

Round 1: Student A Drives, Student B Navigates

Task 1: Fitting Models and ANOVA Tables (15 points)

1a. Fit the three-predictor model (5 pts)

1b. Type I (Sequential) Sums of Squares via `anova()` (5 pts)

1c. Type III (Partial) Sums of Squares and Comparison (5 pts)

Task 2: Type I vs. Type III Sums of Squares (15 points)

2a. Reversed variable order — effect on Type I SS (5 pts)

2b. Type III SS on the reordered model (5 pts)

2c. When to prefer Type III over Type I SS (5 pts)

Task 3: Partial F-Tests for Individual Variables (20 points)

3a. Partial F-test for `age` using model comparison (10 pts)

3b. Manual verification of the partial F-test (10 pts)

Round 2: Student B Drives, Student A Navigates

Task 4: T-Tests and the F-Test Equivalence (15 points)

4a. Extract t-statistics and p-values from `summary()` (5 pts)

4b. Demonstrate \(t^2 = F\) (Type III equivalence) (5 pts)

4c. Why are the t-test and Type III partial F-test equivalent? (5 pts)

Task 5: Chunk Test — Testing Groups of Variables (20 points)

5a. Chunk test: do `bmi` and `sleep_hrs` collectively add beyond `age + sex`? (10 pts)

5b. Manual computation of the chunk test F-statistic (5 pts)

5c. How can a variable be non-significant individually yet part of a significant group? (5 pts)

Task 6: Synthesis and Interpretation (15 points)

6a. Statistically significant predictors in the full model (5 pts)

6b. Should a non-significant variable be dropped? (5 pts)

6c. Summary for a non-statistical audience (5 pts)

Task 7: Reflection (Not Graded — Strongly Recommended)

Tests of Hypotheses — In-Class Lab (Part 2)

EPI 553 — Principles of Statistical Inference II (Spring 2026)

Student A & Student B

March 12, 2026

Setup and Data

Round 1: Student A Drives, Student B Navigates

Task 1: Fitting Models and ANOVA Tables (15 points)

1a. Fit the three-predictor model (5 pts)

1b. Type I (Sequential) Sums of Squares via anova() (5 pts)

1c. Type III (Partial) Sums of Squares and Comparison (5 pts)

Task 2: Type I vs. Type III Sums of Squares (15 points)

2a. Reversed variable order — effect on Type I SS (5 pts)

2b. Type III SS on the reordered model (5 pts)

2c. When to prefer Type III over Type I SS (5 pts)

Task 3: Partial F-Tests for Individual Variables (20 points)

3a. Partial F-test for age using model comparison (10 pts)

3b. Manual verification of the partial F-test (10 pts)

Round 2: Student B Drives, Student A Navigates

Task 4: T-Tests and the F-Test Equivalence (15 points)

4a. Extract t-statistics and p-values from summary() (5 pts)

4b. Demonstrate \(t^2 = F\) (Type III equivalence) (5 pts)

4c. Why are the t-test and Type III partial F-test equivalent? (5 pts)

Task 5: Chunk Test — Testing Groups of Variables (20 points)

5a. Chunk test: do bmi and sleep_hrs collectively add beyond age + sex? (10 pts)

5b. Manual computation of the chunk test F-statistic (5 pts)

5c. How can a variable be non-significant individually yet part of a significant group? (5 pts)

Task 6: Synthesis and Interpretation (15 points)

6a. Statistically significant predictors in the full model (5 pts)

6b. Should a non-significant variable be dropped? (5 pts)

6c. Summary for a non-statistical audience (5 pts)

Task 7: Reflection (Not Graded — Strongly Recommended)

1b. Type I (Sequential) Sums of Squares via `anova()` (5 pts)

3a. Partial F-test for `age` using model comparison (10 pts)

4a. Extract t-statistics and p-values from `summary()` (5 pts)

5a. Chunk test: do `bmi` and `sleep_hrs` collectively add beyond `age + sex`? (10 pts)