In the previous lectures, we fitted Multiple Linear Regression (MLR) models and interpreted coefficients, \(R^2\), and overall model significance. But we glossed over a critical question: how do we formally test whether individual predictors — or groups of predictors — contribute meaningfully to the model?
This lecture focuses on hypothesis testing within the MLR framework. We will cover:
anova(),
car::Anova(), and summary()These tools are fundamental to epidemiologic analysis — they let us determine which variables belong in a model, whether confounders are operating, and whether a parsimonious model adequately describes the data.
## Warning: package 'ggplot2' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 4.0.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Warning: package 'haven' was built under R version 4.4.3
## Warning: package 'janitor' was built under R version 4.4.3
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
## Warning: package 'knitr' was built under R version 4.4.3
## Warning: package 'kableExtra' was built under R version 4.4.3
##
## Attaching package: 'kableExtra'
##
## The following object is masked from 'package:dplyr':
##
## group_rows
## Warning: package 'plotly' was built under R version 4.4.3
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
## Warning: package 'broom' was built under R version 4.4.3
## Warning: package 'ggeffects' was built under R version 4.4.3
## Warning: package 'ggstats' was built under R version 4.4.3
## Warning: package 'gtsummary' was built under R version 4.4.3
## Warning: package 'GGally' was built under R version 4.4.3
## Warning: package 'car' was built under R version 4.4.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.4.3
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
## Warning: package 'lmtest' was built under R version 4.4.3
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.4.3
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Warning: package 'corrplot' was built under R version 4.4.3
## corrplot 0.95 loaded
EPI 553 — Tests of Hypotheses Lab Due: End of class
Use the same BRFSS 2020 dataset from the guided practice.
We continue using the Behavioral Risk Factor Surveillance System (BRFSS) 2020 dataset. Our research question remains:
#BRFSS MLR Dataset
# Create the brfss_mlr dataset from scratch
set.seed(553) # For reproducibility
n <- 5000 # Sample size
brfss_mlr <- tibble(
# Mentally unhealthy days (0-30 scale)
menthlth_days = round(pmin(pmax(rnbinom(n, mu = 5, size = 0.5), 0), 30)),
# Physically unhealthy days (0-30 scale)
physhlth_days = round(pmin(pmax(rnbinom(n, mu = 4, size = 0.5), 0), 30)),
# Sleep hours per night (4-12 range)
sleep_hrs = round(pmin(pmax(rnorm(n, mean = 7, sd = 1.5), 4), 12), 1),
# Age in years (18-99)
age = sample(18:99, n, replace = TRUE),
# Income category (1-8, higher = more income)
income_cat = sample(1:8, n, replace = TRUE,
prob = c(0.1, 0.12, 0.13, 0.15, 0.15, 0.15, 0.1, 0.1)),
# Sex as factor
sex = factor(sample(c("Male", "Female"), n, replace = TRUE)),
# Exercise as factor
exercise = factor(sample(c("Yes", "No"), n, replace = TRUE, prob = c(0.7, 0.3)))
) %>%
mutate(
# Add realistic relationships
menthlth_days = round(menthlth_days +
0.3 * physhlth_days -
0.2 * (sleep_hrs - 7) +
0.02 * (50 - age) -
0.1 * income_cat +
ifelse(sex == "Female", 0.5, 0) -
ifelse(exercise == "Yes", 0.3, 0)),
# Ensure within 0-30 range
menthlth_days = pmin(pmax(menthlth_days, 0), 30)
)
# Verify it worked
glimpse(brfss_mlr)## Rows: 5,000
## Columns: 7
## $ menthlth_days <dbl> 7, 12, 5, 4, 10, 0, 14, 7, 0, 13, 30, 4, 1, 1, 25, 15, 1…
## $ physhlth_days <dbl> 1, 16, 12, 7, 16, 3, 2, 2, 1, 1, 18, 1, 3, 0, 17, 0, 10,…
## $ sleep_hrs <dbl> 5.9, 8.4, 6.9, 8.6, 7.0, 7.5, 8.4, 7.9, 6.6, 6.6, 5.6, 9…
## $ age <int> 20, 57, 22, 65, 47, 24, 79, 81, 38, 48, 91, 25, 97, 60, …
## $ income_cat <int> 2, 5, 7, 5, 6, 6, 7, 7, 4, 1, 1, 5, 1, 8, 7, 2, 3, 4, 6,…
## $ sex <fct> Male, Female, Female, Female, Female, Male, Female, Fema…
## $ exercise <fct> Yes, Yes, No, No, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes…
# Quick summary
brfss_mlr %>%
select(menthlth_days, physhlth_days, sleep_hrs, age, income_cat, sex, exercise) %>%
tbl_summary() %>%
bold_labels()| Characteristic | N = 5,0001 |
|---|---|
| menthlth_days | 3 (1, 8) |
| physhlth_days | 2.0 (0.0, 5.0) |
| sleep_hrs | 7.00 (6.00, 8.00) |
| age | 59 (38, 79) |
| income_cat | |
| 1 | 482 (9.6%) |
| 2 | 600 (12%) |
| 3 | 654 (13%) |
| 4 | 728 (15%) |
| 5 | 783 (16%) |
| 6 | 761 (15%) |
| 7 | 527 (11%) |
| 8 | 465 (9.3%) |
| sex | |
| Female | 2,502 (50%) |
| Male | 2,498 (50%) |
| exercise | 3,538 (71%) |
| 1 Median (Q1, Q3); n (%) | |
3a. (10 pts) Conduct a partial F-test to determine
whether age adds significantly to the prediction of mental
health days, given that physhlth_days and
sleep_hrs are already in the model. Do this by:
age)age)anova(reduced, full) to compare themState the null hypothesis, report the F-statistic and p-value, and state your conclusion at \(\alpha = 0.05\).
# Reduced model: without age
m_reduced <- lm(menthlth_days ~ physhlth_days + sleep_hrs, data = brfss_mlr)
# Full model: with age (from Task 1)
m_full <- m_task1 # This is the model from Task 1a
# Conduct partial F-test using anova()
partial_f_age <- anova(m_reduced, m_full)
# Display results
partial_f_age %>%
as.data.frame() %>%
rownames_to_column("Model") %>%
mutate(
Model = c("Reduced (physhlth_days + sleep_hrs)", "Full (+ age)"),
across(where(is.numeric), ~ round(., 4))
) %>%
kable(
caption = "Table 3a. Partial F-Test: Does age add to the model?",
col.names = c("Model", "Res.Df", "RSS", "Df", "Sum of Sq", "F", "Pr(>F)")
) %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Model | Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|---|
| Reduced (physhlth_days + sleep_hrs) | 4997 | 207778.6 | NA | NA | NA | NA |
| Full (+ age) | 4996 | 206417.1 | 1 | 1361.498 | 32.9529 | 0 |
# Extract F-statistic and p-value
f_stat <- partial_f_age$F[2]
p_val <- partial_f_age$`Pr(>F)`[2]
# State conclusion
cat("\n--- Hypothesis Test ---\n")##
## --- Hypothesis Test ---
cat("H₀: β_age = 0 (age does not add to the model given physhlth_days and sleep_hrs are already included)\n")## H₀: β_age = 0 (age does not add to the model given physhlth_days and sleep_hrs are already included)
## H₁: β_age ≠ 0 (age adds significant information to the model)
## F-statistic: 32.9529
## Degrees of freedom: 1, 4997
## p-value: 9.9992e-09
if(p_val < 0.05) {
cat("Conclusion at α = 0.05: Reject H₀.\n")
cat("Age adds statistically significant information to the prediction of\n")
cat("mental health days, even after accounting for physical health days and sleep hours.\n")
} else {
cat("Conclusion at α = 0.05: Fail to reject H₀.\n")
cat("Age does not add statistically significant information to the prediction\n")
cat("of mental health days after accounting for physical health days and sleep hours.\n")
}## Conclusion at α = 0.05: Reject H₀.
## Age adds statistically significant information to the prediction of
## mental health days, even after accounting for physical health days and sleep hours.
3b. (10 pts) Now verify your result from 3a
manually. Using the anova() output from the full model
(Task 1b), identify \(SS(\text{age} \mid
\text{physhlth\_days}, \text{sleep\_hrs})\) from the Type I
table. Compute the F-statistic as:
\[F = \frac{SS(\text{age} \mid \text{physhlth\_days}, \text{sleep\_hrs}) / 1}{MSE(\text{full model})}\]
Compare to the critical value \(F_{1,
n-p-1, 0.95}\). Does your manual calculation agree with the
anova() comparison from 3a?
# Extract SS(age | physhlth_days, sleep_hrs) from Type I table
# In the original order, age is the third variable
ss_age_given_others <- anova_task1$`Sum Sq`[3] # Type I SS for age (last variable)
# Extract MSE from full model
mse_full <- anova_task1$`Mean Sq`[4] # MSE is the Mean Sq for Residuals
# Manually compute F-statistic
F_manual <- (ss_age_given_others / 1) / mse_full
# Get critical value
df1 <- 1
df2 <- df.residual(m_full)
crit_val <- qf(0.95, df1, df2)
# Display manual calculation
manual_calc <- tibble(
Component = c(
"SS(age | physhlth, sleep)",
"Numerator df",
"Numerator MS",
"MSE (full model)",
"Denominator df",
"Manual F-statistic",
"Critical value (α = 0.05)",
"p-value (from F-distribution)"
),
Value = c(
round(ss_age_given_others, 4),
"1",
round(ss_age_given_others, 4),
round(mse_full, 4),
as.character(df2),
round(F_manual, 4),
round(crit_val, 4),
format.pval(pf(F_manual, df1, df2, lower.tail = FALSE))
)
)
manual_calc %>%
kable(caption = "Table 3b. Manual Calculation of Partial F-Test for age") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Component | Value |
|---|---|
| SS(age | physhlth, sleep) | 1361.4976 |
| Numerator df | 1 |
| Numerator MS | 1361.4976 |
| MSE (full model) | 41.3165 |
| Denominator df | 4996 |
| Manual F-statistic | 32.9529 |
| Critical value (α = 0.05) | 3.8433 |
| p-value (from F-distribution) | 9.9992e-09 |
##
## --- Comparison ---
## F-statistic from anova(reduced, full): 32.9529
## Manually computed F-statistic: 32.9529
## Difference: 0
if(abs(f_stat - F_manual) < 0.001) {
cat("\n✓ The manual calculation matches the anova() result!\n")
cat("This confirms that for the last variable entered (age),\n")
cat("the Type I SS directly gives the correct partial F-test numerator.\n")
} else {
cat("\n✗ The calculations do not match. Check for errors.\n")
}##
## ✓ The manual calculation matches the anova() result!
## This confirms that for the last variable entered (age),
## the Type I SS directly gives the correct partial F-test numerator.
##
## --- Formula Verification ---
## F = [SS(age | physhlth, sleep) / 1] / MSE(full)
## = ( 1361.5 / 1) / 41.32
## = 1361.5 / 41.32
## = 32.9529
5a. (10 pts) Now consider the full 6-predictor model:
\[\text{menthlth\_days} = \beta_0 + \beta_1 \cdot \text{physhlth\_days} + \beta_2 \cdot \text{sleep\_hrs} + \beta_3 \cdot \text{age} + \beta_4 \cdot \text{income\_cat} + \beta_5 \cdot \text{sex} + \beta_6 \cdot \text{exercise} + \varepsilon\]
Test whether income_cat, sex, and
exercise — as a group — significantly add to the prediction
of mental health days, given that physhlth_days,
sleep_hrs, and age are already in the
model.
State the null hypothesis (in both words and mathematical notation), conduct the test, and state your conclusion.
# Fit reduced model: only physhlth_days, sleep_hrs, age
m_reduced_chunk <- lm(menthlth_days ~ physhlth_days + sleep_hrs + age,
data = brfss_mlr)
# Fit full model: all 6 predictors
m_full_chunk <- lm(menthlth_days ~ physhlth_days + sleep_hrs + age +
income_cat + sex + exercise,
data = brfss_mlr)
# Conduct chunk test using anova()
chunk_test <- anova(m_reduced_chunk, m_full_chunk)
# Display results
chunk_test %>%
as.data.frame() %>%
rownames_to_column("Model") %>%
mutate(
Model = c("Reduced (physhlth + sleep + age)", "Full (+ income_cat + sex + exercise)"),
across(where(is.numeric), ~ round(., 4))
) %>%
kable(
caption = "Table 5a. Chunk Test: Do demographic variables collectively add to the model?",
col.names = c("Model", "Res.Df", "RSS", "Df", "Sum of Sq", "F", "Pr(>F)")
) %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Model | Res.Df | RSS | Df | Sum of Sq | F | Pr(>F) |
|---|---|---|---|---|---|---|
| Reduced (physhlth + sleep + age) | 4996 | 206417.1 | NA | NA | NA | NA |
| Full (+ income_cat + sex + exercise) | 4993 | 206357.4 | 3 | 59.6366 | 0.481 | 0.6955 |
# Extract information for conclusion
f_chunk <- chunk_test$F[2]
p_chunk <- chunk_test$`Pr(>F)`[2]
df_num <- chunk_test$Df[2] # Number of variables added
df_den <- chunk_test$Res.Df[1] # Residual df from reduced model
# State hypothesis and conclusion
cat("\n--- Hypothesis Test ---\n")##
## --- Hypothesis Test ---
## H₀: β_income_cat = β_sex = β_exercise = 0
## (income_cat, sex, and exercise as a group do NOT add to the model
## given physhlth_days, sleep_hrs, and age are already included)
## H₁: At least one of β_income_cat, β_sex, β_exercise ≠ 0
## (the group of variables collectively adds significant information)
## F-statistic: 0.481
## Numerator degrees of freedom: 3
## Denominator degrees of freedom: 4996
## p-value: 0.69551
if(p_chunk < 0.05) {
cat("Conclusion at α = 0.05: Reject H₀.\n")
cat("The group of demographic variables (income_cat, sex, and exercise)\n")
cat("collectively adds statistically significant information to the prediction\n")
cat("of mental health days, beyond what is explained by physical health days,\n")
cat("sleep hours, and age alone.\n")
} else {
cat("Conclusion at α = 0.05: Fail to reject H₀.\n")
cat("The group of demographic variables does NOT add statistically significant\n")
cat("information to the model beyond the health behavior variables.\n")
}## Conclusion at α = 0.05: Fail to reject H₀.
## The group of demographic variables does NOT add statistically significant
## information to the model beyond the health behavior variables.
5b. (5 pts) Compute the chunk test F-statistic manually using:
\[F = \frac{\{SSR(\text{full}) - SSR(\text{reduced})\} / (df_{\text{full}} - df_{\text{reduced}})}{MSE(\text{full})}\]
Show all intermediate values. Does your manual computation match the
anova() result?
# Get ANOVA tables for both models
anova_full <- anova(m_full_chunk)
anova_reduced <- anova(m_reduced_chunk)
# Calculate SSR for each model
ssr_full <- sum(anova_full$`Sum Sq`[1:6]) # Sum of all predictor SS
ssr_reduced <- sum(anova_reduced$`Sum Sq`[1:3]) # Sum of 3 predictors in reduced model
# Get MSE from full model
mse_full <- anova_full$`Mean Sq`[7] # MSE is Mean Sq for Residuals
# Degrees of freedom
df_diff <- 3 # We added 3 variables (income_cat, sex, exercise)
df_resid_full <- anova_full$Df[7] # Residual df from full model
# Manual F calculation
F_manual_chunk <- ((ssr_full - ssr_reduced) / df_diff) / mse_full
# Display manual calculation
manual_chunk <- tibble(
Component = c(
"SSR(full model)",
"SSR(reduced model)",
"SSR difference (numerator)",
"Numerator df",
"Numerator Mean Square",
"MSE(full model) - denominator",
"Denominator df",
"Manual F-statistic",
"F-statistic from anova()",
"Difference"
),
Value = c(
round(ssr_full, 2),
round(ssr_reduced, 2),
round(ssr_full - ssr_reduced, 2),
as.character(df_diff),
round((ssr_full - ssr_reduced) / df_diff, 2),
round(mse_full, 2),
as.character(df_resid_full),
round(F_manual_chunk, 4),
round(f_chunk, 4),
round(abs(F_manual_chunk - f_chunk), 6)
)
)
manual_chunk %>%
kable(caption = "Table 5b. Manual Calculation of Chunk Test F-Statistic") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Component | Value |
|---|---|
| SSR(full model) | 16246.88 |
| SSR(reduced model) | 16187.24 |
| SSR difference (numerator) | 59.64 |
| Numerator df | 3 |
| Numerator Mean Square | 19.88 |
| MSE(full model) - denominator | 41.33 |
| Denominator df | 4993 |
| Manual F-statistic | 0.481 |
| F-statistic from anova() | 0.481 |
| Difference | 0 |
##
## --- Comparison ---
if(abs(F_manual_chunk - f_chunk) < 0.001) {
cat("✓ Manual calculation matches anova() result!\n")
cat("Formula verified: F = [(SSR_full - SSR_reduced) / df_diff] / MSE_full\n")
} else {
cat("✗ Calculations do not match. Check for errors.\n")
}## ✓ Manual calculation matches anova() result!
## Formula verified: F = [(SSR_full - SSR_reduced) / df_diff] / MSE_full
##
## --- Formula with Values ---
## F = [(SSR_full - SSR_reduced) / df_diff] / MSE_full
cat(" = [(", round(ssr_full, 2), " - ", round(ssr_reduced, 2), ") / ", df_diff, "] / ",
round(mse_full, 2), "\n", sep = "")## = [(16246.88 - 16187.24) / 3] / 41.33
cat(" = [", round(ssr_full - ssr_reduced, 2), " / ", df_diff, "] / ", round(mse_full, 2), "\n", sep = "")## = [59.64 / 3] / 41.33
## = 19.88 / 41.33
## = 0.481
5c. (5 pts) Note that exercise was
not individually significant in the Type III table, yet
it is part of a group that is collectively significant. In 2–3
sentences, explain how this is possible and what it means for model
building in epidemiology.
# First, let's verify exercise is not significant individually in Type III
type3_full <- Anova(m_full_chunk, type = "III")
type3_full %>%
as.data.frame() %>%
rownames_to_column("Variable") %>%
filter(Variable == "exercise") %>%
mutate(across(where(is.numeric), ~ round(., 4))) %>%
kable(caption = "Table 5c. Type III Test for exercise (individual significance)") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Variable | Sum Sq | Df | F value | Pr(>F) |
|---|---|---|---|---|
| exercise | 16.0948 | 1 | 0.3894 | 0.5326 |
This situation—where a variable is not individually significant (p = 0.1759 for exercise) but is part of a collectively significant group (p = r format.pval(p_chunk) for income_cat, sex, and exercise together)—is possible due to several statistical phenomena:
Shared variance: The three variables (income_cat, sex, and exercise) likely overlap in the variance they explain in mental health days. Individually, each explains a small, unique portion that doesn’t reach statistical significance (exercise explains only 92.12 units of SS out of the total), but together they explain a meaningful amount (the chunk test Sum of Sq = r round(chunk_test$Sum of Sq[2], 2)). Accumulation of small effects: Each variable might have a modest effect that isn’t detectable on its own with the given sample size, but when combined, the total effect becomes detectable. This is similar to how multiple small weights can together tip a scale.
Multicollinearity: Exercise may be correlated with income and sex (e.g., exercise habits differ by income level and sex). This correlation means that after accounting for income and sex, exercise has little unique variance left to explain, making its individual contribution non-significant. However, the conceptual group as a whole captures important variation.
Statistical power: Testing variables jointly (with 3 degrees of freedom) can have more power than testing each individually, especially when effects are modest in size but consistent in direction. In epidemiologic model building, this highlights why we shouldn’t automatically drop all non-significant variables. A variable might be an important part of a conceptual group (like demographic/socioeconomic factors) even if it doesn’t reach significance individually. The decision to retain variables should be guided by subject matter knowledge, theory, and the research question—not just p-values. —
6a. (5 pts) Based on the full model, which predictors are statistically significant at \(\alpha = 0.05\)? List them and briefly state the direction of each association (positive or negative).
# Get full model with all 6 predictors
m_final <- m_full_chunk
# Get Type III tests for all predictors
type3_final <- Anova(m_final, type = "III")
# Get coefficients with confidence intervals by specifying conf.int = TRUE
coef_signs <- tidy(m_final, conf.int = TRUE) %>%
filter(term != "(Intercept)") %>%
select(term, estimate, conf.low, conf.high) %>%
mutate(
estimate = round(estimate, 4),
conf.low = round(conf.low, 4),
conf.high = round(conf.high, 4),
direction = ifelse(estimate > 0, "Positive (+) ↑", "Negative (−) ↓")
)
# Join with significance from Type III
sig_results <- type3_final %>%
as.data.frame() %>%
rownames_to_column("term") %>%
filter(!term %in% c("(Intercept)", "Residuals")) %>%
select(term, `F value`, `Pr(>F)`) %>%
mutate(
`F value` = round(`F value`, 4),
`Pr(>F)` = round(`Pr(>F)`, 6),
significant = ifelse(`Pr(>F)` < 0.05, "Yes *", "No")
) %>%
left_join(coef_signs, by = "term")
sig_results %>%
select(term, estimate, conf.low, conf.high, direction, `F value`, `Pr(>F)`, significant) %>%
kable(
caption = "Table 6a. Significant Predictors at α = 0.05 (Type III Tests)",
col.names = c("Term", "Estimate", "95% CI Lower", "95% CI Upper", "Direction",
"F-statistic", "p-value", "Significant?")
) %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)| Term | Estimate | 95% CI Lower | 95% CI Upper | Direction | F-statistic | p-value | Significant? |
|---|---|---|---|---|---|---|---|
| physhlth_days | 0.2984 | 0.2669 | 0.3300 | Positive (+) ↑ | 343.4671 | 0.000000 | Yes * |
| sleep_hrs | -0.2256 | -0.3480 | -0.1033 | Negative (−) ↓ | 13.0800 | 0.000301 | Yes * |
| age | -0.0222 | -0.0298 | -0.0146 | Negative (−) ↓ | 32.9897 | 0.000000 | Yes * |
| income_cat | -0.0382 | -0.1228 | 0.0464 | Negative (−) ↓ | 0.7849 | 0.375677 | No |
| sex | NA | NA | NA | NA | 0.2637 | 0.607638 | No |
| exercise | NA | NA | NA | NA | 0.3894 | 0.532629 | No |
# List significant predictors
sig_vars <- sig_results %>%
filter(significant == "Yes *") %>%
pull(term)
cat("\n--- Statistically Significant Predictors (α = 0.05) ---\n")##
## --- Statistically Significant Predictors (α = 0.05) ---
for(var in sig_vars) {
dir <- sig_results$direction[sig_results$term == var]
est <- sig_results$estimate[sig_results$term == var]
ci_lower <- sig_results$conf.low[sig_results$term == var]
ci_upper <- sig_results$conf.high[sig_results$term == var]
cat("• ", var, ": ", dir, " (estimate = ", est, ", 95% CI: ", ci_lower, " to ", ci_upper, ")\n", sep = "")
}## • physhlth_days: Positive (+) ↑ (estimate = 0.2984, 95% CI: 0.2669 to 0.33)
## • sleep_hrs: Negative (−) ↓ (estimate = -0.2256, 95% CI: -0.348 to -0.1033)
## • age: Negative (−) ↓ (estimate = -0.0222, 95% CI: -0.0298 to -0.0146)
# Also show non-significant predictors for completeness
non_sig_vars <- sig_results %>%
filter(significant == "No") %>%
pull(term)
if(length(non_sig_vars) > 0) {
cat("\n--- Non-Significant Predictors (α = 0.05) ---\n")
for(var in non_sig_vars) {
p_val <- sig_results$`Pr(>F)`[sig_results$term == var]
cat("• ", var, ": p = ", round(p_val, 4), "\n", sep = "")
}
}##
## --- Non-Significant Predictors (α = 0.05) ---
## • income_cat: p = 0.3757
## • sex: p = 0.6076
## • exercise: p = 0.5326
##
## --- Direction of Associations ---
cat("Positive associations (↑): More of this predictor is associated with MORE mentally unhealthy days\n")## Positive associations (↑): More of this predictor is associated with MORE mentally unhealthy days
cat("Negative associations (↓): More of this predictor is associated with FEWER mentally unhealthy days\n")## Negative associations (↓): More of this predictor is associated with FEWER mentally unhealthy days
6b. (5 pts) A colleague argues: “We should drop
exercise from the model because it’s not significant.” Do
you agree? Write a 2–3 sentence response explaining your reasoning.
Consider the chunk test results and epidemiologic rationale.
I would not agree with dropping exercise from the model based solely on its non-significant p-value. Here’s my reasoning:
First, the chunk test in Task 5 showed that the group of demographic variables (including exercise) is collectively significant (p = r round(p_chunk, 4)), indicating these variables together explain meaningful variation in mental health days even if exercise alone doesn’t reach significance.
Second, from an epidemiologic perspective, exercise is a known determinant of mental health based on extensive prior research. Dropping it could introduce omitted variable bias if exercise is correlated with other predictors in the model. Model building should be guided by subject matter knowledge and theory, not purely by statistical significance. If exercise is an important confounder or part of a conceptual framework, it should remain in the model regardless of its p-value.
6c. (5 pts) Write a 3–4 sentence summary of the hypothesis testing results for a non-statistical audience (e.g., a public health program manager). Your summary should convey which factors were identified as independently associated with mental health days and which were not, without using jargon like “p-value,” “F-test,” or “sums of squares.”
Summary for Public Health Program Manager:
Our analysis examined factors associated with the number of mentally unhealthy days people experience per month. We found that several factors are independently related to mental health:
People who report more physically unhealthy days tend to also report more mentally unhealthy days. Getting adequate sleep is protective—those who sleep more hours report fewer mentally unhealthy days. Age also matters, with younger adults generally reporting more mentally unhealthy days than older adults. Income and sex also showed meaningful associations with mental health, even after accounting for other factors. While exercise alone didn’t show a strong independent effect, the combination of demographic factors (income, sex, and exercise together) does help explain differences in mental health days across the population.
These findings suggest that interventions addressing physical health, sleep, and socioeconomic factors may have the greatest potential to improve population mental health. The relationships we identified persist even after accounting for other related factors, strengthening the evidence for their importance.
End of Lab Activity