EPI 553 — Logistic Regression Part 2 Lab Due: End of class, April 14, 2026
In this lab, you will build a multiple logistic regression model, conduct hypothesis tests, examine an interaction, and assess goodness-of-fit and discrimination. Use the same BRFSS 2020 logistic dataset from Part 1. Work through each task systematically. You may discuss concepts with classmates, but your written answers and R code must be your own.
Submission: Knit your .Rmd to HTML and upload to Brightspace by end of class.
| Variable | Description | Type |
|---|---|---|
fmd |
Frequent mental distress (No/Yes) | Factor (outcome) |
menthlth_days |
Mentally unhealthy days (0–30) | Numeric |
physhlth_days |
Physically unhealthy days (0–30) | Numeric |
sleep_hrs |
Sleep hours per night (1–14) | Numeric |
age |
Age in years (capped at 80) | Numeric |
sex |
Sex (Male/Female) | Factor |
bmi |
Body mass index | Numeric |
exercise |
Exercised in past 30 days (No/Yes) | Factor |
income_cat |
Household income category (1–8) | Numeric |
smoker |
Former/Never vs. Current | Factor |
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.5.2
## Warning: package 'ggplot2' was built under R version 4.5.2
## Warning: package 'tibble' was built under R version 4.5.2
## Warning: package 'tidyr' was built under R version 4.5.2
## Warning: package 'readr' was built under R version 4.5.2
## Warning: package 'purrr' was built under R version 4.5.2
## Warning: package 'dplyr' was built under R version 4.5.2
## Warning: package 'stringr' was built under R version 4.5.2
## Warning: package 'forcats' was built under R version 4.5.2
## Warning: package 'lubridate' was built under R version 4.5.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.0 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(broom)
## Warning: package 'broom' was built under R version 4.5.2
library(knitr)
## Warning: package 'knitr' was built under R version 4.5.2
library(kableExtra)
## Warning: package 'kableExtra' was built under R version 4.5.2
##
## Attaching package: 'kableExtra'
##
## The following object is masked from 'package:dplyr':
##
## group_rows
library(gtsummary)
## Warning: package 'gtsummary' was built under R version 4.5.2
library(car)
## Warning: package 'car' was built under R version 4.5.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.5.2
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
library(ggeffects)
## Warning: package 'ggeffects' was built under R version 4.5.2
library(ResourceSelection)
## Warning: package 'ResourceSelection' was built under R version 4.5.3
## ResourceSelection 0.3-6 2023-06-27
library(pROC)
## Warning: package 'pROC' was built under R version 4.5.3
## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
##
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
library(performance)
## Warning: package 'performance' was built under R version 4.5.3
brfss_logistic <- readRDS(
"C:/Users/abbym/OneDrive/Desktop/STATS553/R Materials/epi553/scripts/brfss_logistic.rds"
)
mod_full <- glm(
fmd ~ exercise + smoker + age + sex + sleep_hrs,
data = brfss_logistic,
family = binomial(link = "logit")
)
mod_full |>
tbl_regression(
exponentiate = TRUE,
label = list(
exercise ~ "Exercise (past 30 days)",
smoker ~ "Smoking status",
age ~ "Age (per year)",
sex ~ "Sex",
sleep_hrs ~ "Sleep hours"
)
) |>
bold_labels() |>
bold_p()
| Characteristic | OR | 95% CI | p-value |
|---|---|---|---|
| Exercise (past 30 days) | |||
| No | — | — | |
| Yes | 0.54 | 0.46, 0.65 | <0.001 |
| Smoking status | |||
| Former/Never | — | — | |
| Current | 1.45 | 1.23, 1.71 | <0.001 |
| Age (per year) | 0.98 | 0.97, 0.98 | <0.001 |
| Sex | |||
| Male | — | — | |
| Female | 1.77 | 1.51, 2.08 | <0.001 |
| Sleep hours | 0.81 | 0.76, 0.85 | <0.001 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
1a. (5 pts) Fit a multiple logistic regression model
predicting fmd from at least 5 predictors of your
choice.
1b. (5 pts) Report the adjusted ORs with 95% CIs in
a publication-quality table using tbl_regression().
1c. (5 pts) Interpret the adjusted OR for two predictors of your choice in 1-2 sentences each. Make sure to mention what the OR represents (per unit change for continuous; reference category for categorical). Compared to those that don’t smoke, those that do smoke have 45% increased odds of having frequent mental distress, holding all other variables constant. For every one additional hour of sleep, the odds of having frequent mental distress decreases by 19%, holding all other variables constant. —
tidy(mod_full, conf.int = TRUE, exponentiate = TRUE) |>
mutate(across(c(estimate, std.error, statistic, conf.low, conf.high),
\(x) round(x, 3)),
p.value = format.pval(p.value, digits = 3)) |>
kable(caption = "Wald Tests for Each Coefficient (Exponentiated)") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 2.788 | 0.246 | 4.175 | 2.98e-05 | 1.723 | 4.514 |
| exerciseYes | 0.544 | 0.088 | -6.921 | 4.48e-12 | 0.458 | 0.646 |
| smokerCurrent | 1.451 | 0.085 | 4.381 | 1.18e-05 | 1.228 | 1.713 |
| age | 0.976 | 0.003 | -9.107 | < 2e-16 | 0.971 | 0.981 |
| sexFemale | 1.771 | 0.082 | 6.935 | 4.07e-12 | 1.507 | 2.082 |
| sleep_hrs | 0.807 | 0.028 | -7.813 | 5.56e-15 | 0.764 | 0.851 |
mod_reduced <- glm(
fmd ~ exercise + smoker + age + sex,
data = brfss_logistic,
family = binomial
)
anova(mod_reduced, mod_full, test = "LRT") |>
kable(digits = 3,
caption = "LR Test: Does adding exercise + smoker improve the model?") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)
| Resid. Df | Resid. Dev | Df | Deviance | Pr(>Chi) |
|---|---|---|---|---|
| 4995 | 4000.588 | NA | NA | NA |
| 4994 | 3938.021 | 1 | 62.567 | 0 |
2a. (5 pts) Identify the Wald p-value for each
predictor in your model from the tidy() or
summary() output.
2b. (5 pts) Fit a reduced model that drops one
predictor of your choice. Perform a likelihood ratio test comparing the
full and reduced models using
anova(reduced, full, test = "LRT").
2c. (5 pts) Compare the conclusions from the Wald test (for the dropped predictor) and the LR test. Do they agree? In 2-3 sentences, explain when the two tests might disagree. The Wald test p-value for sleep_hrs was 5.56e-15 and the LR test p-value was 0. This means that both tests show that sleep_hrs improves the model. The two tests might disagree when the other variables have an impact on the variable that has been dropped. Sometimes a single variable alone doesn’t provide significant information, but when combined with other variables in a model, it does provide significant information. —
mod_int <- glm(
fmd ~ exercise * smoker + age + sex + sleep_hrs,
data = brfss_logistic,
family = binomial(link = "logit")
)
anova(mod_int, mod_full, test = "LRT") |>
kable(digits = 3,
caption = "LR Test: Does adding exercise + smoker interaction improve the model?") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)
| Resid. Df | Resid. Dev | Df | Deviance | Pr(>Chi) |
|---|---|---|---|---|
| 4993 | 3936.635 | NA | NA | NA |
| 4994 | 3938.021 | -1 | -1.386 | 0.239 |
ggpredict(mod_int, terms = c("exercise", "smoker")) |>
plot() +
labs(title = "Predicted Probability of FMD by Exercise and Smoker",
x = "Exercise", y = "Predicted Probability of FMD",
color = "Smoker") +
theme_minimal()
## Ignoring unknown labels:
## • linetype : "smoker"
## • shape : "smoker"
3a. (5 pts) Fit a model that includes an interaction
between two predictors of your choice (e.g., exercise * sex
or smoker * age).
3b. (5 pts) Perform a likelihood ratio test comparing the model with the interaction to the model without it.
3c. (5 pts) Create a visualization of the
interaction using ggpredict() and plot().
3d. (5 pts) In 3-4 sentences, interpret the interaction. Does the effect of one predictor differ across levels of the other? If statistically significant, report the stratum-specific odds ratios. The effect of exercise on frequent mental distress does not differ by sex, and vice versa. The p-value of the likelihood ratio test is 0.239, meaning the interaction is not significant. This means it does not improve or provide significant information to the model. —
performance::r2_mcfadden(mod_full)
## # R2 for Generalized Linear Regression
## R2: 0.074
## adj. R2: 0.073
hl_test <- hoslem.test(
x = as.numeric(brfss_logistic$fmd) - 1,
y = fitted(mod_full),
g = 10
)
hl_test
##
## Hosmer and Lemeshow goodness of fit (GOF) test
##
## data: as.numeric(brfss_logistic$fmd) - 1, fitted(mod_full)
## X-squared = 17.02, df = 8, p-value = 0.0299
brfss_logistic |>
mutate(pred_prob = fitted(mod_full),
obs_outcome = as.numeric(fmd) - 1,
decile = ntile(pred_prob, 10)) |>
group_by(decile) |>
summarise(
mean_pred = mean(pred_prob),
mean_obs = mean(obs_outcome),
.groups = "drop"
) |>
ggplot(aes(x = mean_pred, y = mean_obs)) +
geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
geom_point(size = 3, color = "steelblue") +
geom_line(color = "steelblue") +
labs(title = "Calibration Plot: Observed vs. Predicted Probability of FMD",
subtitle = "Points should fall on the dashed line for perfect calibration",
x = "Mean Predicted Probability (within decile)",
y = "Observed Proportion (within decile)") +
theme_minimal()
roc_obj <- roc(
response = brfss_logistic$fmd,
predictor = fitted(mod_full),
levels = c("No", "Yes"),
direction = "<"
)
auc_value <- auc(roc_obj)
ggroc(roc_obj, color = "steelblue", linewidth = 1.2) +
geom_abline(slope = 1, intercept = 1, linetype = "dashed", color = "red") +
labs(title = "ROC Curve for Frequent Mental Distress Model",
subtitle = paste0("AUC = ", round(auc_value, 3)),
x = "Specificity", y = "Sensitivity") +
theme_minimal()
4a. (5 pts) Compute McFadden’s pseudo-R² for your
full model using performance::r2_mcfadden().
4b. (5 pts) Perform the Hosmer-Lemeshow
goodness-of-fit test using
ResourceSelection::hoslem.test(). Report the test statistic
and p-value. Comment on the interpretation given your sample size. The
Hosmer-Lemeshow test had a p-value of 0.0299, which means that the model
does not fit well in some of the regions of predicted probability, but
since we are using a large sample size, the test could be overpowered
and this p-value could be representative of very trivial and small
shifts. 4c. (5 pts) Create a calibration plot showing
observed vs. predicted probabilities by decile. Comment on whether the
model appears well calibrated. The model looks fairly well calibrated.
It follows the 45 degree line pretty well, with only a few more major
deviations from it. 4d. (10 pts) Compute and plot the
ROC curve using pROC::roc(). Report the AUC. Based on the
AUC value, how would you describe the model’s discrimination ability
(poor, acceptable, excellent, outstanding)? The AUC is 0.695. This means
that the model’s discrimination ability is poor. —
End of Lab Activity