EPI 553 — Logistic Regression Part 2 Lab Due: End of class, April 14, 2026
In this lab, you will build a multiple logistic regression model, conduct hypothesis tests, examine an interaction, and assess goodness-of-fit and discrimination. Use the same BRFSS 2020 logistic dataset from Part 1. Work through each task systematically. You may discuss concepts with classmates, but your written answers and R code must be your own.
Submission: Knit your .Rmd to HTML and upload to Brightspace by end of class.
| Variable | Description | Type |
|---|---|---|
fmd |
Frequent mental distress (No/Yes) | Factor (outcome) |
menthlth_days |
Mentally unhealthy days (0–30) | Numeric |
physhlth_days |
Physically unhealthy days (0–30) | Numeric |
sleep_hrs |
Sleep hours per night (1–14) | Numeric |
age |
Age in years (capped at 80) | Numeric |
sex |
Sex (Male/Female) | Factor |
bmi |
Body mass index | Numeric |
exercise |
Exercised in past 30 days (No/Yes) | Factor |
income_cat |
Household income category (1–8) | Numeric |
smoker |
Former/Never vs. Current | Factor |
library(tidyverse)
library(broom)
library(knitr)
library(kableExtra)
library(gtsummary)
library(car)
library(ggeffects)
library(ResourceSelection)
library(pROC)
library(performance)
brfss_logistic <- readRDS("data/brfss_logistic_2020.rds"
)1a. (5 pts) Fit a multiple logistic regression model
predicting fmd from at least 5 predictors of your
choice.
mod_full <- glm(
fmd ~ exercise + age + sex + sleep_hrs + income_cat,
data = brfss_logistic,
family = binomial(link = "logit")
)1b. (5 pts) Report the adjusted ORs with 95% CIs in
a publication-quality table using tbl_regression().
mod_full |>
tbl_regression(
exponentiate = TRUE,
label = list(
exercise ~ "Exercise (past 30 days)",
age ~ "Age (per year)",
sex ~ "Sex",
sleep_hrs ~ "Sleep hours",
income_cat ~ "Income category (per unit)"
)
) |>
bold_labels() |>
bold_p()| Characteristic | OR | 95% CI | p-value |
|---|---|---|---|
| Exercise (past 30 days) | |||
| No | — | — | |
| Yes | 0.60 | 0.51, 0.72 | <0.001 |
| Age (per year) | 0.97 | 0.97, 0.98 | <0.001 |
| Sex | |||
| Male | — | — | |
| Female | 1.67 | 1.42, 1.96 | <0.001 |
| Sleep hours | 0.81 | 0.77, 0.86 | <0.001 |
| Income category (per unit) | 0.84 | 0.81, 0.88 | <0.001 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
1c. (5 pts) Interpret the adjusted OR for two predictors of your choice in 1-2 sentences each. Make sure to mention what the OR represents (per unit change for continuous; reference category for categorical).
2a. (5 pts) Identify the Wald p-value for each
predictor in your model from the tidy() or
summary() output.
tidy(mod_full, conf.int = TRUE, exponentiate = TRUE) |>
mutate(across(c(estimate, std.error, statistic, conf.low, conf.high),
\(x) round(x, 3)),
p.value = format.pval(p.value, digits = 3)) |>
kable(caption = "Wald Tests for Each Coefficient (Exponentiated)") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 8.967 | 0.247 | 8.890 | < 2e-16 | 5.534 | 14.564 |
| exerciseYes | 0.603 | 0.090 | -5.635 | 1.75e-08 | 0.506 | 0.720 |
| age | 0.973 | 0.003 | -10.551 | < 2e-16 | 0.968 | 0.978 |
| sexFemale | 1.668 | 0.083 | 6.138 | 8.37e-10 | 1.417 | 1.965 |
| sleep_hrs | 0.811 | 0.027 | -7.705 | 1.31e-14 | 0.769 | 0.855 |
| income_cat | 0.844 | 0.019 | -9.042 | < 2e-16 | 0.814 | 0.876 |
2b. (5 pts) Fit a reduced model that drops one
predictor of your choice. Perform a likelihood ratio test comparing the
full and reduced models using
anova(reduced, full, test = "LRT").
mod_reduced <- glm(
fmd ~ exercise + age + sex + sleep_hrs,
data = brfss_logistic,
family = binomial
)
anova(mod_reduced, mod_full, test = "LRT") |>
kable(digits = 3,
caption = "LR Test: Does adding income improve the model?") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)| Resid. Df | Resid. Dev | Df | Deviance | Pr(>Chi) |
|---|---|---|---|---|
| 4995 | 3957.029 | NA | NA | NA |
| 4994 | 3876.520 | 1 | 80.509 | 0 |
2c. (5 pts) Compare the conclusions from the Wald test (for the dropped predictor) and the LR test. Do they agree? In 2-3 sentences, explain when the two tests might disagree. The Wald test shows that income is significantly associated with frequent mental distress. The likelihood ratio test which dropped the income variable showed that the full model (with income variable) is significantly better at predicting mental distress; therefore, these two tests agree. These test may not agree when there is a small sample size.
3a. (5 pts) Fit a model that includes an interaction
between two predictors of your choice (e.g., exercise * sex
or smoker * age).
mod_interact <- glm(
fmd ~ exercise*sleep_hrs + sex + age + income_cat,
data = brfss_logistic,
family = binomial
)
mod_interact |>
tbl_regression(exponentiate = TRUE) |>
bold_labels() |>
bold_p()| Characteristic | OR | 95% CI | p-value |
|---|---|---|---|
| exercise | |||
| No | — | — | |
| Yes | 1.09 | 0.52, 2.27 | 0.8 |
| sleep_hrs | 0.86 | 0.79, 0.93 | <0.001 |
| sex | |||
| Male | — | — | |
| Female | 1.67 | 1.42, 1.97 | <0.001 |
| IMPUTED AGE VALUE COLLAPSED ABOVE 80 | 0.97 | 0.97, 0.98 | <0.001 |
| income_cat | 0.85 | 0.81, 0.88 | <0.001 |
| exercise * sleep_hrs | |||
| Yes * sleep_hrs | 0.91 | 0.82, 1.02 | 0.10 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
3b. (5 pts) Perform a likelihood ratio test comparing the model with the interaction to the model without it.
mod_no_interact <- glm(
fmd ~ exercise + sleep_hrs + sex + age + income_cat,
data = brfss_logistic,
family = binomial
)
anova(mod_no_interact, mod_interact, test = "LRT") |>
kable(digits = 3,
caption = "LR Test for Exercise × Sleep Interaction") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)| Resid. Df | Resid. Dev | Df | Deviance | Pr(>Chi) |
|---|---|---|---|---|
| 4994 | 3876.520 | NA | NA | NA |
| 4993 | 3873.862 | 1 | 2.657 | 0.103 |
3c. (5 pts) Create a visualization of the
interaction using ggpredict() and plot().
ggpredict(mod_interact, terms = c("exercise", "sleep_hrs")) |>
plot() +
labs(title = "Predicted Probability of FMD by Exercise and Sleep Hours",
x = "exercise", y = "Predicted Probability of FMD",
color = "sleep_hrs") +
theme_minimal()3d. (5 pts) In 3-4 sentences, interpret the interaction. Does the effect of one predictor differ across levels of the other? If statistically significant, report the stratum-specific odds ratios. Exercise does not differ across sleep hours. The interaction was not found to be statistically significant. The likelihood ratio test shows that the model without the interaction is better at predicting mental distress. The plot shows that the relationship between sleep and mental distress does not change depending on exercise. —
4a. (5 pts) Compute McFadden’s pseudo-R² for your
full model using performance::r2_mcfadden().
## # R2 for Generalized Linear Regression
## R2: 0.088
## adj. R2: 0.088
4b. (5 pts) Perform the Hosmer-Lemeshow
goodness-of-fit test using
ResourceSelection::hoslem.test(). Report the test statistic
and p-value. Comment on the interpretation given your sample size.
hl_test <- hoslem.test(
x = as.numeric(brfss_logistic$fmd) - 1,
y = fitted(mod_full),
g = 10
)
hl_test##
## Hosmer and Lemeshow goodness of fit (GOF) test
##
## data: as.numeric(brfss_logistic$fmd) - 1, fitted(mod_full)
## X-squared = 18.281, df = 8, p-value = 0.01922
The Hosmer-Lemeshow test resulted in a test statistic of 18.28 and a p-value = 0.019. This low and significant p-value suggests a poor fit. Larger sample sizes like this study can be over-powered and result in detection of misfit.
4c. (5 pts) Create a calibration plot showing observed vs. predicted probabilities by decile. Comment on whether the model appears well calibrated.
brfss_logistic |>
mutate(pred_prob = fitted(mod_full),
obs_outcome = as.numeric(fmd) - 1,
decile = ntile(pred_prob, 10)) |>
group_by(decile) |>
summarise(
mean_pred = mean(pred_prob),
mean_obs = mean(obs_outcome),
.groups = "drop"
) |>
ggplot(aes(x = mean_pred, y = mean_obs)) +
geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
geom_point(size = 3, color = "steelblue") +
geom_line(color = "steelblue") +
labs(title = "Calibration Plot: Observed vs. Predicted Probability of FMD",
subtitle = "Points should fall on the dashed line for perfect calibration",
x = "Mean Predicted Probability (within decile)",
y = "Observed Proportion (within decile)") +
theme_minimal()
The model appears to be well calibrated given that the observed
proportion align closely with the predicted probabilities. There are two
instances of largers over and under prediction
4d. (10 pts) Compute and plot the ROC curve using
pROC::roc(). Report the AUC. Based on the AUC value, how
would you describe the model’s discrimination ability (poor, acceptable,
excellent, outstanding)?
roc_obj <- roc(
response = brfss_logistic$fmd,
predictor = fitted(mod_full),
levels = c("No", "Yes"),
direction = "<"
)
auc_value <- auc(roc_obj)
ggroc(roc_obj, color = "steelblue", linewidth = 1.2) +
geom_abline(slope = 1, intercept = 1, linetype = "dashed", color = "red") +
labs(title = "ROC Curve for Frequent Mental Distress Model",
subtitle = paste0("AUC = ", round(auc_value, 3)),
x = "Specificity", y = "Sensitivity") +
theme_minimal()
The AUC is 0.717. This indicates that the model’s disrimination ability
is acceptable. —
End of Lab Activity