EPI 553 — Logistic Regression Part 2 Lab Due: End of class, April 14, 2026
In this lab, you will build a multiple logistic regression model, conduct hypothesis tests, examine an interaction, and assess goodness-of-fit and discrimination. Use the same BRFSS 2020 logistic dataset from Part 1. Work through each task systematically. You may discuss concepts with classmates, but your written answers and R code must be your own.
Submission: Knit your .Rmd to HTML and upload to Brightspace by end of class.
| Variable | Description | Type |
|---|---|---|
fmd |
Frequent mental distress (No/Yes) | Factor (outcome) |
menthlth_days |
Mentally unhealthy days (0–30) | Numeric |
physhlth_days |
Physically unhealthy days (0–30) | Numeric |
sleep_hrs |
Sleep hours per night (1–14) | Numeric |
age |
Age in years (capped at 80) | Numeric |
sex |
Sex (Male/Female) | Factor |
bmi |
Body mass index | Numeric |
exercise |
Exercised in past 30 days (No/Yes) | Factor |
income_cat |
Household income category (1–8) | Numeric |
smoker |
Former/Never vs. Current | Factor |
library(tidyverse)
library(broom)
library(knitr)
library(kableExtra)
library(gtsummary)
library(car)
library(ggeffects)
library(ResourceSelection)
library(pROC)
library(performance)
brfss_logistic <- readRDS(
"~/Desktop/EPI553/data/brfss_logistic_2020 copy.rds"
)1a. (5 pts) Fit a multiple logistic regression model
predicting fmd from at least 5 predictors of your
choice.
mod_fmd <- glm(fmd ~ exercise + sex + physhlth_days + age + sleep_hrs,
data= brfss_logistic,
family= binomial(link= "logit"))1b. (5 pts) Report the adjusted ORs with 95% CIs in
a publication-quality table using tbl_regression().
mod_fmd |>
tbl_regression(
exponentiate = TRUE,
label = list(
exercise ~ "Exercise (past 30 days)",
sex ~ "Sex",
physhlth_days ~ "Physically unhealthy days",
age ~ "Age (per year)",
sleep_hrs ~ "Sleep hours"
)
) |>
bold_labels() |>
bold_p()| Characteristic | OR | 95% CI | p-value |
|---|---|---|---|
| Exercise (past 30 days) | |||
| No | — | — | |
| Yes | 0.75 | 0.62, 0.90 | 0.002 |
| Sex | |||
| Male | — | — | |
| Female | 1.75 | 1.48, 2.07 | <0.001 |
| Physically unhealthy days | 1.07 | 1.06, 1.08 | <0.001 |
| Age (per year) | 0.97 | 0.96, 0.97 | <0.001 |
| Sleep hours | 0.85 | 0.80, 0.90 | <0.001 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
1c. (5 pts) Interpret the adjusted OR for two predictors of your choice in 1-2 sentences each. Make sure to mention what the OR represents (per unit change for continuous; reference category for categorical).
In comparison to those who did not exercise in the past 30 days (reference category), those who did exercise have 0.75 times the odds of frequent mental distress (aOR= 0.75, 95% CI: 0.62-0.90). In other words, there is 25% lower odds of fmd among those who exercised, after adjusting for sex, physical health, age, and sleep. Since the aOR is less than 1, it indicates that exercise is a protective factor.
For each additional hour of sleep, the odds of frequent mental distress decrease by 15% (aOR= 0.85, 95% CI= 0.80-0.90), after adjusting for exercise, sex, physical health, and age. Since aOR is less than 1, it indicates sleep is a protective factor.
2a. (5 pts) Identify the Wald p-value for each
predictor in your model from the tidy() or
summary() output.
tidy(mod_fmd, conf.int = TRUE, exponentiate = TRUE) |>
mutate(across(c(estimate, std.error, statistic, conf.low, conf.high),
\(x) round(x, 3)),
p.value = format.pval(p.value, digits = 3)) |>
kable(caption = "Wald Tests for Each Coefficient (Exponentiated)") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| (Intercept) | 2.088 | 0.238 | 3.088 | 0.00202 | 1.309 | 3.332 |
| exerciseYes | 0.747 | 0.095 | -3.072 | 0.00213 | 0.620 | 0.901 |
| sexFemale | 1.745 | 0.086 | 6.508 | 7.62e-11 | 1.476 | 2.065 |
| physhlth_days | 1.070 | 0.004 | 17.276 | < 2e-16 | 1.062 | 1.078 |
| age | 0.966 | 0.003 | -12.447 | < 2e-16 | 0.961 | 0.971 |
| sleep_hrs | 0.849 | 0.027 | -5.955 | 2.60e-09 | 0.804 | 0.896 |
exerciseYes: p-value= 0.002 sexFemale: p-value < 0.001 physhlth_days: p-value < 0.001 age: p-value < 0.001 sleep_hrs: p-value < 0.001
2b. (5 pts) Fit a reduced model that drops one
predictor of your choice. Perform a likelihood ratio test comparing the
full and reduced models using
anova(reduced, full, test = "LRT").
mod_reduced <- glm(
fmd ~ sex + physhlth_days + age + sleep_hrs,
data = brfss_logistic,
family = binomial
)
anova(mod_reduced, mod_fmd, test = "LRT") |>
kable(digits = 3,
caption = "LR Test: Does adding exercise improve the model?") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)| Resid. Df | Resid. Dev | Df | Deviance | Pr(>Chi) |
|---|---|---|---|---|
| 4995 | 3675.927 | NA | NA | NA |
| 4994 | 3666.679 | 1 | 9.248 | 0.002 |
2c. (5 pts) Compare the conclusions from the Wald test (for the dropped predictor) and the LR test. Do they agree? In 2-3 sentences, explain when the two tests might disagree.
3a. (5 pts) Fit a model that includes an interaction
between two predictors of your choice (e.g., exercise * sex
or smoker * age).
mod_interaction <- glm(fmd ~ exercise * sex + physhlth_days + age + sleep_hrs, data= brfss_logistic, family= binomial
)
mod_interaction |>
tbl_regression(exponentiate = TRUE) |>
bold_labels() |>
bold_p()| Characteristic | OR | 95% CI | p-value |
|---|---|---|---|
| exercise | |||
| No | — | — | |
| Yes | 0.74 | 0.56, 0.97 | 0.027 |
| sex | |||
| Male | — | — | |
| Female | 1.72 | 1.28, 2.31 | <0.001 |
| physhlth_days | 1.07 | 1.06, 1.08 | <0.001 |
| IMPUTED AGE VALUE COLLAPSED ABOVE 80 | 0.97 | 0.96, 0.97 | <0.001 |
| sleep_hrs | 0.85 | 0.80, 0.90 | <0.001 |
| exercise * sex | |||
| Yes * Female | 1.02 | 0.71, 1.46 | >0.9 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
3b. (5 pts) Perform a likelihood ratio test comparing the model with the interaction to the model without it.
mod_no_interaction <- glm(
fmd ~ exercise + sex + age + sleep_hrs + physhlth_days,
data = brfss_logistic,
family = binomial
)
anova(mod_no_interaction, mod_interaction, test = "LRT") |>
kable(digits = 3,
caption = "LR Test for Exercise × Sex Interaction") |>
kable_styling(bootstrap_options = "striped", full_width = FALSE)| Resid. Df | Resid. Dev | Df | Deviance | Pr(>Chi) |
|---|---|---|---|---|
| 4994 | 3666.679 | NA | NA | NA |
| 4993 | 3666.664 | 1 | 0.016 | 0.9 |
3c. (5 pts) Create a visualization of the
interaction using ggpredict() and plot().
ggpredict(mod_interaction, terms = c("exercise", "sex")) |>
plot() +
labs(title = "Predicted Probability of FMD by Exercise and Sex",
x = "Exercise", y = "Predicted Probability of FMD",
color = "Sex") +
theme_minimal()3d. (5 pts) In 3-4 sentences, interpret the interaction. Does the effect of one predictor differ across levels of the other? If statistically significant, report the stratum-specific odds ratios.
The interaction between exercise and sex is not statistically significant, the p-value is greater than 0.9. The likelihood test supports this with a p-value= 0.9 therefore the effect of exercise on fmd does not differ by sex. Since the interaction is not statistically significant the effect of both groups can be interpreted by themselves. This means females have 1.72 times the odds of fmd compared to males, regardless of exercise status. Also, the odds of fmd are 26% lower for those who exercise compared to those who do not.
4a. (5 pts) Compute McFadden’s pseudo-R² for your
full model using performance::r2_mcfadden().
## # R2 for Generalized Linear Regression
## R2: 0.138
## adj. R2: 0.137
4b. (5 pts) Perform the Hosmer-Lemeshow
goodness-of-fit test using
ResourceSelection::hoslem.test(). Report the test statistic
and p-value. Comment on the interpretation given your sample size.
hl_test <- hoslem.test(
x = as.numeric(brfss_logistic$fmd) - 1,
y = fitted(mod_fmd),
g = 10
)
hl_test##
## Hosmer and Lemeshow goodness of fit (GOF) test
##
## data: as.numeric(brfss_logistic$fmd) - 1, fitted(mod_fmd)
## X-squared = 15.295, df = 8, p-value = 0.05366
test statistic (X-squared)= 15.295 p-value= 0.05366 Given the large sample size and a p-value just above 0.05, it is suggested that the model seems to fit the data but it should be interpreted with caution.
4c. (5 pts) Create a calibration plot showing observed vs. predicted probabilities by decile. Comment on whether the model appears well calibrated.
brfss_logistic |>
mutate(pred_prob = fitted(mod_fmd),
obs_outcome = as.numeric(fmd) - 1,
decile = ntile(pred_prob, 10)) |>
group_by(decile) |>
summarise(
mean_pred = mean(pred_prob),
mean_obs = mean(obs_outcome),
.groups = "drop"
) |>
ggplot(aes(x = mean_pred, y = mean_obs)) +
geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
geom_point(size = 3, color = "steelblue") +
geom_line(color = "steelblue") +
labs(title = "Calibration Plot: Observed vs. Predicted Probability of FMD",
subtitle = "Points should fall on the dashed line for perfect calibration",
x = "Mean Predicted Probability (within decile)",
y = "Observed Proportion (within decile)") +
theme_minimal()The model is not perfectly well calibrated. There are points that fall both above and under the 45-degree line but they are all relatively close.
4d. (10 pts) Compute and plot the ROC curve using
pROC::roc(). Report the AUC. Based on the AUC value, how
would you describe the model’s discrimination ability (poor, acceptable,
excellent, outstanding)?
roc_obj <- roc(
response = brfss_logistic$fmd,
predictor = fitted(mod_fmd),
levels = c("No", "Yes"),
direction = "<"
)
auc_value <- auc(roc_obj)
ggroc(roc_obj, color = "steelblue", linewidth = 1.2) +
geom_abline(slope = 1, intercept = 1, linetype = "dashed", color = "red") +
labs(title = "ROC Curve for Frequent Mental Distress Model",
subtitle = paste0("AUC = ", round(auc_value, 3)),
x = "Specificity", y = "Sensitivity") +
theme_minimal()The AUC= 0.763 Based on the AUC of 0.763, the models discrimination ability is acceptable. —
End of Lab Activity