Part 9: In-Class Lab Activity

EPI 553 — Logistic Regression Part 2 Lab Due: End of class, April 14, 2026


Instructions

In this lab, you will build a multiple logistic regression model, conduct hypothesis tests, examine an interaction, and assess goodness-of-fit and discrimination. Use the same BRFSS 2020 logistic dataset from Part 1. Work through each task systematically. You may discuss concepts with classmates, but your written answers and R code must be your own.

Submission: Knit your .Rmd to HTML and upload to Brightspace by end of class.


Data for the Lab

Variable Description Type
fmd Frequent mental distress (No/Yes) Factor (outcome)
menthlth_days Mentally unhealthy days (0–30) Numeric
physhlth_days Physically unhealthy days (0–30) Numeric
sleep_hrs Sleep hours per night (1–14) Numeric
age Age in years (capped at 80) Numeric
sex Sex (Male/Female) Factor
bmi Body mass index Numeric
exercise Exercised in past 30 days (No/Yes) Factor
income_cat Household income category (1–8) Numeric
smoker Former/Never vs. Current Factor
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.4.3
## Warning: package 'ggplot2' was built under R version 4.4.3
## Warning: package 'tibble' was built under R version 4.4.3
## Warning: package 'tidyr' was built under R version 4.4.3
## Warning: package 'readr' was built under R version 4.4.3
## Warning: package 'purrr' was built under R version 4.4.3
## Warning: package 'dplyr' was built under R version 4.4.3
## Warning: package 'stringr' was built under R version 4.4.3
## Warning: package 'forcats' was built under R version 4.4.3
## Warning: package 'lubridate' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.1     ✔ tibble    3.3.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(broom)
## Warning: package 'broom' was built under R version 4.4.3
library(knitr)
## Warning: package 'knitr' was built under R version 4.4.3
library(kableExtra)
## Warning: package 'kableExtra' was built under R version 4.4.3
## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(gtsummary)
## Warning: package 'gtsummary' was built under R version 4.4.3
library(car)
## Warning: package 'car' was built under R version 4.4.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.4.3
## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## The following object is masked from 'package:purrr':
## 
##     some
library(ggeffects)
## Warning: package 'ggeffects' was built under R version 4.4.3
library(ResourceSelection)
## Warning: package 'ResourceSelection' was built under R version 4.4.3
## ResourceSelection 0.3-6   2023-06-27
library(pROC)
## Warning: package 'pROC' was built under R version 4.4.3
## Type 'citation("pROC")' for a citation.
## 
## Attaching package: 'pROC'
## 
## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var
library(performance)
## Warning: package 'performance' was built under R version 4.4.3
brfss_logistic <- readRDS(
  "/Users/sarah/OneDrive/Documents/EPI 553/data/brfss_logistic_2020.rds"
)

Task 1: Build a Multiple Logistic Regression Model (15 points)

1a. (5 pts) Fit a multiple logistic regression model predicting fmd from at least 5 predictors of your choice.

mod_1a <- glm(fmd~ age + sex + sleep_hrs + exercise + smoker + income_cat, data = brfss_logistic, family = binomial(link = "logit"))

1b. (5 pts) Report the adjusted ORs with 95% CIs in a publication-quality table using tbl_regression().

mod_1a |>
  tbl_regression(
    exponentiate = TRUE, 
    label = list(
      age ~ "Age",
      sex ~ "Sex",
      sleep_hrs ~ "Hours of Sleep",
      exercise ~ "Exercise (Past 30 days)",
      smoker ~ "Smoking Status",
      income_cat ~ "Income category (per unit)")
  ) |>
  bold_labels()|>
  bold_p()
Characteristic OR 95% CI p-value
Age 0.97 0.97, 0.98 <0.001
Sex


    Male — —
    Female 1.67 1.42, 1.96 <0.001
Hours of Sleep 0.81 0.77, 0.86 <0.001
Exercise (Past 30 days)


    No — —
    Yes 0.61 0.51, 0.73 <0.001
Smoking Status


    Former/Never — —
    Current 1.25 1.05, 1.48 0.012
Income category (per unit) 0.85 0.82, 0.89 <0.001
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

1c. (5 pts) Interpret the adjusted OR for two predictors of your choice in 1-2 sentences each. Make sure to mention what the OR represents (per unit change for continuous; reference category for categorical).

The adjusted OR for sleep hours represents the change in the odds of frequent mental distress for each additional hour of sleep per night. With the OR of 0.81, each extra hour of sleep is associated with lower odds of experiencing FMD, after adjusting for age, sex, exercise, smoking, and income.

The adjusted OR for smoking compares current smokers to the reference group of former/never smokers. With an OR of 1.25, current smokers have higher odds of FMD than non-smokers, controlling for all other predictors in the model.

Task 2: Wald and Likelihood Ratio Tests (15 points)

2a. (5 pts) Identify the Wald p-value for each predictor in your model from the tidy() or summary() output.

tidy(mod_1a, conf.int = TRUE, exponentiate = TRUE) |>
  mutate(across(c(estimate, std.error, statistic, conf.low, conf.high),
                \(x) round(x, 3)),
         p.value = format.pval(p.value, digits = 3)) |>
  kable(caption = "Wald Tests for Each Coefficient (Exponentiated)") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)
Wald Tests for Each Coefficient (Exponentiated)
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 6.864 0.269 7.171 7.42e-13 4.057 11.632
age 0.975 0.003 -9.609 < 2e-16 0.970 0.980
sexFemale 1.667 0.083 6.129 8.85e-10 1.416 1.964
sleep_hrs 0.814 0.027 -7.572 3.69e-14 0.772 0.859
exerciseYes 0.613 0.090 -5.449 5.06e-08 0.514 0.731
smokerCurrent 1.247 0.088 2.523 0.0116 1.050 1.480
income_cat 0.853 0.019 -8.294 < 2e-16 0.822 0.886

2b. (5 pts) Fit a reduced model that drops one predictor of your choice. Perform a likelihood ratio test comparing the full and reduced models using anova(reduced, full, test = "LRT").

mod_2b <- glm(fmd~ age + sex + sleep_hrs + exercise + income_cat, data = brfss_logistic, family = binomial(link = "logit"))

anova(mod_2b, mod_1a, test = "LRT") |>
  kable(digits = 3, caption = "LR test: Does adding smoker improve the model?") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)
LR test: Does adding smoker improve the model?
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
4994 3876.520 NA NA NA
4993 3870.197 1 6.322 0.012

2c. (5 pts) Compare the conclusions from the Wald test (for the dropped predictor) and the LR test. Do they agree? In 2-3 sentences, explain when the two tests might disagree.

Yes the Wald test and LR test do agree. Adding smoker significantly improves the model fit (\(\chi^2\) = 6.322, p =0.012).

Task 3: Test an Interaction (20 points)

3a. (5 pts) Fit a model that includes an interaction between two predictors of your choice (e.g., exercise * sex or smoker * age).

mod_3a <- glm(fmd ~ smoker * age + sex + sleep_hrs + exercise + income_cat, data = brfss_logistic, family = binomial(link = "logit"))

3b. (5 pts) Perform a likelihood ratio test comparing the model with the interaction to the model without it.

anova(mod_3a, mod_1a, test = "LRT") |>
  kable(digits =3,
        caption = "LR Test: Does adding smoker *age improve the model?") |>
    kable_styling(bootstrap_options = "striped", full_width = FALSE)
LR Test: Does adding smoker *age improve the model?
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
4992 3865.763 NA NA NA
4993 3870.197 -1 -4.434 0.035

3c. (5 pts) Create a visualization of the interaction using ggpredict() and plot().

ggpredict(mod_3a, terms = c("smoker", "age")) |>
plot() +
  labs(
    x = "Smoking Status",
    y = "Predicted Probability of FMD",
    title = "Interaction Between Smoking Status and age on FMD"
  )
## Ignoring unknown labels:
## • linetype : "IMPUTED AGE VALUE COLLAPSED ABOVE 80"
## • shape : "IMPUTED AGE VALUE COLLAPSED ABOVE 80"

3d. (5 pts) In 3-4 sentences, interpret the interaction. Does the effect of one predictor differ across levels of the other? If statistically significant, report the stratum-specific odds ratios.

ggpredict(mod_3a, terms = c("smoker", "age [40,56,72]")) |>
  as_tibble() |>
  pivot_wider(
    id_cols = group,
    names_from = x,
    values_from = predicted
  ) |>
  mutate(
    OR_current_vs_former =
      (Current / (1 - Current)) /
      (`Former/Never` / (1 - `Former/Never`))
  ) |>
  select(
    Age = group,
    OR_current_vs_former
  ) |>
  kable(
    digits = 3,
    col.names = c("Age", "OR (Current vs Former/Never)"),
    caption = "Stratum-Specific Odds Ratios for Smoking at Selected Ages"
  ) |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)
Stratum-Specific Odds Ratios for Smoking at Selected Ages
Age OR (Current vs Former/Never)
40 1.106
56 1.322
72 1.580

The likelihood ratio test indicates that the smoker * age interaction is statistically significant (p = 0.035), meaning the association between smoking and FMD varies across age. The stratum-specific odds ratio demonstrate the impact of current smoking becomes stronger as age increases. At the age of 40, current smokers have 1.11 times the odds of FMD compared with former/never smokers. By the age of 72 the odds increased to 1.58. This pattern demonstrates that age modifies the effect of smoking, with smoking exerting a greater influence on FMD at older ages.


Task 4: Goodness-of-Fit and Discrimination (25 points)

4a. (5 pts) Compute McFadden’s pseudo-R² for your full model using performance::r2_mcfadden().

performance::r2_mcfadden(mod_3a)
## # R2 for Generalized Linear Regression
##        R2: 0.091
##   adj. R2: 0.090

4b. (5 pts) Perform the Hosmer-Lemeshow goodness-of-fit test using ResourceSelection::hoslem.test(). Report the test statistic and p-value. Comment on the interpretation given your sample size.

hoslem_test <- hoslem.test(
  x = as.numeric(brfss_logistic$fmd) - 1,
  y = fitted(mod_3a),
  g = 10
)

hoslem_test
## 
##  Hosmer and Lemeshow goodness of fit (GOF) test
## 
## data:  as.numeric(brfss_logistic$fmd) - 1, fitted(mod_3a)
## X-squared = 13.24, df = 8, p-value = 0.1038

The Hosmer-Lemeshow test reproted a test statistic of \(\chi^2\) = 13.24 and a p-value of 0.1038. Since the p-value is greater than 0.05, there is no evidencde of poor model fit. With a larger sample size, small deviations can produce statistically significant results. A non-significant Hosmer-Lemeshow test suggests the model fits adequately.

4c. (5 pts) Create a calibration plot showing observed vs. predicted probabilities by decile. Comment on whether the model appears well calibrated.

brfss_logistic |>
  mutate(pred_prob = fitted(mod_3a),
         obs_outcome = as.numeric(fmd) -1,
         decile = ntile(pred_prob, 10)) |>
  group_by(decile) |>
  summarise(
    mean_pred = mean(pred_prob),
    mean_obs  = mean(obs_outcome),
    .groups = "drop"
  ) |>
  ggplot(aes(x = mean_pred, y = mean_obs)) +
  geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
  geom_point(size = 3, color = "steelblue") +
  geom_line(color = "steelblue") +
  labs(title = "Calibration Plot: Observed vs. Predicted Probability of FMD",
       x = "Mean Predicted Probability (within decile)",
       y = "Observed Proportion (within decile)") +
  theme_minimal()

The calibration plot shows that hte observed proportions of FMD across deciles of predicted risk fall close to the reference line.

4d. (10 pts) Compute and plot the ROC curve using pROC::roc(). Report the AUC. Based on the AUC value, how would you describe the model’s discrimination ability (poor, acceptable, excellent, outstanding)?

roc_4d <- roc(brfss_logistic$fmd, fitted(mod_3a),
              levels = c("No", "Yes"),
              direction = "<")

auc_value4d <- auc(roc_4d)
auc_value4d
## Area under the curve: 0.7193
ggroc(roc_4d, color = "steelblue", linewidth = 1.2) +
  geom_abline(slope = 1, intercept = 1, linetype = "dashed", color = "red") +
  labs(
    title = "ROC Curve for FMD Model",
    x = "Specificity",
    y = "Sensitivity") +
  theme_minimal()

The AUC is 0.7193. This indicates the model demonstrates acceptable discrimination. The model can distinguish between individuals with and without FMD.

End of Lab Activity