Part 9: In-Class Lab Activity

EPI 553 — Logistic Regression Part 2 Lab Due: End of class, April 14, 2026

Instructions

In this lab, you will build a multiple logistic regression model, conduct hypothesis tests, examine an interaction, and assess goodness-of-fit and discrimination. Use the same BRFSS 2020 logistic dataset from Part 1. Work through each task systematically. You may discuss concepts with classmates, but your written answers and R code must be your own.

Submission: Knit your .Rmd to HTML and upload to Brightspace by end of class.

Data for the Lab

Variable	Description	Type
`fmd`	Frequent mental distress (No/Yes)	Factor (outcome)
`menthlth_days`	Mentally unhealthy days (0–30)	Numeric
`physhlth_days`	Physically unhealthy days (0–30)	Numeric
`sleep_hrs`	Sleep hours per night (1–14)	Numeric
`age`	Age in years (capped at 80)	Numeric
`sex`	Sex (Male/Female)	Factor
`bmi`	Body mass index	Numeric
`exercise`	Exercised in past 30 days (No/Yes)	Factor
`income_cat`	Household income category (1–8)	Numeric
`smoker`	Former/Never vs. Current	Factor

library(tidyverse)

## Warning: package 'tidyverse' was built under R version 4.5.2

## Warning: package 'ggplot2' was built under R version 4.5.2

## Warning: package 'tibble' was built under R version 4.5.2

## Warning: package 'tidyr' was built under R version 4.5.2

## Warning: package 'readr' was built under R version 4.5.2

## Warning: package 'purrr' was built under R version 4.5.2

## Warning: package 'dplyr' was built under R version 4.5.2

## Warning: package 'stringr' was built under R version 4.5.2

## Warning: package 'forcats' was built under R version 4.5.2

## Warning: package 'lubridate' was built under R version 4.5.2

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.2.0
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(broom)

## Warning: package 'broom' was built under R version 4.5.2

library(knitr)

## Warning: package 'knitr' was built under R version 4.5.2

library(kableExtra)

## Warning: package 'kableExtra' was built under R version 4.5.2

## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows

library(gtsummary)

## Warning: package 'gtsummary' was built under R version 4.5.2

library(car)

## Warning: package 'car' was built under R version 4.5.2

## Loading required package: carData

## Warning: package 'carData' was built under R version 4.5.2

## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## The following object is masked from 'package:purrr':
## 
##     some

library(ggeffects)

## Warning: package 'ggeffects' was built under R version 4.5.2

library(ResourceSelection)

## Warning: package 'ResourceSelection' was built under R version 4.5.3

## ResourceSelection 0.3-6   2023-06-27

library(pROC)

## Warning: package 'pROC' was built under R version 4.5.3

## Type 'citation("pROC")' for a citation.
## 
## Attaching package: 'pROC'
## 
## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var

library(performance)

## Warning: package 'performance' was built under R version 4.5.3

brfss_logistic <- readRDS(
  "C:/Users/abbym/OneDrive/Desktop/STATS553/R Materials/epi553/scripts/brfss_logistic.rds"
)

Task 1: Build a Multiple Logistic Regression Model (15 points)

mod_full <- glm(
  fmd ~ exercise + smoker + age + sex + sleep_hrs,
  data = brfss_logistic,
  family = binomial(link = "logit")
)

mod_full |>
  tbl_regression(
    exponentiate = TRUE,
    label = list(
      exercise      ~ "Exercise (past 30 days)",
      smoker        ~ "Smoking status",
      age           ~ "Age (per year)",
      sex           ~ "Sex",
      sleep_hrs     ~ "Sleep hours"
    )
  ) |>
  bold_labels() |>
  bold_p()

Characteristic	OR	95% CI	p-value
Exercise (past 30 days)
No	—	—
Yes	0.54	0.46, 0.65	<0.001
Smoking status
Former/Never	—	—
Current	1.45	1.23, 1.71	<0.001
Age (per year)	0.98	0.97, 0.98	<0.001
Sex
Male	—	—
Female	1.77	1.51, 2.08	<0.001
Sleep hours	0.81	0.76, 0.85	<0.001
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

1a. (5 pts) Fit a multiple logistic regression model predicting fmd from at least 5 predictors of your choice.

1b. (5 pts) Report the adjusted ORs with 95% CIs in a publication-quality table using tbl_regression().

1c. (5 pts) Interpret the adjusted OR for two predictors of your choice in 1-2 sentences each. Make sure to mention what the OR represents (per unit change for continuous; reference category for categorical). Compared to those that don’t smoke, those that do smoke have 45% increased odds of having frequent mental distress, holding all other variables constant. For every one additional hour of sleep, the odds of having frequent mental distress decreases by 19%, holding all other variables constant. —

Task 2: Wald and Likelihood Ratio Tests (15 points)

tidy(mod_full, conf.int = TRUE, exponentiate = TRUE) |>
  mutate(across(c(estimate, std.error, statistic, conf.low, conf.high),
                \(x) round(x, 3)),
         p.value = format.pval(p.value, digits = 3)) |>
  kable(caption = "Wald Tests for Each Coefficient (Exponentiated)") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

Wald Tests for Each Coefficient (Exponentiated)
term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	2.788	0.246	4.175	2.98e-05	1.723	4.514
exerciseYes	0.544	0.088	-6.921	4.48e-12	0.458	0.646
smokerCurrent	1.451	0.085	4.381	1.18e-05	1.228	1.713
age	0.976	0.003	-9.107	< 2e-16	0.971	0.981
sexFemale	1.771	0.082	6.935	4.07e-12	1.507	2.082
sleep_hrs	0.807	0.028	-7.813	5.56e-15	0.764	0.851

mod_reduced <- glm(
  fmd ~ exercise + smoker + age + sex,
  data = brfss_logistic,
  family = binomial
)

anova(mod_reduced, mod_full, test = "LRT") |>
  kable(digits = 3,
        caption = "LR Test: Does adding exercise + smoker improve the model?") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

LR Test: Does adding exercise + smoker improve the model?
Resid. Df	Resid. Dev	Df	Deviance	Pr(>Chi)
4995	4000.588	NA	NA	NA
4994	3938.021	1	62.567	0

2a. (5 pts) Identify the Wald p-value for each predictor in your model from the tidy() or summary() output.

2b. (5 pts) Fit a reduced model that drops one predictor of your choice. Perform a likelihood ratio test comparing the full and reduced models using anova(reduced, full, test = "LRT").

2c. (5 pts) Compare the conclusions from the Wald test (for the dropped predictor) and the LR test. Do they agree? In 2-3 sentences, explain when the two tests might disagree. The Wald test p-value for sleep_hrs was 5.56e-15 and the LR test p-value was 0. This means that both tests show that sleep_hrs improves the model. The two tests might disagree when the other variables have an impact on the variable that has been dropped. Sometimes a single variable alone doesn’t provide significant information, but when combined with other variables in a model, it does provide significant information. —

Task 3: Test an Interaction (20 points)

mod_int <- glm(
  fmd ~ exercise * smoker + age + sex + sleep_hrs,
  data = brfss_logistic,
  family = binomial(link = "logit")
)

anova(mod_int, mod_full, test = "LRT") |>
  kable(digits = 3,
        caption = "LR Test: Does adding exercise + smoker interaction improve the model?") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

LR Test: Does adding exercise + smoker interaction improve the model?
Resid. Df	Resid. Dev	Df	Deviance	Pr(>Chi)
4993	3936.635	NA	NA	NA
4994	3938.021	-1	-1.386	0.239

ggpredict(mod_int, terms = c("exercise", "smoker")) |>
  plot() +
  labs(title = "Predicted Probability of FMD by Exercise and Smoker",
       x = "Exercise", y = "Predicted Probability of FMD",
       color = "Smoker") +
  theme_minimal()

## Ignoring unknown labels:
## • linetype : "smoker"
## • shape : "smoker"

3a. (5 pts) Fit a model that includes an interaction between two predictors of your choice (e.g., exercise * sex or smoker * age).

3b. (5 pts) Perform a likelihood ratio test comparing the model with the interaction to the model without it.

3c. (5 pts) Create a visualization of the interaction using ggpredict() and plot().

3d. (5 pts) In 3-4 sentences, interpret the interaction. Does the effect of one predictor differ across levels of the other? If statistically significant, report the stratum-specific odds ratios. The effect of exercise on frequent mental distress does not differ by sex, and vice versa. The p-value of the likelihood ratio test is 0.239, meaning the interaction is not significant. This means it does not improve or provide significant information to the model. —

Task 4: Goodness-of-Fit and Discrimination (25 points)

performance::r2_mcfadden(mod_full)

## # R2 for Generalized Linear Regression
##        R2: 0.074
##   adj. R2: 0.073

hl_test <- hoslem.test(
  x = as.numeric(brfss_logistic$fmd) - 1,
  y = fitted(mod_full),
  g = 10
)

hl_test

## 
##  Hosmer and Lemeshow goodness of fit (GOF) test
## 
## data:  as.numeric(brfss_logistic$fmd) - 1, fitted(mod_full)
## X-squared = 17.02, df = 8, p-value = 0.0299

brfss_logistic |>
  mutate(pred_prob = fitted(mod_full),
         obs_outcome = as.numeric(fmd) - 1,
         decile = ntile(pred_prob, 10)) |>
  group_by(decile) |>
  summarise(
    mean_pred = mean(pred_prob),
    mean_obs  = mean(obs_outcome),
    .groups = "drop"
  ) |>
  ggplot(aes(x = mean_pred, y = mean_obs)) +
  geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
  geom_point(size = 3, color = "steelblue") +
  geom_line(color = "steelblue") +
  labs(title = "Calibration Plot: Observed vs. Predicted Probability of FMD",
       subtitle = "Points should fall on the dashed line for perfect calibration",
       x = "Mean Predicted Probability (within decile)",
       y = "Observed Proportion (within decile)") +
  theme_minimal()

roc_obj <- roc(
  response = brfss_logistic$fmd,
  predictor = fitted(mod_full),
  levels = c("No", "Yes"),
  direction = "<"
)

auc_value <- auc(roc_obj)

ggroc(roc_obj, color = "steelblue", linewidth = 1.2) +
  geom_abline(slope = 1, intercept = 1, linetype = "dashed", color = "red") +
  labs(title = "ROC Curve for Frequent Mental Distress Model",
       subtitle = paste0("AUC = ", round(auc_value, 3)),
       x = "Specificity", y = "Sensitivity") +
  theme_minimal()

4a. (5 pts) Compute McFadden’s pseudo-R² for your full model using performance::r2_mcfadden().

4b. (5 pts) Perform the Hosmer-Lemeshow goodness-of-fit test using ResourceSelection::hoslem.test(). Report the test statistic and p-value. Comment on the interpretation given your sample size. The Hosmer-Lemeshow test had a p-value of 0.0299, which means that the model does not fit well in some of the regions of predicted probability, but since we are using a large sample size, the test could be overpowered and this p-value could be representative of very trivial and small shifts. 4c. (5 pts) Create a calibration plot showing observed vs. predicted probabilities by decile. Comment on whether the model appears well calibrated. The model looks fairly well calibrated. It follows the 45 degree line pretty well, with only a few more major deviations from it. 4d. (10 pts) Compute and plot the ROC curve using pROC::roc(). Report the AUC. Based on the AUC value, how would you describe the model’s discrimination ability (poor, acceptable, excellent, outstanding)? The AUC is 0.695. This means that the model’s discrimination ability is poor. —

End of Lab Activity

Lab13_LogisticRegression2_AbigailMiller

Abigail Miller

2026-04-16