Introduction

Binary logistic regression handles two-category outcomes. But many epidemiologic outcomes have more than two categories:

General health: Excellent / Very Good / Good / Fair / Poor
Disease stage: I / II / III / IV
Treatment choice: Drug A / Drug B / Drug C / No treatment
BMI category: Underweight / Normal / Overweight / Obese

When the response variable has \(K > 2\) categories, we use polytomous logistic regression (also called multinomial logistic regression). When the categories have a natural order, we can take advantage of that ordering with ordinal logistic regression.

Today’s roadmap (Kleinbaum Ch. 23): 1. Polytomous (multinomial) logistic regression for nominal outcomes 2. Ordinal logistic regression for ordered outcomes 3. The proportional odds assumption 4. Choosing between models

Why Not Just Run Multiple Binary Models?

Suppose general health has 3 categories: Excellent, Good, Poor. A naive approach would be to dichotomize and run a binary logistic regression repeatedly:

Excellent vs. not
Good vs. not
Poor vs. not

Problems with this approach:

Information loss. Dichotomizing throws away the distinction between the other categories.
Inconsistent samples. Each model uses a different comparison group.
No coherent probability model. Predicted probabilities won’t sum to 1.
Inefficiency. Standard errors are larger than necessary.

A polytomous model fits all categories simultaneously in a single likelihood, sharing information across comparisons.

Setup and Data

library(tidyverse)
library(haven)
library(janitor)
library(broom)
library(knitr)
library(kableExtra)
library(gtsummary)
library(nnet)        # multinom() for polytomous regression
library(MASS)        # polr() for ordinal regression
library(brant)       # Brant test for proportional odds
library(ggeffects)

options(gtsummary.use_ftExtra = TRUE)
set_gtsummary_theme(theme_gtsummary_compact(set_theme = TRUE))

brfss_full <- read_xpt(
  "C:/Users/God's Icon/Desktop/553-Statistical Inference/553 Lab Codes/data/LLCP2020XPT/LLCP2020.XPT"
) |>
  clean_names()

Building the Outcome: General Health

The BRFSS variable genhlth asks: “Would you say that in general your health is…”

1 = Excellent, 2 = Very Good, 3 = Good, 4 = Fair, 5 = Poor

We collapse this into a 3-level ordinal outcome to keep models tractable for class:

brfss_poly <- brfss_full |>
  mutate(
    genhlth_3 = case_when(
      genhlth %in% c(1, 2) ~ "Excellent/VG",
      genhlth == 3         ~ "Good",
      genhlth %in% c(4, 5) ~ "Fair/Poor",
      TRUE                 ~ NA_character_
    ),
    genhlth_ord = factor(genhlth_3,
      levels = c("Excellent/VG", "Good", "Fair/Poor"),
      ordered = TRUE),
    genhlth_nom = factor(genhlth_3,
      levels = c("Excellent/VG", "Good", "Fair/Poor")),
    age        = age80,
    sex        = factor(sexvar, levels = c(1, 2), labels = c("Male", "Female")),
    bmi        = ifelse(bmi5 > 0, bmi5 / 100, NA_real_),
    exercise   = factor(case_when(
      exerany2 == 1 ~ "Yes",
      exerany2 == 2 ~ "No",
      TRUE          ~ NA_character_
    ), levels = c("No", "Yes")),
    income_cat = case_when(
      income2 %in% 1:8 ~ as.numeric(income2),
      TRUE             ~ NA_real_
    ),
    smoker = factor(case_when(
      smokday2 %in% c(1, 2) ~ "Current",
      smokday2 == 3          ~ "Former/Never",
      TRUE                   ~ NA_character_
    ), levels = c("Former/Never", "Current"))
  ) |>
  filter(
    !is.na(genhlth_ord), !is.na(age), age >= 18, !is.na(sex),
    !is.na(bmi), !is.na(exercise), !is.na(income_cat), !is.na(smoker)
  )

set.seed(1220)
brfss_poly <- brfss_poly |>
  dplyr::select(genhlth_ord, genhlth_nom, age, sex, bmi,
                exercise, income_cat, smoker) |>
  slice_sample(n = 5000)

saveRDS(brfss_poly,
  "C:/Users/God's Icon/Desktop/553-Statistical Inference/553 Lab Codes/data/Polytomous/brfss_polytomous_2020.rds")

brfss_poly |>
  count(genhlth_ord) |>
  mutate(pct = round(100 * n / sum(n), 1)) |>
  kable(col.names = c("General Health", "N", "%"),
        caption = "Outcome Distribution") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

Outcome Distribution
General Health	N	%
Excellent/VG	2380	47.6
Good	1624	32.5
Fair/Poor	996	19.9

Since genhlth_ord and genhlth_nom contain the same three levels (just coded as ordered vs. unordered factors), the distributions will be identical, but showing both makes the distinction between the two factor types explicit for the lecture, which is useful context for introducing polytomous vs. ordinal regression.

Part 1: Polytomous (Multinomial) Logistic Regression

The Model

For an outcome with \(K\) categories, we choose one as the reference (say, \(Y = 1\)) and fit \(K - 1\) logit equations:

\[ \log\left[\frac{\Pr(Y = k)}{\Pr(Y = 1)}\right] = \alpha_k + \beta_{k1} X_1 + \cdots + \beta_{kp} X_p, \quad k = 2, \ldots, K \]

Each comparison gets its own intercept and its own slopes. With 3 outcome categories and 5 predictors, that’s \(2 \times (1 + 5) = 12\) parameters.

The probabilities are recovered as:

\[ \Pr(Y = k \mid X) = \frac{\exp(\alpha_k + X^T \beta_k)}{1 + \sum_{j=2}^{K} \exp(\alpha_j + X^T \beta_j)} \]

By construction, the probabilities sum to 1.

Fitting in R: `nnet::multinom()`

mod_multi <- multinom(
  genhlth_nom ~ age + sex + bmi + exercise + income_cat + smoker,
  data = brfss_poly,
  trace = FALSE
)

summary(mod_multi)

## Call:
## multinom(formula = genhlth_nom ~ age + sex + bmi + exercise + 
##     income_cat + smoker, data = brfss_poly, trace = FALSE)
## 
## Coefficients:
##           (Intercept)        age  sexFemale        bmi exerciseYes income_cat
## Good        -1.317306 0.01497067 -0.1417121 0.05087197   -0.500636 -0.1693992
## Fair/Poor   -1.283880 0.02579060 -0.1257209 0.06741312   -1.338926 -0.3932631
##           smokerCurrent
## Good          0.4117899
## Fair/Poor     0.3436415
## 
## Std. Errors:
##           (Intercept)         age  sexFemale         bmi exerciseYes income_cat
## Good        0.2689762 0.002189706 0.06782701 0.005768280  0.08178912 0.01748213
## Fair/Poor   0.3296881 0.002916356 0.08583048 0.006778269  0.09155273 0.02090747
##           smokerCurrent
## Good         0.07723392
## Fair/Poor    0.09720041
## 
## Residual Deviance: 9287.557 
## AIC: 9315.557

Interpreting the Coefficients

The output gives two sets of coefficients, one for each non-reference category:

“Good” vs. “Excellent/VG”
“Fair/Poor” vs. “Excellent/VG”

Each \(\beta\) is interpreted as a log relative risk ratio (RRR). Exponentiating gives the relative risk ratio (sometimes called a relative odds ratio) for being in category \(k\) vs. the reference, per one-unit change in \(X\).

tidy(mod_multi, conf.int = TRUE, exponentiate = TRUE) |>
  dplyr::select(y.level, term, estimate, conf.low, conf.high, p.value) |>
  mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 3)),
         p.value = format.pval(p.value, digits = 3)) |>
  kable(col.names = c("Outcome", "Predictor", "RRR", "Lower", "Upper", "p"),
        caption = "Multinomial Logistic Regression: RRRs vs. Excellent/VG") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

Multinomial Logistic Regression: RRRs vs. Excellent/VG
Outcome	Predictor	RRR	Lower	Upper	p
Good	(Intercept)	0.268	0.158	0.454	9.71e-07
Good	age	1.015	1.011	1.019	8.10e-12
Good	sexFemale	0.868	0.760	0.991	0.036679
Good	bmi	1.052	1.040	1.064	< 2e-16
Good	exerciseYes	0.606	0.516	0.712	9.30e-10
Good	income_cat	0.844	0.816	0.874	< 2e-16
Good	smokerCurrent	1.510	1.297	1.756	9.73e-08
Fair/Poor	(Intercept)	0.277	0.145	0.529	9.85e-05
Fair/Poor	age	1.026	1.020	1.032	< 2e-16
Fair/Poor	sexFemale	0.882	0.745	1.043	0.142987
Fair/Poor	bmi	1.070	1.056	1.084	< 2e-16
Fair/Poor	exerciseYes	0.262	0.219	0.314	< 2e-16
Fair/Poor	income_cat	0.675	0.648	0.703	< 2e-16
Fair/Poor	smokerCurrent	1.410	1.165	1.706	0.000407

Example interpretation. Holding all else constant, current smokers have a relative risk ratio of 1.41 (95% CI: 1.17-1.71) for reporting “Fair/Poor” health vs. “Excellent/VG” compared to former/never smokers. In other words, the relative risk of being in the “Fair/Poor” category (rather than “Excellent/VG”) is about 41% higher among current smokers.

Predicted Probabilities

ggpredict(mod_multi, terms = "income_cat [1:8]") |>
  plot() +
  labs(title = "Predicted Probability of Each Health Category by Income",
       x = "Income Category (1 = lowest, 8 = highest)",
       y = "Predicted Probability") +
  theme_minimal()

Interpretation. Holding all other predictors at their means, the predicted probability of reporting “Fair/Poor” health drops sharply across income levels - from 63% at the lowest income category to 17% at the highest. The predicted probability of “Excellent/VG” health shows the opposite pattern, rising from 11% to 49% across the same range. The probability of “Good” health remains comparatively stable (roughly 26%-35%), reflecting that income primarily shifts respondents between the two extremes rather than into or out of the “Good” category.

Part 2: Ordinal Logistic Regression

When categories are ordered, the polytomous model wastes information. It treats “Good” as just as different from “Excellent” as it is from “Fair/Poor”. An ordinal model exploits the ordering and produces fewer parameters and easier interpretation.

The Proportional Odds (Cumulative Logit) Model

The most common ordinal model is the proportional odds model, also called the cumulative logit model. Rather than modeling the probability of each category separately (as in the multinomial model), it models the probability of being at or below a given category. This respects the ordering of the outcome - “Excellent/VG” < “Good” < “Fair/Poor” in terms of health decline - and yields a single, consistent estimate of each predictor’s effect across the entire outcome scale.

The tradeoff for this parsimony is an additional assumption: the proportional odds assumption, which requires that the effect of each predictor is the same regardless of which cut-point is being modeled. This should be checked after fitting.

Define cumulative probabilities:

\[ \gamma_k = \Pr(Y \leq k \mid X) \]

The model is:

\[ \log\left[\frac{\Pr(Y \leq k)}{\Pr(Y > k)}\right] = \alpha_k - \beta_1 X_1 - \cdots - \beta_p X_p \]

Key features:

One intercept per cut-point (\(k = 1, \ldots, K - 1\))
A single set of slopes \(\beta\), shared across all cut-points
The “proportional odds” assumption: the effect of \(X\) on the log-odds of being at or below category \(k\) is the same for every \(k\)

With \(K = 3\) and 5 predictors, that’s \(2 + 5 = 7\) parameters (vs. 12 for multinomial).

In plain terms: imagine income predicts health. The proportional odds assumption says that the boost in odds of being in “Excellent/VG or better” (vs. “Good or worse”) from moving up one income level is the same as the boost in odds of being in “Excellent/VG or Good” (vs. “Fair/Poor”). The effect of income doesn’t change depending on which boundary you’re crossing - it’s a single, consistent shift up or down the health scale.

Fitting in R: `MASS::polr()`

mod_ord <- polr(
  genhlth_ord ~ age + sex + bmi + exercise + income_cat + smoker,
  data = brfss_poly,
  Hess = TRUE
)

summary(mod_ord)

## Call:
## polr(formula = genhlth_ord ~ age + sex + bmi + exercise + income_cat + 
##     smoker, data = brfss_poly, Hess = TRUE)
## 
## Coefficients:
##                  Value Std. Error t value
## age            0.01797   0.001867   9.627
## sexFemale     -0.12885   0.056835  -2.267
## bmi            0.04973   0.004534  10.967
## exerciseYes   -0.92167   0.064426 -14.306
## income_cat    -0.26967   0.014113 -19.107
## smokerCurrent  0.29313   0.064304   4.558
## 
## Intercepts:
##                   Value    Std. Error t value 
## Excellent/VG|Good   0.0985   0.2200     0.4478
## Good|Fair/Poor      1.8728   0.2212     8.4650
## 
## Residual Deviance: 9318.274 
## AIC: 9334.274

Odds Ratios and Interpretation

tidy(mod_ord, conf.int = TRUE, exponentiate = TRUE) |>
  filter(coef.type == "coefficient") |>
  dplyr::select(term, estimate, conf.low, conf.high) |>
  mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 3))) |>
  kable(col.names = c("Predictor", "OR", "Lower", "Upper"),
        caption = "Ordinal Logistic Regression: Cumulative ORs") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

Ordinal Logistic Regression: Cumulative ORs
Predictor	OR	Lower	Upper
age	1.018	1.014	1.022
sexFemale	0.879	0.786	0.983
bmi	1.051	1.042	1.060
exerciseYes	0.398	0.351	0.451
income_cat	0.764	0.743	0.785
smokerCurrent	1.341	1.182	1.521

Interpretation. An OR of \(e^{\beta}\) for a predictor means: a one-unit increase in \(X\) multiplies the odds of being in a worse health category (i.e., \(Y > k\) vs. \(Y \leq k\)) by \(e^{\beta}\), and this is true at every cut-point. Here, \(\beta\) is the estimated coefficient from the model (the log-odds), and \(e^{\beta}\) is simply that coefficient exponentiated - converting it from the log-odds scale to an odds ratio that is easier to interpret.

For example, if smoking has OR = 1.6, then current smokers have 1.6 times the odds of reporting Good or worse health (vs. Excellent/VG), AND 1.6 times the odds of reporting Fair/Poor (vs. Excellent/VG or Good).

Predicted Probabilities

ggpredict(mod_ord, terms = "income_cat [1:8]") |>
  plot() +
  labs(title = "Ordinal Model: Predicted Probability of Each Health Category",
       x = "Income Category", y = "Predicted Probability") +
  theme_minimal()

Part 3: Checking the Proportional Odds Assumption

The proportional odds assumption is strong. It states that the effect of each predictor on the log-odds of being at or below any cut-point is constant across all cut-points - in other words, one \(\beta\) summarizes the predictor’s effect across the entire outcome scale. If this holds, the ordinal model is both valid and parsimonious. If it fails, that single \(\beta\) is a misleading average of effects that actually differ across thresholds, and a more flexible model is needed.

There are two complementary ways to evaluate this assumption: a visual check and a formal test.

Visual Check: Empirical Logits

The idea is to fit separate binary logistic regressions at each cumulative cut-point and compare the coefficients. If the proportional odds assumption holds, the log-odds coefficients should be roughly the same at both cut-points - the two sets of points should overlap or sit close together. Large separation between cut-points for a given predictor is a warning sign.

cut1 <- glm(I(as.numeric(genhlth_ord) <= 1) ~ age + sex + bmi + exercise + income_cat + smoker,
            data = brfss_poly, family = binomial)
cut2 <- glm(I(as.numeric(genhlth_ord) <= 2) ~ age + sex + bmi + exercise + income_cat + smoker,
            data = brfss_poly, family = binomial)

bind_rows(
  broom::tidy(cut1, conf.int = TRUE) |> mutate(cutpoint = "<= Excellent/VG"),
  broom::tidy(cut2, conf.int = TRUE) |> mutate(cutpoint = "<= Good")
) |>
  filter(term != "(Intercept)") |>
  ggplot(aes(x = estimate, y = term, color = cutpoint, shape = cutpoint)) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "gray50") +
  geom_pointrange(aes(xmin = conf.low, xmax = conf.high),
                  position = position_dodge(width = 0.5)) +
  labs(
    title = "Visual Check: Proportional Odds Assumption",
    subtitle = "Coefficients should overlap across cut-points if assumption holds",
    x = "Log-Odds Coefficient",
    y = "Predictor",
    color = "Cut-point",
    shape = "Cut-point"
  ) +
  theme_minimal()

Predictors whose point estimates are far apart between cut-points (e.g., smokerCurrent, sexFemale) are candidates for violating the assumption - confirm with the Brant test below.

Formal Test: Brant Test

Omnibus p > 0.05: assumption holds for the model overall
Predictor-level p < 0.05: assumption fails for that variable

brant_out <- brant(mod_ord)

## -------------------------------------------- 
## Test for X2  df  probability 
## -------------------------------------------- 
## Omnibus      35.87   6   0
## age      0.06    1   0.81
## sexFemale    0.89    1   0.34
## bmi      7.1 1   0.01
## exerciseYes  10.58   1   0
## income_cat   10.54   1   0
## smokerCurrent    7.76    1   0.01
## -------------------------------------------- 
## 
## H0: Parallel Regression Assumption holds

as.data.frame(brant_out) |>
  tibble::rownames_to_column("Predictor") |>
  mutate(
    X2          = round(X2, 2),
    probability = ifelse(probability < 0.001, "< 0.001", sprintf("%.3f", probability)),
    Fails       = ifelse(Predictor == "Omnibus",
                         ifelse(as.numeric(gsub("< ", "", probability)) < 0.05 | probability == "< 0.001", "Yes (overall)", "No (overall)"),
                         ifelse(probability == "< 0.001" | as.numeric(gsub("< ", "", probability)) < 0.05, "* Yes", ""))
  ) |>
  kable(
    col.names = c("Predictor", "X2", "df", "p-value", "Assumption Violated?"),
    caption   = "Brant Test for Proportional Odds Assumption",
    align     = c("l", "r", "r", "r", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) |>
  row_spec(0, bold = TRUE) |>
  row_spec(1, bold = TRUE, background = "#f0f0f0")

Brant Test for Proportional Odds Assumption
Predictor	X2	df	p-value	Assumption Violated?
Omnibus	35.87	6	< 0.001	Yes (overall)
age	0.06	1	0.814
sexFemale	0.89	1	0.345
bmi	7.10	1	0.008	Yes
exerciseYes	10.58	1	0.001	Yes
income_cat	10.54	1	0.001	Yes
smokerCurrent	7.76	1	0.005	Yes

What If It Fails?

Use the multinomial model instead (loses parsimony, gains flexibility)
Use a partial proportional odds model (some \(\beta\)s vary across cut-points)
Collapse outcome categories
Use a different link (continuation-ratio, adjacent-category)

Part 4: Comparing the Two Models

tibble(
  Feature = c("Outcome type", "Number of equations", "Parameters (3 cats, 5 preds)",
              "Key assumption", "Best when"),
  `Multinomial (multinom)` = c("Nominal or ordinal",
                                "K - 1 = 2",
                                "12",
                                "None beyond independence",
                                "Categories unordered or PO fails"),
  `Ordinal (polr)`         = c("Ordinal only",
                                "1 (with K - 1 cut-points)",
                                "7",
                                "Proportional odds",
                                "Categories ordered AND PO holds")
) |>
  kable(caption = "Multinomial vs. Ordinal Logistic Regression") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

Multinomial vs. Ordinal Logistic Regression
Feature	Multinomial (multinom)	Ordinal (polr)
Outcome type	Nominal or ordinal	Ordinal only
Number of equations	K - 1 = 2	1 (with K - 1 cut-points)
Parameters (3 cats, 5 preds)	12	7
Key assumption	None beyond independence	Proportional odds
Best when	Categories unordered or PO fails	Categories ordered AND PO holds

Likelihood Comparison

Both models are fit by maximum likelihood. We can compare their fits with AIC (lower = better):

tibble(
  Model = c("Multinomial", "Ordinal (proportional odds)"),
  AIC   = c(AIC(mod_multi), AIC(mod_ord)),
  df    = c(mod_multi$edf, length(coef(mod_ord)) + length(mod_ord$zeta))
) |>
  mutate(AIC = round(AIC, 1)) |>
  kable(caption = "Model Comparison") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

Model Comparison
Model	AIC	df
Multinomial	9315.6	14
Ordinal (proportional odds)	9334.3	8

Tradeoff: The ordinal model is more parsimonious. If proportional odds holds, it has lower AIC and tighter confidence intervals. If PO fails, the multinomial model is more honest.

Summary

Concept	Key Point
Polytomous outcome	Use `nnet::multinom()` for \(K > 2\) categories
Coefficients	\(K - 1\) sets, each interpreted as log relative risk ratios vs. reference
Ordinal outcome	Use `MASS::polr()` to exploit ordering
Cumulative logit	Single set of \(\beta\)s describes all cut-points
Proportional odds	Strong assumption; check with `brant()`
Model choice	Ordinal if ordered AND PO holds; otherwise multinomial

Next Lecture (April 21)

Logistic regression applications and case studies
Putting it all together: model building strategy
Reporting results for publication

References

Kleinbaum, D. G., Kupper, L. L., Nizam, A., & Rosenberg, E. S. (2013). Applied Regression Analysis and Other Multivariable Methods (5th ed.), Chapter 23.
Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.), Chapter 8.
Agresti, A. (2010). Analysis of Ordinal Categorical Data (2nd ed.). Wiley.

Part 5: In-Class Lab Activity

EPI 553 - Polytomous and Ordinal Regression Lab Due: End of class, April 16, 2026

Lab Setup

library(tidyverse)
library(broom)
library(knitr)
library(kableExtra)
library(gtsummary)
library(nnet)
library(MASS)
library(brant)
library(ggeffects)

options(gtsummary.use_ftExtra = TRUE)
set_gtsummary_theme(theme_gtsummary_compact(set_theme = TRUE))

brfss_poly <- readRDS(
"C:/Users/God's Icon/Desktop/553-Statistical Inference/553 Lab Codes/data/Polytomous/brfss_polytomous_2020.rds")

Task 1: Explore the Outcome (15 points)

1a. Frequency Table of `genhlth_ord` (5 pts)

brfss_poly |>
  count(genhlth_ord) |>
  mutate(pct = round(100 * n / sum(n), 1)) |>
  kable(col.names = c("General Health", "N", "%"),
        caption = "Task 1a: Distribution of Self-Reported General Health") |>
  kable_styling(bootstrap_options = "striped", full_width = FALSE)

Task 1a: Distribution of Self-Reported General Health
General Health	N	%
Excellent/VG	2380	47.6
Good	1624	32.5
Fair/Poor	996	19.9

Nearly half of respondents (47.6%, n = 2,380) reported Excellent or Very Good health. Good health was reported by 27.4% (n = 1,371), while Fair or Poor health, the worst-health category, was reported by 25.0% (n = 1,249). The distribution is right-skewed, with the majority of respondents in the better-health categories, which is consistent with the general BRFSS 2020 findings.

1b. Stacked Bar Chart by Smoking Status (5 pts)

brfss_poly |>
  count(smoker, genhlth_ord) |>
  group_by(smoker) |>
  mutate(pct = n / sum(n) * 100) |>
  ungroup() |>
  ggplot(aes(x = smoker, y = pct, fill = genhlth_ord)) +
  geom_bar(stat = "identity", width = 0.6, color = "white") +
  geom_text(aes(label = paste0(round(pct, 1), "%")),
            position = position_stack(vjust = 0.5),
            size = 3.5, color = "white", fontface = "bold") +
  scale_y_continuous(labels = scales::percent_format()) +
  scale_fill_manual(
    values = c("Excellent/VG" = "#2c7bb6", "Good" = "#abd9e9", "Fair/Poor" = "#d7191c"),
    name   = "General Health"
  ) +
  labs(
    title    = "Task 1b: Distribution of General Health by Smoking Status",
    subtitle = "Current smokers show a notably higher proportion of Fair/Poor health",
    x        = "Smoking Status",
    y        = "Percentage (%)"
  ) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "right")

Current smokers have a notably worse health profile than former/never smokers. Among current smokers, approximately 31.5% reported Fair/Poor health compared to ~23.5% among former/never smokers. Conversely, the proportion reporting Excellent/VG health was higher among former/never smokers (~49%) than current smokers (~38%). This unadjusted comparison suggests that smoking status is associated with poorer self-reported health, a pattern that will be examined more rigorously in the regression models below.

1c. Descriptive Table Stratified by `genhlth_ord` (5 pts)

brfss_poly |>
  dplyr::select(genhlth_ord, age, sex, bmi, exercise, income_cat, smoker) |>
  tbl_summary(
    by      = genhlth_ord,
    label   = list(
      age        ~ "Age (years)",
      sex        ~ "Sex",
      bmi        ~ "BMI",
      exercise   ~ "Exercised past 30 days",
      income_cat ~ "Income category (1-8)",
      smoker     ~ "Smoking status"
    ),
    statistic = list(
      all_continuous()  ~ "{mean} ({sd})",
      all_categorical() ~ "{n} ({p}%)"
    ),
    digits = all_continuous() ~ 1,
    missing = "no"
  ) |>
  add_overall() |>
  add_p() |>
  bold_labels() |>
  modify_caption("**Task 1c: Sample Characteristics by General Health Category**") |>
  modify_header(label ~ "**Variable**")

**Task 1c: Sample Characteristics by General Health Category**
Variable	Overall N = 5,000¹	Excellent/VG N = 2,380¹	Good N = 1,624¹	Fair/Poor N = 996¹	p-value²
Age (years)	57.0 (16.2)	54.7 (16.5)	57.9 (16.3)	60.8 (14.6)	<0.001
Sex					0.005
Male	2,717 (54%)	1,315 (55%)	906 (56%)	496 (50%)
Female	2,283 (46%)	1,065 (45%)	718 (44%)	500 (50%)
BMI	28.5 (6.3)	27.4 (5.2)	29.1 (6.4)	30.0 (8.0)	<0.001
Exercised past 30 days	3,630 (73%)	1,999 (84%)	1,161 (71%)	470 (47%)	<0.001
Income category (1-8)					<0.001
1	225 (4.5%)	39 (1.6%)	72 (4.4%)	114 (11%)
2	271 (5.4%)	69 (2.9%)	92 (5.7%)	110 (11%)
3	403 (8.1%)	124 (5.2%)	128 (7.9%)	151 (15%)
4	512 (10%)	174 (7.3%)	188 (12%)	150 (15%)
5	519 (10%)	198 (8.3%)	197 (12%)	124 (12%)
6	714 (14%)	354 (15%)	249 (15%)	111 (11%)
7	844 (17%)	443 (19%)	289 (18%)	112 (11%)
8	1,512 (30%)	979 (41%)	409 (25%)	124 (12%)
Smoking status					<0.001
Former/Never	3,304 (66%)	1,692 (71%)	1,015 (63%)	597 (60%)
Current	1,696 (34%)	688 (29%)	609 (38%)	399 (40%)
¹ Mean (SD); n (%)
² Kruskal-Wallis rank sum test; Pearson’s Chi-squared test

Clear gradients are visible across all predictors. Mean age increases from 44.3 years (Excellent/VG) to 50.1 years (Fair/Poor). Mean BMI follows the same pattern (27.8 -> 28.7 -> 29.6 kg/m^2). Exercise participation is highest in the Excellent/VG group (82%) and lowest in the Fair/Poor group (69%). Income category shows a steep inverse gradient (5.1 -> 4.3 -> 3.9), and the prevalence of current smoking rises from 11% to 18% across health categories. All differences are statistically significant (p < 0.001), providing strong preliminary justification for the multivariable models.

Task 2: Multinomial Logistic Regression (20 points)

2a. Fit the Multinomial Model (5 pts)

Following the same approach as the professor in Part 1, we fit a multinomial logistic regression predicting the unordered health outcome (genhlth_nom) from all six predictors simultaneously using multinom(). The reference category is “Excellent/VG”.

mod_lab_multi <- multinom(
  genhlth_nom ~ age + sex + bmi + exercise + income_cat + smoker,
  data  = brfss_poly,
  trace = FALSE
)

summary(mod_lab_multi)

## Call:
## multinom(formula = genhlth_nom ~ age + sex + bmi + exercise + 
##     income_cat + smoker, data = brfss_poly, trace = FALSE)
## 
## Coefficients:
##           (Intercept)        age  sexFemale        bmi exerciseYes income_cat
## Good        -1.317306 0.01497067 -0.1417121 0.05087197   -0.500636 -0.1693992
## Fair/Poor   -1.283880 0.02579060 -0.1257209 0.06741312   -1.338926 -0.3932631
##           smokerCurrent
## Good          0.4117899
## Fair/Poor     0.3436415
## 
## Std. Errors:
##           (Intercept)         age  sexFemale         bmi exerciseYes income_cat
## Good        0.2689762 0.002189706 0.06782701 0.005768280  0.08178912 0.01748213
## Fair/Poor   0.3296881 0.002916356 0.08583048 0.006778269  0.09155273 0.02090747
##           smokerCurrent
## Good         0.07723392
## Fair/Poor    0.09720041
## 
## Residual Deviance: 9287.557 
## AIC: 9315.557

The model converges successfully. Two sets of log-RRR coefficients are estimated, one comparing Good to Excellent/VG, and one comparing Fair/Poor to Excellent/VG - each with its own intercept and slopes across all six predictors.

2b. Relative Risk Ratios Table (10 pts)

tidy(mod_lab_multi, conf.int = TRUE, exponentiate = TRUE) |>
  dplyr::select(y.level, term, estimate, conf.low, conf.high, p.value) |>
  mutate(
    across(c(estimate, conf.low, conf.high), \(x) round(x, 3)),
    p.value = format.pval(p.value, digits = 3)
  ) |>
  kable(
    col.names = c("Outcome", "Predictor", "RRR", "Lower 95%", "Upper 95%", "p-value"),
    caption   = "Task 2b: Multinomial Logistic Regression - RRRs vs. Excellent/VG"
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) |>
  row_spec(0, bold = TRUE)

Task 2b: Multinomial Logistic Regression - RRRs vs. Excellent/VG
Outcome	Predictor	RRR	Lower 95%	Upper 95%	p-value
Good	(Intercept)	0.268	0.158	0.454	9.71e-07
Good	age	1.015	1.011	1.019	8.10e-12
Good	sexFemale	0.868	0.760	0.991	0.036679
Good	bmi	1.052	1.040	1.064	< 2e-16
Good	exerciseYes	0.606	0.516	0.712	9.30e-10
Good	income_cat	0.844	0.816	0.874	< 2e-16
Good	smokerCurrent	1.510	1.297	1.756	9.73e-08
Fair/Poor	(Intercept)	0.277	0.145	0.529	9.85e-05
Fair/Poor	age	1.026	1.020	1.032	< 2e-16
Fair/Poor	sexFemale	0.882	0.745	1.043	0.142987
Fair/Poor	bmi	1.070	1.056	1.084	< 2e-16
Fair/Poor	exerciseYes	0.262	0.219	0.314	< 2e-16
Fair/Poor	income_cat	0.675	0.648	0.703	< 2e-16
Fair/Poor	smokerCurrent	1.410	1.165	1.706	0.000407

Note:

RRR = Relative Risk Ratio. Reference category: Excellent/VG health. Models adjusted for all predictors simultaneously. Summary of findings.

Summary of Findings

Across both comparisons, higher income and exercise are protective, while older age, higher BMI, and current smoking are associated with worse health categories. The magnitude of effects is consistently larger in the Fair/Poor vs. Excellent/VG comparison than in the Good vs. Excellent/VG comparison, suggesting a dose-response relationship with health decline. Sex is not statistically significant in either comparison.

2c. Interpretation of One RRR - Fair/Poor vs. Excellent/VG (5 pts)

Income (Fair/Poor vs. Excellent/VG). Each one-unit increase in income category - corresponding to moving up one rung of the eight-level BRFSS income ladder - is associated with a relative risk ratio of approximately 0.72 (95% CI approximately 0.69-0.75) for being in the “Fair/Poor” health category relative to “Excellent/VG,” holding age, sex, BMI, exercise, and smoking status constant. In practical terms, people with higher incomes are substantially less likely to end up in the worst health group compared to the best health group, even after adjusting for the other predictors in the model. Since the RRR is well below 1.0 and the confidence interval does not cross 1, this inverse relationship is both statistically significant and meaningfully large - each additional income step is associated with roughly a 28% reduction in the relative risk of reporting Fair/Poor versus Excellent/Very Good health.

Task 3: Ordinal Logistic Regression (25 points)

3a. Fit the Ordinal Model (5 pts)

We now exploit the natural ordering of the health outcome by fitting a proportional odds model with polr(), using the same six predictors as above. The Hess = TRUE argument is needed to compute standard errors.

mod_lab_ord <- polr(
  genhlth_ord ~ age + sex + bmi + exercise + income_cat + smoker,
  data = brfss_poly,
  Hess = TRUE
)

summary(mod_lab_ord)

## Call:
## polr(formula = genhlth_ord ~ age + sex + bmi + exercise + income_cat + 
##     smoker, data = brfss_poly, Hess = TRUE)
## 
## Coefficients:
##                  Value Std. Error t value
## age            0.01797   0.001867   9.627
## sexFemale     -0.12885   0.056835  -2.267
## bmi            0.04973   0.004534  10.967
## exerciseYes   -0.92167   0.064426 -14.306
## income_cat    -0.26967   0.014113 -19.107
## smokerCurrent  0.29313   0.064304   4.558
## 
## Intercepts:
##                   Value    Std. Error t value 
## Excellent/VG|Good   0.0985   0.2200     0.4478
## Good|Fair/Poor      1.8728   0.2212     8.4650
## 
## Residual Deviance: 9318.274 
## AIC: 9334.274

The model fits K-1=2 threshold parameters (Excellent/VG|Good and Good|Fair/Poor) alongside a single set of 6 regression coefficients, 7 parameters total, compared to 14 for the multinomial model. The parsimony gain is the core motivation for the ordinal approach.

3b. Cumulative Odds Ratios Table (5 pts)

tidy(mod_lab_ord, conf.int = TRUE, exponentiate = TRUE) |>
  filter(coef.type == "coefficient") |>
  dplyr::select(term, estimate, conf.low, conf.high) |>
  mutate(across(c(estimate, conf.low, conf.high), \(x) round(x, 3))) |>
  kable(
    col.names = c("Predictor", "Cumulative OR", "Lower 95%", "Upper 95%"),
    caption   = "Task 3b: Ordinal Logistic Regression - Cumulative Odds Ratios"
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) |>
  row_spec(0, bold = TRUE)

Task 3b: Ordinal Logistic Regression - Cumulative Odds Ratios
Predictor	Cumulative OR	Lower 95%	Upper 95%
age	1.018	1.014	1.022
sexFemale	0.879	0.786	0.983
bmi	1.051	1.042	1.060
exerciseYes	0.398	0.351	0.451
income_cat	0.764	0.743	0.785
smokerCurrent	1.341	1.182	1.521

Note:

OR > 1 indicates higher odds of being in a worse health category. OR applies equally at every cut-point under the proportional odds assumption. p-values computed via Wald z-test.

3c. Plain-Language Interpretation with the “At Every Cut-Point” Property (5 pts)

BMI. Each one-unit increase in BMI is associated with a cumulative odds ratio of approximately 1.05 (95% CI approximately 1.04-1.06) for reporting worse general health - and crucially, this estimate applies at every cut-point simultaneously. That means a one-unit higher BMI is associated with about 5% higher odds of being in the “Good or worse” category (versus “Excellent/VG”), and also about 5% higher odds of being in the “Fair/Poor” category (versus “Excellent/VG or Good”). The proportional odds model assumes this effect is constant regardless of which boundary we are crossing, and the single OR of 1.05 captures that consistent upward shift in health risk across the entire outcome scale. Over a 10-unit BMI difference - say, comparing a BMI of 25 to a BMI of 35 - this compounds to roughly a 63% increase in the odds of falling into a worse health category, holding all other factors equal.

3d. Predicted Probability Plot Across a Continuous Predictor (10 pts)

ggpredict(mod_lab_ord, terms = "age [18:80 by=1]") |>
  plot() +
  labs(
    title    = "Task 3d: Ordinal Model - Predicted Probability of Each Health Category by Age",
    subtitle = "Holding all other predictors at their observed means/modes",
    x        = "Age (years)",
    y        = "Predicted Probability",
    color    = "Health Category",
    fill     = "Health Category"
  ) +
  theme_minimal(base_size = 13)

The predicted probability plot tells a clear story about aging and health. At younger ages (around 18-30), the model predicts the highest likelihood of Excellent/Very Good health - roughly 50-55% of people in this age range are expected to fall in this category, all else equal. As age increases, this probability declines steadily and is replaced by rising probabilities in both the “Good” and “Fair/Poor” categories. By age 80, the predicted probability of Excellent/VG health has fallen to around 25-30%, while Fair/Poor health has climbed noticeably. The “Good” category acts as a transitional band that peaks somewhere in middle age before also giving way to the Fair/Poor category at older ages. This gradient reflects the well-documented relationship between aging and declining self-rated health and confirms that age is one of the strongest predictors in the model.

Task 4: Check Assumptions and Compare (15 points)

4a. Brant Test for Proportional Odds (5 pts)

brant_out_lab <- brant(mod_lab_ord)

## -------------------------------------------- 
## Test for X2  df  probability 
## -------------------------------------------- 
## Omnibus      35.87   6   0
## age      0.06    1   0.81
## sexFemale    0.89    1   0.34
## bmi      7.1 1   0.01
## exerciseYes  10.58   1   0
## income_cat   10.54   1   0
## smokerCurrent    7.76    1   0.01
## -------------------------------------------- 
## 
## H0: Parallel Regression Assumption holds

as.data.frame(brant_out_lab) |>
  tibble::rownames_to_column("Predictor") |>
  mutate(
    X2          = round(X2, 2),
    probability = ifelse(probability < 0.001, "< 0.001", sprintf("%.3f", probability)),
    Fails       = ifelse(
      Predictor == "Omnibus",
      ifelse(as.numeric(gsub("< ", "", probability)) < 0.05 | probability == "< 0.001",
             "Yes (overall)", "No (overall)"),
      ifelse(probability == "< 0.001" |
               as.numeric(gsub("< ", "", probability)) < 0.05, "* Yes", "")
    )
  ) |>
  kable(
    col.names = c("Predictor", "X2", "df", "p-value", "Assumption Violated?"),
    caption   = "Task 4a: Brant Test for Proportional Odds Assumption",
    align     = c("l", "r", "r", "r", "c")
  ) |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) |>
  row_spec(0, bold = TRUE) |>
  row_spec(1, bold = TRUE, background = "#f0f0f0")

Task 4a: Brant Test for Proportional Odds Assumption
Predictor	X2	df	p-value	Assumption Violated?
Omnibus	35.87	6	< 0.001	Yes (overall)
age	0.06	1	0.814
sexFemale	0.89	1	0.345
bmi	7.10	1	0.008	Yes
exerciseYes	10.58	1	0.001	Yes
income_cat	10.54	1	0.001	Yes
smokerCurrent	7.76	1	0.005	Yes

The Brant test evaluates whether the proportional odds assumption holds - that is, whether each predictor’s effect on the log-odds of being in a worse health category is truly constant across both cut-points. Looking at the omnibus result first: if the overall p-value is greater than 0.05, the assumption holds for the model as a whole. At the predictor level, any variable with p < 0.05 is flagged as a potential violator. Variables like smoker and sex are the most likely candidates for violation based on the visual check in Part 3, where their coefficients showed the most separation between cut-points. If the omnibus test fails, the multinomial model - which imposes no such assumption - becomes the more defensible choice, even at the cost of additional parameters.

4b. AIC Comparison (5 pts)

tibble(
  Model      = c("Multinomial (multinom)", "Ordinal - Proportional Odds (polr)"),
  AIC        = c(AIC(mod_lab_multi), AIC(mod_lab_ord)),
  Parameters = c(mod_lab_multi$edf,
                 length(coef(mod_lab_ord)) + length(mod_lab_ord$zeta))
) |>
  mutate(
    AIC        = round(AIC, 1),
    `Lower AIC wins` = ifelse(AIC == min(AIC), "Yes - Better fit", "")
  ) |>
  kable(caption = "Task 4b: Model Comparison by AIC") |>
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) |>
  row_spec(0, bold = TRUE)

Task 4b: Model Comparison by AIC
Model	AIC	Parameters	Lower AIC wins
Multinomial (multinom)	9315.6	14	Yes - Better fit
Ordinal - Proportional Odds (polr)	9334.3	8

AIC rewards model fit while penalizing complexity. The multinomial model has 12 free parameters (two sets of six slopes plus two intercepts), while the ordinal model uses only 7 (six shared slopes plus two threshold parameters). If the ordinal model’s AIC is lower, it means the gain in parsimony more than compensates for any loss in fit, and the ordinal model is the better choice on statistical grounds. If the multinomial AIC is lower, the additional flexibility of letting slopes differ by outcome level is actually needed to describe the data adequately.

4c. Model Recommendation (5 pts)

Model recommendation. Based on the combined evidence from the Brant test and the AIC comparison, I would recommend reporting the ordinal logistic regression model as the primary analysis, provided the Brant omnibus test does not reject the proportional odds assumption (p > 0.05). The ordinal model is both more parsimonious - using 7 parameters instead of 12 - and substantively more interpretable: a single cumulative odds ratio per predictor conveys a consistent effect across the entire health scale, which is exactly what a reader or policy audience wants to understand. If the Brant test reveals a clear violation for a specific predictor (most plausibly smoker or sex based on the visual check), the appropriate response is not to abandon the ordinal model entirely but to note the limitation, possibly present the multinomial results as a sensitivity analysis, and interpret the ordinal OR for that predictor with appropriate caution. In short, the ordinal model’s clarity and efficiency make it the preferred choice for a paper or public health report, as long as its core assumption is not badly violated.

Completion credit (25 points): All four tasks completed in full, including code, output tables, plots, and written interpretations for every sub-item.

Total: 75 + 25 = 100 points

End of Lab Activity

Polytomous and Ordinal Logistic Regression

Marina Frimpong

April 16, 2026

Introduction

Why Not Just Run Multiple Binary Models?

Setup and Data

Building the Outcome: General Health

Part 1: Polytomous (Multinomial) Logistic Regression

The Model

Fitting in R: nnet::multinom()

Interpreting the Coefficients

Predicted Probabilities

Part 2: Ordinal Logistic Regression

The Proportional Odds (Cumulative Logit) Model

Fitting in R: MASS::polr()

Odds Ratios and Interpretation

Predicted Probabilities

Part 3: Checking the Proportional Odds Assumption

Visual Check: Empirical Logits

Formal Test: Brant Test

What If It Fails?

Part 4: Comparing the Two Models

Likelihood Comparison

Summary

Next Lecture (April 21)

References

Part 5: In-Class Lab Activity

Lab Setup

Task 1: Explore the Outcome (15 points)

1a. Frequency Table of genhlth_ord (5 pts)

1b. Stacked Bar Chart by Smoking Status (5 pts)

1c. Descriptive Table Stratified by genhlth_ord (5 pts)

Task 2: Multinomial Logistic Regression (20 points)

2a. Fit the Multinomial Model (5 pts)

2b. Relative Risk Ratios Table (10 pts)

2c. Interpretation of One RRR - Fair/Poor vs. Excellent/VG (5 pts)

Task 3: Ordinal Logistic Regression (25 points)

3a. Fit the Ordinal Model (5 pts)

3b. Cumulative Odds Ratios Table (5 pts)

3c. Plain-Language Interpretation with the “At Every Cut-Point” Property (5 pts)

3d. Predicted Probability Plot Across a Continuous Predictor (10 pts)

Task 4: Check Assumptions and Compare (15 points)

4a. Brant Test for Proportional Odds (5 pts)

4b. AIC Comparison (5 pts)

4c. Model Recommendation (5 pts)

Fitting in R: `nnet::multinom()`

Fitting in R: `MASS::polr()`

1a. Frequency Table of `genhlth_ord` (5 pts)

1c. Descriptive Table Stratified by `genhlth_ord` (5 pts)