Introduction

The multinomial logistic regression model is a simple extension of the binomial logistic regression model. Both multinomial and ordinal models are used for categorical outcomes with more than two categories.

Ordinal logistic regression is used to predict an ordinal dependent variable given one or more independent variables. It can be considered as either a generalisation of multiple linear regression or as a generalisation of binomial logistic regression. As with other types of regression, ordinal regression can also use interactions between independent variables to predict the dependent variable.

Data: Mental Health Data: The following data describe the mental health status, as an ordinal, categorical variable mental, of 40 subjects.

Mental: Mental status of person
SES: Socio-economic status (ses: 0=low, 1=high)
Life: Number of adverse life events that the participants experiences such as death in the family.

Objective: In this study, I applied the multinomial logistic regression with ordered dependent variables.models from complex(saturated) the simplest(null) model to understand the relationship between the SocioEconomic status and severe life events the participants experienced.

Methods to use:
- Proportional Odds Logistic Regression(Null to Saturated)
- Drop-in-deviance to compare the models
- Finding probabilities
- Comparing with Multinomial Logistic Regression(Ignoring order)

Load the data set

d <- read.table("Mental.txt", header = TRUE)
head(d)
##   mental ses life
## 1      1   1    1
## 2      1   1    9
## 3      1   1    4
## 4      1   1    3
## 5      1   0    2
## 6      1   1    0

#Plot data

boxplot(life ~ mental, data = d, col = "magenta", 
        ylab = "Number of adverse life events", 
        names = c("well", "mild", "moderate", "severe"))

Load MASS Library

library(MASS) # Get the proportional odds logistic regression `= polr` function from the `MASS` library: 
## Warning: package 'MASS' was built under R version 4.0.5

Create meaningful labels for the ordered levels:

d$mental2 <- ordered(d$mental, levels=1:4, 
                     labels=c("well", "mild", "moderate", "severe"))

negate original predictors

#we need to negate our actual x-covariates ses and life to get the polr z-covariates.
d$ses <- -d$ses
d$life <- -d$life
range(d$life)
## [1] -9  0
range(d$ses)
## [1] -1  0

Now fit a series of models:

fit_null <- polr(mental2 ~ 1,          data = d, method = "logistic") # null Model
fit_ses  <- polr(mental2 ~ ses,        data = d, method = "logistic")
fit_life <- polr(mental2 ~ life,       data = d, method = "logistic")
fit_add  <- polr(mental2 ~ ses + life, data = d, method = "logistic")
fit_2way <- polr(mental2 ~ ses * life, data = d, method = "logistic") # Full (saturated) model with interaction

Compare those models via deviances:

round(c(fit_null$deviance, fit_ses$deviance, fit_life$deviance,
        fit_add$deviance, fit_2way$deviance), 2)
## [1] 116.17 114.40 110.33 107.25 105.85

Some of our models are better than null.

Best among them looks like the additive model. It is slightly better than life only, and is as good as the interaction model, life * ses = life + ses + life : ses. So, no need to go with the interaction model because it is difficult to interpret.

summary(fit_add)
## 
## Re-fitting to get Hessian
## Call:
## polr(formula = mental2 ~ ses + life, data = d, method = "logistic")
## 
## Coefficients:
##        Value Std. Error t value
## ses   1.0223     0.5913   1.729
## life -0.2969     0.1167  -2.545
## 
## Intercepts:
##                 Value   Std. Error t value
## well|mild       -0.3067  0.6265    -0.4896
## mild|moderate    0.8457  0.6334     1.3352
## moderate|severe  2.0744  0.6950     2.9846
## 
## Residual Deviance: 107.2468 
## AIC: 117.2468

We can see that there are two regression coefficients; ses and life.

There are three intercepts, one corresponding to each break points for four categorical levels corresponding to 3 break points. So, it broke up the continues distribution in to the bins. This is how we got ordered the categorical variables: Well/mild, mild/moderate and moderate/severe.

We can use the additive model to compute probabilities.

# Again, we need to negate our actual x-covariates ses and life to get 
# the polr z-covariates.
Probs_SES1 <- predict(fit_add, 
                      newdata = data.frame(ses = rep(-1, 10), life = -c(0:9)), 
                      type = "probs")
Probs_SES1
##         well      mild   moderate     severe
## 1  0.6716413 0.1945910 0.09052618 0.04324145
## 2  0.6031742 0.2247718 0.11472128 0.05733271
## 3  0.5304122 0.2510538 0.14288112 0.07565284
## 4  0.4563340 0.2702423 0.17421281 0.09921089
## 5  0.3841384 0.2796922 0.20708904 0.12908042
## 6  0.3167108 0.2780067 0.23899975 0.16628275
## 7  0.2561953 0.2654394 0.26676322 0.21160210
## 8  0.2037941 0.2438197 0.28704427 0.26534192
## 9  0.1598080 0.2160360 0.29708803 0.32706792
## 10 0.1238391 0.1853025 0.29543399 0.39542438
Probs_SES0 <- predict(fit_add, 
                      newdata = data.frame(ses = rep(0, 10), life = -c(0:9)), 
                      type = "probs")
Probs_SES0
##          well       mild  moderate    severe
## 1  0.42391760 0.27575084 0.1887237 0.1116078
## 2  0.35351571 0.28034378 0.2215300 0.1446105
## 3  0.28894152 0.27370276 0.2520194 0.1853363
## 4  0.23193099 0.25681875 0.2768616 0.2343887
## 5  0.18326985 0.23207380 0.2928825 0.2917738
## 6  0.14291850 0.20259316 0.2978248 0.3566635
## 7  0.11025217 0.17151013 0.2909597 0.4272780
## 8  0.08431773 0.14140089 0.2732955 0.5009859
## 9  0.06404469 0.11401404 0.2472902 0.5746510
## 10 0.04838844 0.09027151 0.2161959 0.6451442

Plot those probabilities versus number of adverse life events:

#Sum moderate and severe levels
Probs_SES1_severe_moderate <- apply(Probs_SES1[, 3:4], MAR = 1, FUN = "sum")
Probs_SES0_severe_moderate <- apply(Probs_SES0[, 3:4], MAR = 1, FUN = "sum")
#Plot
plot(0:9, Probs_SES1_severe_moderate, axes = F, 
     lty = 2, type = "l", ylim = c(0, 1), bty = "L", 
     ylab = "P(moderate or severe)",
     xlab = "Number of Adverse Life Events",
     main= "Probs of Life events Vs. Moderate&Severe", lwd = 3, col = "blue")
axis(2)
axis(1, at = 0:9)
lines(0:9, Probs_SES0_severe_moderate, lty = 1, lwd = 3, col = "magenta")
text(2.5, .5, "SES = 0")
arrows(2.5, .48, 2.75, .42, length = .1)
text(3.5, .33, "SES = 1")
arrows(3.5, .31, 3.75, .25, length = .1)

Probs_SES0_severe_moderate <- apply(Probs_SES0[, 3:4], MAR = 1, FUN = "sum")
plot(c(0, 9), c(0, 1),  axes = F, 
     lty = 2, type = "n", ylim = c(0, 1), bty = "L", 
     ylab = "Probability",
     xlab = "Number of Adverse Life Events", lwd = 3, col = "blue")
axis(2)
axis(1, at = 0:9)
lines(0:9, Probs_SES0[, 4], lty = 1, lwd = 3, col = "magenta")
lines(0:9, Probs_SES0_severe_moderate, lty = 1, lwd = 3, col = "magenta")
lines(0:9, apply(Probs_SES0[, 2:4], MAR = 1, FUN = "sum"), lty = 1, lwd = 3, col = "magenta")
lines(0:9, apply(Probs_SES0[, 2:4], MAR = 1, FUN = "sum"), lty = 1, lwd = 3, col = "magenta")

title(main = "Cumulative Probabilities for SES = 0")
text(4, .2, "Severe")
text(4, .44, "<= Moderate")
text(4, 0.8, "<= Mild")

That’s the fit of the proportional odds.

Comparing to baseline category logistic regression

Another option is that we can ignore the ordinal data, we could just fit a baseline category logistic regression. One another option would be the collapsed category. We can stack Mild, Moderate and Severe life events versus to Well, to make it binary. But, obviously, we will lose some information.

In this case we ignore the order, so they are just a category (NOT ORDERED category).By doing so, we may lose some information.

library(nnet)
## Warning: package 'nnet' was built under R version 4.0.5
ignore <- multinom(mental2 ~ ses + life, data = d)
## # weights:  16 (9 variable)
## initial  value 58.224363 
## iter  10 value 52.644017
## final  value 52.643833 
## converged
summary(ignore)
## Call:
## multinom(formula = mental2 ~ ses + life, data = d)
## 
## Coefficients:
##          (Intercept)       ses       life
## mild      -0.9820086 0.6367219 -0.3158206
## moderate  -0.4763790 0.9433176 -0.2391921
## severe    -1.7510982 1.7726545 -0.5657221
## 
## Std. Errors:
##          (Intercept)       ses      life
## mild       0.9421288 0.9476731 0.1935069
## moderate   0.8823564 0.9254144 0.1940823
## severe     1.0797497 1.0462777 0.2166520
## 
## Residual Deviance: 105.2877 
## AIC: 123.2877

Note that as you can see Multinomial Logistic Regression model returns more parameter than the Proportional Odds Logistic Regression.

Instead of having intercepts corresponding to break points and coefficients for the covariates, this has different set a covariates for every category except baseline.

So this is a bigger model ignoring the order which contains the information.

Proportional Odds model only had 2 coefficients, it didnt have 3 for SES and 3 for Life.

Let’s compare the average coefficients for ses and life

Multinomial Logistic Regression:

apply(coef(ignore)[, 2:3], MAR = 2, FUN = "mean")
##        ses       life 
##  1.1175647 -0.3735783

For Proportional Odds Logistic Regression:

coef(fit_add)
##        ses       life 
##  1.0223295 -0.2969094

The two models are telling the same story about the effects of ses and life.

So, basically the higher SES(Socio-economic status) affects positive and the number of life events affects negative.

But we know that Proportional Odds Logistic Regression is better fit because it takes into account the order.

But the standard errors for the model that ignores order are

diag(vcov(ignore))[c(2, 5, 8)]
##     mild:ses moderate:ses   severe:ses 
##    0.8980842    0.8563919    1.0946970
diag(vcov(ignore))[c(3, 6, 9)]
##     mild:life moderate:life   severe:life 
##    0.03744494    0.03766793    0.04693811

These are much larger than the corresponding standard errors for polr:

diag(vcov(fit_add))[1:2]
## 
## Re-fitting to get Hessian
##        ses       life 
## 0.34958611 0.01361506

We can see that Proportional Odds Logistic Regression provides smaller standard error which is good.

References:

  1. Colorado Stat University - Generalized Liner Models Course Notes STAA552
  2. https://statistics.laerd.com/spss-tutorials/ordinal-regression-using-spss-statistics.php



***************