DEM 7283 - Example 3 - Ordinal & Multinomial Logit Models

This example will cover the use of R functions for fitting Ordinal and Multinomial logit models to complex survey data.

For this example I am using 2016 CDC Behavioral Risk Factor Surveillance System (BRFSS) SMART MSA data. Link

#load brfss
library(car)
library(stargazer)

## 
## Please cite as:

##  Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.1. https://CRAN.R-project.org/package=stargazer

library(survey)

## Loading required package: grid

## Loading required package: Matrix

## Loading required package: survival

## 
## Attaching package: 'survey'

## The following object is masked from 'package:graphics':
## 
##     dotchart

load("~/Google Drive/classes/dem7283/class18/data/brfss16_mmsa.Rdata")

#The names in the data are very ugly, so I make them less ugly
nams<-names(brfss16m)
head(nams, n=10)

##  [1] "DISPCODE" "HHADULT"  "GENHLTH"  "PHYSHLTH" "MENTHLTH" "POORHLTH"
##  [7] "HLTHPLN1" "PERSDOC2" "MEDCOST"  "CHECKUP1"

#we see some names are lower case, some are upper and some have a little _ in the first position. This is a nightmare.
newnames<-tolower(gsub(pattern = "_",replacement =  "",x =  nams))
names(brfss16m)<-newnames
#Poor or fair self rated health
#brfss16m$badhealth<-ifelse(brfss16m$genhlth %in% c(4,5),1,0)
brfss16m$badhealth<-recode(brfss16m$genhlth, recodes="4:5=1; 1:3=0; else=NA")
#race/ethnicity
brfss16m$black<-recode(brfss16m$racegr3, recodes="2=1; 9=NA; else=0")
brfss16m$white<-recode(brfss16m$racegr3, recodes="1=1; 9=NA; else=0")
brfss16m$other<-recode(brfss16m$racegr3, recodes="3:4=1; 9=NA; else=0")
brfss16m$hispanic<-recode(brfss16m$racegr3, recodes="5=1; 9=NA; else=0")

brfss16m$race_eth<-recode(brfss16m$racegr3, recodes="1='nhwhite'; 2='nh black'; 3='nh other';4='nh multirace'; 5='hispanic'; else=NA", as.factor.result = T)
brfss16m$race_eth<-relevel(brfss16m$race_eth, ref = "nhwhite")

#insurance
brfss16m$ins<-recode(brfss16m$hlthpln1, recodes ="7:9=NA; 1=1;2=0")

#income grouping
brfss16m$inc<-ifelse(brfss16m$incomg==9, NA, brfss16m$incomg)

#education level
brfss16m$educ<-recode(brfss16m$educa, recodes="1:2='0Prim'; 3='1somehs'; 4='2hsgrad'; 5='3somecol'; 6='4colgrad';9=NA", as.factor.result=T)
brfss16m$educ<-relevel(brfss16m$educ, ref='2hsgrad')

#employment
brfss16m$employ<-recode(brfss16m$employ1, recodes="1:2='Employed'; 2:6='nilf'; 7='retired'; 8='unable'; else=NA", as.factor.result=T)
brfss16m$employ<-relevel(brfss16m$employ, ref='Employed')

#marital status
brfss16m$marst<-recode(brfss16m$marital, recodes="1='married'; 2='divorced'; 3='widowed'; 4='separated'; 5='nm';6='cohab'; else=NA", as.factor.result=T)
brfss16m$marst<-relevel(brfss16m$marst, ref='married')

#Age cut into intervals
brfss16m$agec<-cut(brfss16m$age80, breaks=c(0,24,39,59,79,99))

#BMI, in the brfss16ma the bmi variable has 2 implied decimal places, so we must divide by 100 to get real bmi's

brfss16m$bmi<-brfss16m$bmi5/100

#smoking currently
brfss16m$smoke<-recode(brfss16m$smoker3, recodes="1:2='Current'; 3='Former';4='NeverSmoked'; else=NA", as.factor.result=T)
brfss16m$smoke<-relevel(brfss16m$smoke, ref = "NeverSmoked")

Other logit model specifications

In the basic logistic regression model, we assumed a dichotomous outcome * y = 0 or 1 This is a very flexible type of model since we can write any number of outcomes as a dichtomous variable.

Although the logistic regression model is very flexible, we may often run into other kinds of outcomes that it is not applicable to

E.g. A multiple choice outcome: A behavioral decision to migrate to a set of several possible new residences.
E.g. An ordered outcome with ranked levels: Liberal to conservative: Very liberal, somewhat liberal, somewhat conservative, very conservative

These types of outcomes require other kinds of regression models to analyze them.

Logit model for ordinal outcomes

Suppose we have an outcome with J ordered outcomes, and each person, i, has a response: \(Y_{i}\), and we can write the probablity of a person having response level J as: \(Pr(Y_ij = j)\) for J= 1 … J

For an ordered response it is easier to work with the cumluative probablity function or

\(Pr(Y_ij \leqslant j)\)

So the probabilities increase in adjacent categories (ordered categories). I.e. \[Pr(Y_ij \leqslant 3)>Pr(Y_ij \leqslant 2)>Pr(Y_ij \leqslant 1)\] Since we are using cumulative probabilities, we do not acutally have to estimate all J probabilities. We need only estimate J-1 probabilites, due to the complementary rule of probability: \[Pr(y=0) = 1-Pr(y>0)\]

In order to construct a regression model, we have to link a linear function of covariates to these probabilities. A general depiction of this model can be thought of as a latent trait model, where a latent variable, z is not observable but continuous, but we can observe the discrete representation of it as the J categories, then we can write this as:

\[\pi_{j-1} < Z{i} \leqslant \pi{j}\] or graphically as:

The most common model for an ordinal outcome such as this is the Proportional Odds Model. In this model, for each level of J, there is an intercept term, \(\beta_{0j}\) which depends on the category, J. The other explantory variables x do not depend on the category J, however. This generates a model of the form:

\[\text {log} \frac{\pi_1+\cdots +\pi_j}{\pi_{j+1}+\cdots +\pi_j} = \beta_{0j}+x'\beta\] The base assumption for this model is that each x variable affects the probability of J increasing by 1 level in the same way, for all transitions. We can visualize this as:

This model is attractive because we have a direct and consistent interpretation of our \(\beta\)’s in the model. While this is nice, the assumption can be very weak in practice, with the \(\beta\)’s not being equal between categories. In other words, come covariates may affect the probability of certain transitions more than others.

We can evaluate the assumption generally by fitting a series of binomial logistic regressions and examining the \(\beta\)’s between them. This can be done by re-coding the outcome from a single ordered outcome to a series of binomial outcomes. For example:

#General ordinal coding, with 5 being the worst and 1 being the best health
brfss16m$generalhealth<-recode(brfss16m$genhlth, recodes="1:2=1;3=2;4:5=3; else=NA", as.factor.result = T)
brfss16m$generalhealth<-relevel(brfss16m$generalhealth, ref = "1")
brfss16m$healthnum<-car::recode(brfss16m$genhlth, recodes="1:2=1;3=2;4:5=3; else=NA", as.factor.result = F)

#First we tell R our survey design
options(survey.lonely.psu = "adjust")


library(dplyr)

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:car':
## 
##     recode

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

sub<-brfss16m%>%
  select(badhealth,healthnum,generalhealth, mmsaname, bmi, agec,race_eth, marst, educ,white, black, hispanic, other, smoke, ins, mmsawt, ststr) %>%
  filter( complete.cases(.))

#First we tell R our survey design
options(survey.lonely.psu = "adjust")
des<-svydesign(ids=~1, strata=~ststr, weights=~mmsawt, data =sub )

Ordinal Regression example

To fit an ordinal logit to survey data in R, we use the svyolr fucntion in the survey library.

#Here I fit three nested models for the health outcome
fit.solr1<-svyolr(generalhealth~race_eth+educ+agec,des)
summary(fit.solr1)

## Call:
## svyolr(generalhealth ~ race_eth + educ + agec, des)
## 
## Coefficients:
##                           Value Std. Error    t value
## race_ethhispanic      0.5155520 0.02846057  18.114602
## race_ethnh black      0.4170751 0.02639000  15.804286
## race_ethnh multirace  0.4521600 0.06608237   6.842369
## race_ethnh other      0.3030651 0.04198436   7.218523
## educ0Prim             1.0574528 0.05780852  18.292336
## educ1somehs           0.6620081 0.04072828  16.254261
## educ3somecol         -0.3089586 0.02299982 -13.433088
## educ4colgrad         -0.9845144 0.02150067 -45.789934
## agec(24,39]           0.3632629 0.03623319  10.025693
## agec(39,59]           0.7456899 0.03447668  21.628817
## agec(59,79]           1.1022439 0.03504340  31.453678
## agec(79,99]           1.3501218 0.04544723  29.707465
## 
## Intercepts:
##     Value    Std. Error t value 
## 1|2   0.6313   0.0341    18.5165
## 2|3   2.3040   0.0353    65.1897

#Calculate the AIC ourself
fit.solr1$deviance+2*length(fit.solr1$coefficients)

## [1] 402859

Which gives us the results from the proportional odds model. In this case, we see that non-Hispanic blacks have higher odds of reporting poorer health, compared to non-Hispanic whites. We also see that as education increases, the odds of reporting poorer health decreases, while for lower leves of education, the odds of having worse health are higher. As age increases, the odds of reporting poorer health increases, compared to those <25 years of age.

Now we can “examine” the proportional odds assumption by fitting logits for each change

ex1<-svyglm(I(healthnum>1)~race_eth+educ+agec,des, family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

ex2<-svyglm(I(healthnum>2)~race_eth+educ+agec,des, family="binomial")

## Warning in eval(family$initialize): non-integer #successes in a binomial
## glm!

This is NOT a statistical test, just a a rough guide. Here, I plot the coefficients of each model. If the proportional odds is correct, and the assumption is ok, then all these should be “approximately” the same values.

plot(coef(ex1)[-1], ylim=c(-3, 3), type="l",xaxt="n", ylab="Beta", main=c("Comparison of betas for", " proportional odds assumption"))
lines(coef(ex2)[-1], col=1, lty=2) 
axis(side=1, at=1:12, labels=F)
text(x=1:12, y=-4,  srt = 90, pos = 1, xpd = TRUE,cex=.8,
     labels = c( "Hispanic", "Black","MultRace" ,"Other",
                 "PrimarySch","SomeHS", "SomeColl", "CollGrad", "Age 24-39","Age 39-59" ,"Age 59-79", "Age 80+" ))
legend("bottomright",col=c(1,1),lty=c(1,2), legend=c(">1", ">2"))

lines(coef(fit.solr1)[c(-13:-16)], col=4, lwd=3)

Based on this, the effects appear to be pretty consistent across transitions.

#Just print the odds ratios, 
round(exp(rbind(coef(ex1)[-1], coef(ex2)[-1])),3)

##      race_ethhispanic race_ethnh black race_ethnh multirace
## [1,]            1.698            1.577                1.472
## [2,]            1.630            1.369                1.825
##      race_ethnh other educ0Prim educ1somehs educ3somecol educ4colgrad
## [1,]            1.407     2.519       1.828        0.737        0.385
## [2,]            1.089     2.957       2.060        0.733        0.315
##      agec(24,39] agec(39,59] agec(59,79] agec(79,99]
## [1,]       1.427       1.985       2.736       3.562
## [2,]       1.629       2.883       4.344       5.125

Which again shows us the effects are pretty consistent across transitions, just like the plot does. If anything the plot is easier to digest.

Non proportional assumption

We can also fit a cumulative logit model where the coefficients are not assumed to be the same across transitions. This is an alternative model where we don’t assume proportionality of the odds of the various transitions. We can’t use survey design to do this however.

library(VGAM)

## Loading required package: stats4

## Loading required package: splines

## 
## Attaching package: 'VGAM'

## The following object is masked from 'package:survey':
## 
##     calibrate

## The following object is masked from 'package:car':
## 
##     logit

#Proportional odds
fit.vgam<-vglm(as.ordered(generalhealth)~race_eth+educ+agec,brfss16m, weights =mmsawt/mean(mmsawt, na.rm=T), family=cumulative(parallel = T, reverse = T))  #<-parallel = T == proportional odds
summary(fit.vgam)

## 
## Call:
## vglm(formula = as.ordered(generalhealth) ~ race_eth + educ + 
##     agec, family = cumulative(parallel = T, reverse = T), data = brfss16m, 
##     weights = mmsawt/mean(mmsawt, na.rm = T))
## 
## 
## Pearson residuals:
##                    Min      1Q  Median       3Q    Max
## logit(P[Y>=2]) -10.937 -0.5576 -0.1609  0.51902  8.531
## logit(P[Y>=3])  -7.738 -0.3470 -0.1677 -0.07377 17.376
## 
## Coefficients: 
##                      Estimate Std. Error z value Pr(>|z|)    
## (Intercept):1        -0.62517    0.01405  -44.49   <2e-16 ***
## (Intercept):2        -2.32192    0.01489 -155.93   <2e-16 ***
## race_ethhispanic      0.52140    0.01145   45.52   <2e-16 ***
## race_ethnh black      0.40714    0.01209   33.69   <2e-16 ***
## race_ethnh multirace  0.47611    0.03457   13.77   <2e-16 ***
## race_ethnh other      0.33090    0.01561   21.20   <2e-16 ***
## educ0Prim             0.97749    0.02025   48.26   <2e-16 ***
## educ1somehs           0.66847    0.01554   43.01   <2e-16 ***
## educ3somecol         -0.30490    0.01042  -29.26   <2e-16 ***
## educ4colgrad         -0.97252    0.01125  -86.48   <2e-16 ***
## agec(24,39]           0.35346    0.01455   24.29   <2e-16 ***
## agec(39,59]           0.75487    0.01399   53.97   <2e-16 ***
## agec(59,79]           1.10589    0.01479   74.79   <2e-16 ***
## agec(79,99]           1.33419    0.02280   58.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Number of linear predictors:  2 
## 
## Names of linear predictors: logit(P[Y>=2]), logit(P[Y>=3])
## 
## Residual deviance: 452289.9 on 485742 degrees of freedom
## 
## Log-likelihood: -226144.9 on 485742 degrees of freedom
## 
## Number of iterations: 5 
## 
## No Hauck-Donner effect found in any of the estimates
## 
## Exponentiated coefficients:
##     race_ethhispanic     race_ethnh black race_ethnh multirace 
##            1.6843782            1.5025127            1.6098072 
##     race_ethnh other            educ0Prim          educ1somehs 
##            1.3922222            2.6577733            1.9512507 
##         educ3somecol         educ4colgrad          agec(24,39] 
##            0.7371934            0.3781280            1.4239798 
##          agec(39,59]          agec(59,79]          agec(79,99] 
##            2.1273290            3.0219149            3.7969261

#Non-proportional odds
fit.vgam2<-vglm(as.ordered(generalhealth)~race_eth+educ+agec,brfss16m, weights =mmsawt/mean(mmsawt, na.rm=T), family=cumulative(parallel = F, reverse = T))  #<-parallel = F == Nonproportional odds
summary(fit.vgam2)

## 
## Call:
## vglm(formula = as.ordered(generalhealth) ~ race_eth + educ + 
##     agec, family = cumulative(parallel = F, reverse = T), data = brfss16m, 
##     weights = mmsawt/mean(mmsawt, na.rm = T))
## 
## 
## Pearson residuals:
##                    Min      1Q  Median       3Q    Max
## logit(P[Y>=2]) -10.358 -0.5532 -0.1598  0.53470  8.177
## logit(P[Y>=3])  -8.507 -0.3466 -0.1665 -0.07256 19.839
## 
## Coefficients: 
##                        Estimate Std. Error  z value Pr(>|z|)    
## (Intercept):1          -0.59831    0.01453  -41.189  < 2e-16 ***
## (Intercept):2          -2.52039    0.02363 -106.640  < 2e-16 ***
## race_ethhispanic:1      0.54140    0.01246   43.453  < 2e-16 ***
## race_ethhispanic:2      0.46721    0.01559   29.969  < 2e-16 ***
## race_ethnh black:1      0.43848    0.01301   33.712  < 2e-16 ***
## race_ethnh black:2      0.30590    0.01689   18.112  < 2e-16 ***
## race_ethnh multirace:1  0.40835    0.03707   11.017  < 2e-16 ***
## race_ethnh multirace:2  0.61056    0.04643   13.151  < 2e-16 ***
## race_ethnh other:1      0.38058    0.01627   23.387  < 2e-16 ***
## race_ethnh other:2      0.10654    0.02510    4.245 2.19e-05 ***
## educ0Prim:1             0.89652    0.02591   34.602  < 2e-16 ***
## educ0Prim:2             0.99948    0.02307   43.318  < 2e-16 ***
## educ1somehs:1           0.61562    0.01816   33.898  < 2e-16 ***
## educ1somehs:2           0.70911    0.01875   37.820  < 2e-16 ***
## educ3somecol:1         -0.29973    0.01109  -27.036  < 2e-16 ***
## educ3somecol:2         -0.30658    0.01464  -20.941  < 2e-16 ***
## educ4colgrad:1         -0.93844    0.01171  -80.111  < 2e-16 ***
## educ4colgrad:2         -1.13193    0.01788  -63.323  < 2e-16 ***
## agec(24,39]:1           0.34321    0.01497   22.930  < 2e-16 ***
## agec(24,39]:2           0.45835    0.02493   18.384  < 2e-16 ***
## agec(39,59]:1           0.68875    0.01447   47.588  < 2e-16 ***
## agec(39,59]:2           1.05354    0.02340   45.016  < 2e-16 ***
## agec(59,79]:1           1.00645    0.01546   65.088  < 2e-16 ***
## agec(59,79]:2           1.45215    0.02396   60.611  < 2e-16 ***
## agec(79,99]:1           1.23550    0.02513   49.155  < 2e-16 ***
## agec(79,99]:2           1.64567    0.03185   51.676  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Number of linear predictors:  2 
## 
## Names of linear predictors: logit(P[Y>=2]), logit(P[Y>=3])
## 
## Residual deviance: 451015.4 on 485730 degrees of freedom
## 
## Log-likelihood: -225507.7 on 485730 degrees of freedom
## 
## Number of iterations: 5 
## 
## No Hauck-Donner effect found in any of the estimates
## 
## Exponentiated coefficients:
##     race_ethhispanic:1     race_ethhispanic:2     race_ethnh black:1 
##              1.7184147              1.5955431              1.5503519 
##     race_ethnh black:2 race_ethnh multirace:1 race_ethnh multirace:2 
##              1.3578505              1.5043379              1.8414533 
##     race_ethnh other:1     race_ethnh other:2            educ0Prim:1 
##              1.4631332              1.1124241              2.4510690 
##            educ0Prim:2          educ1somehs:1          educ1somehs:2 
##              2.7168723              1.8508036              2.0321802 
##         educ3somecol:1         educ3somecol:2         educ4colgrad:1 
##              0.7410215              0.7359574              0.3912369 
##         educ4colgrad:2          agec(24,39]:1          agec(24,39]:2 
##              0.3224093              1.4094587              1.5814612 
##          agec(39,59]:1          agec(39,59]:2          agec(59,79]:1 
##              1.9912267              2.8677712              2.7358603 
##          agec(59,79]:2          agec(79,99]:1          agec(79,99]:2 
##              4.2722948              3.4400973              5.1844637

AIC(fit.vgam)

## [1] 452317.9

AIC(fit.vgam2)

## [1] 451067.4

Based on the AIC, the nonproportioal odds model fits better than the proportional odds model.

There are other types of models for ordinal data. Another is the Adjacent categories model, which models:

\[\text {log} \frac{\pi_{ij}}{\pi_{i{(j-1)}}} = \beta_{ij} , i = 1 , \cdots, I; \text{ } j= 2, \cdots, J\]

#Adjacent categories
fit.acat<-vglm(as.ordered(generalhealth)~race_eth+educ+agec,brfss16m, weights = mmsawt/mean(mmsawt, na.rm=T), family=acat(parallel = F, reverse = T))
summary(fit.acat)

## 
## Call:
## vglm(formula = as.ordered(generalhealth) ~ race_eth + educ + 
##     agec, family = acat(parallel = F, reverse = T), data = brfss16m, 
##     weights = mmsawt/mean(mmsawt, na.rm = T))
## 
## 
## Pearson residuals:
##                         Min      1Q Median     3Q    Max
## loge(P[Y=1]/P[Y=2])  -7.684 -0.5706 0.1626 0.5636 10.543
## loge(P[Y=2]/P[Y=3]) -19.450  0.0422 0.1169 0.3534  7.733
## 
## Coefficients: 
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept):1           0.82801    0.01572  52.656  < 2e-16 ***
## (Intercept):2           1.37123    0.02581  53.137  < 2e-16 ***
## race_ethhispanic:1     -0.47882    0.01372 -34.897  < 2e-16 ***
## race_ethhispanic:2     -0.21507    0.01727 -12.454  < 2e-16 ***
## race_ethnh black:1     -0.41737    0.01426 -29.271  < 2e-16 ***
## race_ethnh black:2     -0.07686    0.01864  -4.123 3.74e-05 ***
## race_ethnh multirace:1 -0.25863    0.04202  -6.155 7.51e-10 ***
## race_ethnh multirace:2 -0.47722    0.05280  -9.038  < 2e-16 ***
## race_ethnh other:1     -0.40286    0.01747 -23.062  < 2e-16 ***
## race_ethnh other:2      0.14551    0.02737   5.316 1.06e-07 ***
## educ0Prim:1            -0.52185    0.02910 -17.933  < 2e-16 ***
## educ0Prim:2            -0.78445    0.02578 -30.433  < 2e-16 ***
## educ1somehs:1          -0.41110    0.02034 -20.215  < 2e-16 ***
## educ1somehs:2          -0.52389    0.02091 -25.051  < 2e-16 ***
## educ3somecol:1          0.24390    0.01228  19.859  < 2e-16 ***
## educ3somecol:2          0.17060    0.01631  10.463  < 2e-16 ***
## educ4colgrad:1          0.75251    0.01288  58.424  < 2e-16 ***
## educ4colgrad:2          0.66517    0.01971  33.749  < 2e-16 ***
## agec(24,39]:1          -0.26653    0.01608 -16.578  < 2e-16 ***
## agec(24,39]:2          -0.31121    0.02710 -11.485  < 2e-16 ***
## agec(39,59]:1          -0.46549    0.01567 -29.702  < 2e-16 ***
## agec(39,59]:2          -0.80641    0.02553 -31.586  < 2e-16 ***
## agec(59,79]:1          -0.68517    0.01688 -40.597  < 2e-16 ***
## agec(59,79]:2          -1.08443    0.02622 -41.358  < 2e-16 ***
## agec(79,99]:1          -0.89401    0.02794 -31.992  < 2e-16 ***
## agec(79,99]:2          -1.14086    0.03525 -32.363  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Number of linear predictors:  2 
## 
## Names of linear predictors: loge(P[Y=1]/P[Y=2]), loge(P[Y=2]/P[Y=3])
## 
## Residual deviance: 450954 on 485730 degrees of freedom
## 
## Log-likelihood: -225477 on 485730 degrees of freedom
## 
## Number of iterations: 5

Multinomial Model

We also commonly see a resonse variable with unordered categories for a response. Such as:

A person is deciding whether to migrate to a set of possible new labor markets
A person is deciding which of a set of possible child care arrangement to use
A person is deciding which of a possible set of means by which to get to work
The type of contraception a woman chooses

These are example of what generally is called an âalternative decision makingâ process.

A distribution commonly used for this type of outcome is the multivariate extension of the binomial distribution, known as the multinomial distribution.

If you have an outcome with J unordered classes, then \(\pi_1\), \(\pi_2\), …, \(\pi_J\) are the probabilities of observing the J classes. Remember, \(\sum_J \pi_j = 1\).

If we observe \(y_1\) outcomes in the first category and \(y_2\) outcomes in the second and so on, the let:

\[\mathbf{y}=\begin{bmatrix} y_1\\ y_2\\ \vdots \\ y_4 \end{bmatrix}, with \sum_{j=1}^Jy_i = n\]

In this case y follows the multinomial distribution:

\[f(y|n)=\frac{n!}{y_1!y_2!\cdots y_j!} \pi_1^{y1}\pi_2^{y2}\cdots\pi_J^{yJ}\]

When constructing a regression model for such a distrubution, we choose one level of the outcome as the reference level and compare the probabilty of chooseing other options to it. For example, if we modeled the probability of choosing option j to option 1 (as the reference level), the we would have the logistic regression model:

\[logit(\pi_{j}) =log \left ( \frac{\pi_j}{\pi_1} \right)=x'_j\beta_j\]

This makes J-1 equations, again with the reference category estimated by the complementary rule. We then have a total of J-1 regression equations to interpret, where we compare the odds of being in each of the other j-1 categories to the reference category. If we want to estimate the probability of being in each particular class of the outcome, for a response with a given set of x values, we can solve the equation for \(\pi_j\) as:

\[\pi_j = \frac{exp(x'_j\beta_j)}{1+\sum_{j=2}^J exp(x'_j\beta_j)}\]

This is a much more complicated model than the proportional odds model, and we will have many more effects to interpret, owing to the J-1 separate equations we are estimating.

Unfortunately, in R there is no multinomial survey design model, so we have to use VGAM again and feed the function our normalized weights.

Now I fit the model, this is screwy looking because I have to use the multinom() function from the nnet library to fit the model and since I want to extract the equations from the model I have to use the coef() function.

mfit<-vglm(generalhealth~race_eth+educ+agec,multinomial(refLevel = 1),brfss16m, weights=mmsawt/mean(mmsawt, na.rm=T))

summary(mfit)

## 
## Call:
## vglm(formula = generalhealth ~ race_eth + educ + agec, family = multinomial(refLevel = 1), 
##     data = brfss16m, weights = mmsawt/mean(mmsawt, na.rm = T))
## 
## 
## Pearson residuals:
##                       Min      1Q  Median       3Q    Max
## log(mu[,2]/mu[,1]) -7.150 -0.5176 -0.2604  0.52666  8.274
## log(mu[,3]/mu[,1]) -8.438 -0.3476 -0.1723 -0.06237 19.844
## 
## Coefficients: 
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept):1          -0.82801    0.01572 -52.656  < 2e-16 ***
## (Intercept):2          -2.19924    0.02469 -89.086  < 2e-16 ***
## race_ethhispanic:1      0.47882    0.01372  34.897  < 2e-16 ***
## race_ethhispanic:2      0.69390    0.01711  40.552  < 2e-16 ***
## race_ethnh black:1      0.41737    0.01426  29.271  < 2e-16 ***
## race_ethnh black:2      0.49423    0.01836  26.921  < 2e-16 ***
## race_ethnh multirace:1  0.25863    0.04202   6.155 7.51e-10 ***
## race_ethnh multirace:2  0.73585    0.05021  14.657  < 2e-16 ***
## race_ethnh other:1      0.40286    0.01747  23.062  < 2e-16 ***
## race_ethnh other:2      0.25735    0.02663   9.665  < 2e-16 ***
## educ0Prim:1             0.52185    0.02910  17.933  < 2e-16 ***
## educ0Prim:2             1.30630    0.02924  44.668  < 2e-16 ***
## educ1somehs:1           0.41110    0.02034  20.215  < 2e-16 ***
## educ1somehs:2           0.93499    0.02194  42.610  < 2e-16 ***
## educ3somecol:1         -0.24390    0.01228 -19.859  < 2e-16 ***
## educ3somecol:2         -0.41450    0.01568 -26.431  < 2e-16 ***
## educ4colgrad:1         -0.75251    0.01288 -58.424  < 2e-16 ***
## educ4colgrad:2         -1.41768    0.01868 -75.877  < 2e-16 ***
## agec(24,39]:1           0.26653    0.01608  16.578  < 2e-16 ***
## agec(24,39]:2           0.57774    0.02596  22.254  < 2e-16 ***
## agec(39,59]:1           0.46549    0.01567  29.702  < 2e-16 ***
## agec(39,59]:2           1.27189    0.02447  51.982  < 2e-16 ***
## agec(59,79]:1           0.68517    0.01688  40.597  < 2e-16 ***
## agec(59,79]:2           1.76960    0.02525  70.080  < 2e-16 ***
## agec(79,99]:1           0.89401    0.02794  31.992  < 2e-16 ***
## agec(79,99]:2           2.03487    0.03504  58.066  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Number of linear predictors:  2 
## 
## Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
## 
## Residual deviance: 450954 on 485730 degrees of freedom
## 
## Log-likelihood: -225477 on 485730 degrees of freedom
## 
## Number of iterations: 5 
## 
## Reference group is level  1  of the response

Calculate the odds ratios and confidence intervals

round(exp(coef(mfit)), 3)

##          (Intercept):1          (Intercept):2     race_ethhispanic:1 
##                  0.437                  0.111                  1.614 
##     race_ethhispanic:2     race_ethnh black:1     race_ethnh black:2 
##                  2.001                  1.518                  1.639 
## race_ethnh multirace:1 race_ethnh multirace:2     race_ethnh other:1 
##                  1.295                  2.087                  1.496 
##     race_ethnh other:2            educ0Prim:1            educ0Prim:2 
##                  1.293                  1.685                  3.692 
##          educ1somehs:1          educ1somehs:2         educ3somecol:1 
##                  1.508                  2.547                  0.784 
##         educ3somecol:2         educ4colgrad:1         educ4colgrad:2 
##                  0.661                  0.471                  0.242 
##          agec(24,39]:1          agec(24,39]:2          agec(39,59]:1 
##                  1.305                  1.782                  1.593 
##          agec(39,59]:2          agec(59,79]:1          agec(59,79]:2 
##                  3.568                  1.984                  5.869 
##          agec(79,99]:1          agec(79,99]:2 
##                  2.445                  7.651

round(exp(confint(mfit)), 3)

##                        2.5 % 97.5 %
## (Intercept):1          0.424  0.451
## (Intercept):2          0.106  0.116
## race_ethhispanic:1     1.571  1.658
## race_ethhispanic:2     1.935  2.070
## race_ethnh black:1     1.476  1.561
## race_ethnh black:2     1.581  1.699
## race_ethnh multirace:1 1.193  1.406
## race_ethnh multirace:2 1.892  2.303
## race_ethnh other:1     1.446  1.548
## race_ethnh other:2     1.228  1.363
## educ0Prim:1            1.592  1.784
## educ0Prim:2            3.487  3.910
## educ1somehs:1          1.450  1.570
## educ1somehs:2          2.440  2.659
## educ3somecol:1         0.765  0.803
## educ3somecol:2         0.641  0.681
## educ4colgrad:1         0.459  0.483
## educ4colgrad:2         0.234  0.251
## agec(24,39]:1          1.265  1.347
## agec(24,39]:2          1.694  1.875
## agec(39,59]:1          1.545  1.642
## agec(39,59]:2          3.401  3.743
## agec(59,79]:1          1.920  2.051
## agec(59,79]:2          5.585  6.166
## agec(79,99]:1          2.315  2.583
## agec(79,99]:2          7.143  8.195

So in these results, each row consists of an equation for a particular set of outcomes. For example, let’s interpret the first row, which corresponds to the model \(Pr(\text{y='good health'})/Pr(\text{y='excellent/vg health'})\) which corresponts to the coefficent for race_ethhispanic:1

In this case, hispanics are more likely to report good instead of excellent/vg health, compared to NH Whites. This is also the case for NH blacks,others and multirace respondents. In terms of education, those with primary school and some highschool are more likely to report good, compared to excellent/vg health, compared to those with high school, but those with college educatoin are less likely to report good vs excellent/vg health, compared to those with high school.

Let’s consider the second transition now. This corresponds to the model \(Pr(\text{y='poor/fair health'})/Pr(\text{y='excellent/vg health'})\) . In this case hispanics are more likely to report poor instead of excellent health, compared to NH Whites. This is also the case for NH blacks, Others and multirace respondents. In terms of education, those with less education(primary or some HS) are more likely to report poor, compared to excellent health, compared to those with high school education, while those with more education (some college or college grad) are less likely to report poor, compared to excellent health, compared to those with high school education.

Get Fitted Probabilities

#get a series of predicted probabilites for different "types" of people for each model
dat<-expand.grid(race_eth=levels(brfss16m$race_eth), educ=levels(brfss16m$educ), agec=levels(brfss16m$agec))

#generate the fitted values
#Unfortunately, the survey proportional odds model won't generate fitted values
#but here I use a weighted multinomial model, which fits the data better anyway
fitm<-predict(mfit, newdat=dat,type="response")
#add the values to the fake data
dat<-cbind(dat, round(fitm,3))

#Print the fitted probabilities
head(dat, n=20)

##        race_eth     educ   agec     1     2     3
## 1       nhwhite  2hsgrad (0,24] 0.646 0.282 0.072
## 2      hispanic  2hsgrad (0,24] 0.519 0.366 0.115
## 3      nh black  2hsgrad (0,24] 0.542 0.359 0.099
## 4  nh multirace  2hsgrad (0,24] 0.556 0.315 0.129
## 5      nh other  2hsgrad (0,24] 0.556 0.364 0.080
## 6       nhwhite    0Prim (0,24] 0.466 0.343 0.191
## 7      hispanic    0Prim (0,24] 0.332 0.395 0.272
## 8      nh black    0Prim (0,24] 0.359 0.401 0.241
## 9  nh multirace    0Prim (0,24] 0.356 0.340 0.304
## 10     nh other    0Prim (0,24] 0.380 0.419 0.201
## 11      nhwhite  1somehs (0,24] 0.515 0.339 0.145
## 12     hispanic  1somehs (0,24] 0.380 0.405 0.215
## 13     nh black  1somehs (0,24] 0.406 0.406 0.188
## 14 nh multirace  1somehs (0,24] 0.409 0.349 0.241
## 15     nh other  1somehs (0,24] 0.425 0.419 0.155
## 16      nhwhite 3somecol (0,24] 0.706 0.242 0.052
## 17     hispanic 3somecol (0,24] 0.588 0.325 0.086
## 18     nh black 3somecol (0,24] 0.610 0.317 0.073
## 19 nh multirace 3somecol (0,24] 0.626 0.278 0.096
## 20     nh other 3somecol (0,24] 0.622 0.319 0.059

This will use polr with survey weights to get fitted probabilities from the proportional odds model. The fitted values are right, but the standard errors won’t be.

fitted.ord<-round(predict(fit.vgam, newdat=dat[,1:3], type="response"), 3)
dat<-cbind(dat, fitted.ord)
names(dat)<-c(names(dat)[1:3], "mp1", "mp2", "mp3", "op1", "op2", "op3")
head(dat, n=20)

##        race_eth     educ   agec   mp1   mp2   mp3   op1   op2   op3
## 1       nhwhite  2hsgrad (0,24] 0.646 0.282 0.072 0.651 0.259 0.089
## 2      hispanic  2hsgrad (0,24] 0.519 0.366 0.115 0.526 0.332 0.142
## 3      nh black  2hsgrad (0,24] 0.542 0.359 0.099 0.554 0.317 0.128
## 4  nh multirace  2hsgrad (0,24] 0.556 0.315 0.129 0.537 0.326 0.136
## 5      nh other  2hsgrad (0,24] 0.556 0.364 0.080 0.573 0.307 0.120
## 6       nhwhite    0Prim (0,24] 0.466 0.343 0.191 0.413 0.380 0.207
## 7      hispanic    0Prim (0,24] 0.332 0.395 0.272 0.294 0.400 0.305
## 8      nh black    0Prim (0,24] 0.359 0.401 0.241 0.319 0.400 0.281
## 9  nh multirace    0Prim (0,24] 0.356 0.340 0.304 0.304 0.400 0.296
## 10     nh other    0Prim (0,24] 0.380 0.419 0.201 0.336 0.398 0.266
## 11      nhwhite  1somehs (0,24] 0.515 0.339 0.145 0.489 0.350 0.161
## 12     hispanic  1somehs (0,24] 0.380 0.405 0.215 0.362 0.394 0.244
## 13     nh black  1somehs (0,24] 0.406 0.406 0.188 0.389 0.387 0.223
## 14 nh multirace  1somehs (0,24] 0.409 0.349 0.241 0.373 0.391 0.236
## 15     nh other  1somehs (0,24] 0.425 0.419 0.155 0.408 0.382 0.210
## 16      nhwhite 3somecol (0,24] 0.706 0.242 0.052 0.717 0.215 0.067
## 17     hispanic 3somecol (0,24] 0.588 0.325 0.086 0.601 0.291 0.109
## 18     nh black 3somecol (0,24] 0.610 0.317 0.073 0.628 0.274 0.098
## 19 nh multirace 3somecol (0,24] 0.626 0.278 0.096 0.612 0.284 0.104
## 20     nh other 3somecol (0,24] 0.622 0.319 0.059 0.645 0.263 0.091

Let’s look at the relative model fits:

AIC(fit.vgam) #proportional odds

## [1] 452317.9

AIC(fit.vgam2) #cumulative logit, non proportional

## [1] 451067.4

AIC(mfit) #multinomial

## [1] 451006

Looks like the multinomial is the best fitting model, but it is very close to the nonproportional odds model using the cumlative logit.