Original source from https://stats.idre.ucla.edu/r/dae/multinomial-logistic-regression/
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
require(foreign)
## Loading required package: foreign
require(nnet)
## Loading required package: nnet
require(ggplot2)
## Loading required package: ggplot2
require(reshape2)
## Loading required package: reshape2
The data set contains variables on 200 students. The outcome variable is prog, program type. The predictor variables are social economic status, ses, a three-level categorical variable and writing score, write, a continuous variable. Let’s start with getting some descriptive statistics of the variables of interest.
ml <- read.dta("https://stats.idre.ucla.edu/stat/data/hsbdemo.dta")
head(ml)
## id female ses schtyp prog read write math science socst
## 1 45 female low public vocation 34 35 41 29 26
## 2 108 male middle public general 34 33 41 36 36
## 3 15 male high public vocation 39 39 44 26 42
## 4 67 male low public vocation 37 37 42 33 32
## 5 153 male middle public vocation 39 31 40 39 51
## 6 51 female high public general 42 36 42 31 39
## honors awards cid
## 1 not enrolled 0 1
## 2 not enrolled 0 1
## 3 not enrolled 0 1
## 4 not enrolled 0 1
## 5 not enrolled 0 1
## 6 not enrolled 0 1
with(ml, table(ses, prog))
## prog
## ses general academic vocation
## low 16 19 12
## middle 20 44 31
## high 9 42 7
with(ml, do.call(rbind, tapply(write, prog, function(x) c(M = mean(x), SD = sd(x)))))
## M SD
## general 51.33333 9.397775
## academic 56.25714 7.943343
## vocation 46.76000 9.318754
ml$prog2 <- relevel(ml$prog, ref = "academic")
head(with(ml,cbind(prog2,ses,write)))
## prog2 ses write
## [1,] 3 1 35
## [2,] 2 2 33
## [3,] 3 3 39
## [4,] 3 1 37
## [5,] 3 2 31
## [6,] 2 3 36
test <- multinom(prog2 ~ ses + write, data = ml)
## # weights: 15 (8 variable)
## initial value 219.722458
## iter 10 value 179.982880
## final value 179.981726
## converged
‘summary’ can show the information of the fitted model.
summary(test)
## Call:
## multinom(formula = prog2 ~ ses + write, data = ml)
##
## Coefficients:
## (Intercept) sesmiddle seshigh write
## general 2.852198 -0.5332810 -1.1628226 -0.0579287
## vocation 5.218260 0.2913859 -0.9826649 -0.1136037
##
## Std. Errors:
## (Intercept) sesmiddle seshigh write
## general 1.166441 0.4437323 0.5142196 0.02141097
## vocation 1.163552 0.4763739 0.5955665 0.02221996
##
## Residual Deviance: 359.9635
## AIC: 375.9635
‘coef’ can extract the coefficients of the fitted model.
coef(test)
## (Intercept) sesmiddle seshigh write
## general 2.852198 -0.5332810 -1.1628226 -0.0579287
## vocation 5.218260 0.2913859 -0.9826649 -0.1136037
The coefficients are corresponded to the following equation.
However, the coefficients can’t help to compute the real probabilities.
We can resort to the model matrix
M=model.matrix(prog2 ~ ses + write, data = ml)
head(M)
## (Intercept) sesmiddle seshigh write
## 1 1 0 0 35
## 2 1 1 0 33
## 3 1 0 1 39
## 4 1 0 0 37
## 5 1 1 0 31
## 6 1 0 1 36
M.coef=M%*%t(coef(test))
head(M.coef)
## general vocation
## 1 0.8246935 1.2421291
## 2 0.4072699 1.7607224
## 3 -0.5698439 -0.1949507
## 4 0.7088361 1.0149216
## 5 0.5231273 1.9879299
## 6 -0.3960578 0.1458605
Take exponential to get the odds between general, vocation, and academic.
ExpM.coef=exp(M.coef)
head(ExpM.coef)
## general vocation
## 1 2.2811814 3.4629786
## 2 1.5027096 5.8166379
## 3 0.5656137 0.8228752
## 4 2.0316252 2.7591471
## 5 1.6872960 7.3004054
## 6 0.6729678 1.1570347
Since the baseline category is ‘academic’, we can add the first column to be the ‘academic’ with value 1.
pred.odd=data.frame(academic=1,ExpM.coef)
head(pred.odd)
## academic general vocation
## 1 1 2.2811814 3.4629786
## 2 1 1.5027096 5.8166379
## 3 1 0.5656137 0.8228752
## 4 1 2.0316252 2.7591471
## 5 1 1.6872960 7.3004054
## 6 1 0.6729678 1.1570347
The probability can be obtained by
probmulti=pred.odd/apply(pred.odd,1,sum)
head(probmulti)
## academic general vocation
## 1 0.1482764 0.3382454 0.5134781
## 2 0.1202017 0.1806283 0.6991700
## 3 0.4186747 0.2368082 0.3445171
## 4 0.1726885 0.3508384 0.4764731
## 5 0.1001231 0.1689374 0.7309395
## 6 0.3533566 0.2377976 0.4088458
fitted(test)%>%head
## academic general vocation
## 1 0.1482764 0.3382454 0.5134781
## 2 0.1202017 0.1806283 0.6991700
## 3 0.4186747 0.2368082 0.3445171
## 4 0.1726885 0.3508384 0.4764731
## 5 0.1001231 0.1689374 0.7309395
## 6 0.3533566 0.2377976 0.4088458