Multinomial Logistic Regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes.
Multinomial logistic regression is known by a variety of other names, including polytomous LR, multiclass LR, softmax regression, multinomial logit (mod), the maximum entropy (MaxEnt) classifier, and the conditional maximum entropy model.
In multinomial logistic regression the objective is to model the probability \(\pi_j\) to belong to a specific group \(j=1, 2, \ldots, g\) using covariates \(\boldsymbol{x}\).
Usually we define a reference group, suppose that we choose the group number 1, in this case, we model the logit of \(\pi_j\) as:
\[ \text{logit}(\pi_j) = \log \left( \frac{\pi_j}{\pi_1} \right) = \boldsymbol{x}^\top \boldsymbol{\beta}_j, \quad \text{for} \, j=2, \ldots, g \]
After fitting the model we will obtain \(g-1\) estimated vectors \(\hat{\boldsymbol{\beta}}_j\), not \(g\) vector. Using thesde vectors we can express the estimated probability \(\hat{\pi}_j\) as:
\[ \hat{\pi}_1 = \frac{1}{1+\sum_{j=2}^{j=g} \exp(\boldsymbol{x}^\top \hat{\boldsymbol{\beta}}_j)}, \, \text{only for} \, j=1 \]
\[ \hat{\pi}_j = \frac{\exp(\boldsymbol{x}^\top \hat{\boldsymbol{\beta}}_j)}{1+\sum_{j=2}^{j=g} \exp(\boldsymbol{x}^\top \hat{\boldsymbol{\beta}}_j)}, \, \text{for} \, j=2,\ldots,g \]
Here we are goingo to use the titanic_train and titanic_test datasets from titanic package.
library(titanic)
d1 <- titanic_train
d1 <- subset(d1, d1$Embarked != "") # delete rows where Embarked didn't have any value
We are going to use Pclass as my dependent variable.
tabla <- prop.table(table(d1$Pclass))
barplot(tabla, las=1, col="white")
The objective is to model
\[P(Pclass) = f(Sex, \, Embarked, \, SibSp, \, Parch)\]
where
Some part of the data is given below:
d1[1:10, c("Pclass", "Sex", "Embarked", "SibSp", "Parch")]
## Pclass Sex Embarked SibSp Parch
## 1 3 male S 1 0
## 2 1 female C 1 0
## 3 3 female S 0 0
## 4 1 female S 1 0
## 5 3 male S 0 0
## 6 3 male Q 0 0
## 7 1 male S 0 0
## 8 3 male S 3 1
## 9 3 female S 0 2
## 10 2 female C 1 0
To explore the data structure we can use:
str(d1[, c("Pclass", "Sex", "Embarked", "SibSp", "Parch")])
## 'data.frame': 889 obs. of 5 variables:
## $ Pclass : int 3 1 3 1 3 3 1 3 3 2 ...
## $ Sex : chr "male" "female" "female" "female" ...
## $ Embarked: chr "S" "C" "S" "S" ...
## $ SibSp : int 1 1 0 1 0 0 0 3 0 1 ...
## $ Parch : int 0 0 0 0 0 0 0 1 2 0 ...
table(d1$Pclass)
##
## 1 2 3
## 214 184 491
table(d1$Sex)
##
## female male
## 312 577
table(d1$Embarked)
##
## C Q S
## 168 77 644
table(d1$Sex)
##
## female male
## 312 577
Some variables must be convert properly.
d1$Pclass <- factor(d1$Pclass, levels=c("1", "2", "3"))
d1$Sex <- factor(d1$Sex, levels=c("female", "male"))
d1$Embarked <- factor(d1$Embarked)
There are many R packages to fit this model but here we are going to use nnet package.
library(nnet)
formula <- "Pclass ~ Sex + Embarked + SibSp + Parch"
mod <- multinom(formula, data=d1)
## # weights: 21 (12 variable)
## initial value 976.666325
## iter 10 value 812.978209
## final value 806.250967
## converged
The summary can be found using:
output <- summary(mod)
print(output)
## Call:
## multinom(formula = formula, data = d1)
##
## Coefficients:
## (Intercept) Sexmale EmbarkedQ EmbarkedS SibSp Parch
## 2 -1.5642365 -0.0698696 2.004449 1.876297 -0.07211907 0.04548861
## 3 -0.8142428 0.7559771 3.978519 1.164160 0.19712980 0.11427709
##
## Std. Errors:
## (Intercept) Sexmale EmbarkedQ EmbarkedS SibSp Parch
## 2 0.3006577 0.2238023 0.9523768 0.2922098 0.1343888 0.1464731
## 3 0.2159193 0.1920332 0.7399698 0.1972837 0.1058155 0.1258582
##
## Residual Deviance: 1612.502
## AIC: 1636.502
From the last summary we can write the expression as:
\[\begin{aligned} \log \bigg(\frac{P(Pclass = 2)}{P(Pclass = 1)}\bigg) = &- 1.5642365 - 0.0698696\cdot Sexmale + 2.004449\cdot EmbarkedQ \\&+ 1.876297\cdot EmbarkedS - 0.07211907\cdot SibSp + 0.04548861\cdot Parch \end{aligned}\] \[\begin{aligned} \log \bigg(\frac{P(Pclass = 3)}{P(Pclass = 1)}\bigg) = &- 0.8142428 + 0.7559771\cdot Sexmale + 3.978519\cdot EmbarkedQ \\&+ 1.164160\cdot EmbarkedS + 0.19712980\cdot SibSp + 0.11427709\cdot Parch \end{aligned}\]We can obtain the estimated probability for each class as:
p_hat <- fitted(mod)
head(round(p_hat, digits=2))
## 1 2 3
## 1 0.17 0.20 0.63
## 2 0.58 0.11 0.31
## 3 0.26 0.36 0.37
## 4 0.25 0.32 0.43
## 5 0.19 0.24 0.57
## 6 0.02 0.03 0.95
What would be the Pclass for a female, embarked at Q with 2 iblings and 2 parents.
newdata <- data.frame(Sex="female", Embarked="Q", SibSp=2, Parch=2)
predict(object=mod, newdata=newdata, type="probs")
## 1 2 3
## 0.02145909 0.03159820 0.94694271
predict(object=mod, newdata=newdata, type="class")
## [1] 3
## Levels: 1 2 3
The confusion matrix can be obtained as:
confusion <- table(Predicted=predict(object=mod, type="class"),
True=d1$Pclass)
confusion
## True
## Predicted 1 2 3
## 1 66 11 56
## 2 0 0 0
## 3 148 173 435
Upps, something is wrong!!!
To do the homework you have a limited time, press the button below to start.
30:00