\[D^\star(\boldsymbol{y};\hat{\boldsymbol{\mu}}) = 2 (L(\boldsymbol{y};\boldsymbol{y})-L(\hat{\boldsymbol{\mu}};\boldsymbol{y}))\]
In glm (generalized linear models) there are 2 deviances between 3 models which are defined as follows:
Next figure shows the two types of deviances for glm between the 3 models defined above.
The Null deviance and the residual deviance are defined as:
\[Null Deviance = 2(ll(Saturated Model) - ll(Null Model))\] with \(df = df_{Sat} - df_{Null}\).
\[Residual Deviance = 2(ll(Saturated Model) - ll(Proposed Model))\] with \(df = df_{Sat} - df_{Res}\).
Using a toy dataset we can fit a model to study the effect of body weight on cats gender (y=0 for female and y=1 for male). The dataset is as follows:
y <- c(1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0,
1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0)
weight <- c(2.1, 2.5, 1.2, 1, 3, 2.1, 1.5, 2.2, 1.9, 2.7, 1.1, 2.9, 1.2, 2.1,
2.2, 2.5, 1.9, 1.2, 2, 2.9, 2.2, 1.5, 3, 2.4, 1.2, 1.6, 2.3, 2.1,
2.6, 2.4, 2.5, 2, 1, 1.4, 2.9, 1.5, 3, 2.9, 2.9, 2.1, 2.8, 2.7, 1,
2.9, 1.1, 2.2, 1.3, 1.7, 1.5, 1.7)
To explore the data we can use the next code.
plot(x=weight, y, yaxt='n', pch=20)
axis(side=2, at=0:1, labels=0:1, las=1)
To fit the saturated and null model we can use the next code.
mod_sat <- glm(y ~ as.factor(1:length(y)), family=binomial)
mod_nul <- glm(y ~ 1, family=binomial)
Now we are going to fit the proposed model.
mod <- glm(y ~ weight, family=binomial)
If we wanted to obtain the deviance manually we can use the expression given below and in R we
2*(logLik(mod_sat) - logLik(mod)) # Residual deviance
## 'log Lik.' 35.48677 (df=50)
2*(logLik(mod_sat) - logLik(mod_nul)) # Null deviance
## 'log Lik.' 67.30117 (df=50)
summaryAnother way to extract the deviance is using the next code.
summary(mod)
##
## Call:
## glm(formula = y ~ weight, family = binomial)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6090 -0.4077 0.1892 0.4775 2.2489
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.0053 1.9512 -3.590 0.000330 ***
## weight 3.7997 0.9988 3.804 0.000142 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 67.301 on 49 degrees of freedom
## Residual deviance: 35.487 on 48 degrees of freedom
## AIC: 39.487
##
## Number of Fisher Scoring iterations: 5
The two deviance for the model can be obtained as:
summary(mod)$null.deviance
## [1] 67.30117
summary(mod)$deviance
## [1] 35.48677
devianceThe last form to obtain the deviance is using:
deviance(mod)
## [1] 35.48677
To know more about deviance you can consult this resource.
Nota: Para conocer otras publicaciones relacionadas con glm visite este enlace.