1 Introduction

Model checking is a vital part when we fit a model. The ordinary residuals are not adequate to examine the assumptions of a GLM. In this section we illustrate the behaviour of the Pearson and Deviance residuals when we use a GLM. We select two subclasses of GLMs for this purpose:

Before to compute anything we would like to mention just a few words about these two types of residuals.

2 Pearson Residuals

The rationale behind this kind of residuals is to scale the ordinary residuals by the appropiate variance of each observation. The Pearson residual for the observation \(i\) is defined as:

\[ \hat{\epsilon}_{i}^{\small P} = \frac{y_i - \hat{\mu}_i}{\sqrt{V(\hat{\mu}_i)}} \] where \(\hat{\mu}_i\) is the adjusted value for the \(i\)-th observation and \(V(\cdot)\) is the variance function of the respective subclass of the exponential family distribution.

3 Deviance Residuals

The deviance residuals are a kind of generalization of the ordinary residuals. They are defined based on the concept of deviance. For purpose of reference in the following table we show the deviance of an observation \(i\):

\[ Normal: \ \ D_i = (y_i - \hat{\mu_i})^2 \\ Poisson: \ \ D_i = 2y_ilog \Bigl(\frac{y_i}{\hat{\mu_i}}\Bigr) - 2(y_i - \mu_i)\\ Binomial: \ \ D_i = 2\Bigl\{ y_ilog \Bigl(\frac{y_i}{\hat{\mu_i}}\Bigr) - (n - y_i)log \Bigl(\frac{n - y_i}{\hat{n - \mu_i}}\Bigr) \Bigl\}\\ Gamma: \ \ D_i = 2\Bigr\{\Bigl(\frac{y_i - \hat{\mu_i}}{\hat{\mu_i}}\Bigr) - log \Bigl(\frac{y_i}{\hat{\mu_i}}\Bigr) \Bigl\}\\ Inverse\ Gaussian: \ \ D_i = \frac{(y_i - \hat{\mu_i})^2}{\hat{\mu_i}^2 y_i} \] Hence, the deviance residual for the observation \(i\) is defined as: \[ \hat{\epsilon}_{i}^{\small D} = sign(y_i - \hat{\mu}_i )\sqrt{D_i} \] Now, we are prepared to study these particular residuals for GLMs.

4 Poisson class, P.

In this case we use the canonical link function \(g(\mu) = log(\mu)\) with \(\mu\) the mean of the Poisson distribution.

set.seed(1)
n <- 200
X <- round(cbind( runif(n), rnorm(n,mean=2.5), rnorm(n,mean=4,sd=0.7)),3)
colnames(X) <- c("X1","X2","X3")
dim(X)
## [1] 200   3
head(X)
##         X1    X2    X3
## [1,] 0.266 1.880 4.626
## [2,] 0.372 2.542 3.267
## [3,] 0.573 1.589 5.380
## [4,] 0.908 2.658 3.731
## [5,] 0.202 1.845 5.158
## [6,] 0.898 4.267 5.059
t_beta <- c(0.3,-0.85,1.4) # True vector of parameters
t_beta
## [1]  0.30 -0.85  1.40
mus <- exp(X%*%t_beta)

Now, we will build the response variable

X <- as.data.frame(X)
y <- vector()
for(i in 1:n){
y[i] <- rpois(1,lambda=mus[i]) 
}

A look to the simulated data

head(y)
## [1] 143  11 545  26 309  41

We put the estimates in the object: modelo

modelo <- glm( y ~ X1 + X2 + X3 - 1, data = X, family=poisson(link="log"))

We can obtain the residuals of the fitted model as follow:

P_residual <- residuals(modelo,type="pearson")
D_residual <- residuals(modelo,type="deviance")

We can visualize the residuals versus the fitted values with a scatter plot.

par(mfrow=c(1,2))
plot(modelo$fitted.values,D_residual,xlab=" ",ylab=" ",main="Deviance Residuals")
abline(h=0,col=2)
plot(modelo$fitted.values,P_residual,xlab=" ",ylab=" ",main="Pearson Residuals")
abline(h=0,col=2)

We can see some symmetry around zero without any particular tendency in the scatter plots. But, let’s make a close-up to this behaviour.

4.1 Simulation I

In this point we would like to know about the behaviour of the Pearson and Deviance residuals with different samples of size 200. So, we will make a simulation with 2000 replicates.

R <- 2000
P_residuals <- matrix(0,R,length(y))
D_residuals <- matrix(0,R,length(y))
  
for(j in 1:R){
  for(i in 1:n){
    y[i] <- rpois(1,lambda=mus[i]) 
  }
  modelo <- glm( y ~ X1 + X2 + X3 - 1, data = X, family=poisson)  
  P_residuals[j,] <- residuals(modelo,type="pearson")
  D_residuals[j,] <- residuals(modelo,type="deviance")
}

Our next step is to compute the p-value of the Kolmogorov-Smirnov test for the null hipothesis that the residuals are normally distributed with mean equals to zero and variance equals to \(\phi = 1\).

  P_km_test <- vector()
  D_km_test <- vector()
  for(i in 1:R){
    P_km_test[i] <- round(ks.test(P_residuals[i,], 
                          "pnorm", mean = 0, sd =1)$p.value,2)
    D_km_test[i] <- round(ks.test(D_residuals[i,], 
                          "pnorm", mean = 0, sd =1)$p.value,2)
  }

Finally, we would like to know what is the proportion of rejected residuals in both cases.

round(mean(P_km_test < 0.05),3)
## [1] 0.006
round(mean(D_km_test < 0.05),3)
## [1] 0.01

There exist a tiny difference between the two proportion. Nevertheless, we obtained evidence about the normal behaviour of the residuals.

5 Gamma class, G.

In this case we use the canonical link function: \(g(\mu) = 1/\mu\) where \(\mu\) is the mean of the Gamma distribution.

n <- 200
X_G <- round( cbind(1,rbinom(n,2,prob=0.5)),3)
colnames(X_G) <- c("X0","X1")
head(X_G)
##      X0 X1
## [1,]  1  1
## [2,]  1  2
## [3,]  1  2
## [4,]  1  1
## [5,]  1  2
## [6,]  1  2
# True vector of parameters
t_beta <- c(0.7,0.27) 
class(X_G%*%t_beta)
## [1] "matrix"
mus <- 1/(X_G%*%t_beta)
summary(mus)
##        V1        
##  Min.   :0.8065  
##  1st Qu.:1.0309  
##  Median :1.0309  
##  Mean   :1.0869  
##  3rd Qu.:1.4286  
##  Max.   :1.4286
X_G <- as.data.frame(X_G)

Now, we will generate the response variable

y <- vector()
phi <- 0.063
for(i in 1:n){
  y[i] <- rgamma(1,shape=mus[i]/phi,scale=phi) 
}

A look to the simulated data

head(y)
## [1] 1.5566699 1.0597244 1.1169704 0.7493541 0.9251274 0.7688771

We save the estimates in the object: modelo_G.

modelo_G <- glm( y ~ X1, data = X_G,family=Gamma(link="inverse"))

Remember that we can get the residuals of the fitted model with the function residuals:

P_residual_G <- residuals(modelo_G,type="pearson")
D_residual_G <- residuals(modelo_G,type="deviance")

Once again let’s visualize the residuals versus the fitted values.

par(mfrow=c(1,2))
plot(modelo_G$fitted.values,D_residual_G,xlab="",main="Deviance Residuals")
abline(h=0,col=2)
plot(modelo_G$fitted.values,P_residual_G,xlab="",main="Pearson Residuals")
abline(h=0,col=2)

We have that some symmetry around zero is presented in both residuals plots. Nevertheless, we want to see if it is a systematic behaviour through a simulation study.

5.1 Simulation II

In this point we would like to know about the behaviour of the Pearson and Deviance residuals with different samples of size 200. Hence, we will make a simulation with 2000 replicates.

R <- 2000
P_residuals_G <- matrix(0,R,length(y))
D_residuals_G <- matrix(0,R,length(y))

for(j in 1:R){
  for(i in 1:n){
    y[i] <- rgamma(1,s=mus[i]/phi,scale=phi) 
  }
  modelo_G <- glm( y ~ X1, data = X_G, family=Gamma(link="inverse"))
  P_residuals_G[j,] <- residuals(modelo_G,type="pearson")
  D_residuals_G[j,] <- residuals(modelo_G,type="deviance")
}
# Mean P
round(summary(apply(P_residuals_G,1,mean)),3)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0       0       0       0       0       0
# Var P 
round(summary(apply(P_residuals_G,1,var)),3)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.042   0.056   0.060   0.060   0.064   0.089
# Mean D
round(summary(apply(D_residuals_G,1,mean)),3)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -0.029  -0.021  -0.020  -0.020  -0.019  -0.014
# Var D
round(summary(apply(D_residuals_G,1,var)),3)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.042   0.056   0.060   0.060   0.064   0.086

Our next step is to compute the p-value of the Kolmogorov-Smirnov test for the null hipothesis that the residuals are normally distributed with mean equals to zero and variance equals to \(\phi=0.063\).

P_km_test_G <- vector()
D_km_test_G <- vector()
  for(i in 1:R){
    P_km_test_G[i] <- round(ks.test(P_residuals_G[i,], 
                          "pnorm", mean = 0, sd = phi^0.5)$p.value,2)
    D_km_test_G[i] <- round(ks.test(D_residuals_G[i,], 
                          "pnorm", mean = 0, sd = phi^0.5)$p.value,2)
  }

Finally, we would like to know what is the proportion of rejected residuals in both cases.

round(mean(P_km_test_G < 0.05),3)
## [1] 0.02
round(mean(D_km_test_G < 0.05),3)
## [1] 0.023

We have a similar performance between the two residuals. Additionally, we obtained some evidence about the normal behaviour of the residuals.

6 Final conclusions

I would like to invite you to try with the others subclasses of GLMs.