Markdown Author: Jessie Bell, 2023

Libraries Used: none

GLM and GAM for count data

Likelihood Criterion In linear regression we use ordinary least squares to minimize the residual sum of squares, here we use maximum likelihood estimation because our data are not normal. We specify a joint likelihood criterion (L) for all observed data. We maximize this likelihood criterion as a function of hte unknown regression parameters. What are the values of the regression parameters such that he probability L of the observed data is highest?

\(L=Probability(Y_1 = y_1\) and \(Y_2=y_2\) and \(... Y_n = y_n)\)

Because we assume independence in the observations, we can use the basic probability rule P(A and B) = P(A) x P(B). To optimize a function you obtain the first order derivative and set it equal to 0 (see text pg. 214).

🐡

Mean and variance are equal
Poisson dist.
Log link

RK <- read.table("RoadKills.txt", header = T)

I. Amphibian Roadkill

The data set consists of roadkills of amphibian species at 52 sites along a road in Portugal. A scatterplot of the response variable roadkills against a possible explanatory variable ‘distance to the natural park’, denoted by D.PARK, is given in Fig. 9.3. The biological interpretation of ‘distance to the park’ is given in Chapter 16.

plot(RK$D.PARK, RK$TOT.N, xlab = "Distance to park", ylab = "Road kills", pch=20, col="#e76bf3")

Figure 1: Scatterplot of amphibian road kills versus distance (in metres) to a nearby Natural Park. The data are counts, and there seems to be a non-linear, perhaps exponential, relationship between roadkills and D.PARK. Also note that the variation is larger for larger values of roadkills. Taken together, this gives us all the ingredients for a Poisson GLM. Starting with only D.PARK as an explanatory variable, and ignoring the other 10 explanatory variables, is a pedagogical choice for presenting Poisson GLM in a textbook and is not a general recommendation for analysing these data

The following GLM was applied:

Yi, the number of killed animals at site i, is Poisson distributed with mean µi.
The systematic part is given by η(D.PARKi) = α + β × D.PARKi.
There is a logarithm link between the mean of Yi and the predictor function η(D.PARKi)

As a result of these three steps, we have

Yi ∼ p(µi)

E(Yi) = µi and var(Yi) = µi

log(µi) = α + β × D.PARKi or µi = eα+β×D.PARKi

II. Model Selection

You can use the glm for a linear regression as long as you apply the family= gaussian

#Make a model
M1 <- glm(TOT.N ~ D.PARK, family = poisson, data = RK)

summary(M1) #the first 2 lines tell which model has been fitted. Basic math on the residuals is next.

## 
## Call:
## glm(formula = TOT.N ~ D.PARK, family = poisson, data = RK)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  4.316e+00  4.322e-02   99.87   <2e-16 ***
## D.PARK      -1.059e-04  4.387e-06  -24.13   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 1071.4  on 51  degrees of freedom
## Residual deviance:  390.9  on 50  degrees of freedom
## AIC: 634.29
## 
## Number of Fisher Scoring iterations: 4

# The estimation for slope and intercept are 4.31 and -0.000106, respectively. We get a z stat (because we know the variance) and a corresponding p-value (for testing the null hypothesis that the slope is 0). We get an AIC that can help us with model selection. 

## Keep in mind that the distance to the park is expressed in meters. You should express these data in km instead to reduce the number of 0s.

Deviance

The null and residual deviance are essentially the maximum likelihood equivalents of the total sum of squares and the residual sum of squares, respectively. For the Poisson GLM, the residual deviance is defined as twice the difference between the log likelihood of a model that provides a perfect fit. AKA the saturated model. There is no \(R^2\) in GLM models, so you can determine the explained deviance like this:

\(\dfrac{nullDeviance-residualDeviance}{nullDeviance} * 100\)

#so if we use this with our mode: 

(M1$null.deviance - M1$deviance)/M1$null.deviance * 100

## [1] 63.51681

So the explanatory variable distance to the park explains 63.51% of the variation in road kills. Dobson (2002) called this proportional increase in explained deviance the pseudo R2. The smaller the residual deviance, the better the model. Some statistics programs also quote a p-value as it is supposedly Chi-square distributed with n – p degrees of freedom, where p is the number of regression parameters in the model and n the number of observations. However, using the residual deviance as a goodness of-fit measure is not without controversy; see McCullagh and Nelder (pg. 118–119, 1989). They argue that (at least for the binomial GLM) a large value of the residual deviance cannot always be seen as evidence of a poor fit. The residual deviance is also sometimes called the deviance (see above calculation).

Sketching the Residual Values

Calculate the predicted values and add them as a line.

MyData=data.frame(D.PARK=seq(from=0,to=25000,by=1000))

#predict function for model prediictions
G<-predict(M1,
           newdata=MyData,
           type="link",
           se=T)


F<-exp(G$fit) #creates an exponential function for M1 predicted fit values

FSEUP<-exp(G$fit+1.96*G$se.fit)

FSELOW<-exp(G$fit-1.96*G$se.fit)
plot(RK$D.PARK, RK$TOT.N, xlab = "Distance to park", ylab = "Road kills", pch=20, col="#e76bf3")
lines(MyData$D.PARK,F,lty=1)
lines(MyData$D.PARK,FSEUP,lty=2)
lines(MyData$D.PARK,FSELOW,lty=2)

Figure 2: Observed roadkills with a fitted Poisson GLM curve (solid line) and 95% confidence bands (dotted lines). Note the clear exponential shape of the curve. For smaller fitted values, there are groups of residuals above and below the fitted line. This is not good, and we need to deal with this in the model validation.

III. Model Selection in GLM

So far, we have only discussed the interpretation of the model in terms of an exponential curve fitted through a set of points; we now concentrate on things like model selection, hypothesis testing, and model validation. However, applying a model selection with only one explanatory variable is a bit unrealistic, so we now add a few more explanatory variables.

The amphibian roadkills data set contains 17 explanatory variables. A list of these variables and abbreviations is given in Table 16.1 above. Some of the explanatory variables were square root transformed because of large values. Using variance inflation factors (VIF, Appendix A), a sub-selection of nine variables is made in Chapter 16 and we use the same sub-selection here (see Table 16.2 above and the function code below).

#set up premade functions from Zuur
source("HighstatLibV10.R.txt") 

#The resulting VIF values are given in Table 16.2. As explained in Appendix A, a cut-off value of 5 or even 3 can be used to remove collinear variables; we used 3. To find a set of explanatory variables that does not contain collinearity, we removed one variable at a time, recalculated the VIF values, and repeated this process until all VIF values were smaller than 3. As a result, MONT, P.EDGE, N.PATCH, L.SDI, and SQ.URBAN were dropped. This means that we have 12 remaining explanatory variables. This is still a large number of variables!

#get the data you want
Z<-cbind(RK$OPEN.L, 
         RK$SQ.OLIVE, 
         RK$MONT.S,
         RK$MONT,
         RK$SQ.POLIC, 
         RK$SQ.SHRUB, 
         RK$SQ.URBAN,
         RK$SQ.WATRES, 
         RK$L.WAT.C, 
         RK$L.D.ROAD, 
         RK$SQ.LPROAD, 
         RK$D.WAT.RES, 
         RK$SQ.DWATCOUR, 
         RK$D.PARK, 
         RK$N.PATCH, 
         RK$P.EDGE, 
         RK$L.SDI)
#17 variables above
#use Zuur corvif() function
corvif(Z)

## 
## 
## Variance inflation factors
## 
##          GVIF
## V1  13.651024
## V2   1.245046
## V3  19.299391
## V4   1.763165
## V5   2.221564
## V6   1.570653
## V7   1.946437
## V8  13.796522
## V9  10.797232
## V10  3.261683

Note: this is still a relatively high number of explanatory variables for a data set with only 52 observations! A Poisson GLM for the roadkills data with nine variables is specified in a very similar way as in Equation (9.4), except that the systematic part now contains all nine explanatory variables (we have no biological reasons to believe there are interactions).

Model Code

The following R code implements the Poisson GLM with nine explanatory variables.

RK$SQ.POLIC<-sqrt(RK$POLIC)
RK$SQ.WATRES<-sqrt(RK$WAT.RES)
RK$SQ.URBAN<-sqrt(RK$URBAN)
RK$SQ.OLIVE<-sqrt(RK$OLIVE)
RK$SQ.LPROAD<-sqrt(RK$L.P.ROAD)
RK$SQ.SHRUB<-sqrt(RK$SHRUB)
RK$SQ.DWATCOUR<-sqrt(RK$D.WAT.COUR)

#The full model (read above for VIF on why these variables were kept)
M2<-glm(TOT.N~OPEN.L+MONT.S+SQ.POLIC+
         SQ.SHRUB+SQ.WATRES+L.WAT.C+SQ.LPROAD+
         SQ.DWATCOUR+D.PARK,family=poisson,data=RK) #9 variables chosen

summary(M2)

## 
## Call:
## glm(formula = TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + 
##     SQ.WATRES + L.WAT.C + SQ.LPROAD + SQ.DWATCOUR + D.PARK, family = poisson, 
##     data = RK)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  3.749e+00  1.567e-01  23.935  < 2e-16 ***
## OPEN.L      -3.025e-03  1.580e-03  -1.915 0.055531 .  
## MONT.S       8.697e-02  1.359e-02   6.398 1.57e-10 ***
## SQ.POLIC    -1.787e-01  4.676e-02  -3.822 0.000133 ***
## SQ.SHRUB    -6.112e-01  1.176e-01  -5.197 2.02e-07 ***
## SQ.WATRES    2.243e-01  7.050e-02   3.181 0.001468 ** 
## L.WAT.C      3.355e-01  4.127e-02   8.128 4.36e-16 ***
## SQ.LPROAD    4.517e-01  1.348e-01   3.351 0.000804 ***
## SQ.DWATCOUR  7.355e-03  4.879e-03   1.508 0.131629    
## D.PARK      -1.301e-04  5.936e-06 -21.923  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 1071.44  on 51  degrees of freedom
## Residual deviance:  270.23  on 42  degrees of freedom
## AIC: 529.62
## 
## Number of Fisher Scoring iterations: 5

The Optimal Model

We want to know which explanatory variables are important, and because some terms are not significant, it is time for a model selection. The process is similar to the one used for linear regression (Appendix A). We can use either a selection criterion like the AIC or use a hypothesis testing approach. Automatic forward, backward, and forward and backward selection can be applied with the command step(M2), this backward selection indicates that no term should be dropped.

step(M2)

## Start:  AIC=529.62
## TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + SQ.DWATCOUR + D.PARK
## 
##               Df Deviance     AIC
## <none>             270.23  529.62
## - SQ.DWATCOUR  1   272.50  529.89
## - OPEN.L       1   273.93  531.32
## - SQ.WATRES    1   280.02  537.41
## - SQ.LPROAD    1   281.25  538.64
## - SQ.POLIC     1   285.53  542.92
## - SQ.SHRUB     1   298.31  555.70
## - MONT.S       1   306.89  564.28
## - L.WAT.C      1   335.47  592.86
## - D.PARK       1   838.09 1095.48

## 
## Call:  glm(formula = TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + 
##     SQ.WATRES + L.WAT.C + SQ.LPROAD + SQ.DWATCOUR + D.PARK, family = poisson, 
##     data = RK)
## 
## Coefficients:
## (Intercept)       OPEN.L       MONT.S     SQ.POLIC     SQ.SHRUB    SQ.WATRES  
##   3.7493885   -0.0030250    0.0869656   -0.1787178   -0.6111864    0.2242561  
##     L.WAT.C    SQ.LPROAD  SQ.DWATCOUR       D.PARK  
##   0.3354676    0.4517172    0.0073554   -0.0001301  
## 
## Degrees of Freedom: 51 Total (i.e. Null);  42 Residual
## Null Deviance:       1071 
## Residual Deviance: 270.2     AIC: 529.6

For the hypothesis testing approach, we have three options:

1. Test the null hypothesis H0: βi = 0 using the z-statistic.

This is the equivalent of the t-statistic in linear regression. This approach suggests to drop first SQ.DWATCOUR as it is the least significant term and then to refit the model and see whether there are still non-significant terms in the model.

summary(M2) #drop the least significant term

## 
## Call:
## glm(formula = TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + 
##     SQ.WATRES + L.WAT.C + SQ.LPROAD + SQ.DWATCOUR + D.PARK, family = poisson, 
##     data = RK)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  3.749e+00  1.567e-01  23.935  < 2e-16 ***
## OPEN.L      -3.025e-03  1.580e-03  -1.915 0.055531 .  
## MONT.S       8.697e-02  1.359e-02   6.398 1.57e-10 ***
## SQ.POLIC    -1.787e-01  4.676e-02  -3.822 0.000133 ***
## SQ.SHRUB    -6.112e-01  1.176e-01  -5.197 2.02e-07 ***
## SQ.WATRES    2.243e-01  7.050e-02   3.181 0.001468 ** 
## L.WAT.C      3.355e-01  4.127e-02   8.128 4.36e-16 ***
## SQ.LPROAD    4.517e-01  1.348e-01   3.351 0.000804 ***
## SQ.DWATCOUR  7.355e-03  4.879e-03   1.508 0.131629    
## D.PARK      -1.301e-04  5.936e-06 -21.923  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 1071.44  on 51  degrees of freedom
## Residual deviance:  270.23  on 42  degrees of freedom
## AIC: 529.62
## 
## Number of Fisher Scoring iterations: 5

M3 <- glm(TOT.N~OPEN.L+MONT.S+SQ.POLIC+SQ.SHRUB+SQ.WATRES+L.WAT.C+SQ.LPROAD+D.PARK,family=poisson,data=RK) #8 variables chosen

summary(M3) #everything is now significant! WOW

## 
## Call:
## glm(formula = TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + 
##     SQ.WATRES + L.WAT.C + SQ.LPROAD + D.PARK, family = poisson, 
##     data = RK)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  3.852e+00  1.405e-01  27.405  < 2e-16 ***
## OPEN.L      -3.464e-03  1.544e-03  -2.243 0.024890 *  
## MONT.S       8.927e-02  1.344e-02   6.643 3.08e-11 ***
## SQ.POLIC    -1.583e-01  4.492e-02  -3.525 0.000424 ***
## SQ.SHRUB    -6.160e-01  1.185e-01  -5.200 2.00e-07 ***
## SQ.WATRES    2.080e-01  6.917e-02   3.006 0.002645 ** 
## L.WAT.C      3.113e-01  3.817e-02   8.155 3.50e-16 ***
## SQ.LPROAD    4.936e-01  1.315e-01   3.753 0.000175 ***
## D.PARK      -1.286e-04  5.837e-06 -22.033  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 1071.4  on 51  degrees of freedom
## Residual deviance:  272.5  on 43  degrees of freedom
## AIC: 529.89
## 
## Number of Fisher Scoring iterations: 5

2. drop1()

Use the drop1(M2,test = “Chi”) command, which drops one explanatory variable, in turn, and each time applies an analysis of deviance test. We explain this process below.

The model containing all explanatory variables has a deviance of 270.3. If we drop OPEN.L, the deviance is 273.93: a difference of 3.69. The statistic X2 = 3.69 follows (approximately) a Chi-square distribution with 1 degree of freedom, which gives a p-value of 0.054. This can be double checked with the R command:

1 – pchisq (3.69 ,1)

1-pchisq(3.69 ,1)

## [1] 0.05473962

Note that the analysis of deviance does not give exactly the same p-value as the z-statistic. This is because both tests are approximate. If in doubt, use the analysis of deviance test. The advantage of using the analysis of deviance test is that it also gives a p-value for a nominal variable.

drop1(M2, test="Chi")

## Single term deletions
## 
## Model:
## TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + SQ.DWATCOUR + D.PARK
##             Df Deviance     AIC    LRT  Pr(>Chi)    
## <none>           270.23  529.62                     
## OPEN.L       1   273.93  531.32   3.69 0.0546474 .  
## MONT.S       1   306.89  564.28  36.66 1.410e-09 ***
## SQ.POLIC     1   285.53  542.92  15.30 9.181e-05 ***
## SQ.SHRUB     1   298.31  555.70  28.08 1.167e-07 ***
## SQ.WATRES    1   280.02  537.41   9.79 0.0017539 ** 
## L.WAT.C      1   335.47  592.86  65.23 6.648e-16 ***
## SQ.LPROAD    1   281.25  538.64  11.02 0.0009009 ***
## SQ.DWATCOUR  1   272.50  529.89   2.27 0.1319862    
## D.PARK       1   838.09 1095.48 567.85 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

3. anova()

Use the anova(M2) command, which applies a series of analysis of deviance tests by removing each term sequential. We explain at the end of Subsection 9.6.5 how this process works.

The same p-value for OPEN.L can be obtained by fitting a model with all explanatory variables (which is M2), a model without OPEN.L, and then use the anova command to compare the two models with an analysis of deviance. This is done with the following R code:

M3 <- glm(TOT.N ~ MONT.S + SQ.POLIC + D.PARK +
SQ.SHRUB + SQ.WATRES + L.WAT.C + SQ.LPROAD +
SQ.DWATCOUR, family = poisson, data = RK)

anova(M2, M3, test = "Chi")

## Analysis of Deviance Table
## 
## Model 1: TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + SQ.DWATCOUR + D.PARK
## Model 2: TOT.N ~ MONT.S + SQ.POLIC + D.PARK + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + SQ.DWATCOUR
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)  
## 1        42     270.23                       
## 2        43     273.93 -1  -3.6928  0.05465 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

If you use this output in a paper or report, then you should write that the difference in deviance is 3.69 and approximately follows a Chi-square distribution with 1 degree of freedom. We have seen papers where a Chi-square distribution with 43 degrees of freedom was quoted from the output above, which is clearly wrong! Be careful when using the command anova(M2); it applies an analysis of deviance test, but now the terms are removed sequentially and the order depends on the order they were typed. This is useful if all explanatory variables are independent or if the last term is an interaction.

Steps 2 and 3 are similar to the anova and drop1 functions in linear regression, except that in linear regression we used an F test based on residual sum of squares of a full and a nested model. A nested model is defined as a model that is obtained from the full model by setting certain parameters equal to 0. We do not have residual sum of squares in Poisson GLM. Well, actually we do, but they are not used in these tests (residuals are discussed in Section 9.8). Instead, we use the residual deviance of two nested models.

Summary: Check dispersion before selecting model!

Using the drop1 function, we decided to remove the variable SQ.DWATCOUR. Refitting the model resulted in all explanatory variables being significant at the 5% level. This suggests that we are finished with the model selection process, and can proceed to the model validation process. However, things are never that easy. The results of the summary command presented above had a small sentence that said: ‘overdispersion parameter for Poisson family taken to be 1’. This does not mean that the overdispersion really is 1; it just says it was taken as 1.We promised more misery, and overdispersion is the next stage. In the next section, we show that all the results presented in this section can be put in the bin, because of overdispersion. If you analyse your own data, you should always first check for overdispersion, before doing any model selection or interpretation of the results. The reason why we did not start by looking at overdispersion was because we wanted to make sure you could read the output and judge whether there is overdispersion. For your own data, you should always start by checking for overdispersion and act accordingly. This is discussed in the next section.

IV. Checking Dispersion

Overdispersion: when the variance is larger than the mean.

You can quickly correct for it using the following quasi-poisson code:

M4<- glm(TOT.N ~ OPEN.L + MONT.S + SQ.POLIC+
         SQ.SHRUB + SQ.WATRES + L.WAT.C + SQ.LPROAD+
         SQ.DWATCOUR + D.PARK, family = quasipoisson, data = RK)

You can see the only difference is specifying the family option as quasipoisson instead of poisson. This gives the impression that there is a quasi-Poisson distribution, but there is no such thing! All we do here is specify the mean and variance relationship and an exponential link between the expected values and explanatory variables. It is a software issue to call this ‘quasipoisson’. Do not write in your report or paper that you used a quasi-Poisson distribution. Just say that you did a Poisson GLM, detected overdispersion, and corrected the standard errors using a quasi-GLM model where the variance is given by φ × µ, where µ is the mean and φ the dispersion parameter.

To get the numerical output for this model, use summary(M4), which gives the output below. Note that the ratio of the residual deviance and the degrees of freedom is still larger than 1, but that is no longer a problem as we now allow for overdispersion. The dispersion parameter φ is estimated as 5.93. This means that all standard errors have been multiplied by 2.43 (the square root of 5.93), and as a result, most parameters are no longer significant! We can move onto model selection

summary(M4)

## 
## Call:
## glm(formula = TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + 
##     SQ.WATRES + L.WAT.C + SQ.LPROAD + SQ.DWATCOUR + D.PARK, family = quasipoisson, 
##     data = RK)
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.749e+00  3.814e-01   9.830 1.86e-12 ***
## OPEN.L      -3.025e-03  3.847e-03  -0.786  0.43604    
## MONT.S       8.697e-02  3.309e-02   2.628  0.01194 *  
## SQ.POLIC    -1.787e-01  1.139e-01  -1.570  0.12400    
## SQ.SHRUB    -6.112e-01  2.863e-01  -2.135  0.03867 *  
## SQ.WATRES    2.243e-01  1.717e-01   1.306  0.19851    
## L.WAT.C      3.355e-01  1.005e-01   3.338  0.00177 ** 
## SQ.LPROAD    4.517e-01  3.282e-01   1.376  0.17597    
## SQ.DWATCOUR  7.355e-03  1.188e-02   0.619  0.53910    
## D.PARK      -1.301e-04  1.445e-05  -9.004 2.33e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasipoisson family taken to be 5.928003)
## 
##     Null deviance: 1071.44  on 51  degrees of freedom
## Residual deviance:  270.23  on 42  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 5

V. Modeling Quasi-Poisson

# begin with the quasi model you made in the last section to deal with overdispersion

drop1(M4,test="F") #drop SQ.DWATCOUR first from model

## Single term deletions
## 
## Model:
## TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + SQ.DWATCOUR + D.PARK
##             Df Deviance F value   Pr(>F)    
## <none>           270.23                     
## OPEN.L       1   273.93  0.5739 0.452926    
## MONT.S       1   306.89  5.6970 0.021574 *  
## SQ.POLIC     1   285.53  2.3776 0.130585    
## SQ.SHRUB     1   298.31  4.3635 0.042814 *  
## SQ.WATRES    1   280.02  1.5217 0.224221    
## L.WAT.C      1   335.47 10.1389 0.002735 ** 
## SQ.LPROAD    1   281.25  1.7129 0.197727    
## SQ.DWATCOUR  1   272.50  0.3526 0.555802    
## D.PARK       1   838.09 88.2569    7e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M5<- glm(TOT.N ~ OPEN.L + MONT.S + SQ.POLIC+
         SQ.SHRUB + SQ.WATRES + L.WAT.C + SQ.LPROAD+
         D.PARK, family = quasipoisson, data = RK)

drop1(M5, test="F") #get rid of OPEN.L

## Single term deletions
## 
## Model:
## TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + D.PARK
##           Df Deviance F value    Pr(>F)    
## <none>         272.50                      
## OPEN.L     1   277.60  0.8038   0.37496    
## MONT.S     1   311.91  6.2181   0.01657 *  
## SQ.POLIC   1   285.62  2.0707   0.15739    
## SQ.SHRUB   1   300.59  4.4319   0.04115 *  
## SQ.WATRES  1   281.22  1.3765   0.24716    
## L.WAT.C    1   339.08 10.5064   0.00230 ** 
## SQ.LPROAD  1   286.31  2.1794   0.14716    
## D.PARK     1   843.43 90.0904 4.105e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M6<- glm(TOT.N ~ MONT.S + SQ.POLIC+
         SQ.SHRUB + SQ.WATRES + L.WAT.C + SQ.LPROAD+
         D.PARK, family = quasipoisson, data = RK)
drop1(M6, test="F")

## Single term deletions
## 
## Model:
## TOT.N ~ MONT.S + SQ.POLIC + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + D.PARK
##           Df Deviance  F value    Pr(>F)    
## <none>         277.60                       
## MONT.S     1   321.03   6.8845   0.01191 *  
## SQ.POLIC   1   298.88   3.3741   0.07299 .  
## SQ.SHRUB   1   310.04   5.1428   0.02831 *  
## SQ.WATRES  1   286.46   1.4056   0.24215    
## L.WAT.C    1   339.09   9.7473   0.00317 ** 
## SQ.LPROAD  1   292.85   2.4176   0.12715    
## D.PARK     1   950.86 106.7149 2.429e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#keep doing this until you find out all you need is D.Park in your model! WOW so long

drop1(M5, test="F") #get rid of OPEN.L

## Single term deletions
## 
## Model:
## TOT.N ~ OPEN.L + MONT.S + SQ.POLIC + SQ.SHRUB + SQ.WATRES + L.WAT.C + 
##     SQ.LPROAD + D.PARK
##           Df Deviance F value    Pr(>F)    
## <none>         272.50                      
## OPEN.L     1   277.60  0.8038   0.37496    
## MONT.S     1   311.91  6.2181   0.01657 *  
## SQ.POLIC   1   285.62  2.0707   0.15739    
## SQ.SHRUB   1   300.59  4.4319   0.04115 *  
## SQ.WATRES  1   281.22  1.3765   0.24716    
## L.WAT.C    1   339.08 10.5064   0.00230 ** 
## SQ.LPROAD  1   286.31  2.1794   0.14716    
## D.PARK     1   843.43 90.0904 4.105e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M7<- glm(TOT.N ~ MONT.S + SQ.POLIC+
         SQ.SHRUB + L.WAT.C + SQ.LPROAD+
         D.PARK, family = quasipoisson, data = RK)
drop1(M7, test="F") #SQ.LPROAD is a goner

## Single term deletions
## 
## Model:
## TOT.N ~ MONT.S + SQ.POLIC + SQ.SHRUB + L.WAT.C + SQ.LPROAD + 
##     D.PARK
##           Df Deviance  F value    Pr(>F)    
## <none>         286.46                       
## MONT.S     1   322.79   5.7065  0.021157 *  
## SQ.POLIC   1   308.82   3.5122  0.067421 .  
## SQ.SHRUB   1   310.05   3.7052  0.060578 .  
## L.WAT.C    1   339.10   8.2680  0.006144 ** 
## SQ.LPROAD  1   297.73   1.7691  0.190197    
## D.PARK     1   952.16 104.5733 2.569e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M8<- glm(TOT.N ~ MONT.S + SQ.POLIC+
         SQ.SHRUB + L.WAT.C + D.PARK, family = quasipoisson, data = RK)
drop1(M8, test="F") #SQ.SHRUB is a goner

## Single term deletions
## 
## Model:
## TOT.N ~ MONT.S + SQ.POLIC + SQ.SHRUB + L.WAT.C + D.PARK
##          Df Deviance  F value    Pr(>F)    
## <none>        297.73                       
## MONT.S    1   332.50   5.3731  0.024950 *  
## SQ.POLIC  1   314.89   2.6527  0.110205    
## SQ.SHRUB  1   314.85   2.6457  0.110664    
## L.WAT.C   1   350.38   8.1353  0.006483 ** 
## D.PARK    1   953.56 101.3296 3.298e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M9<- glm(TOT.N ~ MONT.S + SQ.POLIC+
         L.WAT.C + D.PARK, family = quasipoisson, data = RK)
drop1(M9, test="F") #SQ.POLIC is a goner

## Single term deletions
## 
## Model:
## TOT.N ~ MONT.S + SQ.POLIC + L.WAT.C + D.PARK
##          Df Deviance  F value    Pr(>F)    
## <none>        314.85                       
## MONT.S    1   341.41   3.9644   0.05230 .  
## SQ.POLIC  1   335.99   3.1556   0.08214 .  
## L.WAT.C   1   354.53   5.9233   0.01880 *  
## D.PARK    1   997.18 101.8565 2.378e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M10<- glm(TOT.N ~ MONT.S + L.WAT.C + D.PARK, family = quasipoisson, data = RK)
drop1(M10, test="F") #MONT.S is a goner

## Single term deletions
## 
## Model:
## TOT.N ~ MONT.S + L.WAT.C + D.PARK
##         Df Deviance  F value    Pr(>F)    
## <none>       335.99                       
## MONT.S   1   362.28   3.7559   0.05852 .  
## L.WAT.C  1   364.70   4.1012   0.04843 *  
## D.PARK   1  1067.39 104.4902 1.228e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M11<- glm(TOT.N ~ L.WAT.C + D.PARK, family = quasipoisson, data = RK)
drop1(M11, test="F") #L.WAT.C is a goner

## Single term deletions
## 
## Model:
## TOT.N ~ L.WAT.C + D.PARK
##         Df Deviance F value    Pr(>F)    
## <none>       362.28                      
## L.WAT.C  1   390.90  3.8708   0.05481 .  
## D.PARK   1  1070.90 95.8447 4.053e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

M12<- glm(TOT.N ~ D.PARK, family = quasipoisson, data = RK)
drop1(M12, test="F") #seems we have found the best model?

## Single term deletions
## 
## Model:
## TOT.N ~ D.PARK
##        Df Deviance F value    Pr(>F)    
## <none>       390.9                      
## D.PARK  1   1071.4  87.049 1.572e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#ignoring overdispersion can lead to a VERY DIFFERENT MODEL. YIKES!

MyData=data.frame(D.PARK=seq(from=0,to=25000,by=1000))

#predict function for model prediictions
G<-predict(M12,
           newdata=MyData,
           type="link",
           se=T)


F<-exp(G$fit) #creates an exponential function for M1 predicted fit values

FSEUP<-exp(G$fit+1.96*G$se.fit)

FSELOW<-exp(G$fit-1.96*G$se.fit)
plot(RK$D.PARK, RK$TOT.N, xlab = "Distance to park", ylab = "Road kills", pch=20, col="#e76bf3")
lines(MyData$D.PARK,F,lty=1)
lines(MyData$D.PARK,FSEUP,lty=2)
lines(MyData$D.PARK,FSELOW,lty=2)

Figure 3:Fitted line of the optimal quasi-Poisson model using only D.PARK as the explanatory variables. Its estimated parameters, standard errors, etc. are given below and the fitted line is presented in this figure. Note that the confidence intervals around the line are now larger than before due to the overdispersion correction.

summary(M12)

## 
## Call:
## glm(formula = TOT.N ~ D.PARK, family = quasipoisson, data = RK)
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.316e+00  1.194e-01  36.156  < 2e-16 ***
## D.PARK      -1.058e-04  1.212e-05  -8.735 1.24e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasipoisson family taken to be 7.630148)
## 
##     Null deviance: 1071.4  on 51  degrees of freedom
## Residual deviance:  390.9  on 50  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 4

VI. Model Validation

Just as in linear regression, we have to apply a model validation after we have decided on the optimal GLM, and the residuals are an important tool for this. The method is the same or both GLM and GAM but we focus just on GLM here.

There are 3 types of residuals (actually more, see McCullagh and Nelder 1989): 1. Pearson residuals, not looking for normality but for lack of fit. Should not show pattern. 2. Deviance residuals (R default), not looking for normality but for lack of fit. Should not show pattern. 3. Ordinary residuals, looking for normality

Illustrating Residuals

To explain model validation, we use the optimal quasi-Poisson GLM for the amphibian roadkills data. Recall that there was an overdispersion of 7.63 and that the only significant explanatory variable was D.PARK. Figure 9.6 shows the standard output from a plot command, and Fig. 9.7 contains the response residuals, Pearson residuals, scaled Pearson residuals (we divided the Pearson residuals by the square root of the overdispersion parameter), and the deviance residuals. Both figures indicate that there is a clear pattern in the residuals. Note that it is hard to detect any differences between Pearson and deviance residuals. Some additional exploration into the residuals against other explanatory variables and spatial locations is done in Chapter 16. As in linear regression, we can also use leverage and the Cook distance statistic. There are no influential observations. The following R code was used to produce Figs. 4 and 5.

Unfortunately, the function resid ignores the overdispersion; so we need to manually divide the Pearson residuals by the square root of 7.63 or calculate these residuals from scratch (as we did here). The rest of the code plots the residuals and should be self explanatory.

op <- par(mfrow = c(2, 2))
plot(M12)

Figure 4: Standard output from a GLM function applied on the amphibian roadkills data obtained by the plot command

M12<-glm(TOT.N~D.PARK,family=quasipoisson,data=RK)
EP=resid(M5,type="pearson")
ED=resid(M5,type="deviance")
mu=predict(M5,type="response")
E=RK$TOT.N-mu
EP2=E/sqrt(7.630148*mu) #Unfortunately, the function resid ignores the overdispersion; so we need to manually divide the Pearson residuals by the square root of 7.63
op <- par(mfrow=c(2,2))
plot(x=mu,y=E,main="Response residuals")
plot(x=mu,y=EP,main="Pearson residuals")
plot(x=mu,y=EP2,main="Pearson residuals scaled")
plot(x=mu,y=ED,main="Deviance residuals")

Figure 5:Response residuals (observed minus fitted values, also called ordinary residuals), Pearson residuals, scaled Pearson residuals (the overdispersion is taken into account) and the deviance residuals for the optimal quasi-Poisson model applied on the amphibian roadkills data

par(op)

Negative Binomial

GAM

References

[1] Zuur, A. F., Ieno, E. N., Walker, N., Saveliev, A. A. & Smith, G. M. Mixed Effects Models and Extensions in Ecology with R (Springer, 2010).

Zuur Ch. 9