1. The data below are the times to death \(y\) and the log of the white blood cell count \(x\) for 17 patients suffering from leukaemia.

y <- c(3.36,2.88,3.63,3.41,3.78,4.02,4.00,4.23,3.73,3.85,3.97,4.51,4.54,
    5.00,5.00,4.72,5.00)
x <- c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,6)
plot(x, y)

plot(1/x, y)

Yes, and it becomes linear when we plot \(1/y\) vs \(x\).

Inverse link function.

g1 <- glm(y~x,family=Gamma(link="inverse"))
summary(g1,dispersion=1)
## 
## Call:
## glm(formula = y ~ x, family = Gamma(link = "inverse"))
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept) 0.2156853  0.0804951   2.679  0.00737 **
## x           0.0005382  0.0011642   0.462  0.64387   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Gamma family taken to be 1)
## 
##     Null deviance: 0.38385  on 16  degrees of freedom
## Residual deviance: 0.15971  on 15  degrees of freedom
## AIC: 22.477
## 
## Number of Fisher Scoring iterations: 4
summary(g1)
## 
## Call:
## glm(formula = y ~ x, family = Gamma(link = "inverse"))
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.2156853  0.0081804  26.366 5.57e-14 ***
## x           0.0005382  0.0001183   4.549 0.000384 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Gamma family taken to be 0.01032795)
## 
##     Null deviance: 0.38385  on 16  degrees of freedom
## Residual deviance: 0.15971  on 15  degrees of freedom
## AIC: 22.477
## 
## Number of Fisher Scoring iterations: 4

2. Analysis of the data on the number of plant species in Galapagos islands resulted in the following output.

library(GLMsData)
data("galapagos")
fit1 <- glm(Plants ~ Area + Elevation + StCruz + Adjacent + Nearest,family = poisson, data = galapagos)
summary(fit1)
## 
## Call:
## glm(formula = Plants ~ Area + Elevation + StCruz + Adjacent + 
##     Nearest, family = poisson, data = galapagos)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  3.328e+00  5.071e-02  65.623  < 2e-16 ***
## Area        -5.511e-04  2.579e-05 -21.369  < 2e-16 ***
## Elevation    3.357e-03  8.425e-05  39.846  < 2e-16 ***
## StCruz      -6.500e-03  6.359e-04 -10.221  < 2e-16 ***
## Adjacent    -6.304e-04  2.914e-05 -21.637  < 2e-16 ***
## Nearest      8.743e-03  1.806e-03   4.841 1.29e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 3447.59  on 28  degrees of freedom
## Residual deviance:  700.48  on 23  degrees of freedom
## AIC: 868.29
## 
## Number of Fisher Scoring iterations: 5
anova(fit1,test="LRT")
## Analysis of Deviance Table
## 
## Model: poisson, link: log
## 
## Response: Plants
## 
## Terms added sequentially (first to last)
## 
## 
##           Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
## NULL                         28     3447.6              
## Area       1   871.96        27     2575.6 < 2.2e-16 ***
## Elevation  1   789.47        26     1786.2 < 2.2e-16 ***
## StCruz     1   289.51        25     1496.6 < 2.2e-16 ***
## Adjacent   1   773.63        24      723.0 < 2.2e-16 ***
## Nearest    1    22.54        23      700.5 2.055e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
MSE <- 700.5/23

The significance of the predictors for quasipoisson models can be tested using \(F\)-test statistics. The \(F\)-test statistic asymptotically follows an \(F_{1,23}\) distribution. The test statistic is calculated as \(F = \frac{\mbox{Deviance}_{\rm Nearest}/df_{Nearrest}}{MSE}\) and the \(p\)-value is obtained by comparing the test statistic with the associated asymptotic distribution.

testStat <- 22.54/MSE

1 - pf(testStat, 1, 23)
## [1] 0.398517

The \(p\)-value suggests that this predictor is not significant. Alternatively use a critical value at 95% level

qf(.95, 1, 23)
## [1] 4.279344

Equivalently, we can obtain the same \(F\)-test statistic and \(p\)-value via the following

fit2 <- glm(Plants ~ Area + Elevation + StCruz + Adjacent + Nearest,family = quasipoisson, data = galapagos)
anova(fit2, test = "F")
## Analysis of Deviance Table
## 
## Model: quasipoisson, link: log
## 
## Response: Plants
## 
## Terms added sequentially (first to last)
## 
## 
##           Df Deviance Resid. Df Resid. Dev       F    Pr(>F)    
## NULL                         28     3447.6                      
## Area       1   871.96        27     2575.6 28.1714 2.182e-05 ***
## Elevation  1   789.47        26     1786.2 25.5065 4.117e-05 ***
## StCruz     1   289.51        25     1496.6  9.3535   0.00557 ** 
## Adjacent   1   773.63        24      723.0 24.9945 4.671e-05 ***
## Nearest    1    22.54        23      700.5  0.7283   0.40223    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
sum(residuals(fit1, type = "pearson")^2)/fit1$df.residual
## [1] 30.95181