y <- c(3.36,2.88,3.63,3.41,3.78,4.02,4.00,4.23,3.73,3.85,3.97,4.51,4.54,
5.00,5.00,4.72,5.00)
x <- c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,6)
plot(x, y)
plot(1/x, y)
Yes, and it becomes linear when we plot \(1/y\) vs \(x\).
Inverse link function.
g1 <- glm(y~x,family=Gamma(link="inverse"))
summary(g1,dispersion=1)
##
## Call:
## glm(formula = y ~ x, family = Gamma(link = "inverse"))
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.2156853 0.0804951 2.679 0.00737 **
## x 0.0005382 0.0011642 0.462 0.64387
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Gamma family taken to be 1)
##
## Null deviance: 0.38385 on 16 degrees of freedom
## Residual deviance: 0.15971 on 15 degrees of freedom
## AIC: 22.477
##
## Number of Fisher Scoring iterations: 4
summary(g1)
##
## Call:
## glm(formula = y ~ x, family = Gamma(link = "inverse"))
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.2156853 0.0081804 26.366 5.57e-14 ***
## x 0.0005382 0.0001183 4.549 0.000384 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Gamma family taken to be 0.01032795)
##
## Null deviance: 0.38385 on 16 degrees of freedom
## Residual deviance: 0.15971 on 15 degrees of freedom
## AIC: 22.477
##
## Number of Fisher Scoring iterations: 4
library(GLMsData)
data("galapagos")
fit1 <- glm(Plants ~ Area + Elevation + StCruz + Adjacent + Nearest,family = poisson, data = galapagos)
summary(fit1)
##
## Call:
## glm(formula = Plants ~ Area + Elevation + StCruz + Adjacent +
## Nearest, family = poisson, data = galapagos)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.328e+00 5.071e-02 65.623 < 2e-16 ***
## Area -5.511e-04 2.579e-05 -21.369 < 2e-16 ***
## Elevation 3.357e-03 8.425e-05 39.846 < 2e-16 ***
## StCruz -6.500e-03 6.359e-04 -10.221 < 2e-16 ***
## Adjacent -6.304e-04 2.914e-05 -21.637 < 2e-16 ***
## Nearest 8.743e-03 1.806e-03 4.841 1.29e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 3447.59 on 28 degrees of freedom
## Residual deviance: 700.48 on 23 degrees of freedom
## AIC: 868.29
##
## Number of Fisher Scoring iterations: 5
anova(fit1,test="LRT")
## Analysis of Deviance Table
##
## Model: poisson, link: log
##
## Response: Plants
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 28 3447.6
## Area 1 871.96 27 2575.6 < 2.2e-16 ***
## Elevation 1 789.47 26 1786.2 < 2.2e-16 ***
## StCruz 1 289.51 25 1496.6 < 2.2e-16 ***
## Adjacent 1 773.63 24 723.0 < 2.2e-16 ***
## Nearest 1 22.54 23 700.5 2.055e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Is there evidence of overdispersion? Explain your answer. Yes, the residual deviance 700.5 is much larger than degrees of freedom 23.
Suppose the variance is proportional to the mean rather than equal to the mean. Estimate the proportionality parameter using residual deviance.
MSE <- 700.5/23
The significance of the predictors for quasipoisson models can be tested using \(F\)-test statistics. The \(F\)-test statistic asymptotically follows an \(F_{1,23}\) distribution. The test statistic is calculated as \(F = \frac{\mbox{Deviance}_{\rm Nearest}/df_{Nearrest}}{MSE}\) and the \(p\)-value is obtained by comparing the test statistic with the associated asymptotic distribution.
testStat <- 22.54/MSE
1 - pf(testStat, 1, 23)
## [1] 0.398517
The \(p\)-value suggests that this predictor is not significant. Alternatively use a critical value at 95% level
qf(.95, 1, 23)
## [1] 4.279344
Equivalently, we can obtain the same \(F\)-test statistic and \(p\)-value via the following
fit2 <- glm(Plants ~ Area + Elevation + StCruz + Adjacent + Nearest,family = quasipoisson, data = galapagos)
anova(fit2, test = "F")
## Analysis of Deviance Table
##
## Model: quasipoisson, link: log
##
## Response: Plants
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev F Pr(>F)
## NULL 28 3447.6
## Area 1 871.96 27 2575.6 28.1714 2.182e-05 ***
## Elevation 1 789.47 26 1786.2 25.5065 4.117e-05 ***
## StCruz 1 289.51 25 1496.6 9.3535 0.00557 **
## Adjacent 1 773.63 24 723.0 24.9945 4.671e-05 ***
## Nearest 1 22.54 23 700.5 0.7283 0.40223
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
sum(residuals(fit1, type = "pearson")^2)/fit1$df.residual
## [1] 30.95181