The probability mass function of the Poisson distribution follows
\[ p(y; \mu) = \frac{\mu^{y} \exp (-\mu)}{y !}, \quad \mu >0, \quad y = 0, 1, \ldots \]
with parameter \(\mu\).
The probability mass function of the Binomial distribution follows
\[ p(y; \mu, n) = \begin{pmatrix} n\\y \end{pmatrix} \left( \frac{\mu}{n} \right)^y \left(1- \frac{\mu}{n} \right)^{n-y}. \]
x1 <- c(3.7,3.5,1.25,0.75,.8,.7,.6,1.1,.9,.9,.8,.55,.6,1.4,0.75,2.3, 3.2,0.85,1.7,1.8,0.4,0.95,1.35,1.5,1.6,.6,1.8,0.95,1.9,1.6, 2.7,2.35,1.1,1.1,1.2,.8,.95,.75,1.3)
x2 <- c(.83,1.09,2.5,1.5,3.2,3.5,.75,1.7,.75,.45,.57,2.75,3,2.33,3.75, 1.64,1.6,1.42,1.06,1.8,2,1.36,1.35,1.36,1.78,1.5,1.5, 1.9,.95,.4,.75,.03,1.83,2.2,2,3.33,1.9,1.9,1.63)
y <- c(1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,1,0,1,0,1,0,1, 0,0,1,1,1,0,0,1)
y <- c(3.36,2.88,3.63,3.41,3.78,4.02,4.00,4.23,3.73,3.85,3.97,4.51,4.54, 5.00,5.00,4.72,5.00)
x <- c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,6)
Load the dataset, briefly explore it, and provide a description.
Preprocess the response variable Type
and
suggest a suitable generalized linear model to classify if a glass type
is 1 or not.
# Glass$Type <- ifelse(Glass$Type == 1, 1, 0)
Give an expression of the linear predictor and discuss how it associates with the conditional mean \(E(Y \mid X)\).
Perform model diagnostics and identify problems with the above model.
Fit a quasi-likelihood model with the same mean structure and briefly discuss the difference.
Use the predict
function for prediction and
explain the output.
# index <- sample(dim(Glass)[1], 10, replace = TRUE)
# predict(fit, newdata = Glass[index, ], type = "response")
## fit refers to your fitted model
y <- c(3.36,2.88,3.63,3.41,3.78,4.02,4.00,4.23,3.73,3.85,3.97,4.51,4.54,
5.00,5.00,4.72,5.00)
x <- c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,6)
library(GLMsData)
data("galapagos")
fit1 <- glm(Plants ~ Area + Elevation + Nearest + StCruz + Adjacent,family = poisson, data = galapagos)
summary(fit1)
##
## Call:
## glm(formula = Plants ~ Area + Elevation + Nearest + StCruz +
## Adjacent, family = poisson, data = galapagos)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.328e+00 5.071e-02 65.623 < 2e-16 ***
## Area -5.511e-04 2.579e-05 -21.369 < 2e-16 ***
## Elevation 3.357e-03 8.425e-05 39.846 < 2e-16 ***
## Nearest 8.743e-03 1.806e-03 4.841 1.29e-06 ***
## StCruz -6.500e-03 6.359e-04 -10.221 < 2e-16 ***
## Adjacent -6.304e-04 2.914e-05 -21.637 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 3447.59 on 28 degrees of freedom
## Residual deviance: 700.48 on 23 degrees of freedom
## AIC: 868.29
##
## Number of Fisher Scoring iterations: 5
anova(fit1,test="Chisq")
## Analysis of Deviance Table
##
## Model: poisson, link: log
##
## Response: Plants
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 28 3447.6
## Area 1 871.96 27 2575.6 < 2.2e-16 ***
## Elevation 1 789.47 26 1786.2 < 2.2e-16 ***
## Nearest 1 10.00 25 1776.2 0.001569 **
## StCruz 1 500.41 24 1275.8 < 2.2e-16 ***
## Adjacent 1 575.28 23 700.5 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1