This week we had no lectures or readings in which we learned new material. However, after completing the first quiz and the Poisson Regression assignment I have some additional information I want to document and have available.
In terms of linear regression review, I want to review interaction terms.
lifedata <- read.csv("http://www.cknudson.com/data/LifeExp.csv")
attach(lifedata)
model1 <- lm(MaleLife ~ Birth*Death)
summary(model1)
##
## Call:
## lm(formula = MaleLife ~ Birth * Death)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5160 -1.8130 0.1159 2.3615 8.1887
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 76.09897 2.92098 26.053 < 2e-16 ***
## Birth -0.22479 0.08676 -2.591 0.01112 *
## Death -0.03466 0.27949 -0.124 0.90157
## Birth:Death -0.02209 0.00728 -3.035 0.00312 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.249 on 93 degrees of freedom
## Multiple R-squared: 0.8894, Adjusted R-squared: 0.8859
## F-statistic: 249.3 on 3 and 93 DF, p-value: < 2.2e-16
When the model is made up of just one interaction term (between two variables) the two variables also appear in the model summary and contribute to the degrees of freedom.
In terms of new Poisson Regression
eleph <- read.csv("http://www.cknudson.com/data/elephant.csv")
attach(eleph)
Histograms are a good way to see if we should choose Poisson Regression.
hist(MATINGS)
We want to make sure we see a right-skewed distribution and the responses should be greater or equal to zero.
Pmod <- glm(MATINGS ~ AGE, family = poisson)
summary(Pmod)
##
## Call:
## glm(formula = MATINGS ~ AGE, family = poisson)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.80798 -0.86137 -0.08629 0.60087 2.17777
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.58201 0.54462 -2.905 0.00368 **
## AGE 0.06869 0.01375 4.997 5.81e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 75.372 on 40 degrees of freedom
## Residual deviance: 51.012 on 39 degrees of freedom
## AIC: 156.46
##
## Number of Fisher Scoring iterations: 5
Pmod$coef[2]
## AGE
## 0.06869281
exp(.06869281)
## [1] 1.071107
exp(1)^confint(Pmod, 'AGE', level=.95)
## Waiting for profiling to be done...
## 2.5 % 97.5 %
## 1.042558 1.100360
There are a couple important interpretations here.
First, if an elephants age increases by one year that is associated with a general .06869 increase in the number of mating sessions the elephant partakes in.
Second, if we take the exponent of that coefficient we get a percentage that make be more interpretable. An increase of an elephants age by one year is associated with an approximate 7.11% increase in mating sessions.
Lastly, we calculate a 95% confidence interval. This shows that we are 95% confident that the increase in mating sessions associated with an additional year of age is between 4.26% and 10.04%.
The last thing I want to mention is how to evaluate Poisson Regression results.
Since there is no R-squared value in the poisson regression summary we need to use the pchisq with the residual deviance as the test stat.
pchisq(Pmod$deviance, Pmod$df.residual, lower.tail = FALSE)
## [1] 0.09426231
This p-value that is greater than our alpha of .05 shows that we do not have evidence of lack-of-fit.