This week we had no lectures or readings in which we learned new material. However, after completing the first quiz and the Poisson Regression assignment I have some additional information I want to document and have available.

In terms of linear regression review, I want to review interaction terms.

lifedata <- read.csv("http://www.cknudson.com/data/LifeExp.csv")
attach(lifedata)
model1 <- lm(MaleLife ~ Birth*Death)
summary(model1)
## 
## Call:
## lm(formula = MaleLife ~ Birth * Death)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.5160 -1.8130  0.1159  2.3615  8.1887 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 76.09897    2.92098  26.053  < 2e-16 ***
## Birth       -0.22479    0.08676  -2.591  0.01112 *  
## Death       -0.03466    0.27949  -0.124  0.90157    
## Birth:Death -0.02209    0.00728  -3.035  0.00312 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.249 on 93 degrees of freedom
## Multiple R-squared:  0.8894, Adjusted R-squared:  0.8859 
## F-statistic: 249.3 on 3 and 93 DF,  p-value: < 2.2e-16

When the model is made up of just one interaction term (between two variables) the two variables also appear in the model summary and contribute to the degrees of freedom.

In terms of new Poisson Regression

eleph <- read.csv("http://www.cknudson.com/data/elephant.csv")
attach(eleph)

Histograms are a good way to see if we should choose Poisson Regression.

hist(MATINGS)

We want to make sure we see a right-skewed distribution and the responses should be greater or equal to zero.

Pmod <- glm(MATINGS ~ AGE, family = poisson)
summary(Pmod)
## 
## Call:
## glm(formula = MATINGS ~ AGE, family = poisson)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.80798  -0.86137  -0.08629   0.60087   2.17777  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.58201    0.54462  -2.905  0.00368 ** 
## AGE          0.06869    0.01375   4.997 5.81e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 75.372  on 40  degrees of freedom
## Residual deviance: 51.012  on 39  degrees of freedom
## AIC: 156.46
## 
## Number of Fisher Scoring iterations: 5
Pmod$coef[2]
##        AGE 
## 0.06869281
exp(.06869281)
## [1] 1.071107
exp(1)^confint(Pmod, 'AGE', level=.95)
## Waiting for profiling to be done...
##    2.5 %   97.5 % 
## 1.042558 1.100360

There are a couple important interpretations here.

First, if an elephants age increases by one year that is associated with a general .06869 increase in the number of mating sessions the elephant partakes in.

Second, if we take the exponent of that coefficient we get a percentage that make be more interpretable. An increase of an elephants age by one year is associated with an approximate 7.11% increase in mating sessions.

Lastly, we calculate a 95% confidence interval. This shows that we are 95% confident that the increase in mating sessions associated with an additional year of age is between 4.26% and 10.04%.

The last thing I want to mention is how to evaluate Poisson Regression results.

Since there is no R-squared value in the poisson regression summary we need to use the pchisq with the residual deviance as the test stat.

pchisq(Pmod$deviance, Pmod$df.residual, lower.tail = FALSE)
## [1] 0.09426231

This p-value that is greater than our alpha of .05 shows that we do not have evidence of lack-of-fit.