R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(rmarkdown); library(knitr)

library(dplyr)
library(psych); library(corrplot); library(GPArotation)
library(lavaan); library(semPlot); library(moments)
library(faraway)
data(gala)
attach(gala)

#in class note

modpos <- glm(Species ~ Area + Elevation + Nearest + Scruz, family = poisson)
summary(modpos)
## 
## Call:
## glm(formula = Species ~ Area + Elevation + Nearest + Scruz, family = poisson)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -13.429   -6.064   -1.679    2.049   16.582  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  3.702e+00  4.015e-02   92.20   <2e-16 ***
## Area        -2.314e-04  2.263e-05  -10.22   <2e-16 ***
## Elevation    2.147e-03  6.587e-05   32.60   <2e-16 ***
## Nearest      2.678e-02  1.660e-03   16.13   <2e-16 ***
## Scruz       -1.273e-02  7.058e-04  -18.04   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 3510.7  on 29  degrees of freedom
## Residual deviance: 1341.5  on 25  degrees of freedom
## AIC: 1512.3
## 
## Number of Fisher Scoring iterations: 5
pchisq(1341.5, 24, lower.tail=FALSE)
## [1] 1.567323e-268

We need to understand the meaning of the coefficents. Then take the coefficents for elevation. For example, in this model, an increase of 1 meter in elavation results in an increase of .001588 in the log mean number of species of tortioses holding Area and Nearest constant. An increase of 1 unit in the elevation in the islands highest points is associated w/ a multiplicative increase of 1.0021 (e^.0021) in the mean number of species, holding all other variables constant

A halfnorm plot of the residuals will show outliers but I know the issue lies somewhere else.

A key characteristic of Poisson distribution and regression is that the mean and variance are equal; however, I believe this may not be the case for my model.

dp <- sum(residuals(modpos, type ="pearson")^2 / modpos$df.residual)
summary(modpos, dispersion = dp)
## 
## Call:
## glm(formula = Species ~ Area + Elevation + Nearest + Scruz, family = poisson)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -13.429   -6.064   -1.679    2.049   16.582  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  3.7017746  0.3194589  11.588  < 2e-16 ***
## Area        -0.0002314  0.0001801  -1.285   0.1988    
## Elevation    0.0021471  0.0005241   4.097 4.19e-05 ***
## Nearest      0.0267753  0.0132058   2.028   0.0426 *  
## Scruz       -0.0127332  0.0056155  -2.267   0.0234 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 63.30627)
## 
##     Null deviance: 3510.7  on 29  degrees of freedom
## Residual deviance: 1341.5  on 25  degrees of freedom
## AIC: 1512.3
## 
## Number of Fisher Scoring iterations: 5

Fixed the issue with the above line. Dispersion parameter is important in Poisson regression

modpos2 <- glm(Species ~ Area + Elevation + Scruz + Nearest, family = poisson)
summary(modpos2)
## 
## Call:
## glm(formula = Species ~ Area + Elevation + Scruz + Nearest, family = poisson)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -13.429   -6.064   -1.679    2.049   16.582  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  3.702e+00  4.015e-02   92.20   <2e-16 ***
## Area        -2.314e-04  2.263e-05  -10.22   <2e-16 ***
## Elevation    2.147e-03  6.587e-05   32.60   <2e-16 ***
## Scruz       -1.273e-02  7.058e-04  -18.04   <2e-16 ***
## Nearest      2.678e-02  1.660e-03   16.13   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 3510.7  on 29  degrees of freedom
## Residual deviance: 1341.5  on 25  degrees of freedom
## AIC: 1512.3
## 
## Number of Fisher Scoring iterations: 5
dropinDev <- deviance(modpos) - deviance(modpos2)
dispL <- sum(residuals(modpos2, type ="pearson")^2 / modpos$df.residual)
teststat <- dropinDev/(dispL*1)
pf(teststat, df1 = 1, 25, lower.tail = FALSE)
## [1] 1

#now an intro to confidence intervals

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.