This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(rmarkdown); library(knitr)
library(dplyr)
library(psych); library(corrplot); library(GPArotation)
library(lavaan); library(semPlot); library(moments)
library(faraway)
data(gala)
attach(gala)
#in class note
modpos <- glm(Species ~ Area + Elevation + Nearest + Scruz, family = poisson)
summary(modpos)
##
## Call:
## glm(formula = Species ~ Area + Elevation + Nearest + Scruz, family = poisson)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -13.429 -6.064 -1.679 2.049 16.582
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.702e+00 4.015e-02 92.20 <2e-16 ***
## Area -2.314e-04 2.263e-05 -10.22 <2e-16 ***
## Elevation 2.147e-03 6.587e-05 32.60 <2e-16 ***
## Nearest 2.678e-02 1.660e-03 16.13 <2e-16 ***
## Scruz -1.273e-02 7.058e-04 -18.04 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 3510.7 on 29 degrees of freedom
## Residual deviance: 1341.5 on 25 degrees of freedom
## AIC: 1512.3
##
## Number of Fisher Scoring iterations: 5
pchisq(1341.5, 24, lower.tail=FALSE)
## [1] 1.567323e-268
We need to understand the meaning of the coefficents. Then take the coefficents for elevation. For example, in this model, an increase of 1 meter in elavation results in an increase of .001588 in the log mean number of species of tortioses holding Area and Nearest constant. An increase of 1 unit in the elevation in the islands highest points is associated w/ a multiplicative increase of 1.0021 (e^.0021) in the mean number of species, holding all other variables constant
A halfnorm plot of the residuals will show outliers but I know the issue lies somewhere else.
A key characteristic of Poisson distribution and regression is that the mean and variance are equal; however, I believe this may not be the case for my model.
dp <- sum(residuals(modpos, type ="pearson")^2 / modpos$df.residual)
summary(modpos, dispersion = dp)
##
## Call:
## glm(formula = Species ~ Area + Elevation + Nearest + Scruz, family = poisson)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -13.429 -6.064 -1.679 2.049 16.582
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.7017746 0.3194589 11.588 < 2e-16 ***
## Area -0.0002314 0.0001801 -1.285 0.1988
## Elevation 0.0021471 0.0005241 4.097 4.19e-05 ***
## Nearest 0.0267753 0.0132058 2.028 0.0426 *
## Scruz -0.0127332 0.0056155 -2.267 0.0234 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 63.30627)
##
## Null deviance: 3510.7 on 29 degrees of freedom
## Residual deviance: 1341.5 on 25 degrees of freedom
## AIC: 1512.3
##
## Number of Fisher Scoring iterations: 5
Fixed the issue with the above line. Dispersion parameter is important in Poisson regression
modpos2 <- glm(Species ~ Area + Elevation + Scruz + Nearest, family = poisson)
summary(modpos2)
##
## Call:
## glm(formula = Species ~ Area + Elevation + Scruz + Nearest, family = poisson)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -13.429 -6.064 -1.679 2.049 16.582
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.702e+00 4.015e-02 92.20 <2e-16 ***
## Area -2.314e-04 2.263e-05 -10.22 <2e-16 ***
## Elevation 2.147e-03 6.587e-05 32.60 <2e-16 ***
## Scruz -1.273e-02 7.058e-04 -18.04 <2e-16 ***
## Nearest 2.678e-02 1.660e-03 16.13 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 3510.7 on 29 degrees of freedom
## Residual deviance: 1341.5 on 25 degrees of freedom
## AIC: 1512.3
##
## Number of Fisher Scoring iterations: 5
dropinDev <- deviance(modpos) - deviance(modpos2)
dispL <- sum(residuals(modpos2, type ="pearson")^2 / modpos$df.residual)
teststat <- dropinDev/(dispL*1)
pf(teststat, df1 = 1, 25, lower.tail = FALSE)
## [1] 1
#now an intro to confidence intervals
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.