Set working directory
setwd("~/Documents/geog6000/lab05")
Read in data
island2 <- read.csv("../datafiles/island2.csv")
island2
## incidence area isolation quality
## 1 1 7.928 3.317 3
## 2 0 1.925 7.554 8
## 3 1 2.045 5.883 8
## 4 0 4.781 5.932 2
## 5 0 1.536 5.308 3
## 6 1 7.369 4.934 4
## 7 1 8.599 2.882 6
## 8 0 2.422 8.771 7
## 9 1 6.403 6.092 3
## 10 1 7.199 6.977 3
## 11 0 2.654 7.748 8
## 12 1 4.128 4.297 5
## 13 0 4.175 8.516 3
## 14 1 7.098 3.318 2
## 15 0 2.392 9.292 5
## 16 1 8.690 5.923 2
## 17 1 4.667 4.997 9
## 18 1 2.307 2.434 3
## 19 1 1.053 6.842 3
## 20 0 0.657 6.452 2
## 21 1 5.632 2.506 7
## 22 1 5.895 6.613 6
## 23 0 4.855 9.577 6
## 24 1 8.241 6.330 5
## 25 1 2.898 6.649 3
## 26 1 4.445 5.032 3
## 27 1 6.027 2.023 3
## 28 0 1.574 5.737 1
## 29 0 2.700 5.762 3
## 30 0 3.106 6.924 9
## 31 0 4.330 7.262 5
## 32 1 7.799 3.219 8
## 33 0 4.630 9.229 2
## 34 1 6.951 5.841 7
## 35 0 0.859 9.294 4
## 36 1 3.657 2.759 8
## 37 0 2.696 8.342 8
## 38 0 4.171 8.805 6
## 39 1 8.063 2.274 1
## 40 0 0.153 5.650 7
## 41 1 1.948 5.447 3
## 42 1 2.006 5.480 1
## 43 1 6.508 4.007 3
## 44 1 1.501 5.400 9
## 45 1 9.269 5.272 5
## 46 1 4.170 4.786 5
## 47 0 3.376 5.219 1
## 48 0 2.228 7.990 1
## 49 0 1.801 8.466 6
## 50 1 6.441 3.451 8
Convert to factors
island2$incidence = factor(island2$incidence, levels = c(0, 1), labels = c("absent", "present"))
island2
## incidence area isolation quality
## 1 present 7.928 3.317 3
## 2 absent 1.925 7.554 8
## 3 present 2.045 5.883 8
## 4 absent 4.781 5.932 2
## 5 absent 1.536 5.308 3
## 6 present 7.369 4.934 4
## 7 present 8.599 2.882 6
## 8 absent 2.422 8.771 7
## 9 present 6.403 6.092 3
## 10 present 7.199 6.977 3
## 11 absent 2.654 7.748 8
## 12 present 4.128 4.297 5
## 13 absent 4.175 8.516 3
## 14 present 7.098 3.318 2
## 15 absent 2.392 9.292 5
## 16 present 8.690 5.923 2
## 17 present 4.667 4.997 9
## 18 present 2.307 2.434 3
## 19 present 1.053 6.842 3
## 20 absent 0.657 6.452 2
## 21 present 5.632 2.506 7
## 22 present 5.895 6.613 6
## 23 absent 4.855 9.577 6
## 24 present 8.241 6.330 5
## 25 present 2.898 6.649 3
## 26 present 4.445 5.032 3
## 27 present 6.027 2.023 3
## 28 absent 1.574 5.737 1
## 29 absent 2.700 5.762 3
## 30 absent 3.106 6.924 9
## 31 absent 4.330 7.262 5
## 32 present 7.799 3.219 8
## 33 absent 4.630 9.229 2
## 34 present 6.951 5.841 7
## 35 absent 0.859 9.294 4
## 36 present 3.657 2.759 8
## 37 absent 2.696 8.342 8
## 38 absent 4.171 8.805 6
## 39 present 8.063 2.274 1
## 40 absent 0.153 5.650 7
## 41 present 1.948 5.447 3
## 42 present 2.006 5.480 1
## 43 present 6.508 4.007 3
## 44 present 1.501 5.400 9
## 45 present 9.269 5.272 5
## 46 present 4.170 4.786 5
## 47 absent 3.376 5.219 1
## 48 absent 2.228 7.990 1
## 49 absent 1.801 8.466 6
## 50 present 6.441 3.451 8
Center Isolation & Area
island2$isolation.cen = island2$isolation - mean(island2$isolation)
island2$area.cen = island2$area - mean(island2$area)
Boxplot Time!
Incidence & Area
boxplot(area.cen ~ incidence, data = island2)
Incidence & Isolation
boxplot(isolation.cen ~ incidence, data = island2)
Incidence & Quality
boxplot(quality ~ incidence, data = island2)
The two primary explanatory variables are the island area and island isolation.
GLM
island2.glm1 <- glm(incidence ~ area.cen + isolation.cen,
data = island2,
family = binomial(link = 'logit'))
summary(island2.glm1)
##
## Call:
## glm(formula = incidence ~ area.cen + isolation.cen, family = binomial(link = "logit"),
## data = island2)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8189 -0.3089 0.0490 0.3635 2.1192
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.1154 0.5877 1.898 0.05770 .
## area.cen 0.5807 0.2478 2.344 0.01909 *
## isolation.cen -1.3719 0.4769 -2.877 0.00401 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 68.029 on 49 degrees of freedom
## Residual deviance: 28.402 on 47 degrees of freedom
## AIC: 34.402
##
## Number of Fisher Scoring iterations: 6
Coefficients
coef(island2.glm1)
## (Intercept) area.cen isolation.cen
## 1.1154028 0.5807241 -1.3719411
Significance
* Intercept: 0.05770
* Area (centered): 0.01909*
* Isolation (centered): 0.00401**
* AIC: 34.402
Odd thing
exp(coef(island2.glm1))
## (Intercept) area.cen isolation.cen
## 3.0507967 1.7873322 0.2536142
Prediction Time! Area of 5 and isolation distance of 6
Build New Dataframe
newisland = data.frame(area.cen = 5 - mean(island2$area), data.frame(isolation.cen = 6 -mean(island2$isolation)))
Prediction
newislandpred = predict(island2.glm1,
newdata = newisland,
type = 'response',
se.fit = TRUE)
newislandpred
## $fit
## 1
## 0.7881208
##
## $se.fit
## 1
## 0.1125028
##
## $residual.scale
## [1] 1
Predicted value: 0.788 Standard error: 0.1125028
Poisson regression model (abundance/cover), stream distance & elevation are explanatory variables
Read in data:
tsuga.csv <- read.csv("../datafiles/tsuga.csv")
Poisson regression model:
tsuga.csv.glm <- glm(cover ~ streamdist + elev,
data = tsuga.csv,
family = poisson(link = 'log'))
summary(tsuga.csv.glm)
##
## Call:
## glm(formula = cover ~ streamdist + elev, family = poisson(link = "log"),
## data = tsuga.csv)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.31395 -0.82155 -0.07929 0.71900 2.62316
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.622e+00 5.226e-02 31.047 < 2e-16 ***
## streamdist -8.963e-04 1.173e-04 -7.641 2.15e-14 ***
## elev 8.901e-05 5.653e-05 1.575 0.115
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 748.23 on 744 degrees of freedom
## Residual deviance: 687.10 on 742 degrees of freedom
## (1 observation deleted due to missingness)
## AIC: 3150.2
##
## Number of Fisher Scoring iterations: 4
coef(tsuga.csv.glm)
## (Intercept) streamdist elev
## 1.622411e+00 -8.963237e-04 8.901257e-05
AIC: 3150.2
Significance
* Intercept: < 2e-16***
* streamdist: 2.15e-14***
* elev: 0.115
Transform the coefficients:
exp(coef(tsuga.csv.glm))
## (Intercept) streamdist elev
## 5.0652901 0.9991041 1.0000890
Brief Interpretation
The abundance/cover of trees is dependent on the two explanatory variables. The variable “streamdist” is less than one and before transforming it was negative. Both of these outputs indicate that the abundance of trees decreases as one gets further away from a stream. The “elev” variable is over one and was positive before transforming it. This indicates that the abundance of trees increases as elevation increases. (Not sure the units used for this dataset)
The sign indicates a positive explanatory relationship between abundance, stream distance, and elevation.
```