Set working directory

setwd("~/Documents/geog6000/lab05")

Exercise 1

Read in data

island2 <- read.csv("../datafiles/island2.csv")
island2
##    incidence  area isolation quality
## 1          1 7.928     3.317       3
## 2          0 1.925     7.554       8
## 3          1 2.045     5.883       8
## 4          0 4.781     5.932       2
## 5          0 1.536     5.308       3
## 6          1 7.369     4.934       4
## 7          1 8.599     2.882       6
## 8          0 2.422     8.771       7
## 9          1 6.403     6.092       3
## 10         1 7.199     6.977       3
## 11         0 2.654     7.748       8
## 12         1 4.128     4.297       5
## 13         0 4.175     8.516       3
## 14         1 7.098     3.318       2
## 15         0 2.392     9.292       5
## 16         1 8.690     5.923       2
## 17         1 4.667     4.997       9
## 18         1 2.307     2.434       3
## 19         1 1.053     6.842       3
## 20         0 0.657     6.452       2
## 21         1 5.632     2.506       7
## 22         1 5.895     6.613       6
## 23         0 4.855     9.577       6
## 24         1 8.241     6.330       5
## 25         1 2.898     6.649       3
## 26         1 4.445     5.032       3
## 27         1 6.027     2.023       3
## 28         0 1.574     5.737       1
## 29         0 2.700     5.762       3
## 30         0 3.106     6.924       9
## 31         0 4.330     7.262       5
## 32         1 7.799     3.219       8
## 33         0 4.630     9.229       2
## 34         1 6.951     5.841       7
## 35         0 0.859     9.294       4
## 36         1 3.657     2.759       8
## 37         0 2.696     8.342       8
## 38         0 4.171     8.805       6
## 39         1 8.063     2.274       1
## 40         0 0.153     5.650       7
## 41         1 1.948     5.447       3
## 42         1 2.006     5.480       1
## 43         1 6.508     4.007       3
## 44         1 1.501     5.400       9
## 45         1 9.269     5.272       5
## 46         1 4.170     4.786       5
## 47         0 3.376     5.219       1
## 48         0 2.228     7.990       1
## 49         0 1.801     8.466       6
## 50         1 6.441     3.451       8

Convert to factors

island2$incidence = factor(island2$incidence, levels = c(0, 1), labels = c("absent", "present"))
island2
##    incidence  area isolation quality
## 1    present 7.928     3.317       3
## 2     absent 1.925     7.554       8
## 3    present 2.045     5.883       8
## 4     absent 4.781     5.932       2
## 5     absent 1.536     5.308       3
## 6    present 7.369     4.934       4
## 7    present 8.599     2.882       6
## 8     absent 2.422     8.771       7
## 9    present 6.403     6.092       3
## 10   present 7.199     6.977       3
## 11    absent 2.654     7.748       8
## 12   present 4.128     4.297       5
## 13    absent 4.175     8.516       3
## 14   present 7.098     3.318       2
## 15    absent 2.392     9.292       5
## 16   present 8.690     5.923       2
## 17   present 4.667     4.997       9
## 18   present 2.307     2.434       3
## 19   present 1.053     6.842       3
## 20    absent 0.657     6.452       2
## 21   present 5.632     2.506       7
## 22   present 5.895     6.613       6
## 23    absent 4.855     9.577       6
## 24   present 8.241     6.330       5
## 25   present 2.898     6.649       3
## 26   present 4.445     5.032       3
## 27   present 6.027     2.023       3
## 28    absent 1.574     5.737       1
## 29    absent 2.700     5.762       3
## 30    absent 3.106     6.924       9
## 31    absent 4.330     7.262       5
## 32   present 7.799     3.219       8
## 33    absent 4.630     9.229       2
## 34   present 6.951     5.841       7
## 35    absent 0.859     9.294       4
## 36   present 3.657     2.759       8
## 37    absent 2.696     8.342       8
## 38    absent 4.171     8.805       6
## 39   present 8.063     2.274       1
## 40    absent 0.153     5.650       7
## 41   present 1.948     5.447       3
## 42   present 2.006     5.480       1
## 43   present 6.508     4.007       3
## 44   present 1.501     5.400       9
## 45   present 9.269     5.272       5
## 46   present 4.170     4.786       5
## 47    absent 3.376     5.219       1
## 48    absent 2.228     7.990       1
## 49    absent 1.801     8.466       6
## 50   present 6.441     3.451       8

Center Isolation & Area

island2$isolation.cen = island2$isolation - mean(island2$isolation)
island2$area.cen = island2$area - mean(island2$area)

Boxplot Time!
Incidence & Area

boxplot(area.cen ~ incidence, data = island2)

Incidence & Isolation

boxplot(isolation.cen ~ incidence, data = island2)

Incidence & Quality

boxplot(quality ~ incidence, data = island2)

The two primary explanatory variables are the island area and island isolation.

GLM

island2.glm1 <- glm(incidence ~ area.cen + isolation.cen, 
                    data = island2, 
                    family = binomial(link = 'logit'))
summary(island2.glm1)
## 
## Call:
## glm(formula = incidence ~ area.cen + isolation.cen, family = binomial(link = "logit"), 
##     data = island2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8189  -0.3089   0.0490   0.3635   2.1192  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)   
## (Intercept)     1.1154     0.5877   1.898  0.05770 . 
## area.cen        0.5807     0.2478   2.344  0.01909 * 
## isolation.cen  -1.3719     0.4769  -2.877  0.00401 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 68.029  on 49  degrees of freedom
## Residual deviance: 28.402  on 47  degrees of freedom
## AIC: 34.402
## 
## Number of Fisher Scoring iterations: 6

Coefficients

coef(island2.glm1)
##   (Intercept)      area.cen isolation.cen 
##     1.1154028     0.5807241    -1.3719411

Significance
* Intercept: 0.05770
* Area (centered): 0.01909*
* Isolation (centered): 0.00401**
* AIC: 34.402

Odd thing

exp(coef(island2.glm1))
##   (Intercept)      area.cen isolation.cen 
##     3.0507967     1.7873322     0.2536142

Prediction Time! Area of 5 and isolation distance of 6
Build New Dataframe

newisland = data.frame(area.cen = 5 - mean(island2$area), data.frame(isolation.cen = 6 -mean(island2$isolation)))

Prediction

newislandpred = predict(island2.glm1, 
        newdata = newisland, 
        type = 'response', 
        se.fit = TRUE)
newislandpred
## $fit
##         1 
## 0.7881208 
## 
## $se.fit
##         1 
## 0.1125028 
## 
## $residual.scale
## [1] 1

Predicted value: 0.788 Standard error: 0.1125028

Exercise 2

Poisson regression model (abundance/cover), stream distance & elevation are explanatory variables
Read in data:

tsuga.csv <- read.csv("../datafiles/tsuga.csv")

Poisson regression model:

tsuga.csv.glm <- glm(cover ~ streamdist + elev, 
               data = tsuga.csv, 
               family = poisson(link = 'log'))
summary(tsuga.csv.glm)
## 
## Call:
## glm(formula = cover ~ streamdist + elev, family = poisson(link = "log"), 
##     data = tsuga.csv)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.31395  -0.82155  -0.07929   0.71900   2.62316  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  1.622e+00  5.226e-02  31.047  < 2e-16 ***
## streamdist  -8.963e-04  1.173e-04  -7.641 2.15e-14 ***
## elev         8.901e-05  5.653e-05   1.575    0.115    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 748.23  on 744  degrees of freedom
## Residual deviance: 687.10  on 742  degrees of freedom
##   (1 observation deleted due to missingness)
## AIC: 3150.2
## 
## Number of Fisher Scoring iterations: 4
coef(tsuga.csv.glm)
##   (Intercept)    streamdist          elev 
##  1.622411e+00 -8.963237e-04  8.901257e-05

AIC: 3150.2

Significance
* Intercept: < 2e-16***
* streamdist: 2.15e-14***
* elev: 0.115

Transform the coefficients:

exp(coef(tsuga.csv.glm))
## (Intercept)  streamdist        elev 
##   5.0652901   0.9991041   1.0000890

Brief Interpretation
The abundance/cover of trees is dependent on the two explanatory variables. The variable “streamdist” is less than one and before transforming it was negative. Both of these outputs indicate that the abundance of trees decreases as one gets further away from a stream. The “elev” variable is over one and was positive before transforming it. This indicates that the abundance of trees increases as elevation increases. (Not sure the units used for this dataset)
The sign indicates a positive explanatory relationship between abundance, stream distance, and elevation.

```