Homework 8 - Geospatial Analysis

The goal of this homework is to interpolate the bird richness data to a surface with IDW and via kriging.

# Set working directory
setwd("~/Documents/ESCI Time Series/Homework 8")

# Load necessary packages
library(sp)

## Warning: package 'sp' was built under R version 3.3.2

library(gstat)

## Warning: package 'gstat' was built under R version 3.3.2

library(raster)
library(maptools)

## Warning: package 'maptools' was built under R version 3.3.2

## Checking rgeos availability: TRUE

First, load empty grid to interpolate data onto.

Andy made this empty grid with 25 km cells clipped to the boundary of Mexico to interpolate the bird richness data onto.

birds <- readRDS("birdRichnessMexico.rds")
spplot(birds, "nSpecies", colorkey=T)

# Read in empty grid, plot, add bird richess data points, and see how empty grid is formatted
gridMex <- readRDS("gridMex.rds")
plot(gridMex)
plot(birds,add=T,pch=20)

summary(gridMex)

## Object of class SpatialPixelsDataFrame
## Coordinates:
##          min     max
## x -1139251.5 1860748
## y  -995597.2 1004403
## Is projected: TRUE 
## proj4string :
## [+proj=aea +lat_1=14.5 +lat_2=32.5 +lat_0=24 +lon_0=-105 +x_0=0
## +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs +towgs84=0,0,0]
## Number of points: 3132
## Grid attributes:
##    cellcentre.offset cellsize cells.dim
## s1        -1276751.5    25000       127
## s2         -983097.2    25000        80
## Data attributes:
##      layer  
##  Min.   :0  
##  1st Qu.:0  
##  Median :0  
##  Mean   :0  
##  3rd Qu.:0  
##  Max.   :0

Inverse Distance Weighting (IDW)

The the IDW method estimates z by using known measurements at several locations and weights closer points more heavily than farther away points. This means that points that are closer will influence the estimate a lot more than points far away. By changing the value of p, we can adjust how points are weighted. This is a deterministic method that doesn’t rely upon the statistical relationship among the known points, but just uses a mathematical formula based on distance between points. This means there is no error/uncertainty in these calculations.

Weighted spatial average: \[\hat{Z}(s_0)=\frac{\sum_{i=1}^n w(s_i)Z(s_i)}{\sum_{i=1}^n w(s_i)}\]

Weight is calculated as an inverse function of distance: \[ w(s_i) = ||s_i - s_0||^{-p} \]

# Create an object to hold the inverse distance weight function output for the bird richness data for several powers

# Power 0.5
birdIDW <- idw(formula= nSpecies~1,
               locations= birds,
               newdata= gridMex,
               idp=0.5)

## [inverse distance weighted interpolation]

# View beginning of dataset 
head(birdIDW@data)

##   var1.pred var1.var
## 1  221.0785       NA
## 2  220.9107       NA
## 3  220.7543       NA
## 4  220.6127       NA
## 5  220.4880       NA
## 6  220.3803       NA

# Power 1
birdIDW1 <- idw(formula= nSpecies~1,
               locations= birds,
               newdata= gridMex,
               idp=1)

## [inverse distance weighted interpolation]

# Power 2
birdIDW2 <- idw(formula= nSpecies~1,
               locations= birds,
               newdata= gridMex,
               idp=2.5)

## [inverse distance weighted interpolation]

# Power 4
birdIDW3 <- idw(formula= nSpecies~1,
               locations= birds,
               newdata= gridMex,
               idp=4)

## [inverse distance weighted interpolation]

# Power 6
birdIDW4 <- idw(formula= nSpecies~1,
               locations= birds,
               newdata= gridMex,
               idp=6)

## [inverse distance weighted interpolation]

# Plot IDW calculations with different powers
par(mfrow=c(2,2))
spplot(birdIDW, "var1.pred", main="Power = 0.5")

spplot(birdIDW1, "var1.pred", main="Power = 1")

spplot(birdIDW2, "var1.pred", main="Power = 2")

spplot(birdIDW3, "var1.pred", main="Power = 4")

spplot(birdIDW4, "var1.pred", main="Power = 6")

From looking at these interpolated plots with different power values, it is difficult to tell which p value is better or worse. A much better way to determine the best p value would be to use a bootstrapping method where you hold out parts of the data, make predictions, and compare the z to \(\hat z\) values. Repeat this for several values of p and compare the \(R^2\) values. This would give you an idea of which model gave you the best predictions. However, a better method is to take into account the spatial autocorrelation within the data. We do this by a method called kriging outlined below.

Kriging

Kriging is a geostatistical method that uses the spatial autocorrelation information from the data to interpolate. First, we will do this manually and then automatically using a function.

First, fit a variogram and model to the data. This is a structural analysis.

birdVar <- variogram(nSpecies~1, birds)
summary(birdVar)

##        np              dist             gamma           dir.hor 
##  Min.   : 155.0   Min.   :  50455   Min.   : 237.5   Min.   :0  
##  1st Qu.: 757.5   1st Qu.: 298844   1st Qu.:1908.6   1st Qu.:0  
##  Median : 893.0   Median : 558141   Median :2440.4   Median :0  
##  Mean   : 810.0   Mean   : 559756   Mean   :2433.7   Mean   :0  
##  3rd Qu.: 956.0   3rd Qu.: 818467   3rd Qu.:3248.7   3rd Qu.:0  
##  Max.   :1083.0   Max.   :1078585   Max.   :4111.7   Max.   :0  
##     dir.ver     id    
##  Min.   :0   var1:15  
##  1st Qu.:0            
##  Median :0            
##  Mean   :0            
##  3rd Qu.:0            
##  Max.   :0

# Plot the semivariogram
plot(birdVar, 
     pch=20, cex=1.5, col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)", 
     main = "Bird Richness")

# Fit a spherical model
sph.model <- vgm(psill= 2500, 
                 model= "Sph", 
                 range= 600000, 
                 nugget= 0)

sph.fit <- fit.variogram(object= birdVar, 
                         model= sph.model)

# Plot the fitted model
plot(birdVar, 
     pch=20, cex=1.5, col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)", 
     main = "Bird Richness",
     model=sph.fit)

# Fit an exponential model
exp.model <- vgm(psill= 2500, 
                 model= "Exp", 
                 range= 600000, 
                 nugget= 0)

exp.fit <- fit.variogram(object= birdVar, 
                         model= exp.model)

# Plot the fitted model
plot(birdVar, 
     pch=20, cex=1.5, col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)", 
     main = "Bird Richness",
     model=exp.fit)

The exponential model appears to fit the semivariogram better than the spherical model, so I will utilize this for the Kriging interpolation.

Second, interpolate by kriging using hand-fit model of variogram.

birdExp <- krige(nSpecies~1, birds, gridMex, model= exp.fit)

## [using ordinary kriging]

spplot(birdExp, "var1.pred")

Third, perform this automatically using a function and compare to the hand-fit version. Automatically performing this does not necessarily give you a better result/one that makes sense.

library(automap)
birdVarauto <- autofitVariogram(nSpecies~1, birds)
summary(birdVarauto)

## Experimental variogram:
##      np       dist      gamma dir.hor dir.ver   id
## 1    12   18686.66   20.20833       0       0 var1
## 2    48   36495.36  174.90625       0       0 var1
## 3    78   59101.39  330.67949       0       0 var1
## 4   152   88775.27  736.08882       0       0 var1
## 5   204  122711.96 1269.84069       0       0 var1
## 6   273  158086.20 1517.68498       0       0 var1
## 7  1072  236608.91 1527.87313       0       0 var1
## 8  1432  352106.38 2163.75489       0       0 var1
## 9  2439  497644.17 2343.40693       0       0 var1
## 10 2310  675405.75 2838.77381       0       0 var1
## 11 2141  847102.90 3316.00771       0       0 var1
## 12 2604 1053083.48 3635.70315       0       0 var1
## 
## Fitted variogram model:
##   model    psill    range kappa
## 1   Nug    0.000      0.0   0.0
## 2   Ste 3310.266 445743.9   0.6
## Sums of squares betw. var. model and sample var.[1] 0.005326698

plot(birdVarauto)

The automatic function used the same nugget, a higher sill value, and a lower range value than I fit manually.

birdKrig <- autoKrige(nSpecies~1, birds, gridMex)

## [using ordinary kriging]

plot(birdKrig)

We can see from the kriging interpolation a general increasing bird richness moving from north to south. There also appear to be higher bird richness measures along the coastline of the main landmass of Mexico with a high density of bird richness in the southern end of the country and a low density of bird richness in the Baja.