# Set working directory
setwd("~/Documents/ESCI Time Series/Homework 8")
# Load necessary packages
library(sp)
## Warning: package 'sp' was built under R version 3.3.2
library(gstat)
## Warning: package 'gstat' was built under R version 3.3.2
library(raster)
library(maptools)
## Warning: package 'maptools' was built under R version 3.3.2
## Checking rgeos availability: TRUE
Andy made this empty grid with 25 km cells clipped to the boundary of Mexico to interpolate the bird richness data onto.
birds <- readRDS("birdRichnessMexico.rds")
spplot(birds, "nSpecies", colorkey=T)
# Read in empty grid, plot, add bird richess data points, and see how empty grid is formatted
gridMex <- readRDS("gridMex.rds")
plot(gridMex)
plot(birds,add=T,pch=20)
summary(gridMex)
## Object of class SpatialPixelsDataFrame
## Coordinates:
## min max
## x -1139251.5 1860748
## y -995597.2 1004403
## Is projected: TRUE
## proj4string :
## [+proj=aea +lat_1=14.5 +lat_2=32.5 +lat_0=24 +lon_0=-105 +x_0=0
## +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs +towgs84=0,0,0]
## Number of points: 3132
## Grid attributes:
## cellcentre.offset cellsize cells.dim
## s1 -1276751.5 25000 127
## s2 -983097.2 25000 80
## Data attributes:
## layer
## Min. :0
## 1st Qu.:0
## Median :0
## Mean :0
## 3rd Qu.:0
## Max. :0
The the IDW method estimates z by using known measurements at several locations and weights closer points more heavily than farther away points. This means that points that are closer will influence the estimate a lot more than points far away. By changing the value of p, we can adjust how points are weighted. This is a deterministic method that doesn’t rely upon the statistical relationship among the known points, but just uses a mathematical formula based on distance between points. This means there is no error/uncertainty in these calculations.
Weighted spatial average: \[\hat{Z}(s_0)=\frac{\sum_{i=1}^n w(s_i)Z(s_i)}{\sum_{i=1}^n w(s_i)}\]
Weight is calculated as an inverse function of distance: \[ w(s_i) = ||s_i - s_0||^{-p} \]
# Create an object to hold the inverse distance weight function output for the bird richness data for several powers
# Power 0.5
birdIDW <- idw(formula= nSpecies~1,
locations= birds,
newdata= gridMex,
idp=0.5)
## [inverse distance weighted interpolation]
# View beginning of dataset
head(birdIDW@data)
## var1.pred var1.var
## 1 221.0785 NA
## 2 220.9107 NA
## 3 220.7543 NA
## 4 220.6127 NA
## 5 220.4880 NA
## 6 220.3803 NA
# Power 1
birdIDW1 <- idw(formula= nSpecies~1,
locations= birds,
newdata= gridMex,
idp=1)
## [inverse distance weighted interpolation]
# Power 2
birdIDW2 <- idw(formula= nSpecies~1,
locations= birds,
newdata= gridMex,
idp=2.5)
## [inverse distance weighted interpolation]
# Power 4
birdIDW3 <- idw(formula= nSpecies~1,
locations= birds,
newdata= gridMex,
idp=4)
## [inverse distance weighted interpolation]
# Power 6
birdIDW4 <- idw(formula= nSpecies~1,
locations= birds,
newdata= gridMex,
idp=6)
## [inverse distance weighted interpolation]
# Plot IDW calculations with different powers
par(mfrow=c(2,2))
spplot(birdIDW, "var1.pred", main="Power = 0.5")
spplot(birdIDW1, "var1.pred", main="Power = 1")
spplot(birdIDW2, "var1.pred", main="Power = 2")
spplot(birdIDW3, "var1.pred", main="Power = 4")
spplot(birdIDW4, "var1.pred", main="Power = 6")
From looking at these interpolated plots with different power values, it is difficult to tell which p value is better or worse. A much better way to determine the best p value would be to use a bootstrapping method where you hold out parts of the data, make predictions, and compare the z to \(\hat z\) values. Repeat this for several values of p and compare the \(R^2\) values. This would give you an idea of which model gave you the best predictions. However, a better method is to take into account the spatial autocorrelation within the data. We do this by a method called kriging outlined below.
Kriging is a geostatistical method that uses the spatial autocorrelation information from the data to interpolate. First, we will do this manually and then automatically using a function.
birdVar <- variogram(nSpecies~1, birds)
summary(birdVar)
## np dist gamma dir.hor
## Min. : 155.0 Min. : 50455 Min. : 237.5 Min. :0
## 1st Qu.: 757.5 1st Qu.: 298844 1st Qu.:1908.6 1st Qu.:0
## Median : 893.0 Median : 558141 Median :2440.4 Median :0
## Mean : 810.0 Mean : 559756 Mean :2433.7 Mean :0
## 3rd Qu.: 956.0 3rd Qu.: 818467 3rd Qu.:3248.7 3rd Qu.:0
## Max. :1083.0 Max. :1078585 Max. :4111.7 Max. :0
## dir.ver id
## Min. :0 var1:15
## 1st Qu.:0
## Median :0
## Mean :0
## 3rd Qu.:0
## Max. :0
# Plot the semivariogram
plot(birdVar,
pch=20, cex=1.5, col="black",
ylab=expression("Semivariance ("*gamma*")"),
xlab="Distance (m)",
main = "Bird Richness")
# Fit a spherical model
sph.model <- vgm(psill= 2500,
model= "Sph",
range= 600000,
nugget= 0)
sph.fit <- fit.variogram(object= birdVar,
model= sph.model)
# Plot the fitted model
plot(birdVar,
pch=20, cex=1.5, col="black",
ylab=expression("Semivariance ("*gamma*")"),
xlab="Distance (m)",
main = "Bird Richness",
model=sph.fit)
# Fit an exponential model
exp.model <- vgm(psill= 2500,
model= "Exp",
range= 600000,
nugget= 0)
exp.fit <- fit.variogram(object= birdVar,
model= exp.model)
# Plot the fitted model
plot(birdVar,
pch=20, cex=1.5, col="black",
ylab=expression("Semivariance ("*gamma*")"),
xlab="Distance (m)",
main = "Bird Richness",
model=exp.fit)
The exponential model appears to fit the semivariogram better than the spherical model, so I will utilize this for the Kriging interpolation.
birdExp <- krige(nSpecies~1, birds, gridMex, model= exp.fit)
## [using ordinary kriging]
spplot(birdExp, "var1.pred")
library(automap)
birdVarauto <- autofitVariogram(nSpecies~1, birds)
summary(birdVarauto)
## Experimental variogram:
## np dist gamma dir.hor dir.ver id
## 1 12 18686.66 20.20833 0 0 var1
## 2 48 36495.36 174.90625 0 0 var1
## 3 78 59101.39 330.67949 0 0 var1
## 4 152 88775.27 736.08882 0 0 var1
## 5 204 122711.96 1269.84069 0 0 var1
## 6 273 158086.20 1517.68498 0 0 var1
## 7 1072 236608.91 1527.87313 0 0 var1
## 8 1432 352106.38 2163.75489 0 0 var1
## 9 2439 497644.17 2343.40693 0 0 var1
## 10 2310 675405.75 2838.77381 0 0 var1
## 11 2141 847102.90 3316.00771 0 0 var1
## 12 2604 1053083.48 3635.70315 0 0 var1
##
## Fitted variogram model:
## model psill range kappa
## 1 Nug 0.000 0.0 0.0
## 2 Ste 3310.266 445743.9 0.6
## Sums of squares betw. var. model and sample var.[1] 0.005326698
plot(birdVarauto)
The automatic function used the same nugget, a higher sill value, and a lower range value than I fit manually.
birdKrig <- autoKrige(nSpecies~1, birds, gridMex)
## [using ordinary kriging]
plot(birdKrig)
We can see from the kriging interpolation a general increasing bird richness moving from north to south. There also appear to be higher bird richness measures along the coastline of the main landmass of Mexico with a high density of bird richness in the southern end of the country and a low density of bird richness in the Baja.