Spatial autocorrelation in the Meuse River data

I called up all of the packages that we have been working with for spatial anaylsis, just in case knitr needed them, and because hey, why not? As one R fortune() so nicely put it, “RAM is cheap and thinking hurts.” (Credit: Uwe Ligges (about memory requirements in R), from R-help, June 2007).

library(knitr)
library(sp)
library(maptools)
## Checking rgeos availability: FALSE
##      Note: when rgeos is not available, polygon geometry     computations in maptools depend on gpclib,
##      which has a restricted licence. It is disabled by default;
##      to enable gpclib, type gpclibPermit()
library(rgdal)
## rgdal: version: 0.9-2, (SVN revision 526)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.11.2, released 2015/02/10
## Path to GDAL shared files: C:/Users/Ezra/Documents/R/win-library/3.1/rgdal/gdal
## GDAL does not use iconv for recoding strings.
## Loaded PROJ.4 runtime: Rel. 4.9.1, 04 March 2015, [PJ_VERSION: 491]
## Path to PROJ.4 shared files: C:/Users/Ezra/Documents/R/win-library/3.1/rgdal/proj
library(raster)
library(gstat)

I read in the meuse data from the gstat package and converted it into a SpatialPointsDataFrame as follows:

data(meuse)
head(meuse)
##        x      y cadmium copper lead zinc  elev       dist   om ffreq soil
## 1 181072 333611    11.7     85  299 1022 7.909 0.00135803 13.6     1    1
## 2 181025 333558     8.6     81  277 1141 6.983 0.01222430 14.0     1    1
## 3 181165 333537     6.5     68  199  640 7.800 0.10302900 13.0     1    1
## 4 181298 333484     2.6     81  116  257 7.655 0.19009400  8.0     1    2
## 5 181307 333330     2.8     48  117  269 7.480 0.27709000  8.7     1    2
## 6 181390 333260     3.0     61  137  281 7.791 0.36406700  7.8     1    2
##   lime landuse dist.m
## 1    1      Ah     50
## 2    1      Ah     30
## 3    1      Ah    150
## 4    0      Ga    270
## 5    0      Ah    380
## 6    0      Ga    470
class(meuse)
## [1] "data.frame"

coordinates(meuse) <- ~x+y
proj4string(meuse) <- CRS("+init=epsg:28992")
class(meuse)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"

Here is a bubble() plot of lead concentrations along the Meuse River. Data courtesy of gstat, lovely plotting capabilities thanks to ggplot2.

bubble(meuse, xcol="x",ycol="y",zcol="lead",
       maxsize = 2.75,
       main = "Lead Concentrations (ppm)")

Here is a line fitted to the lead semivariogram points.

lead.vgm = variogram(lead~1, meuse)
lead.fit = fit.variogram(lead.vgm, model = vgm(1, "Sph", 910, 1))
print(lead.fit)
##   model     psill    range
## 1   Nug  3024.179   0.0000
## 2   Sph 12412.891 906.6741
plot(lead.vgm, lead.fit, pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)",main = "Lead Concentrations (ppm)")

The nugget is at gamma = 3024.178991 (embedded in this text via R lead.fit$psill[1]), and the data are spatially autocorrelated out to 906.6740757 meters, at least based on how I am interpreting these values.

In the gstat tutorial PDF it says that cutoff and width “correspond to some extent to map extent and cell size,” which I’m having a little bit of trouble interpreting. I have yet to determine how exactly width is manifested in the plots, aside from the fact that it basically serves to spread things out vertically (or so it seems). I’ve made plots with various width values (specifically 0.5, 1, 2, 5) below in order to see how they compare (to reiterate: thinking hurts, but RAM = cheap). Based on the first two plots, it appears that width = 1 is the default. When the width command is used with cloud = FALSE and width is set to something other than the default value of 1, it basically compresses or stretches the plot vertically, depending on whether width is <1 or >1.

leadVarCloudT <- variogram(lead~1, meuse, cloud = TRUE)
plot(leadVarCloudT,pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)",main = "Lead Concentrations (ppm)")

leadVarCloudT1 <- variogram(lead~1, meuse, cloud = TRUE, width = 1)
plot(leadVarCloudT1,pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)",main = "Lead Concentrations (ppm)")

leadVarCloudF <- variogram(lead~1, meuse, cloud = FALSE)
plot(leadVarCloudF, lead.fit, pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)",main = "Lead Concentrations (ppm)")

leadVarCloudF1 <- variogram(lead~1, meuse, cloud = FALSE, width = 1)
plot(leadVarCloudF1, lead.fit, pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)",main = "Lead Concentrations (ppm)")

leadVar.5 <- variogram(lead~1, meuse, cloud = FALSE, width=0.5)
plot(leadVar.5, lead.fit, pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)", main = "Lead Concentrations (ppm)")

leadVar1 <- variogram(lead~1, meuse, cloud = FALSE, width=1)
plot(leadVar1, lead.fit, pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)", main = "Lead Concentrations (ppm)")

leadVar2 <- variogram(lead~1, meuse, cloud = FALSE, width=2)
plot(leadVar2, lead.fit, pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)", main = "Lead Concentrations (ppm)")

leadVar5 <- variogram(lead~1, meuse, cloud = FALSE, width=5)
plot(leadVar5, lead.fit, pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)", main = "Lead Concentrations (ppm)")