Andy Bunn Examples

This first part is submitted as homework for Western Washington Univeristy’s ESCI597 Time Series and Spatial Analysis course, lead by Dr. Andy Bunn. Please see his RPubs Page Here

# install.packages("sp")
# install.packages("rgeos")
# install.packages("maptools")
# install.packages("rgdal")
# install.packages("raster")
# install.packages("gstat")
# install.packages("usdm")


library(sp)
library(rgeos)

## rgeos version: 0.3-8, (SVN revision 460)
##  GEOS runtime version: 3.3.3-CAPI-1.7.4 
##  Polygon checking: TRUE

library(maptools)

## Checking rgeos availability: TRUE

library(rgdal)

## rgdal: version: 0.9-2, (SVN revision 526)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.9.2, released 2012/10/08
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.1/Resources/library/rgdal/gdal
## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
## Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.1/Resources/library/rgdal/proj

library(raster)
library(gstat)

Meuse Cadmium Data

Let’s take a look at the classis Meuse floodplain metals concentration dataset.

require(sp)
data(meuse)
head(meuse)

##        x      y cadmium copper lead zinc  elev       dist   om ffreq soil
## 1 181072 333611    11.7     85  299 1022 7.909 0.00135803 13.6     1    1
## 2 181025 333558     8.6     81  277 1141 6.983 0.01222430 14.0     1    1
## 3 181165 333537     6.5     68  199  640 7.800 0.10302900 13.0     1    1
## 4 181298 333484     2.6     81  116  257 7.655 0.19009400  8.0     1    2
## 5 181307 333330     2.8     48  117  269 7.480 0.27709000  8.7     1    2
## 6 181390 333260     3.0     61  137  281 7.791 0.36406700  7.8     1    2
##   lime landuse dist.m
## 1    1      Ah     50
## 2    1      Ah     30
## 3    1      Ah    150
## 4    0      Ga    270
## 5    0      Ah    380
## 6    0      Ga    470

class(meuse)

## [1] "data.frame"

coordinates(meuse) <- ~x+y # this comes from the {sp} package
proj4string(meuse) <- CRS("+init=epsg:28992")
class(meuse)

## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"

bubble(meuse, xcol="x",ycol="y",zcol="cadmium", maxsize = 2.5, main = "Cadmium concentrations (ppm)")

This just set up the data.

next we use the variogram, which is “the simplest measure of spatial autocorrelation.”

cadVarCloud <- variogram(cadmium~1, meuse, cloud = TRUE)
plot(cadVarCloud,pch=20,cex=1.5,col="black",
     ylab=expression("Semivariance ("*gamma*")"),
     xlab="Distance (m)",main = "Cadmium concentrations (ppm)")

You can use this to find where points are no longer spatially autocorrelated. This could be a good baseline distance for point sampling if you wish to avoid autocorrelation. On the flipside, you can look at the distances that are spatially autocorrelated and investigate the reltionships at play there.

Perhaps we could do this with a RasterLayer

My Own Snow Departure Data

I will now perform a variogram on my own raster data.

I have developed here a snow departure timing map by scanning through the MOD10A2 MODIS Data Product to perform a change detection analysis. Every time that a pixel changes from snow to no snow that date (in this case, week of year) is recorded as a value in this map. Colors now represent the week of year that snow left the landscape for 2013.

library(raster)
WUS2013.path<-'/Users/bgatts/Desktop/WWU/ESCI597/ESCI_597_Snowmelt/SnowMeltTiffs/wus2013zero.tif'
WUS2013.raster=raster(WUS2013.path)
plot(WUS2013.raster)

plot(WUS2013.raster)

The color scheme here is a bit funky as this is a wok in progress. Zero values recieved no snow in 2013. Colors then represent 10+the number of weeks into the year that snow left that particular pixel. For example, pixels flanking many of the mountain ranges are dark green and have a value of 30, meaning that they melted off 20 weeks into the year (late May). I began counting here at 10 as to visually separate the snow from non-snow areas, as well as to separate them further in “snow melt timing” space in the variogram.

Crater Lake Subset

Let’s just take a close up look at the Central Sierra Nevada

CRLAclip<-'/Users/bgatts/Desktop/WWU/ESCI597/ESCI_597_Snowmelt/SnowMeltTiffs/2013CRLA.tif'
CRLA2013=raster(CRLAclip)



CRLA2013.values <- getValues(CRLA2013)
my.xlim <- c(10,40)
minElev <-min(CRLA2013.values)
maxElev <-max(CRLA2013.values)

plot(CRLA2013)

h<-hist(CRLA2013.values, breaks=seq(minElev,maxElev,length = 40), 
        col=rev(terrain.colors(40)), xlab="Melt Week of Year (weeks)",
        main="",
        xlim=my.xlim, 
        #ylim=c(0,0.001599), 
        freq=FALSE) 
xfit<-seq(minElev,maxElev,length=100) 
yfit<-dnorm(xfit,mean=mean(CRLA2013.values),sd=sd(CRLA2013.values)) 
lines(xfit, yfit, col="black", lwd=2)
boxplot(CRLA2013.values, horizontal=TRUE,  outline=TRUE,  axes=FALSE,at=0.0001,
        ylim=my.xlim, col = "white", add = TRUE, boxwex=0.02)

Spatial Variogram

Here is a discussion of Variogram {usdm}: Empirical variogram for raster data http://www.inside-r.org/packages/cran/usdm/docs/Variogram

#install.packages("usdm")
library("usdm")

Usage Variogram(x, lag, cutoff, cells, size=100) Arguments x a raster object (RasterLayer) lag the lag size (width of subsequent distance intervals) into which cell pairs are grouped for semivariance estimates. If missing, the cell size (raster resolution) is assigned. cutoff spatial separation distance up to which cell pairs are included in semivariance estimates; as a default, the length of the diagonal of the box spanning the data is divided by three. cells numeric (optional). A vector of cell numbers in the Raster object. This forces the function to only consider these cells (and their neighbours) to compute the variogram. size positive integer specifying the number of cells to be drawn from raster object. If the number of cells in the raster object is large, a sample with the specified size is drawn to make the computation more efficient.

RasterStack: interesting (http://www.inside-r.org/packages/cran/raster/docs/stack)

Well let’s look at the variogram:

CRLA2013.var=Variogram(CRLA2013, size=50)
plot(CRLA2013.var)

Interpretation

As one may expect, as physical distance increases, semivariance increases as well. Semvariance here resembles a mountain gradient, and I believe this is an artifact of the linear gradients at play along these mountains. It is hard to see, but it appears that the “nugget” is small and contained mostly within the very first bump (<0.1). Remember this x-axis is in decimal degrees.

Sierra Nevada Subset

Let’s just take a close up look at the Central Sierra Nevada

sierraclip<-'/Users/bgatts/Desktop/WWU/ESCI597/ESCI_597_Snowmelt/SnowMeltTiffs/sierrapos.tif'
sierra2013=raster(sierraclip)

sierra2013.values <- getValues(sierra2013)
# sierra2013.allvalues <- getValues(sierra2013)
sierra2013.sdvalues=sierra2013.values[sierra2013.values >0]
my.xlim <- c(10,34)
minElev <-min(sierra2013.values)
maxElev <-max(sierra2013.values)

plot(sierra2013)

h<-hist(sierra2013.values, breaks=seq(minElev,maxElev,length = 34), 
        col=rev(terrain.colors(34)), xlab="Melt Week of Year (weeks)",
        main="",
        xlim=my.xlim, 
        #ylim=c(0,0.001599), 
        freq=FALSE) 
xfit<-seq(10,maxElev,length=100) 
yfit<-dnorm(xfit,mean=mean(22),sd=sd(sierra2013.sdvalues)) 
lines(xfit, yfit, col="black", lwd=2)
boxplot(sierra2013.values, horizontal=TRUE,  outline=TRUE,  axes=FALSE,at=0.0001,
        ylim=my.xlim, col = "white", add = TRUE, boxwex=0.02)

Spatial Variogram

Well let’s look at the variogram:

sierra2013.var=Variogram(sierra2013, size=20)
plot(sierra2013.var)

plot(sierra2013.var, xlim=c(0,.5))

Interpretation

Descriptive Statistics

CRLA2013.values <- getValues(CRLA2013)
my.xlim <- c(10,40)
minElev <-min(CRLA2013.values)
maxElev <-max(CRLA2013.values)
h<-hist(CRLA2013.values, breaks=seq(minElev,maxElev,length = 40), 
        col=rev(terrain.colors(40)), xlab="Melt Week of Year (weeks)",
        main="",
        xlim=my.xlim, 
        #ylim=c(0,0.001599), 
        freq=FALSE) 
xfit<-seq(minElev,maxElev,length=100) 
yfit<-dnorm(xfit,mean=mean(CRLA2013.values),sd=sd(CRLA2013.values)) 
lines(xfit, yfit, col="black", lwd=2)
boxplot(CRLA2013.values, horizontal=TRUE,  outline=TRUE,  axes=FALSE,at=0.0001,
        ylim=my.xlim, col = "white", add = TRUE, boxwex=0.02)

Test for Normality

I then randomly sampled 50 points from the snow loss timing data and tested for normality using the Shapiro-Wilk test:

# Subset this sample to remove areas of no snow
CRLA2013.snowvalues=CRLA2013.values[CRLA2013.values >0]
CRLA2013.shapiro=seq(1:10)

for (i in 1:10) {
  CRLA2013.sample50=sample(CRLA2013.snowvalues, size=50, replace = FALSE, prob = NULL)
  CRLA2013.shapiro[i]=shapiro.test(CRLA2013.sample50)$p.value
  #hist(CRLA2013.sample50)
}

barplot(CRLA2013.shapiro, ylab="p-values")
abline(h=0.05)

Spatial Analysis and Variograms

Donal O’Leary

Wednesday, April 29, 2015

Andy Bunn Examples

Meuse Cadmium Data

My Own Snow Departure Data

Crater Lake Subset

Spatial Variogram

Interpretation

Sierra Nevada Subset

Spatial Variogram

Interpretation

Descriptive Statistics

Test for Normality