Introduction

This document provides a an introduction to applied spatial statistics. It is based mostly on the second edition of the book: Applied Spatial Data Analysis with R and on the Harvard Workshop: Applied Spatial Statistics in R

Handling Spatial Data in R

Classes for spatial data

We use the following R Packages:

#install.packages("maps")
#install.packages("maptools")
#install.packages("sp")
#install.packages("spdep")
#install.packages("gstat")
#install.packages("splancs")
#install.packages("spatstat")
#install.packages("lattice")
#install.packages("pgirmess")
#install.packages("RColorBrewer")
#install.packages("classInt")
#install.packages("spgwr")
#install.packages("leaflet")


library(maps)         ## Projections
library(maptools)     ## Data management
library(sp)           ## Data management
library(spdep)        ## Spatial autocorrelation
library(gstat)        ## Geostatistics
library(splancs)      ## Kernel Density
library(spatstat)     ## Geostatistics
#library(pgirmess)     ## Spatial autocorrelation        <<<Error: required 'rgdal'>>> 
library(RColorBrewer) ## Visualization
library(classInt)     ## Class intervals
library(spgwr)        ## GWR

# load some data:
#url <- url("http://www.people.fas.harvard.edu/~zhukov/Datasets.RData")
#load(url)
#save(laos,crime,cities,volcano,election,dat88,mat88,file="Datasets.RData")

for comprehensive description of spatial packeges see CRAN’s Spatial Task View

Spatial objects:

  1. Spatial* has two slots. The first is a bounding box, a matrix of numerical coordinates with column names c(‘min’, ‘max’), and at least two rows, with the first row eastings (x-axis) and the second northings (y-axis).

  2. CRS - coordinate reference system class, gets character string as its only slot value, which may be a missing value. If it is not missing, it should be a PROJ.4-format string describing the projection (more details are given in Sect. 4.1.2). For geographical coordinates, the simplest such string is “+proj=longlat”.

m <- matrix(c(0,0,1,1), 
            ncol = 2,
            dimnames = list(NULL, c("min", "max")))
crs <- CRS(projargs = as.character(NA))
crs
## CRS arguments: NA
S <- Spatial(bbox = m, proj4string = crs)
S
## An object of class "Spatial"
## Slot "bbox":
##      min max
## [1,]   0   1
## [2,]   0   1
## 
## Slot "proj4string":
## CRS arguments: NA

using standard read.table:

url1 <- "http://www.asdar-book.org/datasets/CRAN051001a.txt"
CRAN_df <- read.table(url1, header = T)
CRAN_mat <- cbind(CRAN_df$long, CRAN_df$lat)
row.names(CRAN_mat) <- 1:nrow(CRAN_mat)
summary(CRAN_mat)
##        V1                  V2        
##  Min.   :-122.9500   Min.   :-37.82  
##  1st Qu.: -47.3833   1st Qu.: 34.52  
##  Median :   7.8500   Median : 42.73  
##  Mean   :  -0.6617   Mean   : 31.73  
##  3rd Qu.:  16.8333   3rd Qu.: 47.58  
##  Max.   : 153.0333   Max.   : 57.05

SpatialPoints object:

llCRS <- CRS("+proj=longlat +ellps=WGS84")
CRAN_sp <- SpatialPoints(CRAN_mat, proj4string = llCRS)
summary(CRAN_sp)
## Object of class SpatialPoints
## Coordinates:
##                  min      max
## coords.x1 -122.95000 153.0333
## coords.x2  -37.81667  57.0500
## Is projected: FALSE 
## proj4string : [+proj=longlat +ellps=WGS84]
## Number of points: 54

the summary gives min+max values for coordinates.

Methods

The bbox method returns the bounding box of the object, and is used both for preparing plotting methods (see Chap.3) and internally in handling data objects

bbox(CRAN_sp)
##                  min      max
## coords.x1 -122.95000 153.0333
## coords.x2  -37.81667  57.0500

SpatialPointsDataFrame

The SpatialPointsDataFrame class is the container for this kind of spatial point information, and can be constructed in a number of ways, for example from a data frame and a matrix of coordinates

The Spatial*DataFrame classes have been designed to behave as far as possible like data frames

the match.ID argument is set to its default value of TRUE, then the matrix row names are checked against the row names of the data frame

CRAN_spdf1 <- SpatialPointsDataFrame(CRAN_mat, CRAN_df,
                                     proj4string = llCRS,
                                     match.ID = TRUE)
head(CRAN_spdf1)
##              coordinates          place   north     east       loc
## 1  (153.0333, -27.46667)       Brisbane 27d28'S 153d02'E Australia
## 2  (144.9667, -37.81667)      Melbourne 37d49'S 144d58'E Australia
## 3   (16.33333, 48.21667)           Wien 48d13'N  16d20'E   Austria
## 4 (-49.26667, -25.41667)       Curitiba 25d25'S  49d16'W    Brazil
## 5    (-42.86667, -20.75)      Vi\xe7oza 20d45'S  42d52'W    Brazil
## 6         (-43.2, -22.9) Rio de Janeiro 22d54'S  43d12'W    Brazil
##        long       lat
## 1 153.03333 -27.46667
## 2 144.96667 -37.81667
## 3  16.33333  48.21667
## 4 -49.26667 -25.41667
## 5 -42.86667 -20.75000
## 6 -43.20000 -22.90000

Example: creating a SpatialPointsDataFrame

we converte the data.frame to SpatialPointsDataFrame when calling the coordinates function

url2 <- "http://www.asdar-book.org/datasets/seamap105_mod.csv"
turtle_df <- read.csv(url2)
timestamp <- as.POSIXlt(strptime(as.character(turtle_df$obs_date),
                                 "%m/%d/%Y %H:%M:%S"), "GMT")
turtle_df1 <- data.frame(turtle_df, timestamp = timestamp)
turtle_df1$lon <- ifelse(turtle_df1$lon < 0, turtle_df1$lon + 360, turtle_df1$lon)
turtle_sp <- turtle_df1[order(turtle_df1$timestamp), ]
# turtle_sp is still a data.frame

coordinates(turtle_sp) <- c("lon", "lat")
# now turlte_sp is a SpatialPointsDataFrame

proj4string(turtle_sp) <- CRS("+proj=longlat +ellps=WGS84")

SpatialLines

Example: SpatialLines and maptools

library(maps)
japan <- map("world", "japan", plot = F)
p4s <- CRS("+proj=longlat +ellps=WGS84")

library(maptools)
SLjapan <- map2SpatialLines(japan, proj4string = p4s)
str(SLjapan, max.level = 2)
## Formal class 'SpatialLines' [package "sp"] with 3 slots
##   ..@ lines      :List of 34
##   ..@ bbox       : num [1:2, 1:2] 123.7 24.3 145.8 45.5
##   .. ..- attr(*, "dimnames")=List of 2
##   ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
plot(SLjapan)

A very typical way of exploring the contents of these objects is to use lapply or sapply in combination with slot

Lines_len <- sapply(slot(SLjapan, "lines"), 
                    function(x) length(slot(x,"Lines")))
table(Lines_len)
## Lines_len
##  1 
## 34

no Lines object contains more than one Line object:

converting data to SpatialLinesDataFrame using CcontourLines2SLDF function:

consider the volcano data:

image(volcano)

volcano_sl <- ContourLines2SLDF(contourLines(volcano))
t(slot(volcano_sl, "data"))
##       C_1   C_2   C_3   C_4   C_5   C_6   C_7   C_8   C_9   C_10 
## level "100" "110" "120" "130" "140" "150" "160" "170" "180" "190"

SpatialPolygons

Start with an example:

url3 <- "http://www.asdar-book.org/datasets/auckland_mapgen.dat"
llCRS <- CRS("+proj=longlat +ellps=WGS84")
auck_shore <- MapGen2SL(url3, llCRS)
summary(auck_shore)
## Object of class SpatialLines
## Coordinates:
##     min   max
## x 174.2 175.3
## y -37.5 -36.5
## Is projected: FALSE 
## proj4string : [+proj=longlat +ellps=WGS84]
lns <- slot(auck_shore, "lines")
table(sapply(lns, function(x) length(slot(x, "Lines"))))
## 
##  1 
## 80
islands_auck <- sapply(lns, function(x) {
        crds <- slot(slot(x, "Lines")[[1]], "coords")
        identical(crds[1, ], crds[nrow(crds), ])
        })
table(islands_auck)
## islands_auck
## FALSE  TRUE 
##    16    64

Since all the Lines in the auck_shore object contain only single Line objects, checking the equality of the first and last coordinates of the first Line object in each Lines object tells us which sets of coordinates can validly be made into polygons.

The Polygon class extends the Line class by adding slots needed for polygons and checking that the first and last coordinates are identical. The extra slots are: - a label point, taken as the centroid of the polygon - the area of the polygon in the metric of the coordinates - whether the polygon is declared as a hole or not – the default value is a logical NA - the ring direction of the polygon

Example: build a SpatialPolygons object

islands_sl <- auck_shore[islands_auck]
list_of_Lines <- slot(islands_sl, "lines")
islands_sp <- SpatialPolygons(lapply(list_of_Lines, function(x) {
        Polygons(list(Polygon(slot(slot(x, "Lines")[[1]],
                           "coords"))), ID = slot(x, "ID"))
        }), proj4string = CRS("+proj=longlat +ellps=WGS84"))

summary(islands_sp)
## Object of class SpatialPolygons
## Coordinates:
##         min       max
## x 174.30297 175.22791
## y -37.43877 -36.50033
## Is projected: FALSE 
## proj4string : [+proj=longlat +ellps=WGS84]
slot(islands_sp, "plotOrder")
##  [1] 45 54 37 28 38 27 12 11 59 53  5 25 26 46  7 55 17 34 30 16  6 43 14
## [24] 40 32 19 61 42 15 50 21 18 62 23 22 29 24 44 13  2 36  9 63 58 56 64
## [47] 52 39 51  1  8  3  4 20 47 35 41 48 60 31 49 57 10 33
order(sapply(slot(islands_sp, "polygons"), 
             function(x) slot(x, "area")), decreasing = TRUE)
##  [1] 45 54 37 28 38 27 12 11 59 53  5 25 26 46  7 55 17 34 30 16  6 43 14
## [24] 40 32 19 61 42 15 50 21 18 62 23 22 29 24 44 13  2 36  9 63 58 56 64
## [47] 52 39 51  1  8  3  4 20 47 35 41 48 60 31 49 57 10 33

As we saw with the construction of SpatialLines objects from raw coordinates, here we build a list of Polygon objects for each Polygons object, corresponding to a single identifying tag. A list of these Polygons objects is then passed to the SpatialPolygons function, with a coordinate reference system, to create the SpatialPolygons object. Again, like SpatialLines objects, SpatialPolygons objects are most often created by functions that import or manipulate such data objects, and seldom from scratch.

SpatialPolygonsDataFrame

Example: merging US state data with state boundary polygons

library(maps)
state.map <- map("state", plot = F, fill = T)
IDs <- sapply(strsplit(state.map$names, ":"), function(x) x[1])
library(maptools)
state.sp <- map2SpatialPolygons(state.map, IDs = IDs,
                                proj4string = CRS("+proj=longlat +ellps=WGS84"))
plot(state.sp)

Then we can use identifying tag matching to suit the rows of the data frame to the SpatialPolygons. Here, we subset to the matched rows of the data frame, to ensure that one row corresponds to each Polygons object, to achieve one-to-one matching:

url4 <- "http://www.asdar-book.org/datasets/state.sat.data_mod.txt"
sat <- read.table(url4, row.names = 5, header = T)
id <- match(row.names(sat), row.names(state.sp))
row.names(sat)[is.na(id)]
## [1] "alaska" "hawaii" "usa"
sat1 <- sat[!is.na(id), ]

state.spdf <- SpatialPolygonsDataFrame(state.sp, sat1)
slot(state.spdf, "data")
##                      oname vscore mscore pc
## alabama                ala    561    555  9
## arizona               ariz    524    525 34
## arkansas               ark    563    556  6
## california           calif    497    514 49
## colorado              colo    536    540 32
## connecticut           conn    510    509 80
## delaware              dela    503    497 67
## district of columbia  d.c.    494    478 77
## florida                fla    499    498 53
## georgia                 ga    487    482 63
## idaho                idaho    542    540 16
## illinois               ill    569    585 12
## indiana                ind    496    498 60
## iowa                  iowa    594    598  5
## kansas                 kan    578    576  9
## kentucky                ky    547    547 12
## louisiana               la    561    558  8
## maine                maine    507    503 68
## maryland                md    507    507 65
## massachusetts         mass    511    511 78
## michigan              mich    557    565 11
## minnesota             minn    586    598  9
## mississippi           miss    563    548  4
## missouri                mo    572    572  8
## montana               mont    545    546 21
## nebraska               neb    568    571  8
## nevada                 nev    512    517 34
## new hampshire           nh    520    518 72
## new jersey              nj    498    510 80
## new mexico              nm    549    542 12
## new york                ny    495    502 76
## north carolina          nc    493    493 61
## north dakota            nd    594    605  5
## ohio                  ohio    534    538 25
## oklahoma              okla    567    560  8
## oregon                 ore    525    525 53
## pennsylvania            pa    498    495 70
## rhode island            ri    504    499 70
## south carolina          sc    479    475 61
## south dakota            sd    585    588  4
## tennessee             tenn    559    553 13
## texas                texas    494    499 50
## utah                  utah    570    568  5
## vermont                 vt    514    506 70
## virginia                va    508    499 65
## washington            wash    525    526 52
## west virginia          wva    527    512 18
## wisconsin              wis    584    595  7
## wyoming                wyo    546    551 10

SpatialGrid and SpatialPixel Objects

Example: making a GridTopology object

#TODO

Visualising Spatial Data

A major pleasure in working with spatial data is their visualisation.

This chapter introduces the plotting methods for objects inheriting from class Spatial that are provided by package sp.

Traditional Plot System

Plotting Points, Lines, Polygons, and Grids

Example: creating spatial objects from data.frame objects from the sp package

library(sp)
data("meuse")
class(meuse)
## [1] "data.frame"
head(meuse)
##        x      y cadmium copper lead zinc  elev       dist   om ffreq soil
## 1 181072 333611    11.7     85  299 1022 7.909 0.00135803 13.6     1    1
## 2 181025 333558     8.6     81  277 1141 6.983 0.01222430 14.0     1    1
## 3 181165 333537     6.5     68  199  640 7.800 0.10302900 13.0     1    1
## 4 181298 333484     2.6     81  116  257 7.655 0.19009400  8.0     1    2
## 5 181307 333330     2.8     48  117  269 7.480 0.27709000  8.7     1    2
## 6 181390 333260     3.0     61  137  281 7.791 0.36406700  7.8     1    2
##   lime landuse dist.m
## 1    1      Ah     50
## 2    1      Ah     30
## 3    1      Ah    150
## 4    0      Ga    270
## 5    0      Ah    380
## 6    0      Ga    470
# converting to Spatial object:
coordinates(meuse) <- c("x","y")
class(meuse)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
plot(meuse, main = "points")

The SpatialPointsDataFrame object used is created from a data.frame provided with sp, and the plot method shows the points with the default symbol.

cc <- coordinates(meuse)
m.sl <- SpatialLines(list(Lines(list(Line(cc)), "line1")))

plot(m.sl, main = "lines")

A SpatialLines object is made by joining up the points in sequence, and plot draws the resulting zig-zags.

data("meuse.riv")
meuse.lst <- list(Polygons(list(Polygon(meuse.riv)), "meuse.riv"))
meuse.pol <- SpatialPolygons(meuse.lst)

plot(meuse.pol, col = "grey", main ="grid")

Grid:

data("meuse.grid")
coordinates(meuse.grid) <- c("x","y")
meuse.grid <- as(meuse.grid, "SpatialPixels")
image(meuse.grid, col = "grey", main = "grid")

A map becomes more readable when we combine several elements. We can display elements from those created above by using the add = TRUE argument in function calls:

image(meuse.grid, col = "lightgrey")
plot(meuse.pol, col = "grey", add = T)
plot(meuse, pch = 11 ,col = "blue", add = T)

Axes and Layout Elements

Example:some notes about plotting and axes:

oldpar = par(no.readonly = TRUE) # set oldpar
layout(matrix(c(1, 2), 1, 2))
plot(meuse, axes = TRUE, cex = 0.6)
plot(meuse.pol, add = TRUE)
title("Sample locations")
par(mar = c(0, 0, 0, 0) + 0.1)
plot(meuse, axes = FALSE, cex = 0.6)
plot(meuse.pol, add = TRUE)
box()

par(oldpar)

This table summerises the SP DataFrame classes

Class(es) Argument Meaning Further help
SpatialLinesDataFrame col Colour ?lines
lwd Line width ?lines
lty Line type ?lines
SpatialPolygonsDataFrame border Border colour ?polygon
density Hashing density ?polygon
angle Hashing angle ?polygon
lty Line type ?polygon
pbg Hole colour
SpatialPointsDataFrame pch Symbol ?points
col Colour ?points
bg Fill colour ?points
cex Symbol size ?points
SpatialPixelsDataFrame zlim Attribute value limits ?image.default
and col Colours ?image.default
SpatialGridDataFrame breaks Break points ?image.default

Plotting Attributes and Map Legends

Example: image methods for objects of class SpatialPixelsDataFrame and SpatialGridDataFrame

library(gstat)
zn.idw <- krige(log(zinc) ~ 1, meuse, meuse.grid)
## [inverse distance weighted interpolation]
grays = gray.colors(4, 0.55, 0.95)
image(zn.idw, col = grays, breaks = log(c(100, 200, 400, 800, 1800)))

plot(meuse.pol, add = TRUE)
plot(meuse, pch = 1, cex = sqrt(meuse$zinc)/20, add = TRUE)
legVals <- c(100, 200, 500, 1000, 2000)
legend("left", legend = legVals, pch = 1, pt.cex = sqrt(legVals)/20,
       bty = "n", title = "measured")
legend("topleft", legend = c("100-200", "200-400", "400-800","800-1800"),
       fill = grays, bty = "n", title = "interpolated")

Plotting with ssplot

library(lattice)
## 
## Attaching package: 'lattice'
## The following object is masked from 'package:spatstat':
## 
##     panel.histogram
zn <- krige(zinc~1,meuse,meuse.grid)
## [inverse distance weighted interpolation]
zn$direct <- zn$var1.pred
zn$log <- exp(krige(log(zinc)~1,meuse,meuse.grid)$var1.pred)
## [inverse distance weighted interpolation]
# traditional:
levelplot(z ~ x + y | name,
          spmap.to.lev(zn[c("direct","log")]),
          asp = "iso")

# with spplot:
spplot(zn[c("direct","log")])

Example: SpatialLinesDataFrame with contourLines function and spplot (using maptools)

data(meuse.grid)
coordinates(meuse.grid) <- c("x", "y")
meuse.grid <- as(meuse.grid, "SpatialPixelsDataFrame")
im <- as.image.SpatialGridDataFrame(meuse.grid["dist"])
cl <- ContourLines2SLDF(contourLines(im))
spplot(cl)

Example: Adding Reference and Layout Elements to Plots with spplot

river <- list("sp.polygons", meuse.pol)
north <- list("SpatialPolygonsRescale",
              layout.north.arrow(),
              offset = c(178750, 332500),
              scale = 400)

scale <- list("SpatialPolygonsRescale",
              layout.scale.bar(),
              offset = c(180200, 329800), 
              scale = 1000, 
              fill = c("transparent","black"))

txt1 <- list("sp.text", c(180200, 329950), "0")
txt2 <- list("sp.text", c(181200, 329950), "1 km")

pts <- list("sp.points",
            meuse,
            pch = 3, col = "black")

meuse.layout <- list(river, north, scale, txt1, txt2, pts)

spplot(zn["log"], sp.layout = meuse.layout)

Plotting with latticeExtra

Example: latticeExtra

library(latticeExtra)
p = spplot(meuse["zinc"])
m = SpatialPolygonsDataFrame(meuse.pol,
                             data.frame(col = 1),
                             match.ID = FALSE)
l = spplot(m)
l + p

p + l

Interactive Plots

(this can not be demonstrated in the pdf version)

Example: interactive with base

plot(meuse)
region <- locator(type = "o")
n <- length(region$x)
p <- Polygon(cbind(region$x, region$y)[c(1:n, 1), ],
             hole = FALSE)
ps <- Polygons(list(p), ID = "region")
sps <- SpatialPolygons(list(ps))
plot(meuse[sps, ], pch = 16, cex = 0.5, add = TRUE)

Example: interactive with spplot To select points with spplot, use: identify = T in the spplot function

Digitising can be done by the function grid.locator from package grid, which underlies the functionality in lattice. A single point is selected by:

library(grid)
trellis.focus("panel", column = 1, row = 1)
as.numeric(grid.locator())
trellis.unfocus()

note aboute choosing class intervals: Fisher-Jenks natural breaks is a better option than quantiles.

Some more Visualisation exercises

This exercises are mostly based on the Harvard workshop: Applied Spatial Statistics in R.

rm(list=ls())
url <- url("http://www.people.fas.harvard.edu/~zhukov/Datasets.RData")
load(url) #might takes some time
ls()
##  [1] "cities"   "crime"    "dat88"    "election" "laos"     "map2"    
##  [7] "map3"     "mat88"    "url"      "volcano"

Example: point data - crime database. (using Traditional Plot System)

head(crime)
##   ID      LONG      LAT
## 1  1 -76.65159 39.23941
## 2  2 -76.47434 39.35274
## 3  3 -76.51726 39.25874
## 4  4 -76.52607 39.40707
## 5  5 -76.51001 39.33571
## 6  6 -76.70375 39.26605
dim(crime)
## [1] 1061    3

First, we convert the crime dataframe into a SpatialDataFrame

proj <- CRS("+proj=utm +zone=17 +datum=WGS84")
crime.sp <- SpatialPointsDataFrame(coords = cbind(crime$LONG, crime$LAT),
                                   data = crime,
                                   proj4string = proj)

bounding box of data points:

bbox(crime.sp)
##                 min       max
## coords.x1 -76.82981 -76.38756
## coords.x2  39.22488  39.67995

Plot points:

plot(crime.sp, pch = 16, cex = 0.5, axes = T)

Example: polygon data : 2004 US election + crime. (using Traditional Plot System)

The election data is already in class: SpatialPolygonDataFrame

class(election)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
names(election)
##  [1] "NAME"       "STATE_NAME" "STATE_FIPS" "CNTY_FIPS"  "FIPS"      
##  [6] "AREA"       "FIPS_num"   "Bush"       "Kerry"      "County_F"  
## [11] "Nader"      "Total"      "Bush_pct"   "Kerry_pct"  "Nader_pct" 
## [16] "MDratio"    "hosp"       "pcthisp"    "pcturban"   "urbrural"  
## [21] "pctfemhh"   "pcincome"   "pctpoor"    "pctlt9ed"   "pcthsed"   
## [26] "pctcoled"   "unemploy"   "pctwhtcl"   "homevalu"   "rent"      
## [31] "popdens"    "crowded"    "ginirev"    "SmokecurM"  "SmokevrM"  
## [36] "SmokecurF"  "SmokevrF"   "Obese"      "Noins"      "XYLENES__M"
## [41] "TOLUENE"    "TETRACHLOR" "STYRENE"    "NICKEL_COM" "METHYLENE_"
## [46] "MERCURY_CO" "LEAD_COMPO" "BENZENE__I" "ARSENIC_CO" "POP2000"   
## [51] "POP00SQMIL" "MALE2000"   "FEMALE2000" "MAL2FEM"    "UNDER18"   
## [56] "AIAN"       "ASIA"       "BLACK"      "NHPI"       "WHITE"     
## [61] "AIAN_MORE"  "ASIA_MORE"  "BLK_MORE"   "NHPI_MORE"  "WHT_MORE"  
## [66] "HISP_LAT"   "CH19902000" "MEDAGE2000" "PEROVER65"
dim(election)
## [1] 3111   69
election.sp <- election

Lambert Conformal Conic Projection:

proj4string(election.sp) <- CRS("+proj=lcc+lon_0=90w +lat_1=20n +lat_2=60n")
summary(election.sp)[1:4]
## $class
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
## 
## $bbox
##           min       max
## r1 -124.73142 -66.96985
## r2   24.95597  49.37173
## 
## $is.projected
## [1] TRUE
## 
## $proj4string
## [1] "+proj=lcc+lon_0=90w +lat_1=20n +lat_2=60n"

plot basemap of counties:

par(mar = c(0,0,0,0))
plot(election)

basemap + crime

par(mar=rep(0.5,4))
plot(election,
     xlim=bbox(crime.sp)[1,], # defining the zoom with bbox
     ylim=bbox(crime.sp)[2,],
     col="beige")
plot(crime.sp,pch=1, cex=.5,add=T, col="blue")

Example: ploting some attributes : 2004 US election + crime. (using Traditional Plot System)

Firs let’s check the RColorBrewer package options:

par(mar=c(0,3,0,0),cex=.6)
display.brewer.all(n=5) 

# Create blue-state red-state palette
br.palette <- colorRampPalette(c("blue", "red"), space = "rgb")
br.palette(5)
## [1] "#0000FF" "#3F00BF" "#7F007F" "#BF003F" "#FF0000"

plot the % of vote for Bush

here are two plotting options:

Easy but unflexible option

bush <- election$Bush_pct

spplot(election, 
       zcol = "Bush_pct",
       col.regions = br.palette(100), 
       main = "Percent of County Vote for Bush (2004)")

Harder but more flexible option: Using classInt package.

pal <- br.palette(n=5) # Define number of colors in a palette

# Fixed intervals:
classes_fx <- classIntervals(bush, n=5, style = "fixed",
                             fixedBreaks=c(0,10,25,50,75,100), rtimes = 1)
# Other methods:
classes_sd <- classIntervals(bush, n=5, style = "sd", rtimes = 1)
classes_fi <- classIntervals(bush, n=5, style = "fisher", rtimes = 3)
classes_eq <- classIntervals(bush, n=5, style = "equal", rtimes = 1)
classes_km <- classIntervals(bush, n=5, style = "kmeans", rtimes = 1)
classes_qt <- classIntervals(bush, n=5, style = "quantile", rtimes = 1)

# Compare classes before plotting
par(mar=c(2,2,2,1)+0.1, mfrow=c(2,3))
plot(classes_fx, pal=pal, main="Fixed Intervals", xlab="", ylab="")
plot(classes_sd, pal=pal, main="Standard Deviation", xlab="", ylab="")
plot(classes_fi, pal=pal, main="Fisher-Jenks", xlab="", ylab="")
plot(classes_km, pal=pal, main="K Means", xlab="", ylab="")
plot(classes_eq, pal=pal, main="Equal Interval", xlab="", ylab="")
plot(classes_qt, pal=pal, main="Quantile", xlab="", ylab="")

par(mfrow = c(2,1))
# Plot using fixed intervals
cols <- findColours(classes_fx, pal) # this is a classInt function

par(mar=rep(0,4))
plot(election, col = cols, border=NA)
legend(x = "bottom", cex = 0.7, fill = attr(cols, "palette"),
       bty = "n", legend = names(attr(cols, "table")),
       title = "Percent of County Vote for Bush (2004)", ncol=5)

# Plot binary Bush/Kerry (Red/Blue)
cols <- ifelse(election.sp$Bush > election.sp$Kerry,"red","blue")

plot(election, col = cols, border = NA)
legend(x = "bottom", cex= 0.7, fill = c("red","blue"),
       bty = "n", legend = c("Bush","Kerry"),
       title = "Winner of County Vote (2004)", ncol=2)

Example: Grid data: Maunga Whau Volcano. (using also: contour and 3D plot with persp package)

# the volcano data:
class(volcano)
## [1] "matrix"
dim(volcano)
## [1] 87 61
z <- volcano            ## Height Variable
x <- 10*(1:nrow(z))     ## 10 Meter Spacing (S-N)
y <- 10*(1:ncol(z))     ## 10 Meter Spacing (E-W)

# Gradient + Contour plot:
image(x, y, z, col=terrain.colors(100), axes=F)
contour(x, y, z, levels=seq(from=min(z), to=max(z), by=10),axes=F, add=T)

# 3D Plot:
## 3-D Elevation Plot w/ color
z <- 2 * volcano
x <- 10 * (1:nrow(z))
y <- 10 * (1:ncol(z))

## Create new grid
z0 <- min(z) - 20 
z <- rbind(z0, cbind(z0, z, z0), z0) 
x <- c(min(x) - 1e-10, x, max(x) + 1e-10) 
y <- c(min(y) - 1e-10, y, max(y) + 1e-10) 

## Create matrix of base colors
fcol <- matrix("green3", nr = nrow(z)-1, nc = ncol(z)-1) 
fcol[ , i2 <- c(1,ncol(fcol))] <- "gray" 
fcol[i1 <- c(1,nrow(fcol)) , ] <- "gray" 

## Take average of four neighboring values for palette
zi <- (volcano[ -1,-1] + volcano[ -1,-61] + volcano[-87,-1] + volcano[-87,-61])/4
pal <- terrain.colors(20)[cut(zi, quantile(zi, seq(0,1, len = 21)), include.lowest = TRUE)]
fcol[-i1,-i2] <- pal

## Plot it
par(mar=rep(0,4))
persp(x, y, z, 
      theta=120, phi=15, col = fcol, scale = FALSE, shade = 0.4, border = NA)

# another way - plotly package
# this can be viewed only in html version
# library(plotly)
# plot_ly(z = ~volcano, type = "surface")

Spatial Data Import and Export

TODO

Analysing Spatial Data

Cressie (1993) spatial statistic classification: - Spatial Point Pattern Data - Interpolation and Geostatistical Data - Lattice Data (areal data)

before moving to specific Spatial Statistic area we start by introducing some spatial statistic concepts.

preliminaries: distance conversion function () and load data

km2d <- function(km){ # convert km to degrees
out <- (km/1.852)/60
return(out)
}

d2km <- function(d){ # convert degrees to km
out <- d*60*1.852
return(out)
}

spdf <- election # a SpatialPolygonsDataFrame object
map_crd <- coordinates(spdf)

Spatial Autocorrelation

this spatial topic can be best explored with the spdep package for Spatial autocorrelation analysis.

more on this package from the package CRAN page:

A collection of functions to create spatial weights matrix objects from polygon contiguities, from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial autocorrelation, including global Moran’s I, APLE, Geary’s C, Hubert/Mantel general cross product statistic, Empirical Bayes estimates and Assunção/Reis Index, Getis/Ord G and multicoloured join count statistics, local Moran’s I and Getis/Ord G, saddlepoint approximations and exact tests for global and local Moran’s I; and functions for estimating spatial simultaneous autoregressive (SAR) lag and error models, impact measures for lag models, weighted and unweighted SAR and CAR spatial regression models, semi-parametric and Moran eigenvector spatial filtering, GM SAR error models, and generalized spatial two stage least squares models.

contiguity neighbors:

W_cont_el <- poly2nb(spdf, queen = T)
W_cont_el_mat <- nb2listw(W_cont_el,
                          style = "W",
                          zero.policy = TRUE)

plot the connections:

par(mar=rep(0,4))
plot(W_cont_el_mat, 
     coords = map_crd,
     pch = 19, cex = 0.1, col = "gray")

Global Autocorrelation tests

Moran’s I:

the Moran’s I coefficient calculates the ratio between the product of the variable of interest and its spatial lag, with the product of the variable of interest, adjusted for the spatial weights used. \[ I = \frac {n} {\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}} \frac {\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}(y_i-\bar{y})(y_j-\bar{y})} {\sum_{i=1}^{n}(y_i-\bar{y})^2}\] note that I range from –1 (perfect dispersion) to +1 (perfect correlation). A zero value indicates a random spatial pattern

moran.test(spdf$Bush_pct,
           listw = W_cont_el_mat,
           zero.policy = T)
## 
##  Moran I test under randomisation
## 
## data:  spdf$Bush_pct  
## weights: W_cont_el_mat  
## 
## Moran I statistic standard deviate = 51.731, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      0.5565174275     -0.0003219575      0.0001158676

Geary’s C: The Geary’s C uses the sum of squared differences between pairs of data values as its measure of covariation.

\[C = \frac {(n-1)\sum _{i}\sum _{j}w_{ij}(y_{i}-y_{j})^{2}} {2W\sum _{i}(y_{i}-{\bar {y}})^{2}}\] Where \(W\) is the sum of all \(w_{ij}\)

geary.test(spdf$Bush_pct, listw=W_cont_el_mat, zero.policy=T)
## 
##  Geary C test under randomisation
## 
## data:  spdf$Bush_pct 
## weights: W_cont_el_mat 
## 
## Geary C statistic standard deviate = 50.359, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic       Expectation          Variance 
##      0.4204068521      1.0000000000      0.0001324606

Join Count: When the variable of interest is categorical, a join count analysis can be used to assess the degree of clustering or dispersion

spdf$BushWin <- as.factor(ifelse(spdf$Bush > spdf$Kerry,1,0))
joincount.multi(spdf$BushWin, listw=W_cont_el_mat, zero.policy=T)
##      Joincount  Expected  Variance z-value
## 0:0   130.7534   54.0599    6.7744  29.466
## 1:1  1111.0765 1030.8162   12.6147  22.598
## 1:0   311.6701  472.6272   29.4787 -29.645
## Jtot  311.6701  472.6272   29.9470 -29.413

Local Autocorrelation

It is possible to break Global measures down into their components, to construct local tests for spatial autocorrelation.

Moran Scatterplot: (note: we use the “W_cont_el_mat” wight matrix we clalculated before)

par(mar=c(4,4,1.5,0.5))
moran.plot(spdf$Bush_pct, listw = W_cont_el_mat, zero.policy=T,
           xlim = c(0,100), ylim = c(0,100), 
           pch = 16, col = "black",cex = 0.5, quiet=F,
           labels = as.character(spdf$NAME), 
           xlab = "Percent for Bush", 
           ylab="Percent for Bush (Spatial Lag)",
           main="Moran Scatterplot")

## Potentially influential observations of
##   lm(formula = wx ~ x) :
## 
##                                  dfb.1_ dfb.x dffit   cov.r   cook.d
## Glacier                           0.09  -0.08  0.09_*  1.00_*  0.00 
## Rolette                           0.10  -0.09  0.10_*  1.00    0.01 
## St. Louis                        -0.01   0.01 -0.01    1.00_*  0.00 
## San Juan                         -0.28   0.26 -0.29_*  0.97_*  0.04 
## Roosevelt                         0.06  -0.06  0.07    1.00_*  0.00 
## Clallam                          -0.05   0.04 -0.06    1.00_*  0.00 
## Jefferson                         0.01  -0.01  0.01    1.00_*  0.00 
## Garfield                          0.05  -0.05 -0.06    1.00    0.00 
## Missoula                          0.06  -0.05  0.07    1.00_*  0.00 
## Custer                           -0.01   0.02  0.04    1.00_*  0.00 
## Douglas                           0.01  -0.01  0.01    1.00_*  0.00 
## Ashland                           0.00   0.00  0.00    1.00_*  0.00 
## Sioux                             0.18  -0.17  0.19_*  0.99_*  0.02 
## Deer Lodge                        0.09  -0.08  0.09_*  1.00_*  0.00 
## Carter                           -0.01   0.02  0.02    1.00_*  0.00 
## Big Horn                          0.05  -0.05  0.06    1.00_*  0.00 
## Harding                          -0.01   0.01  0.01    1.00_*  0.00 
## Multnomah                         0.05  -0.05  0.05    1.00_*  0.00 
## Dewey                             0.10  -0.09  0.10_*  1.00_*  0.01 
## Valley                            0.01   0.00  0.04    1.00_*  0.00 
## Ramsey                            0.01  -0.01  0.01    1.00_*  0.00 
## Yellowstone National Park (Part)  0.51  -0.50  0.51_*  0.99_*  0.13 
## Clinton                          -0.03   0.02 -0.04    1.00_*  0.00 
## Lamoille                         -0.02   0.02 -0.03    1.00_*  0.00 
## Chittenden                       -0.02   0.02 -0.02    1.00_*  0.00 
## Clark                            -0.02   0.03  0.03    1.00_*  0.00 
## Buffalo                           0.16  -0.15  0.16_*  1.00_*  0.01 
## Blaine                            0.13  -0.12  0.14_*  0.99_*  0.01 
## Teton                             0.02  -0.01  0.05    1.00_*  0.00 
## Madison                           0.01  -0.01 -0.01    1.00_*  0.00 
## Mower                             0.01  -0.01  0.01    1.00_*  0.00 
## Ada                               0.00   0.00  0.04    1.00_*  0.00 
## Shannon                           0.36  -0.35  0.36_*  0.99_*  0.06 
## Todd                              0.20  -0.19  0.21_*  0.99_*  0.02 
## Windham                          -0.03   0.03 -0.04    1.00_*  0.00 
## Bannock                           0.00   0.01  0.05    1.00_*  0.00 
## Sioux                             0.01  -0.01 -0.02    1.00_*  0.00 
## Berkshire                         0.01  -0.01  0.01    1.00_*  0.00 
## Franklin                         -0.06   0.05 -0.06    1.00    0.00 
## Cassia                            0.00   0.00  0.00    1.00_*  0.00 
## Hampshire                        -0.06   0.06 -0.06    1.00_*  0.00 
## Suffolk                          -0.01   0.01 -0.01    1.00_*  0.00 
## Franklin                         -0.02   0.02  0.02    1.00_*  0.00 
## Blaine                           -0.03   0.04  0.04    1.00_*  0.00 
## Box Elder                        -0.02   0.02  0.02    1.00_*  0.00 
## Rich                             -0.01   0.01  0.01    1.00_*  0.00 
## Arthur                           -0.04   0.04  0.05    1.00_*  0.00 
## Geauga                           -0.01   0.00 -0.05    1.00_*  0.00 
## Banner                            0.00  -0.01 -0.01    1.00_*  0.00 
## Cuyahoga                          0.03  -0.03  0.03    1.00_*  0.00 
## Dukes                            -0.31   0.30 -0.32_*  0.98_*  0.05 
## Weber                            -0.03   0.04  0.06    1.00_*  0.00 
## Lincoln                          -0.02   0.03  0.05    1.00_*  0.00 
## Nantucket                        -0.26   0.24 -0.27_*  0.97_*  0.04 
## Morgan                            0.02  -0.03 -0.03    1.00_*  0.00 
## Summit                            0.05  -0.04  0.07    0.99_*  0.00 
## Mahoning                          0.00   0.00  0.00    1.00_*  0.00 
## Bergen                           -0.05   0.05 -0.06    1.00_*  0.00 
## Bronx                             0.09  -0.09  0.09_*  1.00_*  0.00 
## Salt Lake                         0.01   0.00  0.04    1.00_*  0.00 
## Nassau                           -0.04   0.03 -0.06    1.00_*  0.00 
## Essex                             0.01  -0.01  0.01    1.00_*  0.00 
## New York                         -0.07   0.06 -0.08_*  1.00_*  0.00 
## Uintah                            0.02  -0.03 -0.03    1.00_*  0.00 
## Duchesne                          0.01  -0.01 -0.01    1.00_*  0.00 
## Hudson                           -0.02   0.02 -0.02    1.00_*  0.00 
## Queens                            0.09  -0.08  0.09_*  1.00    0.00 
## Kings                            -0.05   0.04 -0.09_*  0.99_*  0.00 
## Hayes                            -0.02   0.02  0.02    1.00_*  0.00 
## Richmond                         -0.15   0.12 -0.19_*  0.96_*  0.02 
## Utah                              0.01  -0.01 -0.01    1.00_*  0.00 
## Philadelphia                      0.06  -0.06  0.06    1.00_*  0.00 
## Marion                            0.06  -0.05  0.07    1.00_*  0.00 
## Carbon                            0.02  -0.01  0.06    1.00_*  0.00 
## Denver                            0.05  -0.05  0.06    1.00_*  0.00 
## Pitkin                            0.05  -0.05  0.06    1.00    0.00 
## Montgomery                       -0.02   0.02 -0.02    1.00_*  0.00 
## Monroe                            0.05  -0.05  0.06    1.00_*  0.00 
## Anne Arundel                     -0.03   0.02 -0.06    0.99_*  0.00 
## Prince Georges                    0.05  -0.05  0.05    1.00_*  0.00 
## Washington                        0.01  -0.01  0.01    1.01_*  0.00 
## St. Louis                         0.13  -0.13  0.13_*  1.00    0.01 
## Sonoma                           -0.04   0.04 -0.04    1.00_*  0.00 
## Monroe                           -0.02   0.01 -0.04    1.00_*  0.00 
## Marin                            -0.07   0.07 -0.07    1.00_*  0.00 
## Elliott                           0.06  -0.05  0.06    1.00_*  0.00 
## San Miguel                        0.13  -0.13  0.13_*  1.00    0.01 
## Garfield                         -0.01   0.01  0.01    1.00_*  0.00 
## Dolores                           0.02  -0.03 -0.06    1.00_*  0.00 
## Alameda                           0.05  -0.04  0.05    1.00_*  0.00 
## San Francisco                    -0.05   0.05 -0.05    1.00_*  0.00 
## San Mateo                        -0.12   0.11 -0.12_*  1.00    0.01 
## Costilla                          0.01  -0.01  0.01    1.00_*  0.00 
## Jackson                           0.01  -0.01 -0.01    1.00_*  0.00 
## Knott                             0.00   0.00  0.00    1.00_*  0.00 
## Santa Clara                      -0.02   0.02 -0.02    1.00_*  0.00 
## Santa Cruz                       -0.02   0.02 -0.02    1.00_*  0.00 
## San Juan                          0.01  -0.02 -0.05    1.00_*  0.00 
## Rio Arriba                       -0.01   0.00 -0.01    1.00_*  0.00 
## Apache                            0.07  -0.07  0.08_*  1.00    0.00 
## Taos                              0.00   0.00  0.00    1.00_*  0.00 
## Hertford                         -0.02   0.01 -0.02    1.00_*  0.00 
## Warren                            0.01  -0.01  0.01    1.00_*  0.00 
## Person                           -0.01   0.00 -0.04    1.00_*  0.00 
## Ochiltree                        -0.05   0.05  0.06    1.00_*  0.00 
## Orange                            0.04  -0.04  0.04    1.00_*  0.00 
## Durham                            0.03  -0.03  0.03    1.00_*  0.00 
## Kingfisher                        0.00   0.00  0.00    1.00_*  0.00 
## Roberts                          -0.04   0.05  0.05    1.00_*  0.00 
## Santa Fe                          0.01  -0.01  0.01    1.00_*  0.00 
## McKinley                          0.02  -0.02  0.02    1.00_*  0.00 
## Los Alamos                       -0.04   0.03 -0.06    1.00_*  0.00 
## San Miguel                        0.04  -0.04  0.05    1.00_*  0.00 
## Tipton                            0.01  -0.02 -0.05    1.00_*  0.00 
## Potter                           -0.03   0.04  0.06    1.00_*  0.00 
## Oldham                           -0.01   0.01  0.01    1.00_*  0.00 
## St. Francis                      -0.02   0.02 -0.04    1.00_*  0.00 
## Torrance                          0.00  -0.01 -0.04    1.00_*  0.00 
## De Soto                           0.04  -0.06 -0.09_*  0.99_*  0.00 
## Tunica                            0.02  -0.02  0.03    1.00_*  0.00 
## Parmer                            0.01  -0.01 -0.01    1.00_*  0.00 
## Catron                            0.03  -0.04 -0.06    1.00_*  0.00 
## Grant                             0.00  -0.01 -0.04    1.00_*  0.00 
## Florence                         -0.02   0.01 -0.04    1.00_*  0.00 
## Foard                             0.01   0.00  0.04    1.00_*  0.00 
## Fulton                            0.08  -0.08  0.09_*  1.00_*  0.00 
## Clarke                            0.11  -0.10  0.12_*  0.99_*  0.01 
## Jefferson                         0.03  -0.02  0.05    1.00_*  0.00 
## Taliaferro                        0.02  -0.02  0.02    1.00_*  0.00 
## Carroll                           0.01  -0.02 -0.05    1.00_*  0.00 
## Clayton                           0.13  -0.12  0.13_*  1.00    0.01 
## Chicot                            0.01  -0.01  0.01    1.00_*  0.00 
## Fayette                           0.03  -0.03 -0.05    1.00_*  0.00 
## Hancock                           0.13  -0.13  0.13_*  1.00    0.01 
## Bamberg                          -0.01   0.01 -0.01    1.00_*  0.00 
## Holmes                            0.11  -0.10  0.11_*  1.00    0.01 
## Dorchester                        0.00  -0.01 -0.04    1.00_*  0.00 
## Glascock                          0.10  -0.12 -0.15_*  0.99_*  0.01 
## Noxubee                           0.06  -0.05  0.06    1.00_*  0.00 
## Greene                            0.09  -0.09  0.09_*  1.00_*  0.00 
## Allendale                         0.04  -0.04  0.04    1.00_*  0.00 
## West Carroll                      0.05  -0.06 -0.08    1.00_*  0.00 
## Dallas                            0.04  -0.03  0.06    1.00_*  0.00 
## Sumter                            0.02  -0.02  0.02    1.00_*  0.00 
## Shackelford                       0.00   0.00  0.00    1.00_*  0.00 
## Perry                             0.08  -0.07  0.08_*  1.00    0.00 
## Warren                           -0.02   0.01 -0.05    1.00_*  0.00 
## Macon                             0.20  -0.19  0.20_*  1.00    0.02 
## Effingham                         0.05  -0.06 -0.07    1.00_*  0.00 
## Marengo                          -0.04   0.03 -0.05    1.00_*  0.00 
## Schley                            0.02  -0.03 -0.05    1.00_*  0.00 
## Lowndes                           0.08  -0.07  0.08_*  1.00    0.00 
## Bullock                           0.02  -0.02  0.02    1.00_*  0.00 
## Wilcox                            0.02  -0.02  0.02    1.00_*  0.00 
## Beaufort                         -0.01   0.00 -0.04    1.00_*  0.00 
## Jasper                            0.05  -0.04  0.06    1.00_*  0.00 
## Claiborne                         0.13  -0.13  0.13_*  1.00    0.01 
## Sterling                         -0.02   0.03  0.03    1.00_*  0.00 
## Glasscock                        -0.02   0.03  0.03    1.00_*  0.00 
## Loving                            0.01  -0.02 -0.02    1.00_*  0.00 
## Lee                               0.05  -0.06 -0.08_*  1.00_*  0.00 
## Jefferson Davis                   0.29  -0.28  0.29_*  0.99_*  0.04 
## Crane                            -0.01   0.01  0.01    1.00_*  0.00 
## Miller                            0.02  -0.03 -0.06    1.00_*  0.00 
## Decatur                           0.10  -0.09  0.10_*  1.00    0.01 
## Grady                             0.00  -0.01 -0.05    1.00_*  0.00 
## Gadsden                           0.06  -0.06  0.07    1.00    0.00 
## Livingston                        0.05  -0.06 -0.07    1.00_*  0.00 
## Duval                             0.02  -0.01  0.04    1.00_*  0.00 
## Jefferson                         0.05  -0.04  0.06    1.00_*  0.00 
## Orleans                           0.22  -0.21  0.22_*  0.99_*  0.02 
## St. Bernard                       0.01  -0.02 -0.06    1.00_*  0.00 
## Alachua                           0.07  -0.06  0.07    1.00_*  0.00 
## LaFourche                         0.07  -0.08 -0.09_*  1.00_*  0.00 
## Bexar                             0.03  -0.02  0.05    1.00_*  0.00 
## Zavala                            0.10  -0.10  0.10_*  1.00    0.01 
## McMullen                          0.09  -0.11 -0.12_*  0.99_*  0.01 
## Dimmit                           -0.03   0.03 -0.03    1.00_*  0.00 
## Duval                             0.06  -0.06  0.06    1.00    0.00 
## Brooks                           -0.02   0.02 -0.02    1.00_*  0.00 
## Starr                            -0.02   0.02 -0.02    1.00_*  0.00 
## Willacy                           0.00   0.00  0.00    1.00_*  0.00 
## Menominee                         0.20  -0.19  0.20_*  1.00    0.02 
## Orleans                           0.00  -0.01 -0.06    0.99_*  0.00 
## Monroe                            0.22  -0.22  0.23_*  1.00_*  0.03 
## Wayne                            -0.01   0.00 -0.04    1.00_*  0.00 
## Dane                              0.03  -0.03  0.03    1.00_*  0.00 
## Milwaukee                         0.09  -0.08  0.09_*  1.00    0.00 
## Livingston                        0.00  -0.01 -0.04    1.00_*  0.00 
## Wayne                             0.02  -0.02  0.02    1.00_*  0.00 
## Washtenaw                         0.02  -0.01  0.02    1.00_*  0.00 
## Broome                            0.12  -0.11  0.12_*  1.00    0.01 
## Cook                              0.08  -0.07  0.08_*  1.00    0.00 
## Monroe                           -0.04   0.03 -0.05    1.00_*  0.00 
## Lagrange                          0.09  -0.08  0.09_*  1.00_*  0.00 
## Baltimore City                    0.15  -0.15  0.15_*  1.00    0.01 
## Fairfax                          -0.05   0.05 -0.06    1.00_*  0.00 
## Arlington                        -0.08   0.08 -0.08_*  1.00    0.00 
## Alexandria                       -0.12   0.11 -0.12_*  1.00_*  0.01 
## Harrisonburg                      0.02  -0.01  0.05    1.00_*  0.00 
## Staunton                          0.01   0.00  0.04    1.00_*  0.00 
## Charlottesville                   0.05  -0.05  0.05    1.00_*  0.00 
## Clifton Forge                     0.38  -0.37  0.38_*  1.00_*  0.07 
## Richmond City                     0.11  -0.10  0.11_*  1.00    0.01 
## Colonial Heights                  0.05  -0.06 -0.08_*  1.00_*  0.00 
## Petersburg                        0.26  -0.25  0.26_*  0.99_*  0.03 
## Poquoson City                     0.05  -0.06 -0.07    1.00_*  0.00 
## South Boston                      0.40  -0.40  0.40_*  0.99_*  0.08 
##                                  hat    
## Glacier                           0.00  
## Rolette                           0.00  
## St. Louis                         0.00  
## San Juan                          0.00  
## Roosevelt                         0.00  
## Clallam                           0.00  
## Jefferson                         0.00  
## Garfield                          0.00_*
## Missoula                          0.00  
## Custer                            0.00  
## Douglas                           0.00  
## Ashland                           0.00  
## Sioux                             0.00_*
## Deer Lodge                        0.00  
## Carter                            0.00  
## Big Horn                          0.00  
## Harding                           0.00  
## Multnomah                         0.00_*
## Dewey                             0.00  
## Valley                            0.00  
## Ramsey                            0.00  
## Yellowstone National Park (Part)  0.01_*
## Clinton                           0.00  
## Lamoille                          0.00  
## Chittenden                        0.00  
## Clark                             0.00  
## Buffalo                           0.00_*
## Blaine                            0.00  
## Teton                             0.00  
## Madison                           0.00_*
## Mower                             0.00  
## Ada                               0.00  
## Shannon                           0.00_*
## Todd                              0.00_*
## Windham                           0.00_*
## Bannock                           0.00  
## Sioux                             0.00  
## Berkshire                         0.00_*
## Franklin                          0.00_*
## Cassia                            0.00  
## Hampshire                         0.00_*
## Suffolk                           0.00_*
## Franklin                          0.00_*
## Blaine                            0.00_*
## Box Elder                         0.00  
## Rich                              0.00  
## Arthur                            0.00_*
## Geauga                            0.00  
## Banner                            0.00  
## Cuyahoga                          0.00  
## Dukes                             0.00_*
## Weber                             0.00  
## Lincoln                           0.00  
## Nantucket                         0.00  
## Morgan                            0.00  
## Summit                            0.00  
## Mahoning                          0.00  
## Bergen                            0.00  
## Bronx                             0.00_*
## Salt Lake                         0.00  
## Nassau                            0.00  
## Essex                             0.00_*
## New York                          0.00  
## Uintah                            0.00  
## Duchesne                          0.00  
## Hudson                            0.00  
## Queens                            0.00_*
## Kings                             0.00  
## Hayes                             0.00  
## Richmond                          0.00  
## Utah                              0.00  
## Philadelphia                      0.00_*
## Marion                            0.00  
## Carbon                            0.00  
## Denver                            0.00_*
## Pitkin                            0.00_*
## Montgomery                        0.00  
## Monroe                            0.00  
## Anne Arundel                      0.00  
## Prince Georges                    0.00_*
## Washington                        0.01_*
## St. Louis                         0.00_*
## Sonoma                            0.00_*
## Monroe                            0.00  
## Marin                             0.00_*
## Elliott                           0.00_*
## San Miguel                        0.00_*
## Garfield                          0.00  
## Dolores                           0.00  
## Alameda                           0.00_*
## San Francisco                     0.00_*
## San Mateo                         0.00_*
## Costilla                          0.00_*
## Jackson                           0.00  
## Knott                             0.00  
## Santa Clara                       0.00  
## Santa Cruz                        0.00_*
## San Juan                          0.00  
## Rio Arriba                        0.00  
## Apache                            0.00  
## Taos                              0.00_*
## Hertford                          0.00  
## Warren                            0.00  
## Person                            0.00  
## Ochiltree                         0.00_*
## Orange                            0.00  
## Durham                            0.00_*
## Kingfisher                        0.00  
## Roberts                           0.00_*
## Santa Fe                          0.00_*
## McKinley                          0.00  
## Los Alamos                        0.00  
## San Miguel                        0.00_*
## Tipton                            0.00  
## Potter                            0.00  
## Oldham                            0.00  
## St. Francis                       0.00  
## Torrance                          0.00  
## De Soto                           0.00  
## Tunica                            0.00_*
## Parmer                            0.00  
## Catron                            0.00  
## Grant                             0.00  
## Florence                          0.00  
## Foard                             0.00  
## Fulton                            0.00  
## Clarke                            0.00  
## Jefferson                         0.00  
## Taliaferro                        0.00  
## Carroll                           0.00  
## Clayton                           0.00_*
## Chicot                            0.00  
## Fayette                           0.00  
## Hancock                           0.00_*
## Bamberg                           0.00  
## Holmes                            0.00_*
## Dorchester                        0.00  
## Glascock                          0.00  
## Noxubee                           0.00_*
## Greene                            0.00_*
## Allendale                         0.00_*
## West Carroll                      0.00  
## Dallas                            0.00  
## Sumter                            0.00_*
## Shackelford                       0.00  
## Perry                             0.00_*
## Warren                            0.00  
## Macon                             0.00_*
## Effingham                         0.00  
## Marengo                           0.00  
## Schley                            0.00  
## Lowndes                           0.00_*
## Bullock                           0.00_*
## Wilcox                            0.00  
## Beaufort                          0.00  
## Jasper                            0.00  
## Claiborne                         0.00_*
## Sterling                          0.00  
## Glasscock                         0.00_*
## Loving                            0.00  
## Lee                               0.00  
## Jefferson Davis                   0.00_*
## Crane                             0.00  
## Miller                            0.00  
## Decatur                           0.00_*
## Grady                             0.00  
## Gadsden                           0.00_*
## Livingston                        0.00  
## Duval                             0.00  
## Jefferson                         0.00  
## Orleans                           0.00_*
## St. Bernard                       0.00  
## Alachua                           0.00  
## LaFourche                         0.00  
## Bexar                             0.00  
## Zavala                            0.00_*
## McMullen                          0.00  
## Dimmit                            0.00  
## Duval                             0.00_*
## Brooks                            0.00_*
## Starr                             0.00_*
## Willacy                           0.00  
## Menominee                         0.00_*
## Orleans                           0.00  
## Monroe                            0.00_*
## Wayne                             0.00  
## Dane                              0.00  
## Milwaukee                         0.00  
## Livingston                        0.00  
## Wayne                             0.00_*
## Washtenaw                         0.00  
## Broome                            0.00_*
## Cook                              0.00_*
## Monroe                            0.00  
## Lagrange                          0.00  
## Baltimore City                    0.00_*
## Fairfax                           0.00  
## Arlington                         0.00_*
## Alexandria                        0.00  
## Harrisonburg                      0.00  
## Staunton                          0.00  
## Charlottesville                   0.00_*
## Clifton Forge                     0.01_*
## Richmond City                     0.00_*
## Colonial Heights                  0.00  
## Petersburg                        0.00_*
## Poquoson City                     0.00  
## South Boston                      0.01_*

The plot is partitioned into four quadrants: low-low, low-high, high-low and high-high

Local Moran’s I Plot (normality assumption)

lm1 <- localmoran(spdf$Bush_pct, listw = W_cont_el_mat, zero.policy = T)

spdf$lm1 <- lm1[,4] ## Extract z-scores

lm.palette <- colorRampPalette(c("red","white","blue","blue4"), space = "rgb")
spplot(spdf, zcol="lm1", 
       col.regions = lm.palette(20),
       main = "Local Moran's I", pretty = T)

Spatial Weights

Point Pattern Analysis

Geostatistics

we will cover: - Interpolation with Inverse-distance weighted (IDW) - The variogram - Kringing

and more topics in the future

start with exploring the Laos Dataset

# subset of Laos bombing dataset:
df <- laos[sample(500, replace = F),]
str(df) #data.frame object
## 'data.frame':    500 obs. of  24 variables:
##  $ ID         : int  261 49 268 147 99 340 297 167 38 359 ...
##  $ LAT        : num  18.7 18.2 18.2 18.2 18.2 ...
##  $ LONG       : num  104 105 105 105 105 ...
##  $ DATE       : Factor w/ 900 levels "1/1/1966","1/1/1968",..: 82 617 474 749 480 46 503 835 364 525 ...
##  $ YEAR       : int  1970 1967 1970 1969 1969 1972 1970 1969 1967 1972 ...
##  $ MONTH      : int  1 4 2 5 2 1 3 8 2 3 ...
##  $ DAY        : int  24 13 5 4 6 17 10 16 10 16 ...
##  $ YRMO       : int  197001 196704 197002 196905 196902 197201 197003 196908 196702 197203 ...
##  $ LAT_IND60  : num  18.7 18.2 18.2 18.2 18.2 ...
##  $ LON_IND60  : num  104 105 105 105 105 ...
##  $ NUM_ACRFT  : int  2 2 4 2 2 2 2 4 2 2 ...
##  $ AIRCRAFT   : Factor w/ 44 levels "40002","A-1",..: 5 3 30 5 3 42 35 12 10 42 ...
##  $ LOAD_QTY   : int  4 4 2 2 2 12 6 20 6 12 ...
##  $ LOAD_LBS   : int  531 301 799 531 531 63720 531 531 260 63720 ...
##  $ ORDNANCE   : Factor w/ 33 levels "1000 LB  Bomb MK-83",..: 11 24 15 25 25 30 11 11 7 30 ...
##  $ ORD_CLASS  : Factor w/ 11 levels "1000LB MK-65",..: 9 6 10 9 9 9 9 9 6 9 ...
##  $ CATEGORY   : Factor w/ 2 levels "General_Purpose",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ TARGET     : Factor w/ 108 levels " ","100mm Gun Site",..: 19 19 19 19 19 21 19 19 19 21 ...
##  $ BDA        : Factor w/ 31 levels " ","Cratered",..: 30 30 14 30 24 28 24 30 22 28 ...
##  $ Total.Bombs: int  4 27 2 2 20 12 6 20 6 24 ...
##  $ PUNISH     : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ RISK       : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ DENIAL     : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ DECAP      : int  0 0 0 0 0 0 0 0 0 0 ...

open simple world map and create coordinates matrix:

data("wrld_simpl")
sp_point <- cbind(LONG = jitter(df$LONG, 0.001),
                  LAT = jitter(df$LAT, 0.001))

create SpatialPointsDataFrame object and plot it:

df.sp <- SpatialPointsDataFrame(coords = sp_point, data = df,
            proj4string = CRS("+proj=utm +zone=48 +datum=WGS84"))

par(mar = rep(0,4))
plot(df.sp, pch = 1, cex = log(df.sp$LOAD_LBS)/5)

zoom in:

par(mar=c(2,2,0.5,0.5))
plot(wrld_simpl,
     xlim = bbox(df.sp)[1,] + c(-1,1),
     ylim = bbox(df.sp)[2,] + c(-2,2),
     col = "lightgrey", axes = T) ## World Map
points(df.sp, pch = 16,cex = .5, col = "red")

bubble plot: PAYLOAD variable:

bubble(df.sp,"LOAD_LBS", col = "red")

Variogram cloud

A variogram cloud is a scatterplot of data pairs, in which the semivariance is plotted against interpoint distance

plot(variogram(log(LOAD_LBS)~1,
               locations = coordinates(sp_point),
               data = df.sp, cloud = T)
     ,pch=16, cex=0.5)

Upper left corner: point pairs are close together, but have very different values. Lower left corner: close together, similar values. Upper right corner: far apart, different values. Lower right corner: far apart, similar values. A variogram can also be used to identify outliers…

Sample Variogram:

plot(variogram(log(LOAD_LBS)~1,
               locations = coordinates(sp_point),
               data = df.sp, cloud = F),
     type = "b", pch=16)

Identify outlying pairs (in html verssion):

sel <- plot(variogram(LOAD_LBS ~ 1,
                      locations = coordinates(sp_point),
                      data = df.sp,
                      alpha = c(0,45,90,135), cloud = T),
            pch = 16, cex = 1, digitize = T,col = "blue")
plot(sel,df.sp)

Find the outlying pairs in the dataset

sel

out.pair <- function(x,data,sel){
    a <- as.data.frame(data[sel$head,as.character(x)])
    b <- as.data.frame(data[sel$tail,as.character(x)])
    ID.a <- round(as.numeric(rownames(as.data.frame(data[sel$head,]))),0)
    ID.b <- round(as.numeric(rownames(as.data.frame(data[sel$tail,]))),0)
    out <- cbind(ID.a,a[,as.character(x)],ID.b,b[,as.character(x)])
    colnames(out) <- c("ID.a",paste(x,".a",sep=""),"ID.b",paste(x,".b",sep=""))
    out
    }

out.pair(x="LOAD_LBS",data=df.sp,sel=sel)