Spatial Statistics With R
Tutorial
This document provides a an introduction to applied spatial statistics. It is based mostly on the second edition of the book: Applied Spatial Data Analysis with R and on the Harvard Workshop: Applied Spatial Statistics in R
We use the following R Packages:
#install.packages("maps")
#install.packages("maptools")
#install.packages("sp")
#install.packages("spdep")
#install.packages("gstat")
#install.packages("splancs")
#install.packages("spatstat")
#install.packages("lattice")
#install.packages("pgirmess")
#install.packages("RColorBrewer")
#install.packages("classInt")
#install.packages("spgwr")
#install.packages("leaflet")
library(maps) ## Projections
library(maptools) ## Data management
library(sp) ## Data management
library(spdep) ## Spatial autocorrelation
library(gstat) ## Geostatistics
library(splancs) ## Kernel Density
library(spatstat) ## Geostatistics
#library(pgirmess) ## Spatial autocorrelation <<<Error: required 'rgdal'>>>
library(RColorBrewer) ## Visualization
library(classInt) ## Class intervals
library(spgwr) ## GWR
# load some data:
#url <- url("http://www.people.fas.harvard.edu/~zhukov/Datasets.RData")
#load(url)
#save(laos,crime,cities,volcano,election,dat88,mat88,file="Datasets.RData")
for comprehensive description of spatial packeges see CRAN’s Spatial Task View
Spatial objects:
Spatial* has two slots. The first is a bounding box, a matrix of numerical coordinates with column names c(‘min’, ‘max’), and at least two rows, with the first row eastings (x-axis) and the second northings (y-axis).
CRS - coordinate reference system class, gets character string as its only slot value, which may be a missing value. If it is not missing, it should be a PROJ.4-format string describing the projection (more details are given in Sect. 4.1.2). For geographical coordinates, the simplest such string is “+proj=longlat”.
m <- matrix(c(0,0,1,1),
ncol = 2,
dimnames = list(NULL, c("min", "max")))
crs <- CRS(projargs = as.character(NA))
crs
## CRS arguments: NA
S <- Spatial(bbox = m, proj4string = crs)
S
## An object of class "Spatial"
## Slot "bbox":
## min max
## [1,] 0 1
## [2,] 0 1
##
## Slot "proj4string":
## CRS arguments: NA
using standard read.table:
url1 <- "http://www.asdar-book.org/datasets/CRAN051001a.txt"
CRAN_df <- read.table(url1, header = T)
CRAN_mat <- cbind(CRAN_df$long, CRAN_df$lat)
row.names(CRAN_mat) <- 1:nrow(CRAN_mat)
summary(CRAN_mat)
## V1 V2
## Min. :-122.9500 Min. :-37.82
## 1st Qu.: -47.3833 1st Qu.: 34.52
## Median : 7.8500 Median : 42.73
## Mean : -0.6617 Mean : 31.73
## 3rd Qu.: 16.8333 3rd Qu.: 47.58
## Max. : 153.0333 Max. : 57.05
SpatialPoints object:llCRS <- CRS("+proj=longlat +ellps=WGS84")
CRAN_sp <- SpatialPoints(CRAN_mat, proj4string = llCRS)
summary(CRAN_sp)
## Object of class SpatialPoints
## Coordinates:
## min max
## coords.x1 -122.95000 153.0333
## coords.x2 -37.81667 57.0500
## Is projected: FALSE
## proj4string : [+proj=longlat +ellps=WGS84]
## Number of points: 54
the summary gives min+max values for coordinates.
Methods
The bbox method returns the bounding box of the object, and is used both for preparing plotting methods (see Chap.3) and internally in handling data objects
bbox(CRAN_sp)
## min max
## coords.x1 -122.95000 153.0333
## coords.x2 -37.81667 57.0500
SpatialPointsDataFrame
The SpatialPointsDataFrame class is the container for this kind of spatial point information, and can be constructed in a number of ways, for example from a data frame and a matrix of coordinates
The Spatial*DataFrame classes have been designed to behave as far as possible like data frames
the match.ID argument is set to its default value of TRUE, then the matrix row names are checked against the row names of the data frame
CRAN_spdf1 <- SpatialPointsDataFrame(CRAN_mat, CRAN_df,
proj4string = llCRS,
match.ID = TRUE)
head(CRAN_spdf1)
## coordinates place north east loc
## 1 (153.0333, -27.46667) Brisbane 27d28'S 153d02'E Australia
## 2 (144.9667, -37.81667) Melbourne 37d49'S 144d58'E Australia
## 3 (16.33333, 48.21667) Wien 48d13'N 16d20'E Austria
## 4 (-49.26667, -25.41667) Curitiba 25d25'S 49d16'W Brazil
## 5 (-42.86667, -20.75) Vi\xe7oza 20d45'S 42d52'W Brazil
## 6 (-43.2, -22.9) Rio de Janeiro 22d54'S 43d12'W Brazil
## long lat
## 1 153.03333 -27.46667
## 2 144.96667 -37.81667
## 3 16.33333 48.21667
## 4 -49.26667 -25.41667
## 5 -42.86667 -20.75000
## 6 -43.20000 -22.90000
Example: creating a SpatialPointsDataFrame
we converte the data.frame to SpatialPointsDataFrame when calling the coordinates function
url2 <- "http://www.asdar-book.org/datasets/seamap105_mod.csv"
turtle_df <- read.csv(url2)
timestamp <- as.POSIXlt(strptime(as.character(turtle_df$obs_date),
"%m/%d/%Y %H:%M:%S"), "GMT")
turtle_df1 <- data.frame(turtle_df, timestamp = timestamp)
turtle_df1$lon <- ifelse(turtle_df1$lon < 0, turtle_df1$lon + 360, turtle_df1$lon)
turtle_sp <- turtle_df1[order(turtle_df1$timestamp), ]
# turtle_sp is still a data.frame
coordinates(turtle_sp) <- c("lon", "lat")
# now turlte_sp is a SpatialPointsDataFrame
proj4string(turtle_sp) <- CRS("+proj=longlat +ellps=WGS84")
SpatialLinesExample: SpatialLines and maptools
library(maps)
japan <- map("world", "japan", plot = F)
p4s <- CRS("+proj=longlat +ellps=WGS84")
library(maptools)
SLjapan <- map2SpatialLines(japan, proj4string = p4s)
str(SLjapan, max.level = 2)
## Formal class 'SpatialLines' [package "sp"] with 3 slots
## ..@ lines :List of 34
## ..@ bbox : num [1:2, 1:2] 123.7 24.3 145.8 45.5
## .. ..- attr(*, "dimnames")=List of 2
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
plot(SLjapan)
A very typical way of exploring the contents of these objects is to use lapply or sapply in combination with slot
Lines_len <- sapply(slot(SLjapan, "lines"),
function(x) length(slot(x,"Lines")))
table(Lines_len)
## Lines_len
## 1
## 34
no Lines object contains more than one Line object:
converting data to SpatialLinesDataFrame using CcontourLines2SLDF function:
consider the volcano data:
image(volcano)
volcano_sl <- ContourLines2SLDF(contourLines(volcano))
t(slot(volcano_sl, "data"))
## C_1 C_2 C_3 C_4 C_5 C_6 C_7 C_8 C_9 C_10
## level "100" "110" "120" "130" "140" "150" "160" "170" "180" "190"
SpatialPolygonsStart with an example:
url3 <- "http://www.asdar-book.org/datasets/auckland_mapgen.dat"
llCRS <- CRS("+proj=longlat +ellps=WGS84")
auck_shore <- MapGen2SL(url3, llCRS)
summary(auck_shore)
## Object of class SpatialLines
## Coordinates:
## min max
## x 174.2 175.3
## y -37.5 -36.5
## Is projected: FALSE
## proj4string : [+proj=longlat +ellps=WGS84]
lns <- slot(auck_shore, "lines")
table(sapply(lns, function(x) length(slot(x, "Lines"))))
##
## 1
## 80
islands_auck <- sapply(lns, function(x) {
crds <- slot(slot(x, "Lines")[[1]], "coords")
identical(crds[1, ], crds[nrow(crds), ])
})
table(islands_auck)
## islands_auck
## FALSE TRUE
## 16 64
Since all the Lines in the auck_shore object contain only single Line objects, checking the equality of the first and last coordinates of the first Line object in each Lines object tells us which sets of coordinates can validly be made into polygons.
The Polygon class extends the Line class by adding slots needed for polygons and checking that the first and last coordinates are identical. The extra slots are: - a label point, taken as the centroid of the polygon - the area of the polygon in the metric of the coordinates - whether the polygon is declared as a hole or not – the default value is a logical NA - the ring direction of the polygon
Example: build a SpatialPolygons object
islands_sl <- auck_shore[islands_auck]
list_of_Lines <- slot(islands_sl, "lines")
islands_sp <- SpatialPolygons(lapply(list_of_Lines, function(x) {
Polygons(list(Polygon(slot(slot(x, "Lines")[[1]],
"coords"))), ID = slot(x, "ID"))
}), proj4string = CRS("+proj=longlat +ellps=WGS84"))
summary(islands_sp)
## Object of class SpatialPolygons
## Coordinates:
## min max
## x 174.30297 175.22791
## y -37.43877 -36.50033
## Is projected: FALSE
## proj4string : [+proj=longlat +ellps=WGS84]
slot(islands_sp, "plotOrder")
## [1] 45 54 37 28 38 27 12 11 59 53 5 25 26 46 7 55 17 34 30 16 6 43 14
## [24] 40 32 19 61 42 15 50 21 18 62 23 22 29 24 44 13 2 36 9 63 58 56 64
## [47] 52 39 51 1 8 3 4 20 47 35 41 48 60 31 49 57 10 33
order(sapply(slot(islands_sp, "polygons"),
function(x) slot(x, "area")), decreasing = TRUE)
## [1] 45 54 37 28 38 27 12 11 59 53 5 25 26 46 7 55 17 34 30 16 6 43 14
## [24] 40 32 19 61 42 15 50 21 18 62 23 22 29 24 44 13 2 36 9 63 58 56 64
## [47] 52 39 51 1 8 3 4 20 47 35 41 48 60 31 49 57 10 33
As we saw with the construction of SpatialLines objects from raw coordinates, here we build a list of Polygon objects for each Polygons object, corresponding to a single identifying tag. A list of these Polygons objects is then passed to the SpatialPolygons function, with a coordinate reference system, to create the SpatialPolygons object. Again, like SpatialLines objects, SpatialPolygons objects are most often created by functions that import or manipulate such data objects, and seldom from scratch.
SpatialPolygonsDataFrame
Example: merging US state data with state boundary polygons
library(maps)
state.map <- map("state", plot = F, fill = T)
IDs <- sapply(strsplit(state.map$names, ":"), function(x) x[1])
library(maptools)
state.sp <- map2SpatialPolygons(state.map, IDs = IDs,
proj4string = CRS("+proj=longlat +ellps=WGS84"))
plot(state.sp)
Then we can use identifying tag matching to suit the rows of the data frame to the SpatialPolygons. Here, we subset to the matched rows of the data frame, to ensure that one row corresponds to each Polygons object, to achieve one-to-one matching:
url4 <- "http://www.asdar-book.org/datasets/state.sat.data_mod.txt"
sat <- read.table(url4, row.names = 5, header = T)
id <- match(row.names(sat), row.names(state.sp))
row.names(sat)[is.na(id)]
## [1] "alaska" "hawaii" "usa"
sat1 <- sat[!is.na(id), ]
state.spdf <- SpatialPolygonsDataFrame(state.sp, sat1)
slot(state.spdf, "data")
## oname vscore mscore pc
## alabama ala 561 555 9
## arizona ariz 524 525 34
## arkansas ark 563 556 6
## california calif 497 514 49
## colorado colo 536 540 32
## connecticut conn 510 509 80
## delaware dela 503 497 67
## district of columbia d.c. 494 478 77
## florida fla 499 498 53
## georgia ga 487 482 63
## idaho idaho 542 540 16
## illinois ill 569 585 12
## indiana ind 496 498 60
## iowa iowa 594 598 5
## kansas kan 578 576 9
## kentucky ky 547 547 12
## louisiana la 561 558 8
## maine maine 507 503 68
## maryland md 507 507 65
## massachusetts mass 511 511 78
## michigan mich 557 565 11
## minnesota minn 586 598 9
## mississippi miss 563 548 4
## missouri mo 572 572 8
## montana mont 545 546 21
## nebraska neb 568 571 8
## nevada nev 512 517 34
## new hampshire nh 520 518 72
## new jersey nj 498 510 80
## new mexico nm 549 542 12
## new york ny 495 502 76
## north carolina nc 493 493 61
## north dakota nd 594 605 5
## ohio ohio 534 538 25
## oklahoma okla 567 560 8
## oregon ore 525 525 53
## pennsylvania pa 498 495 70
## rhode island ri 504 499 70
## south carolina sc 479 475 61
## south dakota sd 585 588 4
## tennessee tenn 559 553 13
## texas texas 494 499 50
## utah utah 570 568 5
## vermont vt 514 506 70
## virginia va 508 499 65
## washington wash 525 526 52
## west virginia wva 527 512 18
## wisconsin wis 584 595 7
## wyoming wyo 546 551 10
SpatialGrid and SpatialPixel ObjectsExample: making a GridTopology object
#TODO
A major pleasure in working with spatial data is their visualisation.
This chapter introduces the plotting methods for objects inheriting from class Spatial that are provided by package sp.
Example: creating spatial objects from data.frame objects from the sp package
library(sp)
data("meuse")
class(meuse)
## [1] "data.frame"
head(meuse)
## x y cadmium copper lead zinc elev dist om ffreq soil
## 1 181072 333611 11.7 85 299 1022 7.909 0.00135803 13.6 1 1
## 2 181025 333558 8.6 81 277 1141 6.983 0.01222430 14.0 1 1
## 3 181165 333537 6.5 68 199 640 7.800 0.10302900 13.0 1 1
## 4 181298 333484 2.6 81 116 257 7.655 0.19009400 8.0 1 2
## 5 181307 333330 2.8 48 117 269 7.480 0.27709000 8.7 1 2
## 6 181390 333260 3.0 61 137 281 7.791 0.36406700 7.8 1 2
## lime landuse dist.m
## 1 1 Ah 50
## 2 1 Ah 30
## 3 1 Ah 150
## 4 0 Ga 270
## 5 0 Ah 380
## 6 0 Ga 470
# converting to Spatial object:
coordinates(meuse) <- c("x","y")
class(meuse)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
plot(meuse, main = "points")
The SpatialPointsDataFrame object used is created from a data.frame provided with sp, and the plot method shows the points with the default symbol.
cc <- coordinates(meuse)
m.sl <- SpatialLines(list(Lines(list(Line(cc)), "line1")))
plot(m.sl, main = "lines")
A SpatialLines object is made by joining up the points in sequence, and plot draws the resulting zig-zags.
data("meuse.riv")
meuse.lst <- list(Polygons(list(Polygon(meuse.riv)), "meuse.riv"))
meuse.pol <- SpatialPolygons(meuse.lst)
plot(meuse.pol, col = "grey", main ="grid")
Grid:
data("meuse.grid")
coordinates(meuse.grid) <- c("x","y")
meuse.grid <- as(meuse.grid, "SpatialPixels")
image(meuse.grid, col = "grey", main = "grid")
A map becomes more readable when we combine several elements. We can display elements from those created above by using the add = TRUE argument in function calls:
image(meuse.grid, col = "lightgrey")
plot(meuse.pol, col = "grey", add = T)
plot(meuse, pch = 11 ,col = "blue", add = T)
Example:some notes about plotting and axes:
oldpar = par(no.readonly = TRUE) # set oldpar
layout(matrix(c(1, 2), 1, 2))
plot(meuse, axes = TRUE, cex = 0.6)
plot(meuse.pol, add = TRUE)
title("Sample locations")
par(mar = c(0, 0, 0, 0) + 0.1)
plot(meuse, axes = FALSE, cex = 0.6)
plot(meuse.pol, add = TRUE)
box()
par(oldpar)
This table summerises the SP DataFrame classes
| Class(es) | Argument | Meaning | Further help |
|---|---|---|---|
| SpatialLinesDataFrame | col |
Colour | ?lines |
lwd |
Line width | ?lines |
|
lty |
Line type | ?lines |
|
| SpatialPolygonsDataFrame | border |
Border colour | ?polygon |
density |
Hashing density | ?polygon |
|
angle |
Hashing angle | ?polygon |
|
lty |
Line type | ?polygon |
|
pbg |
Hole colour | ||
| SpatialPointsDataFrame | pch |
Symbol | ?points |
col |
Colour | ?points |
|
bg |
Fill colour | ?points |
|
cex |
Symbol size | ?points |
|
| SpatialPixelsDataFrame | zlim |
Attribute value limits | ?image.default |
| and | col |
Colours | ?image.default |
| SpatialGridDataFrame | breaks |
Break points | ?image.default |
Example: image methods for objects of class SpatialPixelsDataFrame and SpatialGridDataFrame
library(gstat)
zn.idw <- krige(log(zinc) ~ 1, meuse, meuse.grid)
## [inverse distance weighted interpolation]
grays = gray.colors(4, 0.55, 0.95)
image(zn.idw, col = grays, breaks = log(c(100, 200, 400, 800, 1800)))
plot(meuse.pol, add = TRUE)
plot(meuse, pch = 1, cex = sqrt(meuse$zinc)/20, add = TRUE)
legVals <- c(100, 200, 500, 1000, 2000)
legend("left", legend = legVals, pch = 1, pt.cex = sqrt(legVals)/20,
bty = "n", title = "measured")
legend("topleft", legend = c("100-200", "200-400", "400-800","800-1800"),
fill = grays, bty = "n", title = "interpolated")
library(lattice)
##
## Attaching package: 'lattice'
## The following object is masked from 'package:spatstat':
##
## panel.histogram
zn <- krige(zinc~1,meuse,meuse.grid)
## [inverse distance weighted interpolation]
zn$direct <- zn$var1.pred
zn$log <- exp(krige(log(zinc)~1,meuse,meuse.grid)$var1.pred)
## [inverse distance weighted interpolation]
# traditional:
levelplot(z ~ x + y | name,
spmap.to.lev(zn[c("direct","log")]),
asp = "iso")
# with spplot:
spplot(zn[c("direct","log")])
Example: SpatialLinesDataFrame with contourLines function and spplot (using maptools)
data(meuse.grid)
coordinates(meuse.grid) <- c("x", "y")
meuse.grid <- as(meuse.grid, "SpatialPixelsDataFrame")
im <- as.image.SpatialGridDataFrame(meuse.grid["dist"])
cl <- ContourLines2SLDF(contourLines(im))
spplot(cl)
Example: Adding Reference and Layout Elements to Plots with spplot
river <- list("sp.polygons", meuse.pol)
north <- list("SpatialPolygonsRescale",
layout.north.arrow(),
offset = c(178750, 332500),
scale = 400)
scale <- list("SpatialPolygonsRescale",
layout.scale.bar(),
offset = c(180200, 329800),
scale = 1000,
fill = c("transparent","black"))
txt1 <- list("sp.text", c(180200, 329950), "0")
txt2 <- list("sp.text", c(181200, 329950), "1 km")
pts <- list("sp.points",
meuse,
pch = 3, col = "black")
meuse.layout <- list(river, north, scale, txt1, txt2, pts)
spplot(zn["log"], sp.layout = meuse.layout)
Example: latticeExtra
library(latticeExtra)
p = spplot(meuse["zinc"])
m = SpatialPolygonsDataFrame(meuse.pol,
data.frame(col = 1),
match.ID = FALSE)
l = spplot(m)
l + p
p + l
(this can not be demonstrated in the pdf version)
Example: interactive with base
plot(meuse)
region <- locator(type = "o")
n <- length(region$x)
p <- Polygon(cbind(region$x, region$y)[c(1:n, 1), ],
hole = FALSE)
ps <- Polygons(list(p), ID = "region")
sps <- SpatialPolygons(list(ps))
plot(meuse[sps, ], pch = 16, cex = 0.5, add = TRUE)
Example: interactive with spplot To select points with spplot, use: identify = T in the spplot function
Digitising can be done by the function grid.locator from package grid, which underlies the functionality in lattice. A single point is selected by:
library(grid)
trellis.focus("panel", column = 1, row = 1)
as.numeric(grid.locator())
trellis.unfocus()
note aboute choosing class intervals: Fisher-Jenks natural breaks is a better option than quantiles.
This exercises are mostly based on the Harvard workshop: Applied Spatial Statistics in R.
rm(list=ls())
url <- url("http://www.people.fas.harvard.edu/~zhukov/Datasets.RData")
load(url) #might takes some time
ls()
## [1] "cities" "crime" "dat88" "election" "laos" "map2"
## [7] "map3" "mat88" "url" "volcano"
Example: point data - crime database. (using Traditional Plot System)
head(crime)
## ID LONG LAT
## 1 1 -76.65159 39.23941
## 2 2 -76.47434 39.35274
## 3 3 -76.51726 39.25874
## 4 4 -76.52607 39.40707
## 5 5 -76.51001 39.33571
## 6 6 -76.70375 39.26605
dim(crime)
## [1] 1061 3
First, we convert the crime dataframe into a SpatialDataFrame
proj <- CRS("+proj=utm +zone=17 +datum=WGS84")
crime.sp <- SpatialPointsDataFrame(coords = cbind(crime$LONG, crime$LAT),
data = crime,
proj4string = proj)
bounding box of data points:
bbox(crime.sp)
## min max
## coords.x1 -76.82981 -76.38756
## coords.x2 39.22488 39.67995
Plot points:
plot(crime.sp, pch = 16, cex = 0.5, axes = T)
Example: polygon data : 2004 US election + crime. (using Traditional Plot System)
The election data is already in class: SpatialPolygonDataFrame
class(election)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
names(election)
## [1] "NAME" "STATE_NAME" "STATE_FIPS" "CNTY_FIPS" "FIPS"
## [6] "AREA" "FIPS_num" "Bush" "Kerry" "County_F"
## [11] "Nader" "Total" "Bush_pct" "Kerry_pct" "Nader_pct"
## [16] "MDratio" "hosp" "pcthisp" "pcturban" "urbrural"
## [21] "pctfemhh" "pcincome" "pctpoor" "pctlt9ed" "pcthsed"
## [26] "pctcoled" "unemploy" "pctwhtcl" "homevalu" "rent"
## [31] "popdens" "crowded" "ginirev" "SmokecurM" "SmokevrM"
## [36] "SmokecurF" "SmokevrF" "Obese" "Noins" "XYLENES__M"
## [41] "TOLUENE" "TETRACHLOR" "STYRENE" "NICKEL_COM" "METHYLENE_"
## [46] "MERCURY_CO" "LEAD_COMPO" "BENZENE__I" "ARSENIC_CO" "POP2000"
## [51] "POP00SQMIL" "MALE2000" "FEMALE2000" "MAL2FEM" "UNDER18"
## [56] "AIAN" "ASIA" "BLACK" "NHPI" "WHITE"
## [61] "AIAN_MORE" "ASIA_MORE" "BLK_MORE" "NHPI_MORE" "WHT_MORE"
## [66] "HISP_LAT" "CH19902000" "MEDAGE2000" "PEROVER65"
dim(election)
## [1] 3111 69
election.sp <- election
Lambert Conformal Conic Projection:
proj4string(election.sp) <- CRS("+proj=lcc+lon_0=90w +lat_1=20n +lat_2=60n")
summary(election.sp)[1:4]
## $class
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
##
## $bbox
## min max
## r1 -124.73142 -66.96985
## r2 24.95597 49.37173
##
## $is.projected
## [1] TRUE
##
## $proj4string
## [1] "+proj=lcc+lon_0=90w +lat_1=20n +lat_2=60n"
plot basemap of counties:
par(mar = c(0,0,0,0))
plot(election)
basemap + crime
par(mar=rep(0.5,4))
plot(election,
xlim=bbox(crime.sp)[1,], # defining the zoom with bbox
ylim=bbox(crime.sp)[2,],
col="beige")
plot(crime.sp,pch=1, cex=.5,add=T, col="blue")
Example: ploting some attributes : 2004 US election + crime. (using Traditional Plot System)
Firs let’s check the RColorBrewer package options:
par(mar=c(0,3,0,0),cex=.6)
display.brewer.all(n=5)
# Create blue-state red-state palette
br.palette <- colorRampPalette(c("blue", "red"), space = "rgb")
br.palette(5)
## [1] "#0000FF" "#3F00BF" "#7F007F" "#BF003F" "#FF0000"
plot the % of vote for Bush
here are two plotting options:
Easy but unflexible option
bush <- election$Bush_pct
spplot(election,
zcol = "Bush_pct",
col.regions = br.palette(100),
main = "Percent of County Vote for Bush (2004)")
Harder but more flexible option: Using classInt package.
pal <- br.palette(n=5) # Define number of colors in a palette
# Fixed intervals:
classes_fx <- classIntervals(bush, n=5, style = "fixed",
fixedBreaks=c(0,10,25,50,75,100), rtimes = 1)
# Other methods:
classes_sd <- classIntervals(bush, n=5, style = "sd", rtimes = 1)
classes_fi <- classIntervals(bush, n=5, style = "fisher", rtimes = 3)
classes_eq <- classIntervals(bush, n=5, style = "equal", rtimes = 1)
classes_km <- classIntervals(bush, n=5, style = "kmeans", rtimes = 1)
classes_qt <- classIntervals(bush, n=5, style = "quantile", rtimes = 1)
# Compare classes before plotting
par(mar=c(2,2,2,1)+0.1, mfrow=c(2,3))
plot(classes_fx, pal=pal, main="Fixed Intervals", xlab="", ylab="")
plot(classes_sd, pal=pal, main="Standard Deviation", xlab="", ylab="")
plot(classes_fi, pal=pal, main="Fisher-Jenks", xlab="", ylab="")
plot(classes_km, pal=pal, main="K Means", xlab="", ylab="")
plot(classes_eq, pal=pal, main="Equal Interval", xlab="", ylab="")
plot(classes_qt, pal=pal, main="Quantile", xlab="", ylab="")
par(mfrow = c(2,1))
# Plot using fixed intervals
cols <- findColours(classes_fx, pal) # this is a classInt function
par(mar=rep(0,4))
plot(election, col = cols, border=NA)
legend(x = "bottom", cex = 0.7, fill = attr(cols, "palette"),
bty = "n", legend = names(attr(cols, "table")),
title = "Percent of County Vote for Bush (2004)", ncol=5)
# Plot binary Bush/Kerry (Red/Blue)
cols <- ifelse(election.sp$Bush > election.sp$Kerry,"red","blue")
plot(election, col = cols, border = NA)
legend(x = "bottom", cex= 0.7, fill = c("red","blue"),
bty = "n", legend = c("Bush","Kerry"),
title = "Winner of County Vote (2004)", ncol=2)
Example: Grid data: Maunga Whau Volcano. (using also: contour and 3D plot with persp package)
# the volcano data:
class(volcano)
## [1] "matrix"
dim(volcano)
## [1] 87 61
z <- volcano ## Height Variable
x <- 10*(1:nrow(z)) ## 10 Meter Spacing (S-N)
y <- 10*(1:ncol(z)) ## 10 Meter Spacing (E-W)
# Gradient + Contour plot:
image(x, y, z, col=terrain.colors(100), axes=F)
contour(x, y, z, levels=seq(from=min(z), to=max(z), by=10),axes=F, add=T)
# 3D Plot:
## 3-D Elevation Plot w/ color
z <- 2 * volcano
x <- 10 * (1:nrow(z))
y <- 10 * (1:ncol(z))
## Create new grid
z0 <- min(z) - 20
z <- rbind(z0, cbind(z0, z, z0), z0)
x <- c(min(x) - 1e-10, x, max(x) + 1e-10)
y <- c(min(y) - 1e-10, y, max(y) + 1e-10)
## Create matrix of base colors
fcol <- matrix("green3", nr = nrow(z)-1, nc = ncol(z)-1)
fcol[ , i2 <- c(1,ncol(fcol))] <- "gray"
fcol[i1 <- c(1,nrow(fcol)) , ] <- "gray"
## Take average of four neighboring values for palette
zi <- (volcano[ -1,-1] + volcano[ -1,-61] + volcano[-87,-1] + volcano[-87,-61])/4
pal <- terrain.colors(20)[cut(zi, quantile(zi, seq(0,1, len = 21)), include.lowest = TRUE)]
fcol[-i1,-i2] <- pal
## Plot it
par(mar=rep(0,4))
persp(x, y, z,
theta=120, phi=15, col = fcol, scale = FALSE, shade = 0.4, border = NA)
# another way - plotly package
# this can be viewed only in html version
# library(plotly)
# plot_ly(z = ~volcano, type = "surface")
TODO
Cressie (1993) spatial statistic classification: - Spatial Point Pattern Data - Interpolation and Geostatistical Data - Lattice Data (areal data)
before moving to specific Spatial Statistic area we start by introducing some spatial statistic concepts.
preliminaries: distance conversion function () and load data
km2d <- function(km){ # convert km to degrees
out <- (km/1.852)/60
return(out)
}
d2km <- function(d){ # convert degrees to km
out <- d*60*1.852
return(out)
}
spdf <- election # a SpatialPolygonsDataFrame object
map_crd <- coordinates(spdf)
this spatial topic can be best explored with the spdep package for Spatial autocorrelation analysis.
more on this package from the package CRAN page:
A collection of functions to create spatial weights matrix objects from polygon contiguities, from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial autocorrelation, including global Moran’s I, APLE, Geary’s C, Hubert/Mantel general cross product statistic, Empirical Bayes estimates and Assunção/Reis Index, Getis/Ord G and multicoloured join count statistics, local Moran’s I and Getis/Ord G, saddlepoint approximations and exact tests for global and local Moran’s I; and functions for estimating spatial simultaneous autoregressive (SAR) lag and error models, impact measures for lag models, weighted and unweighted SAR and CAR spatial regression models, semi-parametric and Moran eigenvector spatial filtering, GM SAR error models, and generalized spatial two stage least squares models.
contiguity neighbors:
W_cont_el <- poly2nb(spdf, queen = T)
W_cont_el_mat <- nb2listw(W_cont_el,
style = "W",
zero.policy = TRUE)
par(mar=rep(0,4))
plot(W_cont_el_mat,
coords = map_crd,
pch = 19, cex = 0.1, col = "gray")
Moran’s I:
the Moran’s I coefficient calculates the ratio between the product of the variable of interest and its spatial lag, with the product of the variable of interest, adjusted for the spatial weights used. \[ I = \frac {n} {\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}} \frac {\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}(y_i-\bar{y})(y_j-\bar{y})} {\sum_{i=1}^{n}(y_i-\bar{y})^2}\] note that I range from –1 (perfect dispersion) to +1 (perfect correlation). A zero value indicates a random spatial pattern
moran.test(spdf$Bush_pct,
listw = W_cont_el_mat,
zero.policy = T)
##
## Moran I test under randomisation
##
## data: spdf$Bush_pct
## weights: W_cont_el_mat
##
## Moran I statistic standard deviate = 51.731, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.5565174275 -0.0003219575 0.0001158676
Geary’s C: The Geary’s C uses the sum of squared differences between pairs of data values as its measure of covariation.
\[C = \frac {(n-1)\sum _{i}\sum _{j}w_{ij}(y_{i}-y_{j})^{2}} {2W\sum _{i}(y_{i}-{\bar {y}})^{2}}\] Where \(W\) is the sum of all \(w_{ij}\)
geary.test(spdf$Bush_pct, listw=W_cont_el_mat, zero.policy=T)
##
## Geary C test under randomisation
##
## data: spdf$Bush_pct
## weights: W_cont_el_mat
##
## Geary C statistic standard deviate = 50.359, p-value < 2.2e-16
## alternative hypothesis: Expectation greater than statistic
## sample estimates:
## Geary C statistic Expectation Variance
## 0.4204068521 1.0000000000 0.0001324606
Join Count: When the variable of interest is categorical, a join count analysis can be used to assess the degree of clustering or dispersion
spdf$BushWin <- as.factor(ifelse(spdf$Bush > spdf$Kerry,1,0))
joincount.multi(spdf$BushWin, listw=W_cont_el_mat, zero.policy=T)
## Joincount Expected Variance z-value
## 0:0 130.7534 54.0599 6.7744 29.466
## 1:1 1111.0765 1030.8162 12.6147 22.598
## 1:0 311.6701 472.6272 29.4787 -29.645
## Jtot 311.6701 472.6272 29.9470 -29.413
It is possible to break Global measures down into their components, to construct local tests for spatial autocorrelation.
Moran Scatterplot: (note: we use the “W_cont_el_mat” wight matrix we clalculated before)
par(mar=c(4,4,1.5,0.5))
moran.plot(spdf$Bush_pct, listw = W_cont_el_mat, zero.policy=T,
xlim = c(0,100), ylim = c(0,100),
pch = 16, col = "black",cex = 0.5, quiet=F,
labels = as.character(spdf$NAME),
xlab = "Percent for Bush",
ylab="Percent for Bush (Spatial Lag)",
main="Moran Scatterplot")
## Potentially influential observations of
## lm(formula = wx ~ x) :
##
## dfb.1_ dfb.x dffit cov.r cook.d
## Glacier 0.09 -0.08 0.09_* 1.00_* 0.00
## Rolette 0.10 -0.09 0.10_* 1.00 0.01
## St. Louis -0.01 0.01 -0.01 1.00_* 0.00
## San Juan -0.28 0.26 -0.29_* 0.97_* 0.04
## Roosevelt 0.06 -0.06 0.07 1.00_* 0.00
## Clallam -0.05 0.04 -0.06 1.00_* 0.00
## Jefferson 0.01 -0.01 0.01 1.00_* 0.00
## Garfield 0.05 -0.05 -0.06 1.00 0.00
## Missoula 0.06 -0.05 0.07 1.00_* 0.00
## Custer -0.01 0.02 0.04 1.00_* 0.00
## Douglas 0.01 -0.01 0.01 1.00_* 0.00
## Ashland 0.00 0.00 0.00 1.00_* 0.00
## Sioux 0.18 -0.17 0.19_* 0.99_* 0.02
## Deer Lodge 0.09 -0.08 0.09_* 1.00_* 0.00
## Carter -0.01 0.02 0.02 1.00_* 0.00
## Big Horn 0.05 -0.05 0.06 1.00_* 0.00
## Harding -0.01 0.01 0.01 1.00_* 0.00
## Multnomah 0.05 -0.05 0.05 1.00_* 0.00
## Dewey 0.10 -0.09 0.10_* 1.00_* 0.01
## Valley 0.01 0.00 0.04 1.00_* 0.00
## Ramsey 0.01 -0.01 0.01 1.00_* 0.00
## Yellowstone National Park (Part) 0.51 -0.50 0.51_* 0.99_* 0.13
## Clinton -0.03 0.02 -0.04 1.00_* 0.00
## Lamoille -0.02 0.02 -0.03 1.00_* 0.00
## Chittenden -0.02 0.02 -0.02 1.00_* 0.00
## Clark -0.02 0.03 0.03 1.00_* 0.00
## Buffalo 0.16 -0.15 0.16_* 1.00_* 0.01
## Blaine 0.13 -0.12 0.14_* 0.99_* 0.01
## Teton 0.02 -0.01 0.05 1.00_* 0.00
## Madison 0.01 -0.01 -0.01 1.00_* 0.00
## Mower 0.01 -0.01 0.01 1.00_* 0.00
## Ada 0.00 0.00 0.04 1.00_* 0.00
## Shannon 0.36 -0.35 0.36_* 0.99_* 0.06
## Todd 0.20 -0.19 0.21_* 0.99_* 0.02
## Windham -0.03 0.03 -0.04 1.00_* 0.00
## Bannock 0.00 0.01 0.05 1.00_* 0.00
## Sioux 0.01 -0.01 -0.02 1.00_* 0.00
## Berkshire 0.01 -0.01 0.01 1.00_* 0.00
## Franklin -0.06 0.05 -0.06 1.00 0.00
## Cassia 0.00 0.00 0.00 1.00_* 0.00
## Hampshire -0.06 0.06 -0.06 1.00_* 0.00
## Suffolk -0.01 0.01 -0.01 1.00_* 0.00
## Franklin -0.02 0.02 0.02 1.00_* 0.00
## Blaine -0.03 0.04 0.04 1.00_* 0.00
## Box Elder -0.02 0.02 0.02 1.00_* 0.00
## Rich -0.01 0.01 0.01 1.00_* 0.00
## Arthur -0.04 0.04 0.05 1.00_* 0.00
## Geauga -0.01 0.00 -0.05 1.00_* 0.00
## Banner 0.00 -0.01 -0.01 1.00_* 0.00
## Cuyahoga 0.03 -0.03 0.03 1.00_* 0.00
## Dukes -0.31 0.30 -0.32_* 0.98_* 0.05
## Weber -0.03 0.04 0.06 1.00_* 0.00
## Lincoln -0.02 0.03 0.05 1.00_* 0.00
## Nantucket -0.26 0.24 -0.27_* 0.97_* 0.04
## Morgan 0.02 -0.03 -0.03 1.00_* 0.00
## Summit 0.05 -0.04 0.07 0.99_* 0.00
## Mahoning 0.00 0.00 0.00 1.00_* 0.00
## Bergen -0.05 0.05 -0.06 1.00_* 0.00
## Bronx 0.09 -0.09 0.09_* 1.00_* 0.00
## Salt Lake 0.01 0.00 0.04 1.00_* 0.00
## Nassau -0.04 0.03 -0.06 1.00_* 0.00
## Essex 0.01 -0.01 0.01 1.00_* 0.00
## New York -0.07 0.06 -0.08_* 1.00_* 0.00
## Uintah 0.02 -0.03 -0.03 1.00_* 0.00
## Duchesne 0.01 -0.01 -0.01 1.00_* 0.00
## Hudson -0.02 0.02 -0.02 1.00_* 0.00
## Queens 0.09 -0.08 0.09_* 1.00 0.00
## Kings -0.05 0.04 -0.09_* 0.99_* 0.00
## Hayes -0.02 0.02 0.02 1.00_* 0.00
## Richmond -0.15 0.12 -0.19_* 0.96_* 0.02
## Utah 0.01 -0.01 -0.01 1.00_* 0.00
## Philadelphia 0.06 -0.06 0.06 1.00_* 0.00
## Marion 0.06 -0.05 0.07 1.00_* 0.00
## Carbon 0.02 -0.01 0.06 1.00_* 0.00
## Denver 0.05 -0.05 0.06 1.00_* 0.00
## Pitkin 0.05 -0.05 0.06 1.00 0.00
## Montgomery -0.02 0.02 -0.02 1.00_* 0.00
## Monroe 0.05 -0.05 0.06 1.00_* 0.00
## Anne Arundel -0.03 0.02 -0.06 0.99_* 0.00
## Prince Georges 0.05 -0.05 0.05 1.00_* 0.00
## Washington 0.01 -0.01 0.01 1.01_* 0.00
## St. Louis 0.13 -0.13 0.13_* 1.00 0.01
## Sonoma -0.04 0.04 -0.04 1.00_* 0.00
## Monroe -0.02 0.01 -0.04 1.00_* 0.00
## Marin -0.07 0.07 -0.07 1.00_* 0.00
## Elliott 0.06 -0.05 0.06 1.00_* 0.00
## San Miguel 0.13 -0.13 0.13_* 1.00 0.01
## Garfield -0.01 0.01 0.01 1.00_* 0.00
## Dolores 0.02 -0.03 -0.06 1.00_* 0.00
## Alameda 0.05 -0.04 0.05 1.00_* 0.00
## San Francisco -0.05 0.05 -0.05 1.00_* 0.00
## San Mateo -0.12 0.11 -0.12_* 1.00 0.01
## Costilla 0.01 -0.01 0.01 1.00_* 0.00
## Jackson 0.01 -0.01 -0.01 1.00_* 0.00
## Knott 0.00 0.00 0.00 1.00_* 0.00
## Santa Clara -0.02 0.02 -0.02 1.00_* 0.00
## Santa Cruz -0.02 0.02 -0.02 1.00_* 0.00
## San Juan 0.01 -0.02 -0.05 1.00_* 0.00
## Rio Arriba -0.01 0.00 -0.01 1.00_* 0.00
## Apache 0.07 -0.07 0.08_* 1.00 0.00
## Taos 0.00 0.00 0.00 1.00_* 0.00
## Hertford -0.02 0.01 -0.02 1.00_* 0.00
## Warren 0.01 -0.01 0.01 1.00_* 0.00
## Person -0.01 0.00 -0.04 1.00_* 0.00
## Ochiltree -0.05 0.05 0.06 1.00_* 0.00
## Orange 0.04 -0.04 0.04 1.00_* 0.00
## Durham 0.03 -0.03 0.03 1.00_* 0.00
## Kingfisher 0.00 0.00 0.00 1.00_* 0.00
## Roberts -0.04 0.05 0.05 1.00_* 0.00
## Santa Fe 0.01 -0.01 0.01 1.00_* 0.00
## McKinley 0.02 -0.02 0.02 1.00_* 0.00
## Los Alamos -0.04 0.03 -0.06 1.00_* 0.00
## San Miguel 0.04 -0.04 0.05 1.00_* 0.00
## Tipton 0.01 -0.02 -0.05 1.00_* 0.00
## Potter -0.03 0.04 0.06 1.00_* 0.00
## Oldham -0.01 0.01 0.01 1.00_* 0.00
## St. Francis -0.02 0.02 -0.04 1.00_* 0.00
## Torrance 0.00 -0.01 -0.04 1.00_* 0.00
## De Soto 0.04 -0.06 -0.09_* 0.99_* 0.00
## Tunica 0.02 -0.02 0.03 1.00_* 0.00
## Parmer 0.01 -0.01 -0.01 1.00_* 0.00
## Catron 0.03 -0.04 -0.06 1.00_* 0.00
## Grant 0.00 -0.01 -0.04 1.00_* 0.00
## Florence -0.02 0.01 -0.04 1.00_* 0.00
## Foard 0.01 0.00 0.04 1.00_* 0.00
## Fulton 0.08 -0.08 0.09_* 1.00_* 0.00
## Clarke 0.11 -0.10 0.12_* 0.99_* 0.01
## Jefferson 0.03 -0.02 0.05 1.00_* 0.00
## Taliaferro 0.02 -0.02 0.02 1.00_* 0.00
## Carroll 0.01 -0.02 -0.05 1.00_* 0.00
## Clayton 0.13 -0.12 0.13_* 1.00 0.01
## Chicot 0.01 -0.01 0.01 1.00_* 0.00
## Fayette 0.03 -0.03 -0.05 1.00_* 0.00
## Hancock 0.13 -0.13 0.13_* 1.00 0.01
## Bamberg -0.01 0.01 -0.01 1.00_* 0.00
## Holmes 0.11 -0.10 0.11_* 1.00 0.01
## Dorchester 0.00 -0.01 -0.04 1.00_* 0.00
## Glascock 0.10 -0.12 -0.15_* 0.99_* 0.01
## Noxubee 0.06 -0.05 0.06 1.00_* 0.00
## Greene 0.09 -0.09 0.09_* 1.00_* 0.00
## Allendale 0.04 -0.04 0.04 1.00_* 0.00
## West Carroll 0.05 -0.06 -0.08 1.00_* 0.00
## Dallas 0.04 -0.03 0.06 1.00_* 0.00
## Sumter 0.02 -0.02 0.02 1.00_* 0.00
## Shackelford 0.00 0.00 0.00 1.00_* 0.00
## Perry 0.08 -0.07 0.08_* 1.00 0.00
## Warren -0.02 0.01 -0.05 1.00_* 0.00
## Macon 0.20 -0.19 0.20_* 1.00 0.02
## Effingham 0.05 -0.06 -0.07 1.00_* 0.00
## Marengo -0.04 0.03 -0.05 1.00_* 0.00
## Schley 0.02 -0.03 -0.05 1.00_* 0.00
## Lowndes 0.08 -0.07 0.08_* 1.00 0.00
## Bullock 0.02 -0.02 0.02 1.00_* 0.00
## Wilcox 0.02 -0.02 0.02 1.00_* 0.00
## Beaufort -0.01 0.00 -0.04 1.00_* 0.00
## Jasper 0.05 -0.04 0.06 1.00_* 0.00
## Claiborne 0.13 -0.13 0.13_* 1.00 0.01
## Sterling -0.02 0.03 0.03 1.00_* 0.00
## Glasscock -0.02 0.03 0.03 1.00_* 0.00
## Loving 0.01 -0.02 -0.02 1.00_* 0.00
## Lee 0.05 -0.06 -0.08_* 1.00_* 0.00
## Jefferson Davis 0.29 -0.28 0.29_* 0.99_* 0.04
## Crane -0.01 0.01 0.01 1.00_* 0.00
## Miller 0.02 -0.03 -0.06 1.00_* 0.00
## Decatur 0.10 -0.09 0.10_* 1.00 0.01
## Grady 0.00 -0.01 -0.05 1.00_* 0.00
## Gadsden 0.06 -0.06 0.07 1.00 0.00
## Livingston 0.05 -0.06 -0.07 1.00_* 0.00
## Duval 0.02 -0.01 0.04 1.00_* 0.00
## Jefferson 0.05 -0.04 0.06 1.00_* 0.00
## Orleans 0.22 -0.21 0.22_* 0.99_* 0.02
## St. Bernard 0.01 -0.02 -0.06 1.00_* 0.00
## Alachua 0.07 -0.06 0.07 1.00_* 0.00
## LaFourche 0.07 -0.08 -0.09_* 1.00_* 0.00
## Bexar 0.03 -0.02 0.05 1.00_* 0.00
## Zavala 0.10 -0.10 0.10_* 1.00 0.01
## McMullen 0.09 -0.11 -0.12_* 0.99_* 0.01
## Dimmit -0.03 0.03 -0.03 1.00_* 0.00
## Duval 0.06 -0.06 0.06 1.00 0.00
## Brooks -0.02 0.02 -0.02 1.00_* 0.00
## Starr -0.02 0.02 -0.02 1.00_* 0.00
## Willacy 0.00 0.00 0.00 1.00_* 0.00
## Menominee 0.20 -0.19 0.20_* 1.00 0.02
## Orleans 0.00 -0.01 -0.06 0.99_* 0.00
## Monroe 0.22 -0.22 0.23_* 1.00_* 0.03
## Wayne -0.01 0.00 -0.04 1.00_* 0.00
## Dane 0.03 -0.03 0.03 1.00_* 0.00
## Milwaukee 0.09 -0.08 0.09_* 1.00 0.00
## Livingston 0.00 -0.01 -0.04 1.00_* 0.00
## Wayne 0.02 -0.02 0.02 1.00_* 0.00
## Washtenaw 0.02 -0.01 0.02 1.00_* 0.00
## Broome 0.12 -0.11 0.12_* 1.00 0.01
## Cook 0.08 -0.07 0.08_* 1.00 0.00
## Monroe -0.04 0.03 -0.05 1.00_* 0.00
## Lagrange 0.09 -0.08 0.09_* 1.00_* 0.00
## Baltimore City 0.15 -0.15 0.15_* 1.00 0.01
## Fairfax -0.05 0.05 -0.06 1.00_* 0.00
## Arlington -0.08 0.08 -0.08_* 1.00 0.00
## Alexandria -0.12 0.11 -0.12_* 1.00_* 0.01
## Harrisonburg 0.02 -0.01 0.05 1.00_* 0.00
## Staunton 0.01 0.00 0.04 1.00_* 0.00
## Charlottesville 0.05 -0.05 0.05 1.00_* 0.00
## Clifton Forge 0.38 -0.37 0.38_* 1.00_* 0.07
## Richmond City 0.11 -0.10 0.11_* 1.00 0.01
## Colonial Heights 0.05 -0.06 -0.08_* 1.00_* 0.00
## Petersburg 0.26 -0.25 0.26_* 0.99_* 0.03
## Poquoson City 0.05 -0.06 -0.07 1.00_* 0.00
## South Boston 0.40 -0.40 0.40_* 0.99_* 0.08
## hat
## Glacier 0.00
## Rolette 0.00
## St. Louis 0.00
## San Juan 0.00
## Roosevelt 0.00
## Clallam 0.00
## Jefferson 0.00
## Garfield 0.00_*
## Missoula 0.00
## Custer 0.00
## Douglas 0.00
## Ashland 0.00
## Sioux 0.00_*
## Deer Lodge 0.00
## Carter 0.00
## Big Horn 0.00
## Harding 0.00
## Multnomah 0.00_*
## Dewey 0.00
## Valley 0.00
## Ramsey 0.00
## Yellowstone National Park (Part) 0.01_*
## Clinton 0.00
## Lamoille 0.00
## Chittenden 0.00
## Clark 0.00
## Buffalo 0.00_*
## Blaine 0.00
## Teton 0.00
## Madison 0.00_*
## Mower 0.00
## Ada 0.00
## Shannon 0.00_*
## Todd 0.00_*
## Windham 0.00_*
## Bannock 0.00
## Sioux 0.00
## Berkshire 0.00_*
## Franklin 0.00_*
## Cassia 0.00
## Hampshire 0.00_*
## Suffolk 0.00_*
## Franklin 0.00_*
## Blaine 0.00_*
## Box Elder 0.00
## Rich 0.00
## Arthur 0.00_*
## Geauga 0.00
## Banner 0.00
## Cuyahoga 0.00
## Dukes 0.00_*
## Weber 0.00
## Lincoln 0.00
## Nantucket 0.00
## Morgan 0.00
## Summit 0.00
## Mahoning 0.00
## Bergen 0.00
## Bronx 0.00_*
## Salt Lake 0.00
## Nassau 0.00
## Essex 0.00_*
## New York 0.00
## Uintah 0.00
## Duchesne 0.00
## Hudson 0.00
## Queens 0.00_*
## Kings 0.00
## Hayes 0.00
## Richmond 0.00
## Utah 0.00
## Philadelphia 0.00_*
## Marion 0.00
## Carbon 0.00
## Denver 0.00_*
## Pitkin 0.00_*
## Montgomery 0.00
## Monroe 0.00
## Anne Arundel 0.00
## Prince Georges 0.00_*
## Washington 0.01_*
## St. Louis 0.00_*
## Sonoma 0.00_*
## Monroe 0.00
## Marin 0.00_*
## Elliott 0.00_*
## San Miguel 0.00_*
## Garfield 0.00
## Dolores 0.00
## Alameda 0.00_*
## San Francisco 0.00_*
## San Mateo 0.00_*
## Costilla 0.00_*
## Jackson 0.00
## Knott 0.00
## Santa Clara 0.00
## Santa Cruz 0.00_*
## San Juan 0.00
## Rio Arriba 0.00
## Apache 0.00
## Taos 0.00_*
## Hertford 0.00
## Warren 0.00
## Person 0.00
## Ochiltree 0.00_*
## Orange 0.00
## Durham 0.00_*
## Kingfisher 0.00
## Roberts 0.00_*
## Santa Fe 0.00_*
## McKinley 0.00
## Los Alamos 0.00
## San Miguel 0.00_*
## Tipton 0.00
## Potter 0.00
## Oldham 0.00
## St. Francis 0.00
## Torrance 0.00
## De Soto 0.00
## Tunica 0.00_*
## Parmer 0.00
## Catron 0.00
## Grant 0.00
## Florence 0.00
## Foard 0.00
## Fulton 0.00
## Clarke 0.00
## Jefferson 0.00
## Taliaferro 0.00
## Carroll 0.00
## Clayton 0.00_*
## Chicot 0.00
## Fayette 0.00
## Hancock 0.00_*
## Bamberg 0.00
## Holmes 0.00_*
## Dorchester 0.00
## Glascock 0.00
## Noxubee 0.00_*
## Greene 0.00_*
## Allendale 0.00_*
## West Carroll 0.00
## Dallas 0.00
## Sumter 0.00_*
## Shackelford 0.00
## Perry 0.00_*
## Warren 0.00
## Macon 0.00_*
## Effingham 0.00
## Marengo 0.00
## Schley 0.00
## Lowndes 0.00_*
## Bullock 0.00_*
## Wilcox 0.00
## Beaufort 0.00
## Jasper 0.00
## Claiborne 0.00_*
## Sterling 0.00
## Glasscock 0.00_*
## Loving 0.00
## Lee 0.00
## Jefferson Davis 0.00_*
## Crane 0.00
## Miller 0.00
## Decatur 0.00_*
## Grady 0.00
## Gadsden 0.00_*
## Livingston 0.00
## Duval 0.00
## Jefferson 0.00
## Orleans 0.00_*
## St. Bernard 0.00
## Alachua 0.00
## LaFourche 0.00
## Bexar 0.00
## Zavala 0.00_*
## McMullen 0.00
## Dimmit 0.00
## Duval 0.00_*
## Brooks 0.00_*
## Starr 0.00_*
## Willacy 0.00
## Menominee 0.00_*
## Orleans 0.00
## Monroe 0.00_*
## Wayne 0.00
## Dane 0.00
## Milwaukee 0.00
## Livingston 0.00
## Wayne 0.00_*
## Washtenaw 0.00
## Broome 0.00_*
## Cook 0.00_*
## Monroe 0.00
## Lagrange 0.00
## Baltimore City 0.00_*
## Fairfax 0.00
## Arlington 0.00_*
## Alexandria 0.00
## Harrisonburg 0.00
## Staunton 0.00
## Charlottesville 0.00_*
## Clifton Forge 0.01_*
## Richmond City 0.00_*
## Colonial Heights 0.00
## Petersburg 0.00_*
## Poquoson City 0.00
## South Boston 0.01_*
The plot is partitioned into four quadrants: low-low, low-high, high-low and high-high
Local Moran’s I Plot (normality assumption)
lm1 <- localmoran(spdf$Bush_pct, listw = W_cont_el_mat, zero.policy = T)
spdf$lm1 <- lm1[,4] ## Extract z-scores
lm.palette <- colorRampPalette(c("red","white","blue","blue4"), space = "rgb")
spplot(spdf, zcol="lm1",
col.regions = lm.palette(20),
main = "Local Moran's I", pretty = T)
we will cover: - Interpolation with Inverse-distance weighted (IDW) - The variogram - Kringing
and more topics in the future
start with exploring the Laos Dataset
# subset of Laos bombing dataset:
df <- laos[sample(500, replace = F),]
str(df) #data.frame object
## 'data.frame': 500 obs. of 24 variables:
## $ ID : int 261 49 268 147 99 340 297 167 38 359 ...
## $ LAT : num 18.7 18.2 18.2 18.2 18.2 ...
## $ LONG : num 104 105 105 105 105 ...
## $ DATE : Factor w/ 900 levels "1/1/1966","1/1/1968",..: 82 617 474 749 480 46 503 835 364 525 ...
## $ YEAR : int 1970 1967 1970 1969 1969 1972 1970 1969 1967 1972 ...
## $ MONTH : int 1 4 2 5 2 1 3 8 2 3 ...
## $ DAY : int 24 13 5 4 6 17 10 16 10 16 ...
## $ YRMO : int 197001 196704 197002 196905 196902 197201 197003 196908 196702 197203 ...
## $ LAT_IND60 : num 18.7 18.2 18.2 18.2 18.2 ...
## $ LON_IND60 : num 104 105 105 105 105 ...
## $ NUM_ACRFT : int 2 2 4 2 2 2 2 4 2 2 ...
## $ AIRCRAFT : Factor w/ 44 levels "40002","A-1",..: 5 3 30 5 3 42 35 12 10 42 ...
## $ LOAD_QTY : int 4 4 2 2 2 12 6 20 6 12 ...
## $ LOAD_LBS : int 531 301 799 531 531 63720 531 531 260 63720 ...
## $ ORDNANCE : Factor w/ 33 levels "1000 LB Bomb MK-83",..: 11 24 15 25 25 30 11 11 7 30 ...
## $ ORD_CLASS : Factor w/ 11 levels "1000LB MK-65",..: 9 6 10 9 9 9 9 9 6 9 ...
## $ CATEGORY : Factor w/ 2 levels "General_Purpose",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ TARGET : Factor w/ 108 levels " ","100mm Gun Site",..: 19 19 19 19 19 21 19 19 19 21 ...
## $ BDA : Factor w/ 31 levels " ","Cratered",..: 30 30 14 30 24 28 24 30 22 28 ...
## $ Total.Bombs: int 4 27 2 2 20 12 6 20 6 24 ...
## $ PUNISH : int 0 0 0 0 0 0 0 0 0 0 ...
## $ RISK : int 0 0 0 0 0 0 0 0 0 0 ...
## $ DENIAL : int 1 1 1 1 1 1 1 1 1 1 ...
## $ DECAP : int 0 0 0 0 0 0 0 0 0 0 ...
open simple world map and create coordinates matrix:
data("wrld_simpl")
sp_point <- cbind(LONG = jitter(df$LONG, 0.001),
LAT = jitter(df$LAT, 0.001))
create SpatialPointsDataFrame object and plot it:
df.sp <- SpatialPointsDataFrame(coords = sp_point, data = df,
proj4string = CRS("+proj=utm +zone=48 +datum=WGS84"))
par(mar = rep(0,4))
plot(df.sp, pch = 1, cex = log(df.sp$LOAD_LBS)/5)
zoom in:
par(mar=c(2,2,0.5,0.5))
plot(wrld_simpl,
xlim = bbox(df.sp)[1,] + c(-1,1),
ylim = bbox(df.sp)[2,] + c(-2,2),
col = "lightgrey", axes = T) ## World Map
points(df.sp, pch = 16,cex = .5, col = "red")
bubble plot: PAYLOAD variable:
bubble(df.sp,"LOAD_LBS", col = "red")
A variogram cloud is a scatterplot of data pairs, in which the semivariance is plotted against interpoint distance
plot(variogram(log(LOAD_LBS)~1,
locations = coordinates(sp_point),
data = df.sp, cloud = T)
,pch=16, cex=0.5)
Upper left corner: point pairs are close together, but have very different values. Lower left corner: close together, similar values. Upper right corner: far apart, different values. Lower right corner: far apart, similar values. A variogram can also be used to identify outliers…
plot(variogram(log(LOAD_LBS)~1,
locations = coordinates(sp_point),
data = df.sp, cloud = F),
type = "b", pch=16)
sel <- plot(variogram(LOAD_LBS ~ 1,
locations = coordinates(sp_point),
data = df.sp,
alpha = c(0,45,90,135), cloud = T),
pch = 16, cex = 1, digitize = T,col = "blue")
plot(sel,df.sp)
Find the outlying pairs in the dataset
sel
out.pair <- function(x,data,sel){
a <- as.data.frame(data[sel$head,as.character(x)])
b <- as.data.frame(data[sel$tail,as.character(x)])
ID.a <- round(as.numeric(rownames(as.data.frame(data[sel$head,]))),0)
ID.b <- round(as.numeric(rownames(as.data.frame(data[sel$tail,]))),0)
out <- cbind(ID.a,a[,as.character(x)],ID.b,b[,as.character(x)])
colnames(out) <- c("ID.a",paste(x,".a",sep=""),"ID.b",paste(x,".b",sep=""))
out
}
out.pair(x="LOAD_LBS",data=df.sp,sel=sel)