Ecological studies use statistics to examine factors that modify risk. Data are typically aggregated by area. Our application concerns factors that modify the risk of tornadoes.
download.file(url = "http://www.spc.noaa.gov/gis/svrgis/zipped/tornado.zip",
destfile = "tornado.zip")
unzip("tornado.zip")
library("rgdal")
## Loading required package: sp
## rgdal: version: 0.9-1, (SVN revision 518)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.9.2, released 2012/10/08
## Path to GDAL shared files: /Users/jelsner/Library/R/3.1/library/rgdal/gdal
## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
## Path to PROJ.4 shared files: /Users/jelsner/Library/R/3.1/library/rgdal/proj
TornL = readOGR(dsn = ".", layer = "tornado",
stringsAsFactors = FALSE)
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "tornado"
## with 57988 features and 21 fields
## Feature type: wkbLineString with 2 dimensions
library("raster")
r = raster(xmn = -102, xmx = -95,
ymn = 36, ymx = 42,
resolution = .25)
sp = as(r, 'SpatialPolygons')
spT = spTransform(sp, CRS(proj4string(TornL)))
At this resolution cells all have about the same area so the exposure to tornadoes is the same. The CRS of the native spatial data is Lambert conic conformal (lcc) with units of meters. Here we create a raster by specifying a bounding box and a resolution of .25 degree. The domain is centered on Kansas but includes parts of Nebraska, Iowa, Oklahoma, and Texas. Change the raster to spatial polygons. This is used to perform the overlay of the tornado tracks. Convert the polygon grid CRS to the projection of the tornado tracks.
There are 24 cells north to south (rows) and 28 cells east to west (cols) for a total of 672. The CRS is geographic. Overlay the tornado tracks onto the spatial polygons with the over() function from the sp package. It works with various combinations of spatial objects but the rgeos package is needed for it to work with spatial polygons and spatial lines.
Remove duplicate tracks. Overlay tracks onto cells and return a vector indicating either NA (is not inside extent) or the cell number. Subset the tornadoes by the raster extent. Compute the track lengths and store in data frame with start and end lat/lon. Subset by duplicates of this data frame and paths exceeding 100 m.
library("rgeos")
## rgeos version: 0.3-4, (SVN revision 438)
## GEOS runtime version: 3.3.3-CAPI-1.7.4
## Polygon checking: TRUE
tc = over(TornL, spT)
TornL2 = subset(TornL, !is.na(tc))
PathD = SpatialLinesLengths(TornL2)
df = data.frame(PathD,
SLAT = as.numeric(TornL2$SLAT),
SLON = as.numeric(TornL2$SLON),
ELAT = as.numeric(TornL2$ELAT),
ELON = as.numeric(TornL2$ELON),
DATE = TornL2$DATE)
dup = duplicated(df)
TornL3 = subset(TornL2, !dup)
#TornL3 = subset(TornL3, as.numeric(YR) >= 1990)
Add a buffer to the tracks based on the width specified in the attribute table to create a spatial polygons data frame. Overlay the path polygons on the cell polygons. Add minimum widths to tornadoes without width.
Width = as.numeric(TornL3$WID) * .9144
Width[Width == 0] = min(Width[Width != 0])
TornP3 = gBuffer(TornL3, byid = TRUE, width = Width/2)
ct = over(spT, TornP3, returnList = TRUE)
nT = sapply(ct, function(x) length(x))
sumL = sapply(ct, function(x) sum(TornP3$Length[x]))
nt = r
values(nt) = nT
The result is a list. Each element of the list is a vector of row numbers from the attribute table corresponding to paths occurring in the cell. There are 672 cells. The order of the list matches the order of the raster (upper left to lower right in lexigraphical order). From the list we get the length of each vector. This is done with the sapply() (simple apply) where the first argument is the list object and the second argument is a function. These counts are then placed onto the original raster with the values() function. The GIS overlay operation uses projected coordinates. The analysis/display uses geographic coordinates.
Compare using lines only.
ct2 = over(spT, TornL3, returnList = TRUE)
nT2 = sapply(ct2, function(x) nrow(x))
(sum(nT) - sum(nT2))/sum(nT) * 100
## [1] 4.380215
There are 4.4% more tornado counts using tornado path compared to using tornado track.
Digital elevation model data are available from http://www.viewfinderpanoramas.org/DEM/TIF15/ Get the elevation raster and crop it to the extent of the tornado raster r. Compute the elevation roughness with the terrain() function. Here roughness is the difference between the maximum and the minimum elevation in the cell and the eight surrounding cells. Match the resolution and origin of the elevation rasters with those of the tornado raster by degrading the resolution of elevation roughness. Compute the correlation between the number of tornadoes and surface roughness & elevation.
#download.file(url = "http://www.viewfinderpanoramas.org/DEM/TIF15/15-H.ZIP",
# destfile = "15-H.ZIP", mode = "wb")
#unzip("15-H.ZIP")
download.file(url = "http://myweb.fsu.edu/jelsner/data/15-H.tif.zip",
destfile = "15-H.tif.zip", mode = "wb")
unzip("15-H.tif.zip")
Elev = raster("15-H.tif")
Elev = crop(Elev, nt)
TR = terrain(Elev, opt = 'roughness')
el = resample(aggregate(Elev, fact = c(nrow(r), ncol(r)), fun = mean), r)
tr = resample(aggregate(TR, fact = c(nrow(r), ncol(r)), fun = mean), r)
Gridded Population of the World (GPW), v3. http://sedac.ciesin.columbia.edu/data/set/gpw-v3-population-density/data-download Values are persons per square km. Download as grid and read as raster. Crop to extent of tornado raster and match the resolution of the tornado grid.
download.file(url = "http://myweb.fsu.edu/jelsner/data/usadens.zip",
destfile = "usadens.zip")
unzip("usadens.zip")
Pop = raster("usadens/usads00g/w001001.adf")
Pop = crop(Pop, r)
pop = resample(aggregate(Pop, fact = c(nrow(r), ncol(r)), fun = mean), r)
Scatter plots
library("ggplot2")
df = data.frame(nT = values(nt),
el = values(el),
tr = values(tr),
pop = values(pop))
ggplot(df, aes(x = tr, y = nT)) +
geom_point() +
geom_smooth() +
ylab("Number of Tornadoes") +
xlab("Terrain Roughness (m)")
## geom_smooth: method="auto" and size of largest group is <1000, so using loess. Use 'method = x' to change the smoothing method.
ggplot(df, aes(x = log2(pop), y = nT)) +
geom_point() +
geom_smooth(method = lm) +
ylab(expression(paste("Number of Central Plains Tornadoes (1990-2013), .25", degree, "resolution"))) +
xlab(expression(paste("Population density (people per ", km^2, "), in 2000"))) +
scale_x_continuous(breaks = c(1, 2, 4, 8),
labels = c(1, 2, 4, 8))
ggplot(df, aes(x = el, y = nT)) +
geom_point() +
geom_smooth(method = lm) +
ylab("Number of Tornadoes") +
xlab("Elevation (m)")
Histogram of the number of tornadoes
ggplot(df, aes(x = nT)) +
geom_histogram(binwidth = 2, color = "white") +
ylab("Number of cells") +
xlab("Number of tornadoes")
mean(df$nT)
## [1] 14.98214
var(df$nT)/mean(df$nT)
## [1] 3.651618
The mean number of tornadoes is 15.1 (tornadoes/.25 deg/64 years). The ratio of the variance to the mean is 3.76. The cell counts are more dispersed than a Poisson distribution.
Get county & state boundaries. Get the tornado tracks. Convert to the geographic coordinates of the raster.
library("mapproj")
## Loading required package: maps
library("maptools")
## Checking rgeos availability: TRUE
ext = as.vector(extent(r))
bndryC = map("county", fill = TRUE,
xlim = ext[1:2],
ylim = ext[3:4],
plot = FALSE)
IDs = sapply(strsplit(bndryC$names, ":"), function(x) x[1])
bndryCP = map2SpatialPolygons(bndryC, IDs = IDs,
proj4string = CRS(projection(r)))
bndryS = map("state", fill = TRUE,
xlim = ext[1:2],
ylim = ext[3:4],
plot = FALSE)
IDs = sapply(strsplit(bndryS$names, ":"), function(x) x[1])
bndrySP = map2SpatialPolygons(bndryS, IDs = IDs,
proj4string = CRS(projection(r)))
TornP3T = spTransform(TornP3, CRS(projection(r)))
Create the map
library("rasterVis")
## Loading required package: lattice
## Loading required package: latticeExtra
## Loading required package: RColorBrewer
##
## Attaching package: 'latticeExtra'
##
## The following object is masked from 'package:ggplot2':
##
## layer
##
## Loading required package: hexbin
library("wesanderson")
range(values(nt))
## [1] 1 49
rng = seq(0, 50, 5)
cr = wes.palette(name = "Zissou", type = "continuous")
p1 = levelplot(nt, margin = TRUE,
sub = "Tornado Counts (1950-2013)",
xlab = NULL, ylab = NULL,
col.regions = cr, at = rng,
colorkey = list(space = 'bottom'),
par.settings = list(fontsize = list(text = 15)))
p1 +
latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1)) +
latticeExtra::layer(sp.polygons(TornP3T, fill = gray(.4),
col = gray(.4), alpha = .3))
Convert raster layers to polygons. Change the attribute name to nT in the spatial polygons data frame and add the elevation roughness as another column.
spdf = as(nt, "SpatialPolygonsDataFrame")
names(spdf) = "nT"
spdf$el = values(el)
spdf$tr = values(tr)
spdf$pop = values(pop)
spdfT = spTransform(spdf, CRS(proj4string(TornP3)))
spdf$area = gArea(spdfT, byid = TRUE)
spdf$l2pop = log2(spdf$pop)
spdf$ID = 1:ncell(nt)
cor(spdf$el, spdf$tr)
## [1] -0.07755983
cor(spdf$el, spdf$pop)
## [1] -0.5615524
cor(spdf$tr, spdf$pop)
## [1] 0.1249989
Some controls for INLA.
#source("http://www.math.ntnu.no/inla/givemeINLA.R")
library("INLA")
## Loading required package: Matrix
## Loading required package: splines
Spatial neighborhood definition as an inla graph
library("spdep")
nb = poly2nb(spdf)
nb2INLA("g", nb)
g = inla.read.graph("g")
Model for the smoothed relative rate.
formula0 = nT ~ f(ID, model = "besag", graph = g)
model0 = inla(formula0, family = "nbinomial", E = area/10^6,
data = spdf@data)
summary(model0)
##
## Call:
## c("inla(formula = formula0, family = \"nbinomial\", data = spdf@data, ", " E = area/10^6)")
##
## Time used:
## Pre-processing Running inla Post-processing Total
## 0.1481 0.8950 0.0375 1.0806
##
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -3.7458 0.0145 -3.7741 -3.7459 -3.7171 -3.746 0
##
## Random effects:
## Name Model
## ID Besags ICAR model
##
## Model hyperparameters:
## mean sd
## size for the nbinomial observations (overdispersion) 17.673 2.6565
## Precision for ID 2.782 0.4257
## 0.025quant 0.5quant
## size for the nbinomial observations (overdispersion) 12.995 17.489
## Precision for ID 2.047 2.748
## 0.975quant mode
## size for the nbinomial observations (overdispersion) 23.417 17.140
## Precision for ID 3.712 2.678
##
## Expected number of effective parameters(std dev): 210.17(24.12)
## Number of equivalent replicates : 3.197
##
## Marginal Likelihood: -2789.89
rSR = r
values(rSR) = (exp(model0$summary.random$ID$mean) - 1) * 100
Plot the smoothed rate.
library("RColorBrewer")
range(values(rSR))
## [1] -64.70234 134.29750
rng = seq(-100, 150, 50)
rngL = paste(rng, '%', sep = "")
cr = rev(brewer.pal(8, "RdBu"))
cr = cr[-(1:2)]
p2 = levelplot(rSR, margin = TRUE,
sub = "Tornado Reports (Above/Below Regional Average) ",
xlab = NULL, ylab = NULL,
col.regions = cr, at = rng,
colorkey = list(at = rng, labels = rngL, col = cr),
par.settings = list(fontsize = list(text = 13)))
p2 +
latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1))
Add covariates to the model
formula1 = nT ~ f(ID, model = "besag", graph = g) +
el + l2pop + tr
model1 = inla(formula = formula1, family = "nbinomial", E = area/10^6,
data = spdf@data)
summary(model1)
##
## Call:
## c("inla(formula = formula1, family = \"nbinomial\", data = spdf@data, ", " E = area/10^6)")
##
## Time used:
## Pre-processing Running inla Post-processing Total
## 0.1187 0.8853 0.0337 1.0377
##
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -3.4921 0.2533 -3.9956 -3.4902 -2.9993 -3.4864 0
## el -0.0002 0.0003 -0.0008 -0.0002 0.0005 -0.0002 0
## l2pop 0.0831 0.0329 0.0188 0.0830 0.1480 0.0828 0
## tr -0.0272 0.0034 -0.0339 -0.0272 -0.0204 -0.0272 0
##
## Random effects:
## Name Model
## ID Besags ICAR model
##
## Model hyperparameters:
## mean sd
## size for the nbinomial observations (overdispersion) 16.371 2.375
## Precision for ID 4.752 1.056
## 0.025quant 0.5quant
## size for the nbinomial observations (overdispersion) 12.147 16.223
## Precision for ID 3.069 4.616
## 0.975quant mode
## size for the nbinomial observations (overdispersion) 21.463 15.949
## Precision for ID 7.192 4.346
##
## Expected number of effective parameters(std dev): 151.01(25.48)
## Number of equivalent replicates : 4.45
##
## Marginal Likelihood: -2783.44
Elevation is not significant. Drop it from the model.
formula2 = nT ~ f(ID, model = "besag", graph = g) +
l2pop + tr
model2 = inla(formula = formula2, family = "nbinomial", E = area/10^6,
data = spdf@data)
summary(model2)
##
## Call:
## c("inla(formula = formula2, family = \"nbinomial\", data = spdf@data, ", " E = area/10^6)")
##
## Time used:
## Pre-processing Running inla Post-processing Total
## 0.1132 0.9698 0.0321 1.1151
##
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -3.6068 0.1015 -3.8079 -3.6063 -3.4088 -3.6053 0
## l2pop 0.0896 0.0303 0.0301 0.0895 0.1491 0.0895 0
## tr -0.0271 0.0034 -0.0338 -0.0271 -0.0204 -0.0271 0
##
## Random effects:
## Name Model
## ID Besags ICAR model
##
## Model hyperparameters:
## mean sd
## size for the nbinomial observations (overdispersion) 16.433 2.375
## Precision for ID 4.743 1.040
## 0.025quant 0.5quant
## size for the nbinomial observations (overdispersion) 12.200 16.290
## Precision for ID 3.088 4.609
## 0.975quant mode
## size for the nbinomial observations (overdispersion) 21.509 16.028
## Precision for ID 7.143 4.341
##
## Expected number of effective parameters(std dev): 150.76(25.17)
## Number of equivalent replicates : 4.457
##
## Marginal Likelihood: -2772.09
rSR = r
values(rSR) = (exp(model2$summary.random$ID$mean) - 1) * 100
range(values(rSR))
## [1] -41.40773 69.69852
rng = seq(-60, 80, 20)
rngL = paste(rng, '%', sep = "")
cr = rev(brewer.pal(8, "RdBu"))
cr = cr[-(1)]
p3 = levelplot(rSR, margin = TRUE,
sub = "Tornado Reports (Above/Below Regional Average) ",
xlab = NULL, ylab = NULL,
col.regions = cr, at = rng,
colorkey = list(at = rng, labels = rngL, col = cr),
par.settings = list(fontsize = list(text = 13)))
p3 +
latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1)) +
latticeExtra::layer({SpatialPolygonsRescale(layout.north.arrow(type = 1),
offset = c(-95.5, 41.2),
scale = .75)})
# layer({SpatialPolygonsRescale(layout.scale.bar(),
# offset = c(-101.5, 41.8),
# scale = 1, fill = c("transparent", "black"))}) +
# layer(sp.text(loc = c(-101.5, 41.7), "0")) +
# layer(sp.text(loc = c(-100.1, 41.7), "100 km"))
Function for plotting the density of the marginal term.
ggplotmargin <- function(x, type, effect, xlab, ylab = "Posterior Density",
int.value = c(value = 0, 5, 95),
color = c("red", "gray", "gray")){
xx = as.data.frame(inla.smarginal(x[[paste("marginals", type, sep=".")]][[effect]]))
out = ggplot(xx, aes(x, y)) + geom_line(size = 1) + ylab(ylab) + xlab(xlab)
if(length(int.value) == 0) int.value = 0
int.value = lapply(int.value, function(x) if(is.character(x))
type.convert(x, as.is = TRUE) else x)
int.value = lapply(int.value, function(x) if(is.character(x))
lapply(strsplit(x, "=")[[1]], type.convert, as.is = TRUE) else x)
nx = names(int.value)
if(!is.null(nx))
for(i in which(nx != "")) int.value[[i]] = list(nx[i], int.value[[i]])
int.value = sapply(int.value, function(x) {
if(is.numeric(x)) xx$x[which.max(cumsum(xx$y)/sum(xx$y) >= as.numeric(x/100))]
else switch(x[[1]],
mean = sum(xx$y*xx$x)/sum(xx$y),
median = xx$x[which.max(cumsum(xx$y)/sum(xx$y) >=.5)],
mode = xx$x[which.max(xx$y)],
value = x[[2]],
zero = 0)})
if(length(color) <= length(int.value)) color = rep(color, length = length(int.value))
for(i in 1:length(int.value)) out = out + geom_vline(xintercept = int.value[i], color = color[i])
out
}
results = model2
results$marginals.fixed$tr[, 1] = (exp(-results$marginals.fixed$tr[, 1]) - 1) * 100
results$marginals.fixed$l2pop[, 1] = (exp(results$marginals.fixed$l2pop[, 1]) - 1) * 100
ggplotmargin(results, type = "fixed", effect = "tr",
xlab = "% decrease in tornado reports\n per meter increase in elevation roughness")
ggplotmargin(results, type = "fixed", effect = "l2pop",
xlab = "% increase in tornado reports\n per doubling of population")