Class Project: Spring 2015

Changing the domain

An ecological study of the relationship between tornado occurrence and terrain roughness

Ecological studies use statistics to examine factors that modify risk. Data are typically aggregated by area. Our application concerns factors that modify the risk of tornadoes.

Put tornado counts onto a raster grid

setwd("~/Dropbox")
download.file(url = "http://www.spc.noaa.gov/gis/svrgis/zipped/tornado.zip",
              destfile = "tornado.zip")
unzip("tornado.zip")
library("rgdal")
## Loading required package: sp
## rgdal: version: 0.9-1, (SVN revision 518)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 1.9.2, released 2012/10/08
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.1/Resources/library/rgdal/gdal
## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
## Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.1/Resources/library/rgdal/proj
TornL = readOGR(dsn = ".", layer = "tornado", 
                stringsAsFactors = FALSE)
## OGR data source with driver: ESRI Shapefile 
## Source: ".", layer: "tornado"
## with 57988 features and 21 fields
## Feature type: wkbLineString with 2 dimensions
library("raster")
r = raster(xmn = -102, xmx = -95,
           ymn = 36, ymx = 42,
           resolution = .25)
sp = as(r, 'SpatialPolygons')
spT = spTransform(sp, CRS(proj4string(TornL)))

The CRS of the native spatial data is Lambert conic conformal (lcc) with units of meters. Here we create a raster by specifying a bounding box and a resolution of .25 degree. The domain is centered on Kansas but includes parts of Nebraska, Iowa, Oklahoma, and Texas. Change the raster to spatial polygons. This is used to perform the overlay of the tornado tracks. Convert the polygon grid CRS to the projection of the tornado tracks.

There are 24 cells north to south (rows) and 28 cells east to west (cols) for a total of 672. The CRS is geographic. Overlay the tornado tracks onto the spatial polygons with the over() function from the sp package. It works with various combinations of spatial objects but the rgeos package is needed for it to work with spatial polygons and spatial lines.

Remove duplicate tracks. Overlay tracks onto cells and return a vector indicating either NA (is not inside extent) or the cell number. Subset the tornadoes by the raster extent. Compute the track lengths and store in data frame with start and end lat/lon. Subset by duplicates of this data frame and paths exceeding 100 m.

library("rgeos")
## rgeos version: 0.3-8, (SVN revision 460)
##  GEOS runtime version: 3.3.3-CAPI-1.7.4 
##  Polygon checking: TRUE
tc = over(TornL, spT)
TornL2 = subset(TornL, !is.na(tc))
PathD = SpatialLinesLengths(TornL2)
df = data.frame(PathD, 
                SLAT = as.numeric(TornL2$SLAT), 
                SLON = as.numeric(TornL2$SLON), 
                ELAT = as.numeric(TornL2$ELAT), 
                ELON = as.numeric(TornL2$ELON),
                DATE = TornL2$DATE)
dup = duplicated(df)
TornL3 = subset(TornL2, !dup)
#TornL3 = subset(TornL3, as.numeric(YR) >= 1990)

Add a buffer to the tracks based on the width specified in the attribute table to create a spatial polygons data frame. Overlay the path polygons on the cell polygons. Add minimum widths to tornadoes without width.

Width = as.numeric(TornL3$WID) * .9144
Width[Width == 0] = min(Width[Width != 0])
TornP3 = gBuffer(TornL3, byid = TRUE, width = Width/2)
ct = over(spT, TornP3, returnList = TRUE)
nT = sapply(ct, function(x) length(x))
sumL = sapply(ct, function(x) sum(TornP3$Length[x]))
nt = r
values(nt) = nT

The result is a list. Each element of the list is a vector of row numbers from the attribute table corresponding to paths occurring in the cell. There are 672 cells. The order of the list matches the order of the raster (upper left to lower right in lexigraphical order). From the list we get the length of each vector. This is done with the sapply() (simple apply) where the first argument is the list object and the second argument is a function. These counts are then placed onto the original raster with the values() function. The GIS overlay operation uses projected coordinates. The analysis/display uses geographic coordinates.

Compare using lines only.

ct2 = over(spT, TornL3, returnList = TRUE)
nT2 = sapply(ct2, function(x) nrow(x))
(sum(nT) - sum(nT2))/sum(nT) * 100
## [1] 4.380215

There are 4.4% more tornado counts using tornado path compared to using tornado track.

Get elevation raster from a DEM

Digital elevation model data are available from http://www.viewfinderpanoramas.org/DEM/TIF15/ Get the elevation raster and crop it to the extent of the tornado raster r. Compute the elevation roughness with the terrain() function. Here roughness is the difference between the maximum and the minimum elevation in the cell and the eight surrounding cells. Match the resolution and origin of the elevation rasters with those of the tornado raster by degrading the resolution of elevation roughness. Compute the correlation between the number of tornadoes and surface roughness & elevation.

#download.file(url = "http://www.viewfinderpanoramas.org/DEM/TIF15/15-H.ZIP",
#              destfile = "15-H.ZIP", mode = "wb")
#unzip("15-H.ZIP")
download.file(url = "http://myweb.fsu.edu/jelsner/data/15-H.tif.zip",
              destfile = "15-H.tif.zip", mode = "wb")
unzip("15-H.tif.zip")
Elev = raster("15-H.tif")
Elev = crop(Elev, nt)
TR = terrain(Elev, opt = 'roughness')
el = resample(aggregate(Elev, fact = c(nrow(r), ncol(r)), fun = mean), r)
tr = resample(aggregate(TR, fact = c(nrow(r), ncol(r)), fun = mean), r)

Get a population raster

Gridded Population of the World (GPW), v3. http://sedac.ciesin.columbia.edu/data/set/gpw-v3-population-density/data-download Values are persons per square km. Download as grid and read as raster. Crop to extent of tornado raster and match the resolution of the tornado grid.

download.file(url = "http://myweb.fsu.edu/jelsner/data/usadens.zip",
              destfile = "usadens.zip")
unzip("usadens.zip")
Pop = raster("usadens/usads00g/w001001.adf")
Pop = crop(Pop, r)
pop = resample(aggregate(Pop, fact = c(nrow(r), ncol(r)), fun = mean), r)

Exploratory analysis

Scatter plots

library("ggplot2")
df = data.frame(nT = values(nt),
                el = values(el),
                tr = values(tr),
                pop = values(pop))
ggplot(df, aes(x = tr, y = nT)) +
  geom_point() +
  geom_smooth() +
  ylab("Number of Tornadoes") +
  xlab("Terrain Roughness (m)")
## geom_smooth: method="auto" and size of largest group is <1000, so using loess. Use 'method = x' to change the smoothing method.

ggplot(df, aes(x = log2(pop), y = nT)) +
  geom_point() +
  geom_smooth(method = lm) +
  ylab(expression(paste("Number of Central Plains Tornadoes (1990-2013), .25", degree, "resolution"))) +
  xlab(expression(paste("Population density (people per ", km^2, "), in 2000"))) +
  scale_x_continuous(breaks = c(1, 2, 4, 8),
                     labels = c(1, 2, 4, 8))

ggplot(df, aes(x = el, y = nT)) +
  geom_point() +
  geom_smooth(method = lm) +
  ylab("Number of Tornadoes") +
  xlab("Elevation (m)")

Histogram of the number of tornadoes

ggplot(df, aes(x = nT)) + 
  geom_histogram(binwidth = 2, color = "white") +
  ylab("Number of cells") +
  xlab("Number of tornadoes")

mean(df$nT)
## [1] 14.98214
var(df$nT)/mean(df$nT)
## [1] 3.651618

The mean number of tornadoes is 15.1 (tornadoes/.25 deg/64 years). The ratio of the variance to the mean is 3.76. The cell counts are more dispersed than a Poisson distribution.

Get county & state boundaries. Get the tornado tracks. Convert to the geographic coordinates of the raster.

library("mapproj")
## Loading required package: maps
library("maptools")
## Checking rgeos availability: TRUE
ext = as.vector(extent(r))
bndryC = map("county", fill = TRUE,
            xlim = ext[1:2],
            ylim = ext[3:4],
            plot = FALSE)
IDs = sapply(strsplit(bndryC$names, ":"), function(x) x[1])
bndryCP = map2SpatialPolygons(bndryC, IDs = IDs,
                              proj4string = CRS(projection(r)))
bndryS = map("state", fill = TRUE,
            xlim = ext[1:2],
            ylim = ext[3:4],
            plot = FALSE)
IDs = sapply(strsplit(bndryS$names, ":"), function(x) x[1])
bndrySP = map2SpatialPolygons(bndryS, IDs = IDs,
                              proj4string = CRS(projection(r)))
TornP3T = spTransform(TornP3, CRS(projection(r)))

Create the map

library("rasterVis")
## Loading required package: lattice
## Loading required package: latticeExtra
## Loading required package: RColorBrewer
## 
## Attaching package: 'latticeExtra'
## 
## The following object is masked from 'package:ggplot2':
## 
##     layer
## 
## Loading required package: hexbin
library("wesanderson")
range(values(nt))
## [1]  1 49
rng = seq(0, 50, 5)
cr = wes_palette(10, name = "Zissou", type = "continuous")
p1 = levelplot(nt, margin = TRUE, 
          sub = "Tornado Counts (1950-2013)", 
          xlab = NULL, ylab = NULL, 
          col.regions = cr, at = rng, 
          colorkey = list(space = 'bottom'),
          par.settings = list(fontsize = list(text = 15)))
p1 + 
  latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
  latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1)) +
  latticeExtra::layer(sp.polygons(TornP3T, fill = gray(.4), 
                                  col = gray(.4), alpha = .3))

Convert raster layers to polygons

Convert raster layers to polygons. Change the attribute name to nT in the spatial polygons data frame and add the elevation roughness as another column.

spdf = as(nt, "SpatialPolygonsDataFrame")
names(spdf) = "nT"
spdf$el = values(el)
spdf$tr = values(tr)
spdf$pop = values(pop)
spdfT = spTransform(spdf, CRS(proj4string(TornP3)))
spdf$area = gArea(spdfT, byid = TRUE)
spdf$l2pop = log2(spdf$pop)
spdf$ID = 1:ncell(nt)
cor(spdf$el, spdf$tr)
## [1] -0.07755983
cor(spdf$el, spdf$pop)
## [1] -0.5615524
cor(spdf$tr, spdf$pop)
## [1] 0.1249989

Spatial model

Some controls for INLA.

#source("http://www.math.ntnu.no/inla/givemeINLA.R")
library("INLA")
## Loading required package: Matrix
## Loading required package: splines

Spatial neighborhood definition as an inla graph

library("spdep")
nb = poly2nb(spdf)
nb2INLA("g", nb)
g = inla.read.graph("g")

Model for the smoothed relative rate.

formula0 = nT ~ f(ID, model = "besag", graph = g)
model0  =  inla(formula0, family = "nbinomial", E = area/10^6,
                data = spdf@data)
summary(model0)
## 
## Call:
## c("inla(formula = formula0, family = \"nbinomial\", data = spdf@data, ",  "    E = area/10^6)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.3330          3.2675          0.0798          3.6803 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant   mode kld
## (Intercept) -3.7458 0.0145    -3.7741  -3.7459    -3.7171 -3.746   0
## 
## Random effects:
## Name   Model
##  ID   Besags ICAR model 
## 
## Model hyperparameters:
##                                                        mean    sd
## size for the nbinomial observations (overdispersion) 17.672 2.656
## Precision for ID                                      2.782 0.426
##                                                      0.025quant 0.5quant
## size for the nbinomial observations (overdispersion)     12.994   17.489
## Precision for ID                                          2.046    2.748
##                                                      0.975quant   mode
## size for the nbinomial observations (overdispersion)     23.416 17.140
## Precision for ID                                          3.713  2.678
## 
## Expected number of effective parameters(std dev): 210.17(24.11)
## Number of equivalent replicates : 3.197 
## 
## Marginal Likelihood:  -2789.87
rSR = r
values(rSR) = (exp(model0$summary.random$ID$mean) - 1) * 100

Plot the smoothed rate.

library("RColorBrewer")
range(values(rSR))
## [1] -64.70207 134.29447
rng = seq(-100, 150, 50)
rngL = paste(rng, '%', sep = "")
cr = rev(brewer.pal(8, "RdBu"))
cr = cr[-(1:2)]
p2 = levelplot(rSR, margin = TRUE, 
          sub = "Tornado Reports (Above/Below Regional Average)             ",
          xlab = NULL, ylab = NULL, 
          col.regions = cr, at = rng,
          colorkey = list(at = rng, labels = rngL, col = cr),
          par.settings = list(fontsize = list(text = 13)))
p2 + 
  latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
  latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1))

Add covariates to the model

formula1 = nT ~ f(ID, model = "besag", graph = g) + 
                el + l2pop + tr
model1 = inla(formula = formula1, family = "nbinomial", E = area/10^6,
             data = spdf@data)
summary(model1)
## 
## Call:
## c("inla(formula = formula1, family = \"nbinomial\", data = spdf@data, ",  "    E = area/10^6)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.2847          2.7276          0.0697          3.0819 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -3.4921 0.2533    -3.9955  -3.4902    -2.9993 -3.4864   0
## el          -0.0002 0.0003    -0.0008  -0.0002     0.0005 -0.0002   0
## l2pop        0.0831 0.0329     0.0188   0.0830     0.1480  0.0828   0
## tr          -0.0272 0.0034    -0.0339  -0.0272    -0.0204 -0.0272   0
## 
## Random effects:
## Name   Model
##  ID   Besags ICAR model 
## 
## Model hyperparameters:
##                                                        mean    sd
## size for the nbinomial observations (overdispersion) 16.370 2.377
## Precision for ID                                      4.752 1.056
##                                                      0.025quant 0.5quant
## size for the nbinomial observations (overdispersion)      12.14   16.222
## Precision for ID                                           3.07    4.616
##                                                      0.975quant   mode
## size for the nbinomial observations (overdispersion)     21.464 15.948
## Precision for ID                                          7.193  4.346
## 
## Expected number of effective parameters(std dev): 151.00(25.48)
## Number of equivalent replicates : 4.45 
## 
## Marginal Likelihood:  -2783.43

Elevation is not significant. Drop it from the model.

formula2 = nT ~ f(ID, model = "besag", graph = g) + 
                l2pop + tr
model2 = inla(formula = formula2, family = "nbinomial", E = area/10^6,
             data = spdf@data)
summary(model2)
## 
## Call:
## c("inla(formula = formula2, family = \"nbinomial\", data = spdf@data, ",  "    E = area/10^6)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.2583          3.1393          0.0668          3.4644 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -3.6068 0.1015    -3.8079  -3.6063    -3.4088 -3.6053   0
## l2pop        0.0896 0.0303     0.0301   0.0895     0.1491  0.0895   0
## tr          -0.0271 0.0034    -0.0338  -0.0271    -0.0204 -0.0271   0
## 
## Random effects:
## Name   Model
##  ID   Besags ICAR model 
## 
## Model hyperparameters:
##                                                        mean    sd
## size for the nbinomial observations (overdispersion) 16.432 2.376
## Precision for ID                                      4.743 1.040
##                                                      0.025quant 0.5quant
## size for the nbinomial observations (overdispersion)     12.196   16.289
## Precision for ID                                          3.089    4.609
##                                                      0.975quant   mode
## size for the nbinomial observations (overdispersion)     21.510 16.027
## Precision for ID                                          7.144  4.341
## 
## Expected number of effective parameters(std dev): 150.75(25.16)
## Number of equivalent replicates : 4.458 
## 
## Marginal Likelihood:  -2772.08
rSR = r
values(rSR) = (exp(model2$summary.random$ID$mean) - 1) * 100
range(values(rSR))
## [1] -41.40614  69.69385
rng = seq(-60, 80, 20)
rngL = paste(rng, '%', sep = "")
cr = rev(brewer.pal(8, "RdBu"))
cr = cr[-(1)]
p3 = levelplot(rSR, margin = TRUE, 
          sub = "Tornado Reports (Above/Below Regional Average)             ",
          xlab = NULL, ylab = NULL, 
          col.regions = cr, at = rng,
          colorkey = list(at = rng, labels = rngL, col = cr),
          par.settings = list(fontsize = list(text = 13)))

p3 + 
  latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
  latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1)) +
  latticeExtra::layer({SpatialPolygonsRescale(layout.north.arrow(type = 1), 
             offset = c(-95.5, 44.0), 
             scale = .75)})

#  layer({SpatialPolygonsRescale(layout.scale.bar(),
#                                offset = c(-101.5, 41.8),
#                                scale = 1, fill = c("transparent", "black"))}) +
#  layer(sp.text(loc = c(-101.5, 41.7), "0")) +
#  layer(sp.text(loc = c(-100.1, 41.7), "100 km"))

Function for plotting the density of the marginal term.

ggplotmargin <- function(x, type, effect, xlab, ylab = "Posterior Density",
                         int.value = c(value = 0, 5, 95),
                         color = c("red", "gray", "gray")){
xx = as.data.frame(inla.smarginal(x[[paste("marginals", type, sep=".")]][[effect]]))
  out = ggplot(xx, aes(x, y)) + geom_line(size = 1) + ylab(ylab) + xlab(xlab)    
if(length(int.value) == 0) int.value = 0
int.value = lapply(int.value, function(x) if(is.character(x)) 
  type.convert(x, as.is = TRUE) else x)
int.value = lapply(int.value, function(x) if(is.character(x)) 
  lapply(strsplit(x, "=")[[1]], type.convert, as.is = TRUE) else x)
nx = names(int.value)
if(!is.null(nx))
   for(i in which(nx != ""))  int.value[[i]] = list(nx[i], int.value[[i]])
    int.value = sapply(int.value, function(x) {
                      if(is.numeric(x)) xx$x[which.max(cumsum(xx$y)/sum(xx$y) >= as.numeric(x/100))]
                      else switch(x[[1]], 
                      mean = sum(xx$y*xx$x)/sum(xx$y), 
                      median = xx$x[which.max(cumsum(xx$y)/sum(xx$y) >=.5)],
                      mode = xx$x[which.max(xx$y)],
                      value = x[[2]],
                      zero = 0)})

if(length(color) <= length(int.value)) color = rep(color, length = length(int.value))
for(i in 1:length(int.value)) out = out + geom_vline(xintercept = int.value[i], color = color[i]) 
out
}
results = model2
results$marginals.fixed$tr[, 1] = (exp(-results$marginals.fixed$tr[, 1]) - 1) * 100
results$marginals.fixed$l2pop[, 1] = (exp(results$marginals.fixed$l2pop[, 1]) - 1) * 100

ggplotmargin(results, type = "fixed", effect = "tr", 
             xlab = "% decrease in tornado reports\n per meter increase in elevation roughness")

ggplotmargin(results, type = "fixed", effect = "l2pop", 
             xlab = "% increase in tornado reports\n per doubling of population")

Expand the boundry Start by creating a new raster and compiling a list.*

#download.file(url = "http://www.spc.noaa.gov/gis/svrgis/zipped/tornado.zip",
#              destfile = "tornado.zip")
#unzip("tornado.zip")

#library("rgdal")
#TornL = readOGR(dsn = ".", layer = "tornado", 
#                stringsAsFactors = FALSE)
#library("raster")
r2 = raster(xmn = -102, xmx = -95,
           ymn = 30, ymx = 45,
           resolution = .25)
sp = as(r2, 'SpatialPolygons')
spT = spTransform(sp, CRS(proj4string(TornL)))

TornL = subset(TornL, !duplicated(TornL@data))
library("rgeos")
tc = over(TornL, spT)
TornL2 = subset(TornL, !is.na(tc))
PathD = SpatialLinesLengths(TornL2)
df = data.frame(PathD, 
                SLAT = as.numeric(TornL2$SLAT), 
                SLON = as.numeric(TornL2$SLON), 
                ELAT = as.numeric(TornL2$ELAT), 
                ELON = as.numeric(TornL2$ELON))
dup = duplicated(df)
dis = PathD > 100
both = dup & dis
TornL3 = subset(TornL2, !both)
#TornL3 = subset(TornL3, as.numeric(YR) >= 1990)

Width = as.numeric(TornL3$WID) * .9144
Width[Width == 0] = min(Width[Width != 0])
TornP3 = gBuffer(TornL3, byid = TRUE, width = Width/2)
ct = over(spT, TornP3, returnList = TRUE)
nT = sapply(ct, function(x) length(x))
sumL = sapply(ct, function(x) sum(TornP3$Length[x]))
nt = r2
values(nt) = nT

Comapre lines

ct2 = over(spT, TornL3, returnList = TRUE)
nT2 = sapply(ct2, function(x) nrow(x))
(sum(nT) - sum(nT2))/sum(nT) * 100
## [1] 4.941366

There are 4.9% more tornado counts using tornado path compared to using tornado track.

Get elevation data

#download.file(url = "http://myweb.fsu.edu/jelsner/data/15-H.tif.zip",
#              destfile = "15-H.tif.zip", mode = "wb")
#unzip("15-H.tif.zip")
Elev = raster("15-H.tif")
Elev = crop(Elev, nt)
TR = terrain(Elev, opt = 'roughness')
tr = resample(TR, nt)
el = resample(Elev, nt)

Get population data raster

#download.file(url = "http://myweb.fsu.edu/jelsner/data/usadens.zip",
#              destfile = "usadens.zip")
#unzip("usadens.zip")
Pop = raster("usadens/usads00g/w001001.adf")
Pop = crop(Pop, r2)
pop = resample(Pop, r2)

Exploratory analysis

Scatterplots

library("ggplot2")
df = data.frame(nT = values(nt),
                el = values(el),
                tr = values(tr),
                pop = values(pop))
ggplot(df, aes(x = tr, y = nT)) +
  geom_point() +
  geom_smooth() +
  ylab("Number of Tornadoes") +
  xlab("Terrain Roughness (m)")
## geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.

ggplot(df, aes(x = log2(pop), y = nT)) +
  geom_point() +
  geom_smooth(method = lm) +
  ylab(expression(paste("Number of Central Plains Tornadoes (1950-2013), .25", degree, "resolution"))) +
  xlab(expression(paste("Population density (people per ", km^2, "), in 2000"))) +
  scale_x_continuous(breaks = c(1, 2, 4, 8),
                     labels = c(1, 2, 4, 8))

ggplot(df, aes(x = el, y = nT)) +
  geom_point() +
  geom_smooth(method = lm) +
  ylab("Number of Tornadoes") +
  xlab("Elevation (m)")

Histogram of tornado counts

ggplot(df, aes(x = nT)) + 
  geom_histogram(binwidth = 2, color = "white") +
  ylab("Number of cells") +
  xlab("Number of tornadoes")

mean(df$nT)
## [1] 13.29881
var(df$nT)/mean(df$nT)
## [1] 5.427991

There are more tornadoes when the boundary is expanded, but the distribution is a similar shape to the previous histogram of tornado counts.

County and State boundaries

library("mapproj")
library("maptools")
ext = as.vector(extent(r2))
bndryC = map("county", fill = TRUE,
            xlim = ext[1:2],
            ylim = ext[3:4],
            plot = FALSE)
IDs = sapply(strsplit(bndryC$names, ":"), function(x) x[1])
bndryCP = map2SpatialPolygons(bndryC, IDs = IDs,
                              proj4string = CRS(projection(r2)))
bndryS = map("state", fill = TRUE,
            xlim = ext[1:2],
            ylim = ext[3:4],
            plot = FALSE)
IDs = sapply(strsplit(bndryS$names, ":"), function(x) x[1])
bndrySP = map2SpatialPolygons(bndryS, IDs = IDs,
                              proj4string = CRS(projection(r2)))
TornP3T = spTransform(TornP3, CRS(projection(r2)))

Create the map

library("rasterVis")
library("wesanderson")
range(values(nt))
## [1]  0 70
#rng = seq(0, 30, 3)
rng = seq(0, 70, 7)
cr = wes_palette(10, name = "Zissou", type = "continuous")
p1 = levelplot(nt, margin = TRUE, 
          sub = "Tornado Counts (1950-2013)", 
          xlab = NULL, ylab = NULL, 
          col.regions = cr, at = rng, 
          colorkey = list(space = 'bottom'),
          par.settings = list(fontsize = list(text = 15)))
p1 + 
  layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
  layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1)) +
  layer(sp.polygons(TornP3T, fill = gray(.4), col = gray(.4), alpha = .3))

try to convert raster layers to polygons and account for population density

(start at line 182)

Convert raster layers to polygons. Change the attribute name to nT in the spatial polygons data frame and add the elevation roughness as another column.

spdf = as(nt, "SpatialPolygonsDataFrame")
names(spdf) = "nT"
spdf$el = values(el)
spdf$tr = values(tr)
spdf$pop = values(pop)
spdfT = spTransform(spdf, CRS(proj4string(TornP3)))
spdf$area = gArea(spdfT, byid = TRUE)
spdf$l2pop = log2(spdf$pop)
spdf$ID = 1:ncell(nt)
cor(spdf$el, spdf$tr)
## [1] 0.0336101
cor(spdf$el, spdf$pop)
## [1] -0.2211108
cor(spdf$tr, spdf$pop)
## [1] -0.04658093

Spatial model

Some controls for INLA.

#source("http://www.math.ntnu.no/inla/givemeINLA.R")
library("INLA")

Spatial neighborhood definition as an inla graph

library("spdep")
nb = poly2nb(spdf)
nb2INLA("g", nb)
g = inla.read.graph("g")

Model for the smoothed relative rate.

formula0 = nT ~ f(ID, model = "besag", graph = g)
model0  =  inla(formula0, family = "nbinomial", E = area/10^6,
                data = spdf@data)
summary(model0)
## 
## Call:
## c("inla(formula = formula0, family = \"nbinomial\", data = spdf@data, ",  "    E = area/10^6)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.3358          9.8791          0.1135         10.3284 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -3.9923 0.0101     -4.012  -3.9923    -3.9725 -3.9923   0
## 
## Random effects:
## Name   Model
##  ID   Besags ICAR model 
## 
## Model hyperparameters:
##                                                        mean     sd
## size for the nbinomial observations (overdispersion) 23.017 3.0943
## Precision for ID                                      1.539 0.1193
##                                                      0.025quant 0.5quant
## size for the nbinomial observations (overdispersion)     17.647   22.764
## Precision for ID                                          1.318    1.535
##                                                      0.975quant   mode
## size for the nbinomial observations (overdispersion)     29.785 22.232
## Precision for ID                                          1.786  1.526
## 
## Expected number of effective parameters(std dev): 702.71(37.98)
## Number of equivalent replicates : 2.391 
## 
## Marginal Likelihood:  -6833.78
#rSR = r
rSR = r2
values(rSR) = (exp(model0$summary.random$ID$mean) - 1) * 100

Plot the smoothed rate.

library("RColorBrewer")
range(values(rSR))
## [1] -93.18267 362.58753
#rng = seq(-100, 150, 50)
#rng = seq(-100, 350, 75)
rng = seq(-100, 400, 100)
rngL = paste(rng, '%', sep = "")
cr = rev(brewer.pal(8, "RdBu"))
cr = cr[-(1:2)]
p2 = levelplot(rSR, margin = TRUE, 
          sub = "Tornado Reports (Above/Below Regional Average)             ",
          xlab = NULL, ylab = NULL, 
          col.regions = cr, at = rng,
          colorkey = list(at = rng, labels = rngL, col = cr),
          par.settings = list(fontsize = list(text = 13)))
p2 + 
  latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
  latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1))

Add covariates to the model

formula1 = nT ~ f(ID, model = "besag", graph = g) + 
                el + l2pop + tr
model1 = inla(formula = formula1, family = "nbinomial", E = area/10^6,
             data = spdf@data)
summary(model1)
## 
## Call:
## c("inla(formula = formula1, family = \"nbinomial\", data = spdf@data, ",  "    E = area/10^6)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.3738         11.1716          0.1284         11.6738 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -4.2809 0.1263    -4.5298  -4.2805    -4.0339 -4.2799   0
## el           0.0005 0.0002     0.0001   0.0005     0.0010  0.0005   0
## l2pop        0.1521 0.0085     0.1353   0.1521     0.1688  0.1521   0
## tr          -0.0199 0.0025    -0.0248  -0.0199    -0.0150 -0.0199   0
## 
## Random effects:
## Name   Model
##  ID   Besags ICAR model 
## 
## Model hyperparameters:
##                                                        mean     sd
## size for the nbinomial observations (overdispersion) 23.993 2.9644
## Precision for ID                                      2.824 0.2859
##                                                      0.025quant 0.5quant
## size for the nbinomial observations (overdispersion)     18.732    23.80
## Precision for ID                                          2.304     2.81
##                                                      0.975quant   mode
## size for the nbinomial observations (overdispersion)     30.354 23.408
## Precision for ID                                          3.426  2.783
## 
## Expected number of effective parameters(std dev): 503.91(38.48)
## Number of equivalent replicates : 3.334 
## 
## Marginal Likelihood:  -6671.27

Elevation is significant. Keep it in the model.

#rSR = r
rSR = r2
#values(rSR) = (exp(model2$summary.random$ID$mean) - 1) * 100
values(rSR) = (exp(model1$summary.random$ID$mean) - 1) * 100
range(values(rSR))
## [1] -81.32065 173.53160
#rng = seq(-60, 80, 20)
rng = seq(-100, 200, 50)
rngL = paste(rng, '%', sep = "")
cr = rev(brewer.pal(8, "RdBu"))
cr = cr[-(1)]
p3 = levelplot(rSR, margin = TRUE, 
          sub = "Tornado Reports (Above/Below Regional Average)             ",
          xlab = NULL, ylab = NULL, 
          col.regions = cr, at = rng,
          colorkey = list(at = rng, labels = rngL, col = cr),
          par.settings = list(fontsize = list(text = 13)))

p3 + 
  latticeExtra::layer(sp.polygons(bndryCP, col = gray(.85), lwd = 1)) +
  latticeExtra::layer(sp.polygons(bndrySP, col = gray(.15), lwd = 1)) +
  latticeExtra::layer({SpatialPolygonsRescale(layout.north.arrow(type = 1), 
             offset = c(-95.5, 44.0), 
             scale = .75)})

#  layer({SpatialPolygonsRescale(layout.scale.bar(),
#                                offset = c(-101.5, 41.8),
#                                scale = 1, fill = c("transparent", "black"))}) +
#  layer(sp.text(loc = c(-101.5, 41.7), "0")) +
#  layer(sp.text(loc = c(-100.1, 41.7), "100 km"))

Function for plotting the density of the marginal term.

ggplotmargin <- function(x, type, effect, xlab, ylab = "Posterior Density",
                         int.value = c(value = 0, 5, 95),
                         color = c("red", "gray", "gray")){
xx = as.data.frame(inla.smarginal(x[[paste("marginals", type, sep=".")]][[effect]]))
  out = ggplot(xx, aes(x, y)) + geom_line(size = 1) + ylab(ylab) + xlab(xlab)    
if(length(int.value) == 0) int.value = 0
int.value = lapply(int.value, function(x) if(is.character(x)) 
  type.convert(x, as.is = TRUE) else x)
int.value = lapply(int.value, function(x) if(is.character(x)) 
  lapply(strsplit(x, "=")[[1]], type.convert, as.is = TRUE) else x)
nx = names(int.value)
if(!is.null(nx))
   for(i in which(nx != ""))  int.value[[i]] = list(nx[i], int.value[[i]])
    int.value = sapply(int.value, function(x) {
                      if(is.numeric(x)) xx$x[which.max(cumsum(xx$y)/sum(xx$y) >= as.numeric(x/100))]
                      else switch(x[[1]], 
                      mean = sum(xx$y*xx$x)/sum(xx$y), 
                      median = xx$x[which.max(cumsum(xx$y)/sum(xx$y) >=.5)],
                      mode = xx$x[which.max(xx$y)],
                      value = x[[2]],
                      zero = 0)})

if(length(color) <= length(int.value)) color = rep(color, length = length(int.value))
for(i in 1:length(int.value)) out = out + geom_vline(xintercept = int.value[i], color = color[i]) 
out
}

Plotting the marginal terms

results = model1
results$marginals.fixed$tr[, 1] = (exp(-results$marginals.fixed$tr[, 1]) - 1) * 100
results$marginals.fixed$l2pop[, 1] = (exp(results$marginals.fixed$l2pop[, 1]) - 1) * 100
results$marginals.fixed$el[, 1] = (exp(results$marginals.fixed$el[, 1]) - 1) * 100

ggplotmargin(results, type = "fixed", effect = "tr", 
             xlab = "% decrease in tornado reports\n per meter increase in elevation roughness")

ggplotmargin(results, type = "fixed", effect = "l2pop", 
             xlab = "% increase in tornado reports\n per doubling of population")

ggplotmargin(results, type = "fixed", effect = "el", 
             xlab = "% increase in tornado reports\n per meter increase in elevation")

There is a difference in marginal values from the smaller domain to the larger domain.

With the smaller domain, we saw around a 2.5% decrease in tornado reports per meter increase in terrain roughness and around a 10% increase in tornado reports per doubling of population.

With the larger domain, we see a 2% decrease in tornado reports per meter increase in terrain roughness, around a 17% increase in tornado reports per doublung of population, and a 0.05% (negligible) increase in tornado reports per meter increase in elevation.