Loading [MathJax]/jax/output/HTML-CSS/jax.js

The INLA Approach to Bayesian models

The Integrated Nested Laplace Approximation, or INLA, approach is a recently developed, computationally simpler method for fitting Bayesian models [(Rue et al., 2009, compared to traditional Markov Chain Monte Carlo (MCMC) approaches. INLA fits models that are classified as latent Gaussian models, which are applicable in many settings (Martino & Rue, 2010. In general, INLA fits a general form of additive models such as:

η=α+nfj=1f(j)(uij)+nβk=1βkzki+ϵi

where η is the linear predictor for a generalized linear model formula, and is composed of a linear function of some variables u, β are the effects of covariates, z, and ϵ is an unstructured residual (Rue et al., 2009). As this model is often parameterized as a Bayesian one, we are interested in the posterior marginal distributions of all the model parameters. Rue and Martino (2007) show that the posterior marginal for the random effects (x) in such models can be approximated as:

˜p(xi|y)=k˜p(xi|θk,y)˜p(θk|y)Δk

via numerical integration (Rue & Martino, 2007; Schrodle & Held, 2011a, 2011b). The posterior distribution of the hyperparameters (θ) of the model can also be approximated as:

˜p(θ|y))p(x,θ,y)˜pG(x|θ,y)x=x(θ)

, where G is a Gaussian approximation of the posterior and x(θ) is the mode of the conditional distribution of p(x|θ,y). Thus, instead of using MCMC to find an iterative, sampling-based estimate of the posterior, it is arrived at numerically. This method of fitting the spatial models specified above has been presented by numerous authors (Blangiardo & Cameletti, 2015; Blangiardo et al., 2013; Lindgren & Rue, 2015; Martins et al., 2013; Schrodle & Held, 2011a, 2011b), with comparable results to MCMC.

Libraries

#library(rgdal)
library(spdep)
## Loading required package: sp
## Loading required package: Matrix
## Loading required package: spData
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source'))`
library(RColorBrewer)
library(lattice)
library(INLA)
## This is INLA_18.07.12 built 2018-07-12 11:05:18 UTC.
## See www.r-inla.org/contact-us for how to get help.
library(tigris)
## To enable 
## caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
## 
## Attaching package: 'tigris'
## The following object is masked from 'package:graphics':
## 
##     plot
library(tidycensus)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Data

I have the data on my github site under the nhgis_vs page. These are data from the NHGIS project by IPUMS who started providing birth and death data from the US Vital statistics program.

The data we will use here are infant mortality rates in US counties between 2000 and 2007.

files<-list.files("~/Google Drive/classes/dem7473/data/nhgis0022_csv/nhgis0022_csv/", pattern = "*.csv", full.names = T)
vital<-lapply(files, read.csv, header=T)


library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
df <- ldply(vital, data.frame)
df$cofips<-paste(substr(df$GISJOIN, 2,3), substr(df$GISJOIN, 5,7), sep="")

df<-df%>%
  filter(YEAR %in%2000:2007)%>%
  mutate(rate=as.numeric(AGWJ001) )%>%
  select(YEAR, cofips,rate)
head(df)
##   YEAR cofips rate
## 1 2000  01001   34
## 2 2000  01003   61
## 3 2000  01005  125
## 4 2000  01007   70
## 5 2000  01009   89
## 6 2000  01011  242

Census intercensus population estimates

From the Census population estimates program

popurl<-url("http://www2.census.gov/programs-surveys/popest/datasets/2000-2010/intercensal/county/co-est00int-tot.csv")
pops<-read.csv(popurl)
names(pops)<-tolower(names(pops))
pops<-pops%>%
  mutate(cofips = paste(sprintf(fmt = "%02d", state), sprintf(fmt = "%03d",county), sep=""))%>%
  filter(sumlev==50, !state%in%c(2, 15))

head(pops)
##   sumlev region division state county  stname        ctyname
## 1     50      3        6     1      1 Alabama Autauga County
## 2     50      3        6     1      3 Alabama Baldwin County
## 3     50      3        6     1      5 Alabama Barbour County
## 4     50      3        6     1      7 Alabama    Bibb County
## 5     50      3        6     1      9 Alabama  Blount County
## 6     50      3        6     1     11 Alabama Bullock County
##   estimatesbase2000 popestimate2000 popestimate2001 popestimate2002
## 1             43751           44021           44889           45909
## 2            140416          141342          144875          147957
## 3             29042           29015           28863           28653
## 4             19856           19913           21028           21199
## 5             50982           51107           51845           52551
## 6             11603           11581           11358           11256
##   popestimate2003 popestimate2004 popestimate2005 popestimate2006
## 1           46800           48366           49676           51328
## 2          151509          156266          162183          168121
## 3           28594           28287           28027           27861
## 4           21399           21721           22042           22099
## 5           53457           54124           54624           55485
## 6           11316           11056           11011           10776
##   popestimate2007 popestimate2008 popestimate2009 census2010pop
## 1           52405           53277           54135         54571
## 2          172404          175827          179406        182265
## 3           27757           27808           27657         27457
## 4           22438           22705           22941         22915
## 5           56240           57055           57341         57322
## 6           11011           10953           10987         10914
##   popestimate2010 cofips
## 1           54632  01001
## 2          183195  01003
## 3           27411  01005
## 4           22867  01007
## 5           57338  01009
## 6           10890  01011

Data prep

pops.long<-reshape(data = pops, idvar = "cofips", varying = list(names(pops)[9:16]), direction="long", drop = names(pops)[c(2,3,4,5,6,8,17,18,19,20)], v.names = "population")
pops.long$year<-pops.long$time+1999
head(pops.long)
##         sumlev        ctyname cofips time population year
## 01001.1     50 Autauga County  01001    1      44021 2000
## 01003.1     50 Baldwin County  01003    1     141342 2000
## 01005.1     50 Barbour County  01005    1      29015 2000
## 01007.1     50    Bibb County  01007    1      19913 2000
## 01009.1     50  Blount County  01009    1      51107 2000
## 01011.1     50 Bullock County  01011    1      11581 2000
dat.long<-merge(pops.long, df, by.x=c("cofips", "year"), by.y=c("cofips", "YEAR"))
head(dat.long)
##   cofips year sumlev        ctyname time population rate
## 1  01001 2000     50 Autauga County    1      44021   34
## 2  01001 2001     50 Autauga County    2      44889   78
## 3  01001 2002     50 Autauga County    3      45909   83
## 4  01001 2003     50 Autauga County    4      46800   79
## 5  01001 2004     50 Autauga County    5      48366   76
## 6  01001 2005     50 Autauga County    6      49676  124

Get census data using tidycensu

Here I get data from the 2000 decennial census summary file 3

#v00<-load_variables(year=2000, dataset = "sf3", cache = T)
cov_dat<-get_decennial(geography = "county", year = 2000, sumfile = "sf3",
                       summary_var = "P001001",
                       variables = c("P007003", "P007004","P007010","P053001", "P089001", "P089002" ),
                      output = "wide")
## Getting data from the 2000 decennial Census
cov_dat<-cov_dat%>%
  mutate(cofips=GEOID,pwhite=P007003/summary_value, pblack=P007004/summary_value, phisp=P007010/summary_value,medhhinc=as.numeric(scale(P053001)), ppov=P089002/P089001)


final.dat<-merge(dat.long, cov_dat, by="cofips")
head(final.dat)
##   cofips year sumlev        ctyname time population rate GEOID
## 1  01001 2006     50 Autauga County    7      51328   93 01001
## 2  01001 2003     50 Autauga County    4      46800   79 01001
## 3  01001 2004     50 Autauga County    5      48366   76 01001
## 4  01001 2005     50 Autauga County    6      49676  124 01001
## 5  01001 2000     50 Autauga County    1      44021   34 01001
## 6  01001 2007     50 Autauga County    8      52405   83 01001
##             NAME P007003 P007004 P007010 P053001 P089001 P089002
## 1 Autauga County   34760    7450     394   42013   43377    4738
## 2 Autauga County   34760    7450     394   42013   43377    4738
## 3 Autauga County   34760    7450     394   42013   43377    4738
## 4 Autauga County   34760    7450     394   42013   43377    4738
## 5 Autauga County   34760    7450     394   42013   43377    4738
## 6 Autauga County   34760    7450     394   42013   43377    4738
##   summary_value    pwhite    pblack       phisp  medhhinc      ppov
## 1         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 2         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 3         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 4         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 5         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 6         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284

Create expected numbers of cases

In count data models, and spatial epidemiology, we have to express the raw counts of events relative to some expected value, or population offset, see this Rpub for a reminder.

rates<-aggregate(rate~1, final.dat,mean) #in this case, we will standardize to the average IMR for the period

final.dat$E_d<-rates$rate

final.dat<-final.dat[order(final.dat$cofips, final.dat$year),]
final.dat$id<-1:dim(final.dat)[1]

head(final.dat)
##   cofips year sumlev        ctyname time population rate GEOID
## 5  01001 2000     50 Autauga County    1      44021   34 01001
## 8  01001 2001     50 Autauga County    2      44889   78 01001
## 7  01001 2002     50 Autauga County    3      45909   83 01001
## 2  01001 2003     50 Autauga County    4      46800   79 01001
## 3  01001 2004     50 Autauga County    5      48366   76 01001
## 4  01001 2005     50 Autauga County    6      49676  124 01001
##             NAME P007003 P007004 P007010 P053001 P089001 P089002
## 5 Autauga County   34760    7450     394   42013   43377    4738
## 8 Autauga County   34760    7450     394   42013   43377    4738
## 7 Autauga County   34760    7450     394   42013   43377    4738
## 2 Autauga County   34760    7450     394   42013   43377    4738
## 3 Autauga County   34760    7450     394   42013   43377    4738
## 4 Autauga County   34760    7450     394   42013   43377    4738
##   summary_value    pwhite    pblack       phisp  medhhinc      ppov
## 5         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 8         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 7         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 2         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 3         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
## 4         43671 0.7959515 0.1705938 0.009022005 0.7593459 0.1092284
##        E_d id
## 5 72.33683  1
## 8 72.33683  2
## 7 72.33683  3
## 2 72.33683  4
## 3 72.33683  5
## 4 72.33683  6
options(scipen=999)

Next we make the spatial information, we get the polygons from census directly using counties from the tigris package. We drop counties not in the contiguous 48 US states.

us_co<-counties( cb = T)
us_co<-us_co%>%
  subset(!STATEFP%in%c("02", "15", "60", "66", "69", "72", "78"))

Construction of spatial relationships:

Contiguity based neighbors

In a general sense, we can think of a square grid. Cells that share common elements of their geometry are said to be “neighbors”. There are several ways to describe these patterns, and for polygons, we generally use the rules of the chess board.

Rook adjacency Neighbors must share a line segment

Queen adjacency Neighbors must share a vertex or a line segment

If polygons share these boundaries (based on the specific definition: rook or queen), they are said to be “spatial neighbors” of one another. The figure below illustrates this principle.

For an observation of interest, the pink area, the Rood adjacent areas are those in green in the figure, because they share a line segment. For the second part of the figure on the right, the pink area has different sets of neighbors, compared to the Rook rule neighbors, because the area also shares vertices with other polygons, making them Queen neighbors.

Adjacency using Chessboard Rules

Adjacency using Chessboard Rules

Order of adjacency

The figure above also highlights the order of adjacency among observations. By order of adjacency, we simply men that observations are either immediate neighbors (the green areas), or they are neighbors of immediate neighbors. These are referred to as first and second order neighbors.

So, we can see, that the yellow polygons are the neighboring areas for this tract, which allows us to think about what the spatial structure of the area surrounding this part of campus.

For an example, let’s consider the case of San Antonio again. If our data are polygons, then there is a function in the spdep library in R, poly2nb that will take a polygon layer and find the neighbors of all areas using either a queen or rook rule. First we form the neighbors using the rook rule for all the tracts in Bexar County.

Distance based association

The queen and rook rules are useful for polygon features, but distance based contiguity is useful for all feature types (points, polygons, lines). The idea is similar to the polygon adjacency rule from above, but the distance rule is based on the calculated distance between areas. There are a variety of distance metrics that are used in statistics, but the most commonly assumed one is the Euclidean distance. The Euclidean distance between any two points is:

D2=(x1x2)2+(y1y2)2 Where x and y are the coordinates of each of the two areas. For polygons, these coordinates are typically the centroid of the polygon (you may have noticed this above when we were plotting the neighbor lists), while for point features, these are the two dimensional geometry of the feature. The collection of these distances between all features forms what is known as the distance matrix between observations. This summarizes all distances between all features in the data.

K nearest neighbors

#In INLA, we don't need FIPS codes, we need a simple numeric index for our counties
us_co$struct<-1:dim(us_co@data)[1]
nbs<-knearneigh(coordinates(us_co), k = 5, longlat = T) #k=5 nearest neighbors
nbs<-knn2nb(nbs, row.names = us_co$struct, sym = T) #force symmetry!!
mat <- nb2mat(nbs, style="B",zero.policy=TRUE)
colnames(mat) <- rownames(mat) 
mat <- as.matrix(mat[1:dim(mat)[1], 1:dim(mat)[1]])


nb2INLA("cl_graph",nbs)
am_adj <-paste(getwd(),"/cl_graph",sep="")
H<-inla.read.graph(filename="cl_graph")
image(inla.graph2matrix(H), xlab="", ylab="", main="")

Plot geographies

library(sf)
## Linking to GEOS 3.6.1, GDAL 2.2.3, proj.4 4.9.3
us_co<-st_as_sf(us_co)
us_co$cofips<-paste(us_co$STATEFP, us_co$COUNTYFP, sep="")
us_co%>%
  ggplot()+geom_sf()+coord_sf(crs = 102008)

final.dat<-merge( us_co,final.dat, by="cofips")
final.dat<-final.dat[order(final.dat$cofips, final.dat$year),]

Model setup

ggplot(data = final.dat)+geom_histogram(aes(x =log(rate) , y=0.5*..density..))+facet_wrap(~year)+
  ggtitle(label = "Distribution of Infant Mortality Rate by Year", subtitle = "US Counties, 2000-2007")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 5772 rows containing non-finite values (stat_bin).

ggplot(data = final.dat)+geom_histogram(aes(x =log(rate/E_d) , y=0.5*..density..))+facet_wrap(~year)+
  ggtitle(label = "Distribution of Infant Mortality Relative Risk by Year", subtitle = "US Counties, 2000-2007")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 5772 rows containing non-finite values (stat_bin).

final.dat%>%
  filter(year%in%c(2000))%>%
  mutate(qrr=cut(I(rate/E_d), breaks = quantile(I(rate/E_d), p=seq(0,1,length.out = 5)), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - raw data, 2000")+coord_sf(crs = 102008)

We can fit these model using the Bayesian framework with INLA.

First, we consider the basic GLM for the mortality outcome, with out any hierarchical structure. We can write this model as a Negative Binomial model, for instance as:

Deaths_ij=log(E_d)+Xβ

INLA will use vague Normal priors for the β’s, and we have not other parameters in the model to specify priors for. INLA does not require you to specify all priors, as all parameters have a default prior specification.

#Model specification:
f1<-rate~scale(pblack)+scale(phisp)+scale(ppov)+year

#Model fit
mod1<-inla(formula = f1,data = final.dat, #linear predictor - fixed effects
           family = "nbinomial", E = E_d,  #marginal distribution for the outcome, expected count
           control.compute = list(waic=T), # compute DIC or not?
           control.predictor = list(link=1), #estimate predicted values & their marginals or not?
           num.threads = 3, 
               verbose = F)
#model summary
summary(mod1)
## 
## Call:
## c("inla(formula = f1, family = \"nbinomial\", data = final.dat, E = E_d, ",  "    verbose = F, control.compute = list(waic = T), control.predictor = list(link = 1), ",  "    num.threads = 3)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.8889         69.1257          1.1870         72.2017 
## 
## Fixed effects:
##                  mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)    9.3970 8.3295    -6.9569   9.3969    25.7365  9.3973   0
## scale(pblack)  0.1347 0.0110     0.1132   0.1347     0.1564  0.1347   0
## scale(phisp)  -0.0188 0.0099    -0.0380  -0.0188     0.0008 -0.0189   0
## scale(ppov)    0.0856 0.0114     0.0633   0.0855     0.1080  0.0855   0
## year          -0.0047 0.0042    -0.0129  -0.0047     0.0035 -0.0047   0
## 
## The model has no random effects
## 
## Model hyperparameters:
##                                                          mean    sd
## size for the nbinomial observations (1/overdispersion) 0.4437 0.004
##                                                        0.025quant 0.5quant
## size for the nbinomial observations (1/overdispersion)     0.4358   0.4437
##                                                        0.975quant   mode
## size for the nbinomial observations (1/overdispersion)     0.4516 0.4437
## 
## Expected number of effective parameters(std dev): 5.068(0.0045)
## Number of equivalent replicates : 4903.10 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 252724.95
## Effective number of parameters .................: 7.665
## 
## Marginal log-Likelihood:  -126406.15 
## Posterior marginals for linear predictor and fitted values computed

Plot our observed vs fitted values

plot(x= mod1$summary.fitted.values$mean, y=final.dat$rate/final.dat$E_d , ylab="Observed", xlab="Estimated" )

Basic county level random intercept model

Now we add basic nesting of rates within counties, with a random intercept term for each county. This would allow there to be heterogeneity in the mortality rate for each county, over and above each county’s observed characteristics.

This model would be:

Deaths_ij=log(E_d)+Xβ+uj ujNormal(0,τu)

where τu here is the precision, not the variance and precision = 1/variance. INLA puts a log-gamma prior on the the precision by default.

f2<-rate~scale(pblack)+scale(phisp)+scale(ppov)+year+ #fixed effects
  f(struct, model = "iid")  #random effects

mod2<-inla(formula = f2,data = final.dat,
           family = "nbinomial", E = E_d, 
           control.compute = list(waic=T), 
           control.predictor = list(link=1),
           num.threads = 3, 
               verbose = F)

#total model summary
summary(mod2)
## 
## Call:
## c("inla(formula = f2, family = \"nbinomial\", data = final.dat, E = E_d, ",  "    verbose = F, control.compute = list(waic = T), control.predictor = list(link = 1), ",  "    num.threads = 3)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.4095        856.0284          1.2317        858.6695 
## 
## Fixed effects:
##                  mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)    9.3987 8.3257    -6.9479   9.3985    25.7309  9.3989   0
## scale(pblack)  0.1348 0.0110     0.1132   0.1347     0.1564  0.1347   0
## scale(phisp)  -0.0188 0.0099    -0.0381  -0.0189     0.0008 -0.0190   0
## scale(ppov)    0.0856 0.0114     0.0633   0.0855     0.1080  0.0855   0
## year          -0.0047 0.0042    -0.0129  -0.0047     0.0035 -0.0047   0
## 
## Random effects:
## Name   Model
##  struct   IID model 
## 
## Model hyperparameters:
##                                                              mean
## size for the nbinomial observations (1/overdispersion)     0.4436
## Precision for struct                                   23433.6648
##                                                                sd
## size for the nbinomial observations (1/overdispersion)     0.0041
## Precision for struct                                   22142.3842
##                                                        0.025quant
## size for the nbinomial observations (1/overdispersion)     0.4357
## Precision for struct                                    2125.9108
##                                                          0.5quant
## size for the nbinomial observations (1/overdispersion)     0.4436
## Precision for struct                                   17113.9178
##                                                        0.975quant
## size for the nbinomial observations (1/overdispersion)     0.4519
## Precision for struct                                   81547.0082
##                                                             mode
## size for the nbinomial observations (1/overdispersion)    0.4434
## Precision for struct                                   6044.6153
## 
## Expected number of effective parameters(std dev): 6.298(1.699)
## Number of equivalent replicates : 3945.58 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 252725.17
## Effective number of parameters .................: 8.418
## 
## Marginal log-Likelihood:  -126406.69 
## Posterior marginals for linear predictor and fitted values computed

Marginal Distributions of hyperparameters

We can plot the posterior marginal of the hyperparameter in this model, in this case σu=1/τu

m2<- inla.tmarginal(
        function(x) (1/x), #invert the precision to be on variance scale
        mod2$marginals.hyperpar$`Precision for struct`)
#95% credible interval for the variance
inla.hpdmarginal(.95, marginal=m2)
##                      low         high
## level:0.95 0.00000473886 0.0003274601
plot(m2, type="l", main=c("Posterior distibution for between county variance", "- IID model -"), xlim=c(0, .01))

final.dat$fitted_m2<-mod2$summary.fitted.values$mean

final.dat%>%
  filter(year%in%c(2000))%>%
  mutate(qrr=cut(fitted_m2, breaks = quantile(fitted_m2, p=seq(0,1,length.out = 6)), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2000")+coord_sf(crs = 102008)

final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(fitted_m2, breaks = quantile(fitted_m2, p=seq(0,1,length.out = 6)), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2007")+coord_sf(crs = 102008)

# library(mapview)
# 
# map1<-final.dat%>%
#   filter(year%in%c(2007))%>%
#   mutate(qrr=cut(fitted_m2, breaks = quantile(fitted_m2, p=seq(0,1,length.out = 8))))
# clrs <- colorRampPalette(brewer.pal(8, "RdBu"))
# mapView(as(map1, "Spatial"), zcol="qrr", legend=T, col.regions=clrs)

BYM Model

Model with spatial correlation - Besag, York, and Mollie (1991) model and temporal heterogeneity Deaths_ij=log(E_d)+Xβ+uj+vj+γt Which has two random effects, one an IID random effect and the second a spatially correlated random effect, specified as a conditionally auto-regressive prior for the vj’s. This is the Besag model:

vj|vj,Normal(1niijvj,1niτ) and uj is an IID normal random effect, γt is also given an IID Normal random effect specification, and there are now three hyperparameters, τu and τv and τγ and each are given log-gamma priors.

For the BYM model we must specify the spatial connectivity matrix in the random effect.

#final.dat$year_c<-final.dat$year - 2004
f3<-rate~scale(pblack)+scale(phisp)+scale(ppov)+
  f(struct, model = "bym", constr = T, scale.model = T, graph = H)+
  f(year, model="iid") #temporal random effect
mod3<-inla(formula = f3,data = final.dat,
           family = "nbinomial", E = E_d, 
           control.compute = list(waic=T), 
           num.threads = 3, 
               verbose = F,
           control.predictor = list(link=1))

#total model summary
summary(mod3)
## 
## Call:
## c("inla(formula = f3, family = \"nbinomial\", data = final.dat, E = E_d, ",  "    verbose = F, control.compute = list(waic = T), control.predictor = list(link = 1), ",  "    num.threads = 3)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          4.7064        833.2760          1.2902        839.2726 
## 
## Fixed effects:
##                  mean     sd 0.025quant 0.5quant 0.975quant    mode    kld
## (Intercept)   -0.0224 0.0105    -0.0430  -0.0224    -0.0017 -0.0224 0.0003
## scale(pblack)  0.1146 0.0137     0.0875   0.1147     0.1412  0.1149 0.0000
## scale(phisp)  -0.0095 0.0125    -0.0337  -0.0096     0.0155 -0.0099 0.0000
## scale(ppov)    0.0829 0.0122     0.0590   0.0829     0.1069  0.0829 0.0000
## 
## Random effects:
## Name   Model
##  struct   BYM model 
## year   IID model 
## 
## Model hyperparameters:
##                                                              mean
## size for the nbinomial observations (1/overdispersion)     0.4448
## Precision for struct (iid component)                    3369.4086
## Precision for struct (spatial component)                 303.0818
## Precision for year                                     12313.4051
##                                                                sd
## size for the nbinomial observations (1/overdispersion)     0.0041
## Precision for struct (iid component)                    2419.8286
## Precision for struct (spatial component)                 157.6129
## Precision for year                                     14240.7406
##                                                        0.025quant
## size for the nbinomial observations (1/overdispersion)     0.4366
## Precision for struct (iid component)                     744.3457
## Precision for struct (spatial component)                 109.6141
## Precision for year                                       434.0078
##                                                         0.5quant
## size for the nbinomial observations (1/overdispersion)    0.4449
## Precision for struct (iid component)                   2743.4492
## Precision for struct (spatial component)                266.5949
## Precision for year                                     7708.3989
##                                                        0.975quant     mode
## size for the nbinomial observations (1/overdispersion)     0.4529    0.445
## Precision for struct (iid component)                    9734.7240 1793.162
## Precision for struct (spatial component)                 709.2834  210.039
## Precision for year                                     50622.7508  917.843
## 
## Expected number of effective parameters(std dev): 41.38(12.32)
## Number of equivalent replicates : 600.55 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 252687.29
## Effective number of parameters .................: 30.18
## 
## Marginal log-Likelihood:  -125275.75 
## Posterior marginals for linear predictor and fitted values computed
plot(y=mod3$summary.random$year_c$mean,x=unique(final.dat$year), type="l")

m3a<- inla.tmarginal(
        function(x) (1/x),
        mod3$marginals.hyperpar$`Precision for struct (iid component)`)
m3b<- inla.tmarginal(
        function(x) (1/x),
        mod3$marginals.hyperpar$`Precision for struct (spatial component)`)
m3c<- inla.tmarginal(
        function(x) (1/x),
        mod3$marginals.hyperpar$`Precision for year`)

plot(m3a, type="l", main=c("Posterior distibution for between county variance", "- IID model -"), xlim=c(0, .07),  ylim=c(0,600))
lines(m3b, col="red")
lines(m3c, col="green")

inla.hpdmarginal(.95,m3a)
##                      low        high
## level:0.95 0.00005314888 0.001088671
inla.hpdmarginal(.95,m3b)
##                   low        high
## level:0.95 0.00102769 0.008057332
inla.hpdmarginal(.95,m3c)
##                       low        high
## level:0.95 0.000007064516 0.001517974

This indicates very low spatially correlated variance in these data and very low temporal heterogeneity as well.

Space-time mapping of the fitted values

final.dat$fitted_m3<-mod3$summary.fitted.values$mean

final.dat%>%
  filter(year%in%c(2000))%>%
  mutate(qrr=cut(fitted_m3, breaks = quantile(fitted_m3, p=seq(0,1,length.out = 6)), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2000")+coord_sf(crs = 102008)

final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(fitted_m3, breaks = quantile(fitted_m3, p=seq(0,1,length.out = 6)), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2007")+coord_sf(crs = 102008)

library(mapview)

#map1<-final.dat%>%
#  filter(year%in%c(2007))%>%
#  mutate(qrr=cut(fitted_m3, breaks = quantile(fitted_m3, p=seq(0,1,length.out = 8))))
#clrs <- colorRampPalette(brewer.pal(8, "RdBu"))
#mapView(as(map1, "Spatial"), zcol="qrr", legend=T, col.regions=clrs)

Map of spatial random effects

It is common to map the random effects from the BYM model to look for spatial trends, in this case, there are not strong spatial signals:

us_co$sp_re<-mod3$summary.random$struct$mean[1:3108]
us_co%>%
  mutate(qse=cut(sp_re, breaks = quantile(sp_re, p=seq(0,1,length.out = 6)), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qse))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Spatial Excess Risk"))+ggtitle(label="Spatial Random Effect - BYM Model")+coord_sf(crs = 102008)

Exceedence probabilities

In Bayesian spatial models that are centered on an epidemiological type of outcome, it is common to examine the data for spatial clustering. One way to do this is to examine the clustering in the relative risk from one of these GLMM models. For instance if θ is the relative risk θ=exp(β0+β1x1+uj) from one of our Negative binomial models above. We can use the posterior marginals of the relative risk to ask θ>θ where θ is a specific level of excess risk, say 50% extra or θ>1.5. If the density, or Pr(θ>θ) is high, then there is evidence that the excess risk is not only high, but significantly high.

To get the exceedence probabilities from one of our models, we can use the inla.pmarginal() function to ask if Pr(θ>θ)

thetastar<-1.5#theta*
inlaprob<- unlist(lapply(mod3$marginals.fitted.values, function(X){
   1-inla.pmarginal(thetastar, X)
}))
hist(inlaprob)

So, we see lots of occasions where the exceedence probability is greater than .9. We can visualize these in a map.

final.dat$exceedprob<-inlaprob

final.dat%>%
  filter(year%in%c(2000))%>%
  mutate(qrr=cut(exceedprob, breaks = c(0, .5, .9, .95, .99, 1), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "Blues" )+scale_fill_brewer(palette = "Blues", na.value="grey")+guides(fill=guide_legend(title=""))+ggtitle(label=expression(paste("Exceedence Probability Relative Risk ","Pr( ",theta," >1.5"," )  - 2000") ))+coord_sf(crs = 102008)

final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(exceedprob, breaks = c(0, .5, .9, .95, .99, 1), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "Blues" )+scale_fill_brewer(palette = "Blues", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label=expression(paste("Exceedence Probability Relative Risk ","Pr( ",theta," >1.5"," )  - 2007") ))+coord_sf(crs = 102008)

library(mapview)

#map1<-final.dat%>%
 # filter(year%in%c(2007))%>%
#  mutate(qrr=cut(exceedprob, breaks = c(0, .5, .9, .95, .99, 1), include.lowest = T))
  
#clrs <- colorRampPalette(brewer.pal(6, "Blues"))
#mapView(as(map1, "Spatial"), zcol="qrr", legend=T, col.regions=clrs, map.types="OpenStreetMap")

Which shows several areas where risk the mortality rate is higher than the national rate.

References

Besag, J., York, J., & Mollie, a. (1991). Bayesian image-restoration, with 2 applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43(1), 1-20. https://doi.org/10.1007/BF00116466