The INLA Approach to Bayesian models

The Integrated Nested Laplace Approximation, or INLA, approach is a recently developed, computationally simpler method for fitting Bayesian models [(Rue et al., 2009, compared to traditional Markov Chain Monte Carlo (MCMC) approaches. INLA fits models that are classified as latent Gaussian models, which are applicable in many settings (Martino & Rue, 2010. In general, INLA fits a general form of additive models such as:

\(\eta = \alpha + \sum_{j=1}^{nf} f^{(j)}(u_{ij}) + \sum_{k=1}^{n\beta}\beta_k z_{ki} + \epsilon_i\)

where \(\eta\) is the linear predictor for a generalized linear model formula, and is composed of a linear function of some variables u, \(\beta\) are the effects of covariates, z, and \(\epsilon\) is an unstructured residual (Rue et al., 2009). As this model is often parameterized as a Bayesian one, we are interested in the posterior marginal distributions of all the model parameters. Rue and Martino (2007) show that the posterior marginal for the random effects (x) in such models can be approximated as:

\(\tilde{p}(x_i|y) = \sum_k \tilde{p}(x_i|\theta_k, y) \tilde{p}(\theta_k|y) \Delta_k\)

via numerical integration (Rue & Martino, 2007; Schrodle & Held, 2011a, 2011b). The posterior distribution of the hyperparameters (\(\theta\)) of the model can also be approximated as:

\(\tilde{p}(\theta | y)) \propto \frac{p(x, \theta, y)}{\tilde{p}G(x| \theta,y)} \mid _{x} = x^*(\theta)\)

, where G is a Gaussian approximation of the posterior and \(x^*(\theta)\) is the mode of the conditional distribution of \(p(x|\theta,y)\). Thus, instead of using MCMC to find an iterative, sampling-based estimate of the posterior, it is arrived at numerically. This method of fitting the spatial models specified above has been presented by numerous authors (Blangiardo & Cameletti, 2015; Blangiardo et al., 2013; Lindgren & Rue, 2015; Martins et al., 2013; Schrodle & Held, 2011a, 2011b), with comparable results to MCMC.

Below, I show examples of using INLA to fit Bayesian regression models for data from US counties. One example will be a relatively small data set from Texas, while the other example will be all US counties. The US county example basically emulates the paper by Sparks et al 2012.

library(rgdal)

## Loading required package: sp

## rgdal: version: 1.2-5, (SVN revision 648)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.0.1, released 2015/09/15
##  Path to GDAL shared files: C:/Users/Corey Sparks/Documents/R/win-library/3.3/rgdal/gdal
##  Loaded PROJ.4 runtime: Rel. 4.9.2, 08 September 2015, [PJ_VERSION: 492]
##  Path to PROJ.4 shared files: C:/Users/Corey Sparks/Documents/R/win-library/3.3/rgdal/proj
##  Linking to sp version: 1.2-4

library(spdep)

## Loading required package: Matrix

library(RColorBrewer)
library(lattice)
library(INLA)

## Loading required package: splines

## This is INLA 0.0-1468872408, dated 2016-07-18 (14:43:05+0100).
## See www.r-inla.org/contact-us for how to get help.

#read in big US shapefile with lots of stuff in it
usdat<-readOGR(dsn = "D:/GoogleDrive/classes/dem7263/data",layer = "output12711")

## OGR data source with driver: ESRI Shapefile 
## Source: "D:/GoogleDrive/classes/dem7263/data", layer: "output12711"
## with 3064 features
## It has 328 fields
## Integer64 fields read as strings:  f1198408 f1198407 f1198406 f1198405 f1198404 f1198403 f1198402 f1198401_1 f0453000_1 f1198499_1 f1198498 f1198497 f1198496 f1198495 f1198494 f1198493 f1198492 f1198491 pubassist SSI socsec

#get outcome, which are the numbers of infant deaths
usdat$infd03<-usdat$f1194803
#get denominator, which are the number of births  
usdat$births03<-usdat$f1254603
usdat$e<-usdat$births03*(sum(usdat$infd03)/sum(usdat$births03))


#form spatial neighbors
#In INLA, we don't need FIPS codes, we need a simple numeric index for our counties
usdat$struct<-1:nrow(usdat@data)


#!!Be sure to set the row names of the neighbor list to the struct variable we just made!!
us.nb<-poly2nb(usdat, row.names=usdat$struct)

#Save the neighbor list to an external file, INLA needs the list this way!!
nb2INLA(us.nb, file="D:/GoogleDrive/classes/dem7263/class17//usa.gra")

We can fit the OLS model using the bayesian framework as:

#US Data
fit.us2<-inla(I(infd03/births03)~metro+giniz+hispz+blackz+povz+ne_bad+isowbz, family = "gaussian",control.predictor = list(link=1), data=usdat@data, control.compute = list(dic=T))
summary(fit.us2)

## 
## Call:
## c("inla(formula = I(infd03/births03) ~ metro + giniz + hispz + blackz + ",  "    povz + ne_bad + isowbz, family = \"gaussian\", data = usdat@data, ",  "    control.compute = list(dic = T), control.predictor = list(link = 1))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.3043          1.9773          0.4622          2.7438 
## 
## Fixed effects:
##                mean    sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0066 1e-04     0.0063   0.0066     0.0068  0.0066   0
## metro        0.0007 2e-04     0.0003   0.0007     0.0011  0.0007   0
## giniz       -0.0001 1e-04    -0.0003  -0.0001     0.0001 -0.0001   0
## hispz       -0.0004 1e-04    -0.0006  -0.0004    -0.0002 -0.0004   0
## blackz       0.0016 1e-04     0.0014   0.0016     0.0018  0.0016   0
## povz         0.0010 1e-04     0.0007   0.0010     0.0013  0.0010   0
## ne_bad       0.0000 3e-04    -0.0005   0.0000     0.0006  0.0000   0
## isowbz       0.0006 1e-04     0.0004   0.0006     0.0008  0.0006   0
## 
## The model has no random effects
## 
## Model hyperparameters:
##                                              mean       sd 0.025quant
## Precision for the Gaussian observations 193269.11 18800.75  160773.79
##                                          0.5quant 0.975quant      mode
## Precision for the Gaussian observations 191776.57  234431.01 188810.97
## 
## Expected number of effective parameters(std dev): 2336.38(55.63)
## Number of equivalent replicates : 1.311 
## 
## Deviance Information Criterion (DIC) ...: -26281.16
## Effective number of parameters .........: 2311.74
## 
## Marginal log-Likelihood:  12017.41 
## Posterior marginals for linear predictor and fitted values computed

inla.hpdmarginal(p = .95, fit.us2$marginals.fixed$giniz)

##                     low         high
## level:0.95 -0.000325973 0.0001271985

inla.hpdmarginal(p = .95, fit.us2$marginals.fixed$ne_bad)

##                      low         high
## level:0.95 -0.0004679239 0.0005539363

inla.hpdmarginal(p = .95, fit.us2$marginals.fixed$isowbz)

##                     low         high
## level:0.95 0.0003971225 0.0008073215

OR, as a poisson model:

#US data
fit.p_us<-inla(infd03~metro+giniz+hispz+blackz+povz+ne_bad+isowbz, family = "poisson",E=e+.0001, data=usdat@data, control.compute = list(dic=T))
summary(fit.p_us)

## 
## Call:
## c("inla(formula = infd03 ~ metro + giniz + hispz + blackz + povz + ",  "    ne_bad + isowbz, family = \"poisson\", data = usdat@data, E = e + ",  "    1e-04, control.compute = list(dic = T))")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.1684          1.1952          0.4688          1.8324 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0348 0.0151     0.0050   0.0348     0.0643  0.0349   0
## metro        0.0649 0.0208     0.0241   0.0649     0.1058  0.0649   0
## giniz       -0.0016 0.0055    -0.0123  -0.0017     0.0092 -0.0018   0
## hispz       -0.0747 0.0075    -0.0893  -0.0747    -0.0600 -0.0747   0
## blackz       0.1389 0.0114     0.1165   0.1389     0.1614  0.1389   0
## povz         0.1448 0.0108     0.1237   0.1448     0.1659  0.1449   0
## ne_bad      -0.2301 0.0189    -0.2672  -0.2301    -0.1931 -0.2300   0
## isowbz      -0.0461 0.0126    -0.0708  -0.0461    -0.0213 -0.0461   0
## 
## The model has no random effects
## 
## The model has no hyperparameters
## 
## Expected number of effective parameters(std dev): 8.445(0.00)
## Number of equivalent replicates : 362.83 
## 
## Deviance Information Criterion (DIC) ...: 10303.27
## Effective number of parameters .........: 8.444
## 
## Marginal log-Likelihood:  -5205.15 
## Posterior marginals for linear predictor and fitted values computed

#or negative binonmial on the us data
fit.nb_us<-inla(infd03~metro+giniz+hispz+blackz+povz+ne_bad+isowbz, family = "nbinomial",E=e+.0001, data=usdat@data, control.compute = list(dic=T))
summary(fit.nb_us)

## 
## Call:
## c("inla(formula = infd03 ~ metro + giniz + hispz + blackz + povz + ",  "    ne_bad + isowbz, family = \"nbinomial\", data = usdat@data, ",  "    E = e + 1e-04, control.compute = list(dic = T))")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.1624          2.4732          0.4462          3.0818 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0333 0.0159     0.0020   0.0334     0.0644  0.0334   0
## metro        0.0601 0.0232     0.0146   0.0600     0.1055  0.0600   0
## giniz       -0.0046 0.0100    -0.0243  -0.0045     0.0149 -0.0044   0
## hispz       -0.0757 0.0101    -0.0955  -0.0757    -0.0559 -0.0757   0
## blackz       0.1178 0.0143     0.0896   0.1178     0.1460  0.1177   0
## povz         0.1574 0.0138     0.1303   0.1574     0.1844  0.1574   0
## ne_bad      -0.1969 0.0247    -0.2453  -0.1969    -0.1485 -0.1969   0
## isowbz      -0.0585 0.0164    -0.0907  -0.0585    -0.0262 -0.0586   0
## 
## The model has no random effects
## 
## Model hyperparameters:
##                                                       mean    sd
## size for the nbinomial observations (overdispersion) 52.41 4.189
##                                                      0.025quant 0.5quant
## size for the nbinomial observations (overdispersion)      44.75    52.24
##                                                      0.975quant  mode
## size for the nbinomial observations (overdispersion)      60.98 51.85
## 
## Expected number of effective parameters(std dev): 8.252(0.0063)
## Number of equivalent replicates : 371.32 
## 
## Deviance Information Criterion (DIC) ...: 10269.18
## Effective number of parameters .........: 8.815
## 
## Marginal log-Likelihood:  -5234.51 
## Posterior marginals for linear predictor and fitted values computed

Or a simple, unstructured random county intercept model:

#US model
fit.pre_u<-inla(infd03~metro+giniz+hispz+blackz+povz+ne_bad+isowbz + f(struct, model = "iid"), family = "poisson",E=e+.0001, data=usdat@data, control.compute = list(dic=T))
summary(fit.pre_u)

## 
## Call:
## c("inla(formula = infd03 ~ metro + giniz + hispz + blackz + povz + ",  "    ne_bad + isowbz + f(struct, model = \"iid\"), family = \"poisson\", ",  "    data = usdat@data, E = e + 1e-04, control.compute = list(dic = T))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.2045          7.0177          1.0587          8.2809 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0304 0.0155    -0.0001   0.0304     0.0606  0.0305   0
## metro        0.0623 0.0219     0.0193   0.0623     0.1054  0.0623   0
## giniz       -0.0034 0.0078    -0.0187  -0.0034     0.0119 -0.0034   0
## hispz       -0.0753 0.0089    -0.0929  -0.0753    -0.0578 -0.0753   0
## blackz       0.1241 0.0130     0.0985   0.1241     0.1496  0.1241   0
## povz         0.1546 0.0123     0.1304   0.1546     0.1788  0.1546   0
## ne_bad      -0.2094 0.0220    -0.2525  -0.2095    -0.1663 -0.2095   0
## isowbz      -0.0554 0.0147    -0.0842  -0.0554    -0.0265 -0.0554   0
## 
## Random effects:
## Name   Model
##  struct   IID model 
## 
## Model hyperparameters:
##                        mean    sd 0.025quant 0.5quant 0.975quant   mode
## Precision for struct 141.12 25.40     100.45   137.94     200.23 131.76
## 
## Expected number of effective parameters(std dev): 147.40(20.21)
## Number of equivalent replicates : 20.79 
## 
## Deviance Information Criterion (DIC) ...: 10201.89
## Effective number of parameters .........: 148.05
## 
## Marginal log-Likelihood:  -5170.65 
## Posterior marginals for linear predictor and fitted values computed

A more complicated, Besag ICAR model:

#US Models
fit.psp1_u<-inla(infd03~metro+giniz+hispz+blackz+povz+ne_bad+isowbz + f(struct, model = "besag", graph = "D:/GoogleDrive/classes/dem7263/usa.gra"), family = "poisson",E=e+.0001, data=usdat@data, control.compute = list(dic=T))
summary(fit.psp1_u)

## 
## Call:
## c("inla(formula = infd03 ~ metro + giniz + hispz + blackz + povz + ",  "    ne_bad + isowbz + f(struct, model = \"besag\", graph = \"D:/GoogleDrive/classes/dem7263/usa.gra\"), ",  "    family = \"poisson\", data = usdat@data, E = e + 1e-04, control.compute = list(dic = T))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.7133         13.2682          0.5886         14.5701 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0279 0.0156    -0.0028   0.0280     0.0584  0.0281   0
## metro        0.0581 0.0221     0.0148   0.0581     0.1015  0.0581   0
## giniz        0.0008 0.0080    -0.0148   0.0007     0.0164  0.0007   0
## hispz       -0.0566 0.0117    -0.0794  -0.0566    -0.0335 -0.0568   0
## blackz       0.1073 0.0148     0.0783   0.1074     0.1364  0.1074   0
## povz         0.1476 0.0131     0.1217   0.1476     0.1733  0.1476   0
## ne_bad      -0.1635 0.0243    -0.2112  -0.1635    -0.1158 -0.1636   0
## isowbz      -0.0658 0.0155    -0.0961  -0.0658    -0.0354 -0.0658   0
## 
## Random effects:
## Name   Model
##  struct   Besags ICAR model 
## 
## Model hyperparameters:
##                       mean    sd 0.025quant 0.5quant 0.975quant  mode
## Precision for struct 70.66 14.28      47.98    68.84     103.92 65.24
## 
## Expected number of effective parameters(std dev): 110.14(14.64)
## Number of equivalent replicates : 27.82 
## 
## Deviance Information Criterion (DIC) ...: 10100.03
## Effective number of parameters .........: 110.75
## 
## Marginal log-Likelihood:  -7506.17 
## Posterior marginals for linear predictor and fitted values computed

1/fit.psp1_u$summary.hyperpar$mean

## [1] 0.01415256

#Change the default prior on the spatial variance:

fit.psp2_u<-inla(infd03~metro+giniz+hispz+blackz+povz+ne_bad+isowbz + f(struct, model = "besag", graph = "D:/GoogleDrive/classes/dem7263/usa.gra", hyper = list(theta=list(prior="loggamma", param=c(.5, .005)))), family = "poisson",E=e+.0001, data=usdat@data, control.compute = list(dic=T))
summary(fit.psp2_u)

## 
## Call:
## c("inla(formula = infd03 ~ metro + giniz + hispz + blackz + povz + ",  "    ne_bad + isowbz + f(struct, model = \"besag\", graph = \"D:/GoogleDrive/classes/dem7263/usa.gra\", ",  "    hyper = list(theta = list(prior = \"loggamma\", param = c(0.5, ",  "        0.005)))), family = \"poisson\", data = usdat@data, E = e + ",  "    1e-04, control.compute = list(dic = T))")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.6748         13.5814          0.6258         14.8820 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0278 0.0156    -0.0029   0.0278     0.0583  0.0279   0
## metro        0.0580 0.0221     0.0146   0.0580     0.1015  0.0580   0
## giniz        0.0008 0.0080    -0.0149   0.0008     0.0165  0.0007   0
## hispz       -0.0563 0.0118    -0.0793  -0.0564    -0.0331 -0.0565   0
## blackz       0.1072 0.0149     0.0781   0.1072     0.1364  0.1072   0
## povz         0.1475 0.0132     0.1216   0.1475     0.1733  0.1476   0
## ne_bad      -0.1631 0.0244    -0.2110  -0.1631    -0.1153 -0.1632   0
## isowbz      -0.0659 0.0155    -0.0963  -0.0659    -0.0354 -0.0659   0
## 
## Random effects:
## Name   Model
##  struct   Besags ICAR model 
## 
## Model hyperparameters:
##                       mean    sd 0.025quant 0.5quant 0.975quant  mode
## Precision for struct 68.18 13.47      46.69    66.50      99.42 63.23
## 
## Expected number of effective parameters(std dev): 112.63(14.65)
## Number of equivalent replicates : 27.20 
## 
## Deviance Information Criterion (DIC) ...: 10100.53
## Effective number of parameters .........: 113.20
## 
## Marginal log-Likelihood:  -7501.95 
## Posterior marginals for linear predictor and fitted values computed

1/fit.psp2_u$summary.hyperpar$mean

## [1] 0.01466758

1/fit.pre_u$summary.hyperpar$mean

## [1] 0.007086127

You notice less variation in the US hyperparameter than in the Texas hyperparameter, that’s becuase we have more data in the US example.

Compare the marginal distributions for the precision parameter from each model. The precision = 1/\(\sigma^2\), so we need to transform it to a variance scale so we can interpret it as we normally would!

#US example
1/fit.psp1_u$summary.hyperpar$mean

## [1] 0.01415256

um<- inla.tmarginal(
        function(x) 1/x,
        fit.psp1_u$marginals.hyperpar$'Precision for struct')

plot(um, type="l", main="Posterior distibution for model hyperparameter - US data")

1/fit.psp2_u$summary.hyperpar$mean

## [1] 0.01466758

um2<- inla.tmarginal(
        function(x) 1/x,
        fit.psp2_u$marginals.hyperpar$'Precision for struct')
lines(um2,lty=3, lwd=3    )

The two different priors show very similar results in terms of the posterior distribution of the county level variance.

Let’s have a look at the raw vs fitted SMR for infant mortality:

##US example
usdat$smr<-usdat$infd03/usdat$e
usdat$fittedsmr<-fit.psp2_u$summary.fitted.values$mean
plot(usdat$smr, usdat$fittedsmr)

brks<-quantile(c(usdat$smr, usdat$fittedsmr), p=seq(0, 1,length.out = 9), na.rm=T)

brks

##        0%     12.5%       25%     37.5%       50%     62.5%       75% 
## 0.0000000 0.6108946 0.8173996 0.9131458 0.9998666 1.1018662 1.2424329 
##     87.5%      100% 
## 1.5389389 8.6480283

spplot(usdat, c("smr", "fittedsmr"), col.regions=brewer.pal(n=9, "RdBu"), at=brks , main="Raw and Fitted SMR")

spplot(usdat, "fittedsmr", col.regions=brewer.pal(n=9, "RdBu"), at=brks, main="US County Fitted SMR")

usdat$spre<-fit.psp2_u$summary.random$struct$mean
spplot(usdat, "spre", col.regions=brewer.pal(n=9, "RdBu"), at=quantile(usdat$spre, p=seq(0, 1,length.out = 9)), main="US County Fitted Spatial Random Effect")

The Besag, York and Mollie model. This model has two random effects, one which is spatially correlated, and the other which is iid normal. In INLA, and when using the BYM model, the first 1:n rows of the random effect are the combined spatial and “nonspatial” effects.

#US data
fit.us_bym<-inla(infd03~metro+giniz+hispz+blackz+povz+ne_bad+isowbz + f(struct, model = "bym", graph = "D:/GoogleDrive/classes/dem7263/usa.gra"), family = "poisson",E=e+.0001, data=usdat@data, control.compute = list(dic=T))
summary(fit.us_bym)

## 
## Call:
## c("inla(formula = infd03 ~ metro + giniz + hispz + blackz + povz + ",  "    ne_bad + isowbz + f(struct, model = \"bym\", graph = \"D:/GoogleDrive/classes/dem7263/usa.gra\"), ",  "    family = \"poisson\", data = usdat@data, E = e + 1e-04, control.compute = list(dic = T))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          0.8081        139.0280          0.7370        140.5732 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0279 0.0156    -0.0028   0.0279     0.0584  0.0280   0
## metro        0.0580 0.0221     0.0145   0.0580     0.1014  0.0580   0
## giniz        0.0008 0.0080    -0.0148   0.0008     0.0166  0.0008   0
## hispz       -0.0571 0.0117    -0.0799  -0.0571    -0.0340 -0.0572   0
## blackz       0.1071 0.0148     0.0780   0.1071     0.1362  0.1071   0
## povz         0.1476 0.0132     0.1218   0.1476     0.1734  0.1477   0
## ne_bad      -0.1638 0.0244    -0.2116  -0.1638    -0.1159 -0.1638   0
## isowbz      -0.0659 0.0155    -0.0963  -0.0659    -0.0354 -0.0659   0
## 
## Random effects:
## Name   Model
##  struct   BYM model 
## 
## Model hyperparameters:
##                                             mean      sd 0.025quant
## Precision for struct (iid component)     3290.29 2277.54     694.13
## Precision for struct (spatial component)   73.54   15.71      47.99
##                                          0.5quant 0.975quant    mode
## Precision for struct (iid component)      2733.16    9209.90 1758.34
## Precision for struct (spatial component)    71.70     109.42   68.07
## 
## Expected number of effective parameters(std dev): 114.96(14.41)
## Number of equivalent replicates : 26.65 
## 
## Deviance Information Criterion (DIC) ...: 10102.76
## Effective number of parameters .........: 115.53
## 
## Marginal log-Likelihood:  -4689.67 
## Posterior marginals for linear predictor and fitted values computed

1/fit.us_bym$summary.hyperpar$mean

## [1] 0.0003039242 0.0135980476

usdat$bym_est<-fit.us_bym$summary.fitted.values$mean
spplot(usdat, "bym_est", col.regions=brewer.pal(n=9, "RdBu"), at=quantile(usdat$bym_est, p=seq(0, 1,length.out = 9)), main="US County Fitted SMR from BYM model")

Exceedence probabilities

Here we calculate our exceedence probabilities, in this case, I calculate the probability that a county has a SMR greater than 1.25.

a2<-1.25
usdat$inlaprob<-unlist(lapply(fit.us_bym$marginals.fitted.values, function(X){
  1-inla.pmarginal(a2, X)
}))
hist(usdat$inlaprob)

usdat$exceed.cut<-cut(usdat$inlaprob,breaks =  c(0,0.2,0.5,0.8,.95,1), include.lowest = T)

spplot(usdat,"exceed.cut" , col.regions=brewer.pal(n=6, name="Blues"), key.space="bottom", main= expression(paste("Exceedence Probability Infant SMR ","Pr( ",theta," >1.25"," )") ))

##           used  (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells 1919786 102.6    5684620 303.6  5684620 303.6
## Vcells 4887416  37.3   37663367 287.4 56575267 431.7

The INLA Approach to Bayesian models

\(\eta = \alpha + \sum_{j=1}^{nf} f^{(j)}(u_{ij}) + \sum_{k=1}^{n\beta}\beta_k z_{ki} + \epsilon_i\)

\(\tilde{p}(x_i|y) = \sum_k \tilde{p}(x_i|\theta_k, y) \tilde{p}(\theta_k|y) \Delta_k\)

via numerical integration (Rue & Martino, 2007; Schrodle & Held, 2011a, 2011b). The posterior distribution of the hyperparameters (\(\theta\)) of the model can also be approximated as:

\(\tilde{p}(\theta | y)) \propto \frac{p(x, \theta, y)}{\tilde{p}G(x| \theta,y)} \mid _{x} = x^*(\theta)\)

INLA for Non - Aggregate data

Above, we saw how to use INLA to fit a Bayesian regression model to areal data (US Counties). This example will focus on how to use INLA to fit a Bayesian multi-level model, where our outcome is observed at the individual level, and we may or may not have information avaialble at a higher level of aggregation.

Age-period-cohort models

In this example, we use data from the Integrated Health Interview Series to fit an Age-Period-Cohort model. Discussion of these models can be found in the recent text on the subject. In addition to the APC model, hierarchical models could consider county or city of residence, or some other geography.

Here, we load the IHIS data and do some recodes.

library(readstata13)
library(car)
library(INLA)
dat<- read.dta13("D:/GoogleDrive/classes/dem7903_App_Hier/data/ihis_00008.dta", convert.factors = F)
names(dat)

##  [1] "year"         "serial"       "strata"       "psu"         
##  [5] "nhishid"      "hhweight"     "nhispid"      "hhx"         
##  [9] "fmx"          "px"           "perweight"    "sampweight"  
## [13] "fweight"      "astatflg"     "cstatflg"     "pernum"      
## [17] "age"          "sex"          "marstat"      "marst"       
## [21] "birthyr"      "racea"        "hispeth"      "yrsinus"     
## [25] "hispyn"       "usborn"       "citizen"      "racenew"     
## [29] "racebr"       "educrec2"     "educrec1"     "educ"        
## [33] "health"       "bmicalc"      "alcstat1"     "smokestatus2"
## [37] "mod10fwk"     "vig10fwk"     "nhisiid"

names(dat)<-tolower(names(dat))
dat<-dat[dat$age>=18,]
dat$goodhealth<-recode(dat$health, recodes="1:3=0; 4:5=1; else=NA")
dat$bmi_clean<-ifelse(dat$bmicalc%in%c(0,996),NA, dat$bmicalc)
dat$obese<-ifelse(dat$bmi_clean>=30,1,0)
dat$hisp<-ifelse(dat$hispeth==10, 0, ifelse(dat$hispeth %in%c(90:93), NA, 1))
dat$race<-recode(dat$racea, recodes="100=1; 200=2; 300:340=3; 400:434=4; 500:600=5; else=NA")
dat$race_eth<-factor(ifelse(dat$hisp==1, "Hispanic",
                     ifelse(dat$hisp==0&dat$race==1, "NHWhite", 
                            ifelse(dat$hisp==0&dat$race==2, "NHBlack",
                                   ifelse(dat$hisp==0&dat$race==3, "NHNAHN",
                                          ifelse(dat$hisp==0&dat$race==4, "NHAsian", 
                                                 ifelse(dat$hisp==0&dat$race==5, "NHAsian", NA)))))))
dat$educ<-recode(dat$educrec2, recodes="10:41='lths'; 42='hs';50:53='somecoll'; 54:60='colledu'; else=NA", as.factor.result=T, levels = c("lths", "hs", "somecoll", "colledu"))
dat$male<-recode(dat$sex, recodes="1=1;2=0")
dat$agec<-cut(dat$age, breaks =seq(15, 100, 5))
dat$currsmoke<-recode(dat$smokestatus2, recodes="0=NA;90=NA; 10:13=1; else=0", as.factor.result = F)
dat$currdrink<-recode(dat$alcstat1, recodes="0='never'; 1:2='former'; 3='current'; else=NA", as.factor.result = T)
dat$birth_year<-ifelse(dat$birthyr>=9997, NA, dat$birthyr)
dat$age2<-dat$age^2
dat<-subset(dat, complete.cases(goodhealth,bmi_clean, educ, currsmoke, currdrink, birth_year))
dat$bmiz<- scale(dat$bmi_clean)
dat$cohort<- cut(dat$birth_year, breaks=seq(1915, 2000, 5))
set.seed(1115)
samps<-sample(1:nrow(dat), size = 75000, replace = F) #take a sample of subjects, because the NHIS is huge.
dat<- dat[samps,]

Below we fit the model using INLA. The APC model basically has three random effects, one for Age, one for period (year of survey) and a third random effect for cohort (birth year). If we have a continuous outcome, our model structure would be :

\(y_{ijkl} = \mu_{jkl} +\sum_c \beta_c x_{ik} + e_{ij}\) with \(\mu_{jkl} = \mu + u_j + v_k + \tau_l\)

\(u_j \sim N(0, \sigma^2_u)\)

\(v_k \sim N(0, \sigma^2_v)\)

\(\tau_l \sim N(0, \sigma^2_\tau)\)

library(lme4)
#user lmer to find good starting values for INLA
fit_in1.mer0<- lmer(bmiz~ race_eth+educ+male+currsmoke+currdrink+agec+(1|year)+(1|cohort), data=dat)
fit_in1.mer1<- lmer(bmiz~ race_eth+educ+male+currsmoke+currdrink+(1|agec)+(1|year)+(1|cohort), data=dat)


#INLA parameterizes variances on the log-precision scale, so if we take log(1/variance), we can get that number and feed it to INLA
vc0<-as.data.frame(VarCorr(fit_in1.mer0))
vc1<-as.data.frame(VarCorr(fit_in1.mer1))
vc0$init<-log(1/vc0$vcov); vc0

##        grp        var1 var2        vcov      sdcor       init
## 1   cohort (Intercept) <NA> 0.001480208 0.03847348 6.51557246
## 2     year (Intercept) <NA> 0.005698834 0.07549062 5.16749364
## 3 Residual        <NA> <NA> 0.920389201 0.95936917 0.08295865

vc1$init<-log(1/vc1$vcov); vc1

##        grp        var1 var2       vcov      sdcor       init
## 1   cohort (Intercept) <NA> 0.00170954 0.04134659 6.37153073
## 2     year (Intercept) <NA> 0.00578871 0.07608357 5.15184586
## 3     agec (Intercept) <NA> 0.04411133 0.21002697 3.12103867
## 4 Residual        <NA> <NA> 0.92036819 0.95935822 0.08298148

#Feed the init values to the hyperparameter inits in each f()
#Age as fixed
fit_in0<- inla(bmiz~ race_eth+educ+male+currsmoke+currdrink+agec+f(year, model="iid",hyper = list(prec=list(initial=vc0$init[2])) )+f(cohort, model="iid",hyper = list(prec=list(initial=vc0$init[1]))), family = "gaussian", data=dat, num.threads = 2, verbose = F,  control.inla=list(strategy='gaussian'), control.family = list(hyper = list(prec=list(initial=vc0$init[3]))), control.compute = list(dic=T))
gc()

##            used  (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells  3556074 190.0    6861544 366.5  6861544 366.5
## Vcells 36082862 275.3   97449684 743.5 90501012 690.5

#Age as unstructured random effect
fit_in1<- inla(bmiz~ race_eth+educ+male+currsmoke+currdrink+f(agec, model="iid",hyper = list(prec=list(initial=vc1$init[3])))+f(year, model="iid",hyper = list(prec=list(initial=vc1$init[2])) )+f(cohort, model="iid",hyper = list(prec=list(initial=vc1$init[1]))), family = "gaussian", data=dat, num.threads = 2, verbose = F,  control.inla=list(strategy='gaussian'), control.family = list(hyper = list(prec=list(initial=vc1$init[4]))), control.compute = list(dic=T))
gc()

##            used  (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells  4458502 238.2    6861544 366.5  6861544 366.5
## Vcells 51561884 393.4  117019620 892.8 95904325 731.7

#Age and year as smooth random effect
fit_in2<- inla(bmiz~ race_eth+educ+male+currsmoke+currdrink+f(agec, model="rw2",hyper = list(prec=list(initial=vc1$init[3])))+f(year, model="rw1",hyper = list(prec=list(initial=vc1$init[2])) )+f(cohort, model="iid",hyper = list(prec=list(initial=vc1$init[1]))), family = "gaussian", data=dat, num.threads = 2, verbose = F,  control.inla=list(strategy='gaussian'), control.family = list(hyper = list(prec=list(initial=vc1$init[4]))), control.compute = list(dic=T))
gc()

##            used  (Mb) gc trigger  (Mb)  max used  (Mb)
## Ncells  5360915 286.4    8273852 441.9   6861544 366.5
## Vcells 67041643 511.5  117019620 892.8 116997713 892.7

summary(fit_in0)

## 
## Call:
## c("inla(formula = bmiz ~ race_eth + educ + male + currsmoke + currdrink + ",  "    agec + f(year, model = \"iid\", hyper = list(prec = list(initial = vc0$init[2]))) + ",  "    f(cohort, model = \"iid\", hyper = list(prec = list(initial = vc0$init[1]))), ",  "    family = \"gaussian\", data = dat, verbose = F, control.compute = list(dic = T), ",  "    control.family = list(hyper = list(prec = list(initial = vc0$init[3]))), ",  "    control.inla = list(strategy = \"gaussian\"), num.threads = 2)" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          2.6701        150.9579          6.6530        160.2810 
## 
## Fixed effects:
##                    mean      sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)     -0.4245  0.0337    -0.4894  -0.4250    -0.3567 -0.4260   0
## race_ethNHAsian -0.4906  0.0189    -0.5278  -0.4906    -0.4535 -0.4906   0
## race_ethNHBlack  0.1762  0.0129     0.1509   0.1762     0.2015  0.1762   0
## race_ethNHNAHN   0.2001  0.0439     0.1138   0.2001     0.2862  0.2001   0
## race_ethNHWhite -0.0924  0.0104    -0.1129  -0.0924    -0.0719 -0.0924   0
## educhs          -0.0360  0.0115    -0.0585  -0.0360    -0.0136 -0.0360   0
## educsomecoll    -0.0512  0.0118    -0.0744  -0.0512    -0.0280 -0.0512   0
## educcolledu     -0.2777  0.0127    -0.3025  -0.2777    -0.2528 -0.2777   0
## male             0.1252  0.0071     0.1112   0.1252     0.1392  0.1252   0
## currsmoke       -0.1662  0.0091    -0.1840  -0.1662    -0.1483 -0.1662   0
## currdrinkformer  0.0658  0.0078     0.0505   0.0658     0.0811  0.0658   0
## currdrinknever   0.0000 31.6228   -62.0862  -0.0009    62.0344  0.0000   0
## agec(20,25       0.2460  0.0237     0.1994   0.2461     0.2923  0.2461   0
## agec(25,30       0.4144  0.0264     0.3622   0.4145     0.4657  0.4147   0
## agec(30,35       0.5038  0.0286     0.4468   0.5041     0.5590  0.5047   0
## agec(35,40       0.5824  0.0298     0.5226   0.5830     0.6394  0.5841   0
## agec(40,45       0.6157  0.0300     0.5549   0.6164     0.6726  0.6179   0
## agec(45,50       0.6442  0.0304     0.5823   0.6450     0.7017  0.6467   0
## agec(50,55       0.6862  0.0312     0.6225   0.6872     0.7449  0.6891   0
## agec(55,60       0.6912  0.0319     0.6259   0.6922     0.7511  0.6943   0
## agec(60,65       0.6895  0.0327     0.6224   0.6905     0.7509  0.6927   0
## agec(65,70       0.6375  0.0330     0.5699   0.6385     0.6999  0.6404   0
## agec(70,75       0.5437  0.0334     0.4760   0.5444     0.6075  0.5458   0
## agec(75,80       0.4378  0.0345     0.3685   0.4383     0.5045  0.4391   0
## agec(80,85       0.1653  0.0345     0.0964   0.1656     0.2326  0.1661   0
## agec(85,90       0.0000 31.6228   -62.0862  -0.0009    62.0344  0.0000   0
## agec(90,95       0.0000 31.6228   -62.0862  -0.0009    62.0344  0.0000   0
## agec(95,100      0.0000 31.6228   -62.0862  -0.0009    62.0344  0.0000   0
## 
## Random effects:
## Name   Model
##  year   IID model 
## cohort   IID model 
## 
## Model hyperparameters:
##                                             mean       sd 0.025quant
## Precision for the Gaussian observations    1.087   0.0057      1.075
## Precision for year                       203.637  77.5473     90.254
## Precision for cohort                    1242.376 896.5609    311.286
##                                         0.5quant 0.975quant    mode
## Precision for the Gaussian observations    1.087      1.097   1.087
## Precision for year                       191.121    389.766 168.000
## Precision for cohort                     999.814   3606.933 676.177
## 
## Expected number of effective parameters(std dev): 49.30(1.534)
## Number of equivalent replicates : 1521.25 
## 
## Deviance Information Criterion (DIC) ...: 206673.42
## Effective number of parameters .........: 51.35
## 
## Marginal log-Likelihood:  -103540.07 
## Posterior marginals for linear predictor and fitted values computed

summary(fit_in1)

## 
## Call:
## c("inla(formula = bmiz ~ race_eth + educ + male + currsmoke + currdrink + ",  "    f(agec, model = \"iid\", hyper = list(prec = list(initial = vc1$init[3]))) + ",  "    f(year, model = \"iid\", hyper = list(prec = list(initial = vc1$init[2]))) + ",  "    f(cohort, model = \"iid\", hyper = list(prec = list(initial = vc1$init[1]))), ",  "    family = \"gaussian\", data = dat, verbose = F, control.compute = list(dic = T), ",  "    control.family = list(hyper = list(prec = list(initial = vc1$init[4]))), ",  "    control.inla = list(strategy = \"gaussian\"), num.threads = 2)" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          2.6171        286.3867          8.0204        297.0242 
## 
## Fixed effects:
##                    mean      sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)      0.0645  0.0610    -0.0562   0.0644     0.1852  0.0644   0
## race_ethNHAsian -0.4905  0.0189    -0.5277  -0.4905    -0.4534 -0.4905   0
## race_ethNHBlack  0.1763  0.0129     0.1510   0.1763     0.2016  0.1763   0
## race_ethNHNAHN   0.2001  0.0439     0.1139   0.2001     0.2863  0.2001   0
## race_ethNHWhite -0.0923  0.0104    -0.1128  -0.0923    -0.0719 -0.0923   0
## educhs          -0.0360  0.0115    -0.0585  -0.0360    -0.0135 -0.0360   0
## educsomecoll    -0.0512  0.0118    -0.0744  -0.0512    -0.0280 -0.0512   0
## educcolledu     -0.2774  0.0127    -0.3022  -0.2774    -0.2525 -0.2774   0
## male             0.1252  0.0071     0.1112   0.1252     0.1392  0.1252   0
## currsmoke       -0.1660  0.0091    -0.1839  -0.1660    -0.1482 -0.1660   0
## currdrinkformer  0.0657  0.0078     0.0504   0.0657     0.0810  0.0657   0
## currdrinknever   0.0000 31.6228   -62.0862  -0.0009    62.0344  0.0000   0
## 
## Random effects:
## Name   Model
##  agec   IID model 
## year   IID model 
## cohort   IID model 
## 
## Model hyperparameters:
##                                             mean       sd 0.025quant
## Precision for the Gaussian observations    1.087   0.0056      1.075
## Precision for agec                        26.037   9.8506     11.469
## Precision for year                       201.452  77.1009     89.280
## Precision for cohort                    1042.177 725.8413    262.906
##                                         0.5quant 0.975quant    mode
## Precision for the Gaussian observations    1.087      1.098   1.087
## Precision for agec                        24.519     49.596  21.660
## Precision for year                       188.793    386.989 165.552
## Precision for cohort                     851.250   2948.947 582.227
## 
## Expected number of effective parameters(std dev): 49.58(1.433)
## Number of equivalent replicates : 1512.59 
## 
## Deviance Information Criterion (DIC) ...: 206671.66
## Effective number of parameters .........: 51.66
## 
## Marginal log-Likelihood:  -103489.79 
## Posterior marginals for linear predictor and fitted values computed

summary(fit_in2)

## 
## Call:
## c("inla(formula = bmiz ~ race_eth + educ + male + currsmoke + currdrink + ",  "    f(agec, model = \"rw2\", hyper = list(prec = list(initial = vc1$init[3]))) + ",  "    f(year, model = \"rw1\", hyper = list(prec = list(initial = vc1$init[2]))) + ",  "    f(cohort, model = \"iid\", hyper = list(prec = list(initial = vc1$init[1]))), ",  "    family = \"gaussian\", data = dat, verbose = F, control.compute = list(dic = T), ",  "    control.family = list(hyper = list(prec = list(initial = vc1$init[4]))), ",  "    control.inla = list(strategy = \"gaussian\"), num.threads = 2)" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          2.4672        606.9693          8.2611        617.6976 
## 
## Fixed effects:
##                    mean      sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)     -0.0743  0.0265    -0.1266  -0.0742    -0.0225 -0.0740   0
## race_ethNHAsian -0.4910  0.0189    -0.5280  -0.4910    -0.4540 -0.4910   0
## race_ethNHBlack  0.1761  0.0128     0.1509   0.1761     0.2013  0.1761   0
## race_ethNHNAHN   0.1994  0.0438     0.1135   0.1994     0.2853  0.1994   0
## race_ethNHWhite -0.0924  0.0104    -0.1128  -0.0924    -0.0720 -0.0924   0
## educhs          -0.0359  0.0114    -0.0583  -0.0359    -0.0135 -0.0359   0
## educsomecoll    -0.0512  0.0118    -0.0743  -0.0512    -0.0281 -0.0512   0
## educcolledu     -0.2773  0.0126    -0.3021  -0.2773    -0.2526 -0.2773   0
## male             0.1252  0.0071     0.1112   0.1252     0.1391  0.1252   0
## currsmoke       -0.1657  0.0091    -0.1835  -0.1657    -0.1480 -0.1657   0
## currdrinkformer  0.0654  0.0078     0.0502   0.0654     0.0807  0.0654   0
## currdrinknever   0.0000 31.6228   -62.0862  -0.0009    62.0344  0.0000   0
## 
## Random effects:
## Name   Model
##  agec   RW2 model 
## year   RW1 model 
## cohort   IID model 
## 
## Model hyperparameters:
##                                             mean       sd 0.025quant
## Precision for the Gaussian observations    1.095   0.0012      1.092
## Precision for agec                       521.098 137.4216    363.681
## Precision for year                      1368.891 357.2214    668.643
## Precision for cohort                     746.105 112.8574    509.518
##                                         0.5quant 0.975quant     mode
## Precision for the Gaussian observations    1.095      1.097    1.095
## Precision for agec                       484.517    878.026  415.435
## Precision for year                      1389.006   2008.814 1574.899
## Precision for cohort                     756.162    943.107  793.667
## 
## Expected number of effective parameters(std dev): 42.27(0.8441)
## Number of equivalent replicates : 1774.16 
## 
## Deviance Information Criterion (DIC) ...: 206659.97
## Effective number of parameters .........: 42.16
## 
## Marginal log-Likelihood:  -103459.54 
## Posterior marginals for linear predictor and fitted values computed

#Posterior summaries for our fixed effects
round(fit_in1$summary.fixed, 3)

##                   mean     sd 0.025quant 0.5quant 0.975quant   mode kld
## (Intercept)      0.064  0.061     -0.056    0.064      0.185  0.064   0
## race_ethNHAsian -0.491  0.019     -0.528   -0.491     -0.453 -0.491   0
## race_ethNHBlack  0.176  0.013      0.151    0.176      0.202  0.176   0
## race_ethNHNAHN   0.200  0.044      0.114    0.200      0.286  0.200   0
## race_ethNHWhite -0.092  0.010     -0.113   -0.092     -0.072 -0.092   0
## educhs          -0.036  0.011     -0.058   -0.036     -0.013 -0.036   0
## educsomecoll    -0.051  0.012     -0.074   -0.051     -0.028 -0.051   0
## educcolledu     -0.277  0.013     -0.302   -0.277     -0.253 -0.277   0
## male             0.125  0.007      0.111    0.125      0.139  0.125   0
## currsmoke       -0.166  0.009     -0.184   -0.166     -0.148 -0.166   0
## currdrinkformer  0.066  0.008      0.050    0.066      0.081  0.066   0
## currdrinknever   0.000 31.623    -62.086   -0.001     62.034  0.000   0

plot( fit_in1$summary.random$agec$mean, type="l",xaxt="n", ylab="Age Random Effect (IID)", main="Age random effect", ylim=c(-1,1))
axis(1, at=1:17, labels=levels(dat$agec))
lines(fit_in1$summary.random$agec$`0.025quant`, col=1, lty=5)
lines(fit_in1$summary.random$agec$`0.975quant`, col=1, lty=5)
abline(h=0, lwd=2)
legend("topright", legend = c("Posterior Mean", "95% Credible Interval"), lty=c(1,5))

plot(fit_in1$summary.random$year$mean, type="l",xaxt="n", ylab="Period Random Effect (IID)", main="Period random effect", ylim=c(-.3, .3))
yrs<-unique(dat$year); yrs<-yrs[order(yrs)]
axis(1, at=1:length(yrs), labels=yrs)
lines(fit_in1$summary.random$year$`0.025quant`, col=1, lty=5)
lines(fit_in1$summary.random$year$`0.975quant`, col=1, lty=5)
abline(h=0, lwd=2)
legend("topright", legend = c("Posterior Mean", "95% Credible Interval"), lty=c(1,5))

plot(fit_in1$summary.random$cohort$mean, type="l", xaxt="n", ylab="Cohort Random Effect (IID)", main="Cohort random effect", ylim=c(-.2, .2))
coh<-unique(dat$cohort); coh<-coh[order(coh)]
axis(1, at=1:length(coh), labels=coh)
lines(fit_in1$summary.random$cohort$`0.025quant`, col=1, lty=5)
lines(fit_in1$summary.random$cohort$`0.975quant`, col=1, lty=5)
abline(h=0, lwd=2)
legend("topright", legend = c("Posterior Mean", "95% Credible Interval"), lty=c(1,5))

Plot the marginals for the variance terms

m1<- inla.tmarginal(
        function(x) 1/x,
        fit_in1$marginals.hyperpar$`Precision for agec`)
m2<- inla.tmarginal(
        function(x) 1/x,
        fit_in1$marginals.hyperpar$`Precision for year`)

m3<- inla.tmarginal(
        function(x) 1/x,
        fit_in1$marginals.hyperpar$`Precision for cohort`)


plot(m1, type="l", main="Posterior distibution for age variance")

plot(m2, type="l", main="Posterior distibution for year variance")

plot(m3, type="l", main="Posterior distibution for cohort variance")

DEM 7263 Bayesian Regression using the INLA Approximation

Corey S. Sparks, Ph.D.

April 5, 2017

The INLA Approach to Bayesian models

Exceedence probabilities

The INLA Approach to Bayesian models

INLA for Non - Aggregate data

Age-period-cohort models