Introduction

In the previous examples, we saw how to use Stan to do posterior inference for regression models. The way Stan does this is through sampling from the posterior distribution. This can be an accurate way to learn about the posterior, but it can be slow and you are not guaranteed to have a model that reaches convergence.

Alternative strategies exist for doing posterior inference for regression models. These are generally classified as approximations to the posterior, since they use numerical methods versus sampling to arrive at the estimates of the posterior.

Stan - approximations

Stan does two forms of this type of approximation in its routines. These are meanfield and the full rank methods, and are classified as variational Bayes methods. You can find a description of these methods here.

There is another strategy that uses the Laplace approximation to the posterior for all model parameters. It was published specifically to deal with models that are classified as Gaussian Markov Random Field models. Basically, this means a model with Gaussian random effects that can be structured, meaning correlated over time or space or both.

This method was published by Rue, Martino & Chopin (2009) who developed much about what we know about latent Gaussian models.

Bayesian analysis principles

Need to get at:

*\[p(\theta|y) \propto p(y|\theta)p(\theta)\]

Traditionally, we would get \(p(\theta|y)\) by:
- either figuring out what the full conditionals for all our model parameters are (hard)
- Use some form of MCMC to arrive at the posterior marginal distributions for our parameters (time consuming)
Integrated Nested Laplace Approximation - Rue, Martino & Chopin (2009)
One of several techniques that approximate the marginal and conditional posterior densities
- Laplace, PQL, E-M, Variational Bayes
Assumes all random effects in the model are latent, zero-mean Gaussian random field, \(x\) with some precision matrix
- The precision matrix depends on a small set of hyperparameters
Attempts to construct a joint Gaussian approximation for \(p(x | \theta, y)\)
- where \(\theta\) is a small subset of hyper-parameters
Apply these approximations to arrive at:

\[\tilde{\pi}(x_i | y) = \int \tilde{\pi}(x_i |\theta, y)\tilde{\pi}(\theta| y) d\theta\]

\[\tilde{\pi}(\theta_j | y) = \int \tilde{\pi}(\theta| y) d\theta_{-j}\]

where each \(\tilde{\pi}(. |.)\) is an approximated conditional density of its parameters
Approximations to \(\pi(x_i | y)\) are computed by approximating both \(\pi(\theta| y)\) and \(\pi(x_i| \theta, y)\) using numerical integration to integrate out the nuisance parameters.
This is possible if the dimension of \(\theta\) is small.
Approximations to \(\tilde{\pi}(\theta|y)\) are based on the Laplace appoximation of the marginal posterior density for \(\pi(x,\theta|y)\)
Their approach relies on numerical integration of the posterior of the latent field, as opposed to a pure Gaussian approximation of it

Long and short of it…

Is that you can get estimates and predictions from Bayesian models faster, with the same results (See De Smedt et al, 2015 and Carroll et al, 2015 for examples).

The author of one of our books, Julian Farayway has a good site where he documents how to use INLA for various things here. He also has a R library, brinla that has useful functions for working with INLA output on Github

Libraries

library (car)

## Loading required package: carData

library(INLA)

## Loading required package: Matrix

## Loading required package: sp

## This is INLA_18.07.12 built 2018-07-12 11:05:18 UTC.
## See www.r-inla.org/contact-us for how to get help.

library(haven)
library(ggplot2)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:car':
## 
##     recode

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(brinla)

Load data and recode some variables:

eclsk11<-read_sas("~/Google Drive/classes/dem7473/data/childk4p.sas7bdat")
names(eclsk11)<-tolower(names(eclsk11))
#get out only the variables I'm going to use for this example

myvars<-c("childid", "x_chsex_r", "x1locale", "x_raceth_r", "x2povty", "x12par1ed_i","p1curmar", "x1htotal",
          "x1mscalk4", "x2mscalk4", "x3mscalk4", "x4mscalk4","x5mscalk4","x6mscalk4","x7mscalk4","x8mscalk4",
          "p1hscale", "p2hscale", "p4hscale","p6hscale","p7hscale","p8hscale",
           "x12sesl","x4sesl_i", "p2parct1", "p2parct2", "s1_id", "p2safepl","x2krceth","p1o2near", "x_distpov" ,
          "w8cf8p_80","w8cf8p_8psu","w8cf8p_8str","x2fsstat2","x4fsstat2", "x1height","x2height","x3height","x4height","x5height","x6height","x7height","x8height",
        "x1kage_r","x2kage_r","x3age","x4age", "x5age", "x6age", "x7age", "x8age", "w1c0","w1c0psu","w1c0str", "w1p0", "w4c4p_20", "w4cf4p_2str", "w4cf4p_2psu")


#subset the data
eclsk.sub<-eclsk11[,myvars]

rm(eclsk11); gc()

##           used  (Mb) gc trigger   (Mb)  max used   (Mb)
## Ncells 1909536 102.0    3808665  203.5   3808665  203.5
## Vcells 4414432  33.7  681440989 5199.0 783705057 5979.2

Longitudinal variables

#First we recode some Child characteristics
#Child's sex: recode as male =1
eclsk.sub$male<-Recode(eclsk.sub$x_chsex_r, recodes="1=1; 2=0; -9=NA")

#Recode race with white, non Hispanic as reference using dummy vars
eclsk.sub$hisp<-Recode(eclsk.sub$x_raceth_r, recodes="3:4=1;-9=NA; else=0")
eclsk.sub$black<-Recode(eclsk.sub$x_raceth_r, recodes="2=1;-9=NA; else=0")
eclsk.sub$asian<-Recode(eclsk.sub$x_raceth_r, recodes="5=1;-9=NA; else=0")
eclsk.sub$nahn<-Recode(eclsk.sub$x_raceth_r, recodes="6:7=1;-9=NA; else=0")
eclsk.sub$other<-Recode(eclsk.sub$x_raceth_r, recodes="8=1;-9=NA; else=0")


#Longitudinal variables
#recode our outcomes, the  first is the child's math standardized test score  in Kindergarten
eclsk.sub$math1<-ifelse(eclsk.sub$x1mscalk4<0, NA, eclsk.sub$x1mscalk4)
eclsk.sub$math2<-ifelse(eclsk.sub$x2mscalk4<0, NA, eclsk.sub$x2mscalk4)
eclsk.sub$math4<-ifelse(eclsk.sub$x4mscalk4<0, NA, eclsk.sub$x4mscalk4)
eclsk.sub$math6<-ifelse(eclsk.sub$x6mscalk4<0, NA, eclsk.sub$x6mscalk4)
eclsk.sub$math7<-ifelse(eclsk.sub$x7mscalk4<0, NA, eclsk.sub$x7mscalk4)
eclsk.sub$math8<-ifelse(eclsk.sub$x8mscalk4<0, NA, eclsk.sub$x8mscalk4)

#Second outcome is child's height for age, continuous outcome
eclsk.sub$height1<-ifelse(eclsk.sub$x1height<=-7, NA, eclsk.sub$x1height)
eclsk.sub$height2<-ifelse(eclsk.sub$x2height<=-7, NA, eclsk.sub$x2height)
eclsk.sub$height4<-ifelse(eclsk.sub$x4height<=-7, NA, eclsk.sub$x4height)
eclsk.sub$height6<-ifelse(eclsk.sub$x6height<=-7, NA, eclsk.sub$x6height)
eclsk.sub$height7<-ifelse(eclsk.sub$x7height<=-7, NA, eclsk.sub$x7height)
eclsk.sub$height8<-ifelse(eclsk.sub$x8height<=-7, NA, eclsk.sub$x8height)

#Age at each wave
eclsk.sub$age_yrs1<-ifelse(eclsk.sub$x1kage_r<0, NA, eclsk.sub$x1kage_r/12)
eclsk.sub$age_yrs2<-ifelse(eclsk.sub$x2kage_r<0, NA, eclsk.sub$x2kage_r/12)
#eclsk.sub$age_yrs3<-ifelse(eclsk.sub$x3age<0, NA, eclsk.sub$x3age/12)
eclsk.sub$age_yrs4<-ifelse(eclsk.sub$x4age<0, NA, eclsk.sub$x4age/12)
eclsk.sub$age_yrs6<-ifelse(eclsk.sub$x6age<0, NA, eclsk.sub$x6age/12)
eclsk.sub$age_yrs7<-ifelse(eclsk.sub$x7age<0, NA, eclsk.sub$x7age/12)
eclsk.sub$age_yrs8<-ifelse(eclsk.sub$x8age<0, NA, eclsk.sub$x8age/12)



eclsk.sub<- eclsk.sub[is.na(eclsk.sub$age_yrs1)==F, ]

#Height for age z score standardized by sex and age
eclsk.sub$height_z1<-ave(eclsk.sub$height1, as.factor(paste(round(eclsk.sub$age_yrs1, 1.5), eclsk.sub$male)), FUN=scale)
eclsk.sub$height_z2<-ave(eclsk.sub$height2, as.factor(paste(round(eclsk.sub$age_yrs2, 1.5), eclsk.sub$male)), FUN=scale)
#eclsk.sub$height_z3<-ave(eclsk.sub$height3, as.factor(paste(round(eclsk.sub$age_yrs3, 1.5), eclsk.sub$male)), FUN=scale)
eclsk.sub$height_z4<-ave(eclsk.sub$height4, as.factor(paste(round(eclsk.sub$age_yrs4, 1.5), eclsk.sub$male)), FUN=scale)
eclsk.sub$height_z6<-ave(eclsk.sub$height6, as.factor(paste(round(eclsk.sub$age_yrs6, 1.5), eclsk.sub$male)), FUN=scale)
eclsk.sub$height_z7<-ave(eclsk.sub$height7, as.factor(paste(round(eclsk.sub$age_yrs7, 1.5), eclsk.sub$male)), FUN=scale)
eclsk.sub$height_z8<-ave(eclsk.sub$height8, as.factor(paste(round(eclsk.sub$age_yrs8, 1.5), eclsk.sub$male)), FUN=scale)



#Child health assessment Excellent to poor , ordinal outcome
eclsk.sub$chhealth1<-ifelse(eclsk.sub$p1hscale<0, NA, eclsk.sub$p1hscale)
eclsk.sub$chhealth2<-ifelse(eclsk.sub$p2hscale<0, NA, eclsk.sub$p2hscale)
eclsk.sub$chhealth4<-ifelse(eclsk.sub$p4hscale<0, NA, eclsk.sub$p4hscale)
eclsk.sub$chhealth6<-ifelse(eclsk.sub$p6hscale<0, NA, eclsk.sub$p6hscale)
eclsk.sub$chhealth7<-ifelse(eclsk.sub$p7hscale<0, NA, eclsk.sub$p7hscale)
eclsk.sub$chhealth8<-ifelse(eclsk.sub$p8hscale<0, NA, eclsk.sub$p8hscale)


eclsk.sub$badhealth2<-ifelse(eclsk.sub$chhealth2 >2, 1,0)

#SES
eclsk.sub$hhses1<-ifelse(eclsk.sub$x12sesl==-9, NA, scale(eclsk.sub$x12sesl))
eclsk.sub$hhses4<-ifelse(eclsk.sub$x4sesl_i==-9, NA, scale(eclsk.sub$x4sesl_i))


#Household food insecurity, dichotomous outcome
#This outcome is only present at two waves
eclsk.sub$foodinsec1<-Recode(eclsk.sub$x2fsstat2, recodes="2:3=1; 1=0; else=NA")
eclsk.sub$foodinsec2<-Recode(eclsk.sub$x4fsstat2, recodes="2:3=1; 1=0; else=NA")

Non time varying variables

#First we recode some Child characteristics
#Child's sex: recode as male =1
eclsk.sub$male<-Recode(eclsk.sub$x_chsex_r, recodes="1=1; 2=0; -9=NA")

#Recode race with white, non Hispanic as reference using dummy vars
eclsk.sub$hisp<-Recode(eclsk.sub$x_raceth_r, recodes="3:4=1;-9=NA; else=0")
eclsk.sub$black<-Recode(eclsk.sub$x_raceth_r, recodes="2=1;-9=NA; else=0")
eclsk.sub$asian<-Recode(eclsk.sub$x_raceth_r, recodes="5=1;-9=NA; else=0")
eclsk.sub$nahn<-Recode(eclsk.sub$x_raceth_r, recodes="6:7=1;-9=NA; else=0")
eclsk.sub$other<-Recode(eclsk.sub$x_raceth_r, recodes="8=1;-9=NA; else=0")


#Then we recode some parent/mother characteristics
#Mother's education, recode as 2 dummys with HS = reference
eclsk.sub$lths<-Recode(eclsk.sub$x12par1ed_i, recodes = "0:2=1; 3:8=0; else = NA")
eclsk.sub$gths<-Recode(eclsk.sub$x12par1ed_i, recodes = "1:3=0; 4:8=1; else =NA") 

#marital status, recode as 2 dummys, ref= married
eclsk.sub$single<-Recode(eclsk.sub$p1curmar, recodes="4=1; -7:-9=NA; else=0")
eclsk.sub$notmar<-Recode(eclsk.sub$p1curmar, recodes="2:3=1; -7:-9=NA; else=0")


#Then we do some household level variables

#Urban school location = 1
eclsk.sub$urban<-Recode(eclsk.sub$x1locale, recodes = "1:3=1; 4=0; -1:-9=NA")

#poverty level in poverty = 1
eclsk.sub$pov<-Recode(eclsk.sub$x2povty , recodes ="1:2=1; 3=0; -9=NA")

#Household size
eclsk.sub$hhsize<-eclsk.sub$x1htotal

#school % minority student body
eclsk.sub$minorsch<-ifelse(eclsk.sub$x2krceth <0, NA, eclsk.sub$x2krceth/10)

#Unsafe neighborhood
eclsk.sub$unsafe<-Recode(eclsk.sub$p2safepl , recodes = "1:2='unsafe'; 3='safe'; else=NA", as.factor = T)

#school district poverty
eclsk.sub$dist_pov<-ifelse(eclsk.sub$x_distpov==-9, NA, scale(eclsk.sub$x_distpov))


#mean centered weights for wave 1
eclsk.sub$wts1<-eclsk.sub$w4c4p_20/mean(eclsk.sub$w4c4p_20, na.rm=T)
#eclsk.sub$wtsp1<-eclsk.sub$w1p0/mean(eclsk.sub$w1p0, na.rm=T)

Basic INLA regression models

The basic multiple regression model can be done by:

inla.setOption("enable.inla.argument.weights", TRUE)

mod1<-inla(height_z2~chhealth1+foodinsec1+hhsize+hhses1,weights = eclsk.sub$wts1,
           data=eclsk.sub,
           control.compute = list(waic=T),
           family = "gaussian",
           num.threads = 2)

summary(mod1)

## 
## Call:
## c("inla(formula = height_z2 ~ chhealth1 + foodinsec1 + hhsize + ",  "    hhses1, family = \"gaussian\", data = eclsk.sub, weights = eclsk.sub$wts1, ",  "    control.compute = list(waic = T), num.threads = 2)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          3.0620         21.9952          0.9440         26.0011 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0885 0.0210     0.0472   0.0885     0.1298  0.0885   0
## chhealth1   -0.0228 0.0096    -0.0417  -0.0228    -0.0040 -0.0228   0
## foodinsec1  -0.0798 0.0263    -0.1315  -0.0798    -0.0281 -0.0798   0
## hhsize      -0.0082 0.0047    -0.0174  -0.0082     0.0009 -0.0082   0
## hhses1       0.0424 0.0103     0.0223   0.0424     0.0625  0.0424   0
## 
## The model has no random effects
## 
## Model hyperparameters:
##                                          mean     sd 0.025quant 0.5quant
## Precision for the Gaussian observations 1.038 0.0116      1.016    1.038
##                                         0.975quant  mode
## Precision for the Gaussian observations      1.061 1.038
## 
## Expected number of effective parameters(std dev): 5.10(0.0012)
## Number of equivalent replicates : 2351.76 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 43942.65
## Effective number of parameters .................: 11.05
## 
## Marginal log-Likelihood:  -22012.38 
## Posterior marginals for linear predictor and fitted values computed

bri.fixed.plot(mod1)

The model summary includes the posterior mean point estimates and the 95% credible intervals for each parameter.

Robust regression in INLA

Here I use a student - t distribution with 3 degrees of freedom instead of the Normal distribution.

mod2<-inla(height_z2~chhealth1+foodinsec1+hhsize+hhses1,weights = eclsk.sub$wts1,
           data=eclsk.sub,
           family = "t",
           control.compute = list(waic=T),
           control.family=list(hyper=list(dof = list(initial=3, fixed=T) )),
           num.threads = 2)

summary(mod2)

## 
## Call:
## c("inla(formula = height_z2 ~ chhealth1 + foodinsec1 + hhsize + ",  "    hhses1, family = \"t\", data = eclsk.sub, weights = eclsk.sub$wts1, ",  "    control.compute = list(waic = T), control.family = list(hyper = list(dof = list(initial = 3, ",  "        fixed = T))), num.threads = 2)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.4169         56.1848          1.6995         59.3011 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)  0.0851 0.0212     0.0435   0.0851     0.1267  0.0851   0
## chhealth1   -0.0227 0.0096    -0.0416  -0.0227    -0.0038 -0.0227   0
## foodinsec1  -0.0803 0.0264    -0.1322  -0.0803    -0.0285 -0.0803   0
## hhsize      -0.0082 0.0047    -0.0175  -0.0082     0.0010 -0.0082   0
## hhses1       0.0442 0.0103     0.0240   0.0442     0.0644  0.0442   0
## 
## The model has no random effects
## 
## Model hyperparameters:
##                                           mean    sd 0.025quant 0.5quant
## precision for the student-t observations 1.016 0.012     0.9927    1.016
##                                          0.975quant  mode
## precision for the student-t observations       1.04 1.016
## 
## Expected number of effective parameters(std dev): 5.099(0.0011)
## Number of equivalent replicates : 2352.27 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 44071.54
## Effective number of parameters .................: 11.13
## 
## Marginal log-Likelihood:  -22076.73 
## Posterior marginals for linear predictor and fitted values computed

bri.fixed.plot(mod2)

Logistic regression in INLA

The basic logistic regression model in INLA is fit using the inla() command, with family = "binomial", Ntrials = 1

mod3<-inla(badhealth2~male+foodinsec1+hhsize,weights = eclsk.sub$wts1,
           data=eclsk.sub,
           family = "binomial", Ntrials = 1,
           control.compute = list(waic=T),
           num.threads = 2)

summary(mod3)

## 
## Call:
## c("inla(formula = badhealth2 ~ male + foodinsec1 + hhsize, family = \"binomial\", ",  "    data = eclsk.sub, weights = eclsk.sub$wts1, Ntrials = 1, ",  "    control.compute = list(waic = T), num.threads = 2)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          2.2709         14.4266          0.5982         17.2958 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -1.9870 0.0710    -2.1273  -1.9867    -1.8484 -1.9861   0
## male         0.0036 0.0525    -0.0994   0.0035     0.1065  0.0035   0
## foodinsec1   0.7622 0.0662     0.6314   0.7625     0.8914  0.7630   0
## hhsize      -0.0072 0.0142    -0.0350  -0.0072     0.0206 -0.0072   0
## 
## The model has no random effects
## 
## The model has no hyperparameters
## 
## Expected number of effective parameters(std dev): 4.009(0.00)
## Number of equivalent replicates : 2494.43 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 9935.71
## Effective number of parameters .................: 7.914
## 
## Marginal log-Likelihood:  -4984.91 
## Posterior marginals for linear predictor and fitted values computed

bri.fixed.plot(mod3)

Basic random intercept model in INLA

Hierarhical models are easily estimated in INLA. The basic way this is done is by specifying the random effect using f() in the model. For example a random effect for each school can be done using f(s1_id, model="iid"). The model="iid" part tells INLA that these are uncorrelated random effects. There are lots of different types of random effect models we can specify using INLA.

Adding a random slope is also easy.

mod4<-inla(badhealth2~male+foodinsec1+hhsize+f(s1_id, model="iid"),weights = eclsk.sub$wts1,
           data=eclsk.sub,
           family = "binomial", Ntrials = 1, #Ntrials=1 for bernoulli, logistic regression
           control.compute = list(waic=T),
           num.threads = 2)

summary(mod4)

## 
## Call:
## c("inla(formula = badhealth2 ~ male + foodinsec1 + hhsize + f(s1_id, ",  "    model = \"iid\"), family = \"binomial\", data = eclsk.sub, weights = eclsk.sub$wts1, ",  "    Ntrials = 1, control.compute = list(waic = T), num.threads = 2)" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.2790         86.6469          1.2676         89.1935 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -2.1896 0.0803    -2.3485  -2.1892    -2.0332 -2.1884   0
## male         0.0114 0.0546    -0.0958   0.0114     0.1186  0.0114   0
## foodinsec1   0.6898 0.0708     0.5500   0.6901     0.8281  0.6906   0
## hhsize      -0.0023 0.0146    -0.0310  -0.0023     0.0265 -0.0023   0
## 
## Random effects:
## Name   Model
##  s1_id   IID model 
## 
## Model hyperparameters:
##                      mean     sd 0.025quant 0.5quant 0.975quant  mode
## Precision for s1_id 1.978 0.2425      1.556    1.959      2.508 1.924
## 
## Expected number of effective parameters(std dev): 349.15(20.98)
## Number of equivalent replicates : 28.64 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 9772.10
## Effective number of parameters .................: 496.24
## 
## Marginal log-Likelihood:  -4889.18 
## Posterior marginals for linear predictor and fitted values computed

bri.fixed.plot(mod4)

bri.hyperpar.plot(mod4)

Random slopes model - correlated with intercept

This is the Multivariate normal model for random effects, you have to make a copy of the random effect and specify the model as iid2d, meaning a 2-dimensional multivariate normal random effect distribution.

eclsk.sub2<-filter(eclsk.sub, complete.cases(badhealth2))
nwnsch<-table(eclsk.sub2$s1_id) #number of kids within each school
eclsk.sub2$sch<-rep(1:length(unique(eclsk.sub2$s1_id)), nwnsch) #make a number indicating which school each kid  is in
eclsk.sub2$sch2<- eclsk.sub2$sch #copy it
n<- length(unique(eclsk.sub2$s1_id)) #get number of unique schools, so we know how big the random effect vectors have to be


mod5<-inla(badhealth2~male+foodinsec1+hhsize+f(sch, n=2*n, model="iid2d")+f( sch2,male, copy="sch"),weights = eclsk.sub2$wts1,
           data=eclsk.sub2,
           family = "binomial", Ntrials = 1, #Ntrials=1 for bernoulli, logistic regression
           control.compute = list(waic=T),
           num.threads = 2)

summary(mod5)

## 
## Call:
## c("inla(formula = badhealth2 ~ male + foodinsec1 + hhsize + f(sch, ",  "    n = 2 * n, model = \"iid2d\") + f(sch2, male, copy = \"sch\"), ",  "    family = \"binomial\", data = eclsk.sub2, weights = eclsk.sub2$wts1, ",  "    Ntrials = 1, control.compute = list(waic = T), num.threads = 2)" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          2.2183         94.3348          0.5822         97.1353 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -2.0323 0.0745    -2.1795  -2.0319    -1.8870 -2.0313   0
## male        -0.1570 0.0588    -0.2728  -0.1569    -0.0421 -0.1567   0
## foodinsec1   0.8024 0.0695     0.6652   0.8026     0.9381  0.8031   0
## hhsize      -0.0124 0.0146    -0.0411  -0.0124     0.0164 -0.0124   0
## 
## Random effects:
## Name   Model
##  sch   IID2D model 
## sch2   Copy 
## 
## Model hyperparameters:
##                                    mean     sd 0.025quant 0.5quant
## Precision for sch (component 1)  6.3737 0.8595     4.8749   6.3066
## Precision for sch (component 2)  4.3152 3.3097     0.7900   3.4514
## Rho1:2 for sch                  -0.0043 0.3105    -0.5885  -0.0068
##                                 0.975quant    mode
## Precision for sch (component 1)      8.249  6.1663
## Precision for sch (component 2)     12.969  2.0488
## Rho1:2 for sch                       0.590 -0.0187
## 
## Expected number of effective parameters(std dev): 321.31(22.28)
## Number of equivalent replicates : 31.12 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 9898.46
## Effective number of parameters .................: 474.20
## 
## Marginal log-Likelihood:  -4921.75 
## Posterior marginals for linear predictor and fitted values computed

bri.fixed.plot(mod5)

bri.hyperpar.plot(mod5)

random slopes - uncorrelated with random intercepts

If the way above seemed confusing, you can also make the random effects uncorrelated by just specifying two iid models, one for the intecept and one for the slope of male between schools. According to the WAIC, this model fits better.

mod6<-inla(badhealth2~male+foodinsec1+hhsize+f(sch, model="iid")+f( sch2,male, model="iid"),weights = eclsk.sub2$wts1,
           data=eclsk.sub2,
           family = "binomial", Ntrials = 1, #Ntrials=1 for bernoulli, logistic regression
           control.compute = list(waic=T),
           num.threads = 2)

summary(mod6)

## 
## Call:
## c("inla(formula = badhealth2 ~ male + foodinsec1 + hhsize + f(sch, ",  "    model = \"iid\") + f(sch2, male, model = \"iid\"), family = \"binomial\", ",  "    data = eclsk.sub2, weights = eclsk.sub2$wts1, Ntrials = 1, ",  "    control.compute = list(waic = T), num.threads = 2)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.3988        331.6369          0.6368        333.6725 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -2.1066 0.0794    -2.2638  -2.1062    -1.9520 -2.1052   0
## male        -0.1869 0.0704    -0.3271  -0.1862    -0.0507 -0.1847   0
## foodinsec1   0.8276 0.0716     0.6864   0.8278     0.9675  0.8282   0
## hhsize      -0.0125 0.0149    -0.0418  -0.0126     0.0168 -0.0126   0
## 
## Random effects:
## Name   Model
##  sch   IID model 
## sch2   IID model 
## 
## Model hyperparameters:
##                     mean     sd 0.025quant 0.5quant 0.975quant  mode
## Precision for sch  2.949 0.5021      2.111    2.897      4.078 2.786
## Precision for sch2 1.516 0.2829      1.047    1.485      2.156 1.423
## 
## Expected number of effective parameters(std dev): 491.23(33.11)
## Number of equivalent replicates : 20.36 
## 
## Watanabe-Akaike information criterion (WAIC) ...: 9820.68
## Effective number of parameters .................: 685.92
## 
## Marginal log-Likelihood:  -4901.87 
## Posterior marginals for linear predictor and fitted values computed

bri.fixed.plot(mod6)

bri.hyperpar.plot(mod6)

DEM 7473 - Bayesian Regression using the INLA Approach

Corey Sparks, Ph.D.

October 29, 2018