First, lets get some prerequisites out of the way.
require(rjags)
## Loading required package: rjags
## Loading required package: coda
## Loading required package: lattice
## Linked to JAGS 3.3.0
## Loaded modules: basemod,bugs
require(coda)
Create some simulated data:
set.seed(432104)
n <- 1000
x <- rnorm(n, 0, 5)
Most JAGS
/BUGS
examples save the probability model in a separate file and then read it in in the call to JAGS
. Below we’ll use the function textConnection
to pass a string object as if it were a text file. The benefit is we can keep the probability model inline with the rest of our R code which helps makes this document self-contained).
model1.string <-"
model {
for (i in 1:N){
x[i] ~ dnorm(mu, tau)
}
mu ~ dnorm(0,.0001)
tau <- pow(sigma, -2)
sigma ~ dunif(0,100)
}
"
model1.spec<-textConnection(model1.string)
Now we pass rajgs
the model, data, and other options to run four MCMC chains to estimate the paramters of interest in the the model. The update step is unnecessary and I’m just making sure it works. The jags.samples
function samples from the samples and calculates means.
jags <- jags.model(model1.spec,
data = list('x' = x,
'N' = n),
n.chains=4,
n.adapt=100)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph Size: 1009
##
## Initializing model
update(jags, 1000)
jags.samples(jags,
c('mu', 'tau'),
1000)
## $mu
## mcarray:
## [1] 0.04152
##
## Marginalizing over: iteration(1000),chain(4)
##
## $tau
## mcarray:
## [1] 0.03854
##
## Marginalizing over: iteration(1000),chain(4)
Here’s an example from Chapter 5 fo Gelman and Hill. Data is simple: change in SAT average SAT scores at 8 schools as well as the standerd devations of the difference in the mean SAT scores. The question Gelman is interested in answering is: do different SAT training programs have a positive effect on SAT scores?
Here’s the data, where \(\sigma\) (sigma
) is the standard deviation and schoolobs
are the observed change in means, \(\mu\):
sigma <- c(15,10,16,11, 9,11,10,18)
schoolobs <- c(28,8, -3, 7,-1, 1,18,12)
Gelman’s recommendation is a multi-level model where mean training effect is drawn from a mean zero normal distribtion, a school’s indiviudal effect is drawn from normal distribution with with the mean equal to the mean training effect, and the observed scores (realizations) are then drawn from a distribtion paramaterized with the mean given by the individual school effect. The model is given by:
model.sat.text<-"
model {
for(i in 1:N) {
schoolmean[i] ~ dnorm(mu,itau)
thes[i] <- 1/pow(sigma[i],2)
schoolobs[i] ~ dnorm(schoolmean[i],thes[i])
}
mu ~ dnorm(0,alpha)
alpha <- .01
itau ~ dgamma(1e-3,pow(15,2)*1e-3)
tau <- pow(1/itau,1/2)
}
"
model.sat.spec<-textConnection(model.sat.text)
This probability model can be represented graphically like so (note that the y1
are the observed schools difference in means, schoolobs
):
require(igraph)
## Loading required package: igraph
gr<-graph.formula("N(0,0.01)"-+"mu",
"mu"-+"N(0,1/tau)",
"N(0,1/tau)"-+"m1",
"N(0,1/tau)"-+"m2",
"N(0,1/tau)"-+"m8",
"m1"-+"N(0,1/simga21)",
"m2"-+"N(0,1/simga22)",
"m8"-+"N(0,1/simga28)",
"N(0,1/simga21)"-+"y1",
"N(0,1/simga22)"-+"y2",
"N(0,1/simga28)"-+"y8")
lo<-data.frame(x=c(2,2,2,1,2,3,1,2,3,1,2,3),y=c(6,5,4,3,3,3,2,2,2,1,1,1))
plot(gr,
layout=layout.reingold.tilford(gr),
edge.arrow.size=.25
)
To estimate the unobserved parameters we send the model and data it to JAGS
and get the MCMC chains back.
sat.jags <- jags.model(model.sat.spec,
data=list('sigma'=sigma,
'schoolobs'=schoolobs,
'N'=length(schoolobs)
),
n.adapt = 1000)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph Size: 49
##
## Initializing model
#run with jags.samples
samps.jags <- jags.samples(sat.jags,
c('mu','tau'),
n.iter=10000,
thin=10
)
samps.jags
## $mu
## mcarray:
## [1] 6.284
##
## Marginalizing over: iteration(1000),chain(1)
##
## $tau
## mcarray:
## [1] 3.426
##
## Marginalizing over: iteration(1000),chain(1)
#same thing but return a coda MCMC object
samps.coda <- coda.samples(sat.jags,
c('mu','tau', 'schoolmean'),
n.iter=10000,
thin=10
)
head(samps.coda)
## [[1]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 10010
## End = 10070
## Thinning interval = 10
## mu schoolmean[1] schoolmean[2] schoolmean[3] schoolmean[4]
## [1,] 1.5970 3.3191 2.15458 0.55528 0.1701
## [2,] -0.4155 0.9781 0.19525 -1.08118 -1.2155
## [3,] -0.1734 -0.8012 -0.04613 0.02149 -0.1771
## [4,] 1.6738 25.6627 13.60300 8.96727 -4.5142
## [5,] 6.4025 6.5380 1.41094 8.76639 1.8506
## [6,] 12.1820 15.1568 7.07287 10.84956 3.1356
## [7,] 16.7768 22.7075 13.78895 17.89082 9.5781
## schoolmean[5] schoolmean[6] schoolmean[7] schoolmean[8] tau
## [1,] 1.4440 2.4067 0.4271 3.66363 1.8163
## [2,] -3.0866 -1.9588 1.4212 -1.09447 2.4566
## [3,] -0.4967 -0.5765 -0.1417 0.04553 0.7958
## [4,] 4.1873 5.7488 3.5961 9.56629 8.0728
## [5,] 0.3272 4.0743 8.9044 7.14496 3.2625
## [6,] 4.7446 12.9049 10.0564 8.24538 5.1041
## [7,] 5.6913 5.2256 25.4259 14.47439 7.2656
##
## attr(,"class")
## [1] "mcmc.list"
summary(samps.coda)
##
## Iterations = 10010:20000
## Thinning interval = 10
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## mu 6.76 3.87 0.122 0.204
## schoolmean[1] 8.26 5.62 0.178 0.241
## schoolmean[2] 6.90 4.92 0.156 0.224
## schoolmean[3] 6.16 5.42 0.171 0.218
## schoolmean[4] 6.90 5.02 0.159 0.222
## schoolmean[5] 5.68 4.88 0.154 0.229
## schoolmean[6] 6.11 5.09 0.161 0.228
## schoolmean[7] 8.14 5.17 0.163 0.254
## schoolmean[8] 7.08 5.50 0.174 0.230
## tau 3.34 3.31 0.105 0.155
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## mu -0.777 4.30 6.82 9.32 14.0
## schoolmean[1] -0.905 4.82 7.71 11.24 20.9
## schoolmean[2] -2.786 3.96 6.80 9.92 16.2
## schoolmean[3] -5.461 3.27 6.38 9.39 16.6
## schoolmean[4] -2.655 3.74 6.73 9.92 17.5
## schoolmean[5] -5.665 2.90 5.82 8.81 14.7
## schoolmean[6] -5.350 3.18 6.33 9.37 15.5
## schoolmean[7] -0.803 4.67 7.63 11.33 19.2
## schoolmean[8] -3.012 3.81 7.01 9.77 18.7
## tau 0.377 1.08 2.23 4.46 12.0
plot(samps.coda[[1]][,c("mu","tau")])
plot(samps.coda[[1]][,2:5])
plot(samps.coda[[1]][,6:9])