Introduction

The purpose of this document is to describe the mechanics of using JAGS, with simple models. In order to understand how JAGS works internally, or Bayesian analysis in general, you will have to look elsewhere. I also do not bother here with model diagnostics, etc, as they are better described in other resources.

The following are excellent starting points:
http://www.johnmyleswhite.com/notebook/2010/08/20/using-jags-in-r-with-the-rjags-package/
http://www.sumsar.net/blog/2014/01/bayesian-first-aid/
https://sites.google.com/site/doingbayesiandataanalysis/what-s-new-in-2nd-ed
http://www.stat.columbia.edu/~gelman/arm/

JAGS uses a dialect of the BUGS language, and so the models will be similar to what you might do in STAN or BUGS, although there are some differences. No attempt was made to use anything but the simplest constructs and we do not use many of the various options available in JAGS - check out the JAGS and rjags documentation for those.

The general format of each section is similar: we describe a problem, generate some data, build a JAGS model, run it, and look at some of the output.

We load rjags, and ggmcmc which provides easy access to the results returned by jags.

library(rjags)

## Loading required package: coda
## Linked to JAGS 3.4.0
## Loaded modules: basemod,bugs

library(ggmcmc)

## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Loading required package: tidyr
## Loading required package: ggplot2

Mean Estimation

In this section we look one of the simplest possible examples, that of estimating the mean of a sample.

First, we generate some fake data

set.seed(1)
n <- 10
mu <- 10
s <- 5
x <- rnorm(n, mu, s)

Note that we could have generated data from any distribution we like, the normal was chosen here for simplicity, and because everybody is familiar with it.

Next we build the JAGS model

modelString <- "model
{
  for (i in 1:n)
  {
  x[i] ~ dnorm(mu, tau)
  }
  mu ~ dnorm(0, 0.0001)
  tau <- pow(sigma, -2)
  sigma ~ dunif(0, 1000)
}"

Note JAGS uses precision, rather than variance, as the parameter to the normal. This is the reason for the tau <-> sigma shuffle you will see throughout this document.

We set up the model with 4 chains

model <- jags.model(textConnection(modelString),
                   data = list('x' = x, 'n' = n), 
                   n.chains = 4)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 19
## 
## Initializing model

and then we sample from the model

samples <- coda.samples(model, c('mu'), 5000)

look at some summary info

plot(samples)

summary(samples)

## 
## Iterations = 1001:6000
## Thinning interval = 1 
## Number of chains = 4 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##       10.63979        1.50587        0.01065        0.01050 
## 
## 2. Quantiles for each variable:
## 
##   2.5%    25%    50%    75%  97.5% 
##  7.632  9.720 10.640 11.570 13.626

and extract some of the data for further analysis

s1 <- samples[[1]] #chain 1
plot(s1[1:length(s1)], pch=19)

Mean & Standard Deviation Estimation

This example is similar to the previous one, but now we are also interested in the standard deviation. The only thing that changes is the outputs we ask for and get.

set.seed(1)
n <- 10
mu <- 10
s <- 5
x <- rnorm(n, mu, s)


modelString <- "model
{
  for (i in 1:n)
  {
  x[i] ~ dnorm(mu, tau)
  }
  mu ~ dnorm(0, 0.0001)
  tau <- pow(sigma, -2)
  sigma ~ dunif(0, 1000)
  }"

model <- jags.model(textConnection(modelString),
                   data = list('x' = x, 'n' = n), 
                   n.chains = 4)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 19
## 
## Initializing model

samples <- coda.samples(model, c('mu','sigma'), 5000)

plot(samples)

summary(samples)

## 
## Iterations = 1001:6000
## Thinning interval = 1 
## Number of chains = 4 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean    SD Naive SE Time-series SE
## mu    10.664 1.506 0.010647        0.01062
## sigma  4.579 1.336 0.009444        0.01665
## 
## 2. Quantiles for each variable:
## 
##        2.5%   25%    50%    75%  97.5%
## mu    7.692 9.735 10.660 11.581 13.704
## sigma 2.789 3.661  4.325  5.209  7.861

s1 <- samples[[1]] #chain 1
s2 <- s1[,2] #get the second var, i.e. sigma
plot(s2[1:length(s2)], pch=19)

Mean & Standard Deviation Estimation with Outliers

This example is similar to the previous one, but now we add a few outliers to the data.

set.seed(1)
n <- 50
mux <- 10
s <- 3
x <- rnorm(n, mux, s)
x[23]<-50
x[35]<-50
x[23]<-50
x[45]<-50

If we use the same model as before, we get a fairly bad fit to the generating model, because the data are now pretty far from normal.

modelString <- "model
{
  for (i in 1:n)
  {
  x[i] ~ dnorm(mu, tau)
  }
  mu ~ dnorm(0, 0.0001)
  tau <- pow(sigma, -2)
  sigma ~ dunif(0, 1000)
  }"

model <- jags.model(textConnection(modelString),
                   data = list('x' = x, 'n' = n), 
                   n.chains = 4)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 59
## 
## Initializing model

samples <- coda.samples(model, c('mu','sigma'), 5000)

plot(samples)

summary(samples)

## 
## Iterations = 1001:6000
## Thinning interval = 1 
## Number of chains = 4 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##        Mean    SD Naive SE Time-series SE
## mu    12.81 1.429 0.010108        0.01009
## sigma 10.06 1.064 0.007522        0.01034
## 
## 2. Quantiles for each variable:
## 
##        2.5%    25%   50%   75% 97.5%
## mu    9.989 11.852 12.80 13.77 15.62
## sigma 8.256  9.298  9.95 10.71 12.40

For example, the standard deviation is estimated at 10.03, rather than 3, and the mean is 12.83, rather than 10.

One of the nice things about Bayesian analysis and these tools is the ability to easily build more complicated models. If we model the data as having come from a t distribution however, and also estimate the parameter of the t distribution, we get much better results.

modelString <- "model
{
  for (i in 1:n) {
  x[i] ~ dt(mu, tau,nu)
  }
  mu ~ dnorm(0, 0.0001)
  tau <- pow(sigma, -2)
  nu <- nuMinusOne+1
  nuMinusOne ~ dexp(1.0/29)
  sigma ~ dunif(0, 1000)
}"

model <- jags.model(textConnection(modelString), data = list('x' = x, 'n' = n), n.chains = 4)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 64
## 
## Initializing model

samples <- coda.samples(model, c('mu','sigma','nu'), 5000)
plot(samples)

summary(samples)

## 
## Iterations = 1001:6000
## Thinning interval = 1 
## Number of chains = 4 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean     SD Naive SE Time-series SE
## mu    10.765 0.3826 0.002705       0.003256
## nu     1.599 0.3908 0.002764       0.004317
## sigma  1.929 0.3464 0.002449       0.003573
## 
## 2. Quantiles for each variable:
## 
##         2.5%    25%    50%    75%  97.5%
## mu    10.003 10.508 10.768 11.023 11.510
## nu     1.046  1.308  1.537  1.817  2.541
## sigma  1.349  1.683  1.900  2.136  2.696

Here the mean and standard deviation are much better, at 10.767 and 1.925 respectively, and the degrees of freedom parameter to the t distribution is estimated as 1.599. Note that as nu gets large, the t distribution converges to the normal, and so such a small value for nu says the tails of the distribution of this data are heavier than normal, indicating the presence of outliers.

More details about this, and the nuMinusOne trick are available in the references mentioned above. Also, note the same technique for being robust to outliers can be applied to the following more sophisticated models.

Comparing Paired Values

Here we have paired samples and are interested in the distribution of the differences. We use almost the same code, with our individual measurements being z & y, and their difference being x, which is then input to the same model as above.

set.seed(1)
n <- 10
mux <- 10
muy <- 12
s <- 5
z <- rnorm(n, mux, s)
y <- rnorm(n, muy, s)
x <- z-y

modelString <- "model
{
  for (i in 1:n)
  {
  x[i] ~ dnorm(mu, tau)
  }
  mu ~ dnorm(0, 0.0001)
  tau <- pow(sigma, -2)
  sigma ~ dunif(0, 1000)
}"

model <- jags.model(textConnection(modelString),
                   data = list('x' = x, 'n' = n), 
                   n.chains = 4)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 19
## 
## Initializing model

samples <- coda.samples(model, c('mu','sigma'), 5000)

plot(samples)

summary(samples)

## 
## Iterations = 1001:6000
## Thinning interval = 1 
## Number of chains = 4 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean    SD Naive SE Time-series SE
## mu    -2.566 2.957  0.02091        0.02098
## sigma  9.024 2.597  0.01836        0.03382
## 
## 2. Quantiles for each variable:
## 
##         2.5%    25%    50%     75% 97.5%
## mu    -8.599 -4.379 -2.564 -0.7511  3.44
## sigma  5.537  7.227  8.512 10.2298 15.44

Non-Paired Comparisons

This is similar to the previous examples, but here we have two samples with different numbers of observations, and they are not paired. We are interested in the difference of the distributions, in particular their mean & standard deviation.

set.seed(1)
nx <- 100
ny <- 50
mux <- 10
muy <- 12
s <- 5
x <- rnorm(nx, mux, s)
y <- rnorm(ny, muy, s)


modelString <- "model {
for(i in 1:nx) {
  x[i] ~ dnorm( mux , taux)
  }
for(i in 1:ny) {
  y[i] ~ dnorm( muy , tauy)
}

mux ~ dnorm( 0 , 1/1000 )
taux <- 1/pow( sigmax , 2 )
sigmax ~ dunif( 0 , 1000 )

muy ~ dnorm( 0 , 1/1000 )
tauy <- 1/pow( sigmay , 2 )
sigmay ~ dunif( 0 , 1000 )

mudiff <- mux - muy
sigmadiff <- sigmax - sigmay 
}"

dataList <- list( x = x, y = y, nx = nx, ny = ny)
model <- jags.model(textConnection(modelString), data = dataList, n.chains = 3)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 167
## 
## Initializing model

update(model, 1000) 
params <- c("mux", "muy", "mudiff", "sigmax", "sigmay", "sigmadiff")
samples <- coda.samples(model, params, n.iter=5000)

plot(samples)

summary(samples)

## 
## Iterations = 2001:7000
## Thinning interval = 1 
## Number of chains = 3 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##               Mean     SD Naive SE Time-series SE
## mudiff    -0.69867 0.8028 0.006555       0.006555
## mux       10.53650 0.4574 0.003734       0.003734
## muy       11.23518 0.6600 0.005389       0.005389
## sigmadiff -0.07243 0.5862 0.004787       0.006269
## sigmax     4.54554 0.3325 0.002715       0.003445
## sigmay     4.61796 0.4817 0.003933       0.005031
## 
## 2. Quantiles for each variable:
## 
##             2.5%    25%      50%     75%   97.5%
## mudiff    -2.276 -1.238 -0.69950 -0.1526  0.8545
## mux        9.644 10.229 10.53970 10.8456 11.4373
## muy        9.951 10.795 11.22743 11.6777 12.5382
## sigmadiff -1.300 -0.445 -0.04456  0.3232  1.0129
## sigmax     3.950  4.312  4.52879  4.7562  5.2519
## sigmay     3.788  4.281  4.57648  4.9117  5.6708

Bayesian Regression

In this section, we do a Bayesian regression, i.e. we obtain posterior distributuions for the coefficients of the regression function.

set.seed(1)
n <- 100
a <- 3
b <- 0.5
s <- 25
x <- runif(n, 0, 100)
y <- a * x + b + rnorm(n, 0, s)
plot(x, y, pch=19)

modelString <- "model
{
  for (i in 1:n)
  {
  y[i] ~ dnorm(mu[i], tau)
  mu[i] <- a * x[i] + b
  }
  a ~ dnorm(0, 1/1000)
  b ~ dnorm(0, 1/1000)
  tau <- pow(sigma, -2)
  sigma ~ dunif(0, 1000)
  }"
  
dataList <- list( x = x, y = y, n = n)
model <- jags.model(textConnection(modelString), data = dataList, n.chains = 3)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 411
## 
## Initializing model

update(model, 1000) 
params <- c("a", "b", "sigma")
samples <- coda.samples(model, params, n.iter=5000)

plot(samples)

summary(samples)

## 
## Iterations = 2001:7000
## Thinning interval = 1 
## Number of chains = 3 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean      SD  Naive SE Time-series SE
## a      3.075 0.08907 0.0007273       0.002135
## b     -3.780 5.16341 0.0421590       0.122905
## sigma 23.814 1.72986 0.0141243       0.018742
## 
## 2. Quantiles for each variable:
## 
##         2.5%    25%    50%     75%  97.5%
## a       2.90  3.015  3.074  3.1333  3.252
## b     -13.95 -7.226 -3.742 -0.3542  6.195
## sigma  20.70 22.622 23.707 24.9109 27.433

Next we do a fancy plot. We randomly select 100 of the samples, and plot the regression line for each. We overlay the actual data, as well as the line that actually generated the data.

ggd <- ggs(samples)
a2 <- ggd$value[ which(ggd$Parameter == "a") ]
b2 <- ggd$value[ which(ggd$Parameter == "b") ]
d2 <- data.frame(a2, b2)
d <- data.frame(x, y)
linesToPlot <- 100
d3 <- d2[sample(nrow(d2),linesToPlot),]
ggplot(d, aes( x = x, y)) +
  geom_abline(aes(intercept = b, slope = a), color = "red") +
  geom_abline(aes(intercept = b2, slope = a2), data = d3, color = "blue", alpha = 0.05) +
  theme_bw() +
  geom_point()

Hierarchical Regression

In this section, we do a hierarchical regression. Seriously, see the book by Hill & Gelman for more info. The point here is to demonstrate the mechanics of how to do it.

set.seed(1)
n <- 100
G <- 5

a <- rnorm(G,5,1)
b <- rnorm(G,-3,1)
s <- 2
x1 <- runif(n, 0, 100)
y1 <- a[1] * x1 + b[1] + rnorm(n, 0, s)
f1<-rep(1,n)

x<-x1
y<-y1
f<-f1
for (j in 2:G){
  x1 <- runif(n, 0, 100)
  y1 <- a[j] * x1 + b[j] + rnorm(n, 0, s)
  f1<-rep(j,n)
  x<-c(x,x1)
  y<-c(y,y1)
  f<-c(f,f1)
}
N <- length(x)

modelString <- "model
{
  for (i in 1:N)
  {
  y[i] ~ dnorm(mu[i], tau)
  mu[i] <- a[f[i]] * x[i] + b[f[i]]
  }
  
  for (j in 1:G)
  {
  a[j] ~ dnorm(mua, tauc)
  b[j] ~ dnorm(mub, tauc)
  }
  
  mua ~ dnorm(0, 0.0001)
  mub ~ dnorm(0, 0.0001)
  
  tau <- pow(sigma, -2)
  sigma ~ dunif(0, 1000)
  
  tauc <- pow(sigmac, -2)
  sigmac ~ dunif(0, 1000)

}"

dataList <- list( x = x, y = y, f=f, N = N, G = G)
model <- jags.model(textConnection(modelString), data = dataList, n.chains = 3)

## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
##    Graph Size: 2523
## 
## Initializing model

update(model, 1000) 
params <- c("a", "b", "sigma",'mua','mub','sigmac')
samples <- coda.samples(model, params, n.iter=5000)

plot(samples)

summary(samples)

## 
## Iterations = 2001:7000
## Thinning interval = 1 
## Number of chains = 3 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean       SD  Naive SE Time-series SE
## a[1]    4.3716 0.007243 5.913e-05      0.0001602
## a[2]    5.1935 0.006569 5.363e-05      0.0001382
## a[3]    4.1732 0.007040 5.748e-05      0.0001537
## a[4]    6.5904 0.006536 5.337e-05      0.0001365
## a[5]    5.3210 0.006794 5.547e-05      0.0001351
## b[1]   -3.8001 0.406815 3.322e-03      0.0089677
## b[2]   -3.0037 0.394462 3.221e-03      0.0084004
## b[3]   -2.9776 0.405114 3.308e-03      0.0086804
## b[4]   -2.2646 0.385704 3.149e-03      0.0085357
## b[5]   -3.1146 0.370697 3.027e-03      0.0073706
## mua     5.1325 0.469493 3.833e-03      0.0038155
## mub    -3.0339 0.502968 4.107e-03      0.0060330
## sigma   2.0466 0.065567 5.354e-04      0.0005899
## sigmac  0.9919 0.329775 2.693e-03      0.0052103
## 
## 2. Quantiles for each variable:
## 
##           2.5%     25%     50%    75%  97.5%
## a[1]    4.3576  4.3667  4.3716  4.377  4.386
## a[2]    5.1803  5.1892  5.1936  5.198  5.207
## a[3]    4.1595  4.1684  4.1732  4.178  4.187
## a[4]    6.5775  6.5861  6.5904  6.595  6.603
## a[5]    5.3076  5.3163  5.3210  5.326  5.334
## b[1]   -4.5945 -4.0764 -3.7961 -3.519 -3.008
## b[2]   -3.7684 -3.2664 -3.0047 -2.744 -2.221
## b[3]   -3.7778 -3.2492 -2.9734 -2.705 -2.189
## b[4]   -3.0132 -2.5231 -2.2686 -2.010 -1.501
## b[5]   -3.8367 -3.3669 -3.1181 -2.864 -2.386
## mua     4.2096  4.8465  5.1302  5.409  6.075
## mub    -4.0281 -3.3498 -3.0430 -2.719 -2.041
## sigma   1.9232  2.0009  2.0444  2.090  2.182
## sigmac  0.5614  0.7658  0.9264  1.143  1.804

Conclusion

Hopefully the ease with which complicated models can be described, fit, and analyzed with JAGS is obvious from this document, and encourages the reader to follow up with the references above.

JAGS Mechanics

Peter Rabinovitch

June 18, 2015