1 Introduction

You should start by reading Section 10 of the textbook.

All of Bayesian inference is based on the posterior distribution of parameters given sample data. Therefore, the main computational task in a Bayesian analysis is computing, or approximating, the posterior distribution. We have seen three ways to compute/approximate the posterior distribution

  • Math. Only works in a few special cases (e.g., Beta-Binomial, Normal-Normal, conjugate prior situations).
  • Grid approximation. Not feasible when they are many parameters.
  • Simulation. By far the most common method (though we have only seen a very naive approach so far)

In principle, the posterior distribution \(\pi(\theta|y)\) of parameters \(\theta\) given observed data \(y\) can be found by

  • simulating many \(\theta\) values from the prior distribution
  • simulating, for each simulated value of \(\theta\), a \(Y\) value from the corresponding conditional distribution of \(Y\) given \(\theta\) (\(Y\) could be a sample or the value of a sample statistic)
  • discarding \((\theta, Y)\) pairs for which the simulated \(Y\) value is not equal to the observed \(y\) value
  • Summarizing the simulated \(\theta\) values for the remaining pairs with \(Y=y\).

However, this is a very computationally inefficient way of approximating the posterior distribution. Unless the sample size is really small, the simulated sample statistic \(Y\) will only match the observed \(y\) value in relatively few samples, simply because in large samples there are just many more possibilities. For example, in 1000 flips of a fair coin, the most likely value of the number of heads is 500, but the probability of exactly 500 heads in 1000 flips is only 0.025. When there are many possibilities, the probability gets stretched fairly thin. Therefore, if we want say 10000 simulated values of \(\theta\) given \(y\), we would have first simulate many, many more values.

The situation is even more extreme when the data is continuous, where the probably of replicating the observed sample is essentially 0.

Therefore, we need more efficient simulation algorithms for approximating posterior distributions. Markov chain Monte Carlo (MCMC) methods provide powerful and widely applicable algorithms for simulating from probability distributions, including complex and high-dimensional distributions. These algorithms include Metropolis-Hastings, Gibbs sampling, Hamiltonian Monte Carlo, among others. We will see later some of the ideas behind MCMC algorithms. However, we will rely on the software package JAGS to carry out MCMC simulations.

JAGS (“Just Another Gibbs Sampler”) is a stand alone program for performing MCMC simulations. JAGS takes as input a Bayesian model description — prior plus likelihood — and data and returns an MCMC sample from the posterior distribution. JAGS uses a combination of Metropolis sampling, Gibbs sampling, and other MCMC algorithms.

This lab gets you started with the JAGS software. You will be coding in JAGS examples that we have already seen so that you can compare JAGS output with results from the other methods (math, grid approximation.)

The purpose of this lab is to just get you up and running with JAGS. We will talk a little bit more about how MCMC works later.

2 Instructions

This RMarkdown file provides a template for you to fill in. Read the file from start to finish, completing the parts as indicated. Some code is provided for you. Be sure to run this code, and make sure you understand what it’s doing. Some blank “code chunks” are provided; you are welcome to add more (Code > Insert chunk or CTRL-ALT-I) as needed. There are also a few places where you should type text responses. Feel free to add additional text responses as necessary.

You can run individual code chunks using the play button. You can use objects defined in one chunk in others. Just keep in mind that chunks are evaluated in order. So if you call something x in one chunk, but redefine x as something else in another chunk, it’s essential that you evaluate the chunks in the proper order.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. When you are finished

  • click Knit and check to make sure that you have no errors and everything runs properly. (Fix any problems and redo this step until it works.)
  • Make sure your type your name(s) at the top of the notebook where it says “Type your name(s) here”. If you worked in a team, you will submit a single notebook with both names; make sure both names are included
  • Submit your completed files in Canvas.

You’ll need a few R packages, which you can install using install.packages

library(ggplot2)
library(dplyr)
library(knitr)
library(janitor)
library(rjags)
library(bayesplot)
library(runjags)

knitr::opts_chunk$set(
  echo = TRUE,
  warning = FALSE,
  message = FALSE
)

3 Install JAGS

JAGS is a standalone software program that you need to download and install outside of R. Install JAGS using Step 3 of these instructions.

4 Install rjags

JAGS is a separate program that must be installed outside of R. However, we will only interface with JAGS through R/RStudio. To do this, you will need to install the rjags package in R: install.packages("rjags").

You should also install the bayesplot package.

# you'll need to install the packages first
# install.packages("rjags")
# install.packages("bayesplot")

library(rjags)
library(bayesplot)

5 Introduction to JAGS

Read Section 10.1 of the textbook for a brief introduction to JAGS.

6 Exercise: left/right eye and hand dominance

In Lab 1, you used simulation to approximate a posterior distribution. Now you will use JAGS to conduct a similar but much more computationally efficient simulation.

Suppose you’re interested in \(\theta\), the population proportion of Cal Poly students whose dominant eye (right or left) is the same same as their dominant hand (right or left).

Assume a Beta(2, 1) prior for \(\theta\). (This is similar to the prior used in parts 3-6 of Lab 1.)

In a sample of 36 students, 29 students have the same dominant right eye and right hand. (This is based on my STAT 217 class in Fall 2021, but you can assume it’s a random sample.)

To do:

  1. Code and run the model in JAGS, and produce summaries of the posterior distribution of \(\theta\) given the sample data.
  2. Compare to the results from Lab 1. Is the JAGS output reasonably consistent with the previous results?
library(rjags)
library(bayesplot)
library(runjags) 
n = 36 # sample size
y = 29 # number of successes

model_string <- "model{

  # Likelihood
  y ~ dbinom(theta, n)

  # Prior
  theta ~ dbeta(alpha, beta)
  alpha <- 2 # prior successes
  beta <- 1 # prior failures

}"

dataList = list(y = y, n = n)

model <- jags.model(file = textConnection(model_string), 
                    data = dataList)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 1
##    Unobserved stochastic nodes: 1
##    Total graph size: 5
## 
## Initializing model
update(model, n.iter = 1000)

Nrep = 10000 # number of values to simulate

posterior_sample <- coda.samples(model,
                       variable.names = c("theta"),
                       n.iter = Nrep)
summary(posterior_sample)
## 
## Iterations = 2001:12000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 10000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##      0.7934053      0.0642869      0.0006429      0.0008769 
## 
## 2. Quantiles for each variable:
## 
##   2.5%    25%    50%    75%  97.5% 
## 0.6524 0.7530 0.7983 0.8403 0.9051
plot(posterior_sample)

library(bayesplot)
mcmc_hist(posterior_sample)

Type your response here. Looking at this simulated posterior distribution of theta using JAGS, it is extremely similar to that of Lab 1’s simulation of the posterior distribution. Both have a center of around 0.8 and appear to have similar spreads as well. Thus, the JAGS output is reasonably consistent with the previous results of Lab 1.

Hints:

  • You should be able to follow the code in Section 10.1 pretty closely.
  • The main output of JAGS will be values simulated from the posterior distribution. You can experiment with different ways of summarizing them (bayesplot, DBDA2E functions, creating your own plots/summaries). But your summaries should include the usual: plot, posterior mean, posterior SD, posterior credible intervals.

7 Exercise: body temperatures

You’ll use JAGS to perform a Bayesian analysis of the body temperature problem from Lab 3. Recall the assumptions:

  • body temperatures (degrees Fahrenheit) of healthy adult humans follow a Normal distribution with unknown mean \(\theta\) and known standard deviation \(\sigma=1\).
  • the prior distribution of \(\theta\) is a Normal distribution with mean 98.6 and standard deviation 0.3 (degrees Fahrenheit).

In a recent study1, the sample mean body temperature in a sample of 208 healthy adults was 97.7 degrees F. Here is the sample data on 208 temperature measurements. Notice that the sample mean is 97.7 and the sample SD is 1 (consistent with the assumed population SD of \(\sigma = 1\).)

data = read.csv("temperature.csv")
y = data$temperature

hist(y, freq = FALSE)

summary(y)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   95.05   97.06   97.60   97.70   98.32  100.88
sd(y)
## [1] 0.9999203
n = length(y)
ybar = mean(y)
y
##   [1]  98.44  99.40  99.07  97.32  99.90  97.33  97.21  96.30  97.00  99.05
##  [11]  96.57  97.61  97.56  98.58  97.83  98.94  97.27  97.10  97.42  97.36
##  [21]  95.86  97.49  96.87  98.82  98.25  98.68  96.35  97.11  99.71  97.98
##  [31]  99.36  96.45  99.11  98.53  97.70  95.80  98.12  97.33  96.50  97.49
##  [41]  97.04  96.09  97.56  99.35  96.86  98.14  97.47  97.94  97.13  95.82
##  [51]  98.24  99.35  97.64  97.98  96.01  98.20  98.50  97.04  95.41  99.81
##  [61]  98.13  97.13  98.51  96.62  98.22  98.83  97.19  98.82  96.90  99.70
##  [71]  96.32  99.05  99.22  97.28  98.10  96.96  98.02  98.00  98.72  98.68
##  [81]  98.03  96.99  98.22  99.66  97.27  99.99  98.74  98.12  96.45  98.49
##  [91]  97.92  96.30  98.67  95.75  97.39  97.06  97.17  98.55  97.24  99.06
## [101]  98.59  96.63  97.58  98.30  96.73  97.43  97.38  98.72  97.26  97.85
## [111]  98.27  96.49  96.99  99.15  98.61  97.68  97.10  97.49  95.95  96.39
## [121]  97.82  95.05  97.59  98.29  97.31  98.31  97.54  97.32  97.86  97.52
## [131]  98.84  96.71  98.17  97.60  97.20  96.57  98.72  97.98  96.99  97.71
## [141]  98.78  96.96  98.54  98.25  96.68  98.03  97.19  96.21  98.15  98.16
## [151]  98.52  98.23  97.29  95.24  97.39  97.56  97.55  97.54  95.90  98.15
## [161]  97.58  97.10  96.48 100.19  97.65  96.79  99.61  97.58  97.30  97.72
## [171]  97.74  98.20  98.97  98.22  97.89  98.66  98.35  98.20  96.85  97.11
## [181]  95.77  96.64  97.79  96.62  97.71  97.10  96.57  96.77  96.93  96.75
## [191]  96.50  96.92  97.25  99.06  97.37  98.44 100.88  98.38  97.73  97.60
## [201]  97.82  97.06  97.43  98.48  98.20  98.60  97.82  97.39

To do:

  1. Code and run the model in JAGS, and produce summaries of the posterior distribution of \(\theta\) given the sample data.
  2. Use simulation to approximate the posterior predictive distribution of body temperatures.
  3. Compare to the results from Lab 3. Is the JAGS output reasonably consistent with the previous results?
library(rjags)
library(bayesplot)
library(runjags) 


model_string <- "model{

  # Likelihood
  for (i in 1:n){
    y[i] ~ dnorm(theta,1)
  }

  # Prior
  theta ~ dnorm(mu0, sd0)
  mu0 <- 98.6
  sd0 <- 1

}"

dataList = list(y = y, n = n)

# Compile
model <- jags.model(file = textConnection(model_string), 
                    data = dataList)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 208
##    Unobserved stochastic nodes: 1
##    Total graph size: 212
## 
## Initializing model
update(model, n.iter = 1000)

Nrep = 10000 # number of values to simulate

posterior_sample <- coda.samples(model,
                       variable.names = c("theta"),
                       n.iter = Nrep)
summary(posterior_sample)
## 
## Iterations = 1001:11000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 10000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##      9.770e+01      6.925e-02      6.925e-04      6.925e-04 
## 
## 2. Quantiles for each variable:
## 
##  2.5%   25%   50%   75% 97.5% 
## 97.57 97.66 97.71 97.75 97.84
plot(posterior_sample)

library(bayesplot)
mcmc_hist(posterior_sample)

Type your response here. After simulatying the approximate posterior predictive distribution of body temperatures using JAGS, I can see that this simulation technique is extremely similar to that of Lab 3. Both have posterior means of around 97.7 and the 25th and 75th percentiles are very similar as well, 97.66 97.70 for the JAGS simulation and 97.70007 97.79121 for our results also using Normal-Normal in Lab 3. Thus, we can conclude that the JAGS output is reasonably consistent with the previous results from Lab 3.

Hints:

  • The data consists of 208 individual temperature measurements, so make sure you read Section 10.1.7 on loading data as individual values.
  • Also make sure the likelihood reflects the data, i.e., individual values.
  • The JAGS code in the textbook is for Beta-Binomial types models. The main modifications to the JAGS code will be to implement a Normal-Normal type model. But you might need to make other small modifications.
  • The main output of JAGS will be values simulated from the posterior distribution. You can experiment with different ways of summarizing them (bayesplot, DBDA2E functions, creating your own plots/summaries). But your summaries should include the usual: plot, posterior mean, posterior SD, posterior credible intervals.
  • JAGS simulates values of \(\theta\) from the posterior distribution. Once you have these values, you can run posterior predictive simulation as usual. That is, if you think of posterior predictive simulation as a two-stage process, then JAGS takes care of the first stage for you.

8 Reflection

  1. Write a few sentences summarizing one important concept you have learned in this lab
  2. What is (at least) one question that you still have?

TYPE YOUR RESPONSE HERE. 1) One important concept I’ve learned with this lab is that there are many different methods in how to simulate a Bayesian posterior predictive distribution and that there isn’t just one unique way to do it. I also learned some helpful syntax and feel fairly confident on how to apply and use JAGS. 2) What are the advantages of using JAGS over the other techniques we have used in class thus far (e.g. grid-approximation)?