Introduction

In this tutorial, you will learn how to perform some simple pair-wise meta-analyses using Bayesian methods. The platform that you will use is R with the JAGS program installed. RStudio is the interface for R which allows you to write your codes and compile them into the console. For this tutorial, you will need to have installed R, RStudio, JAGS, and the various packages associated with JAGS such as rjags, R2jags, mcmcplot, code, and boa.

Example 1 – Aspirin meta-analysis using a fixed effects model

Read-in the data

When you read your data, make sure you are using the correct working directory. You can change your working directory using the setwd command. In this tutorial am using my working directory to allow the codes to work. Make sure that you replace the directory path that I’m using with your directory path when you perform these tutorials for yourself.

I have saved the data as data1.csv. Please use this data for this example. The data can be located here: https://github.com/mbounthavong/Bayesian-meta-analysis

##### Clear data
rm(list=ls())

#### I commented out the system that I am not on. In this case, I commented out the PC system for the Mac.
##### For mac:
setwd("/Users/mbounthavong/Dropbox/UW Courses/PHARM 536 -- Beth's lecture/Beths Winbugs examples")

##### For PC:
# setwd("D:\\mboun\\Dropbox\\UW Courses\\PHARM 536 -- Beth's lecture\\Beths Winbugs examples")

##### Read the data into R.
##### The data is saved as a CSV file.
data = read.csv("data1.csv")
head(data) # Shows the first six entries
##   N          y         V
## 1 1  0.3289011 0.0388957
## 2 2  0.3845458 0.0411673
## 3 3  0.2195622 0.0204915
## 4 4  0.2222206 0.0647646
## 5 5  0.2254672 0.0351996
## 6 6 -0.1246363 0.0096167

Defining values

After you read-in your data, we will define some values that will be used in the JAGS model. These values are necessary for the JAGS model to run. These include the initial values for the priors and the headers for the data columns.

##### Values for simulation
N <- length(data$y) # There are 7 total studies. 

##### load libraries
library(rjags)
library(coda)

##### now prepare dat for JAGS
## N is the number of entries (e.g., 7)
## y is the outcome in the data (e.g., ln(OR))
## V is the variance in the data
dat <- list("N" = N, "y" = data$y, "V" = data$V)  # names list of numbers

##### Initial values
inits <- list( d = 0.0 )

Enter your model

Next, you will need to define your JAGS model using the BUGS syntax. This can be done in two ways:

  1. Using a text editor or

  2. Embed the BUGS model into R.

For this example, I will embed the BUGS model into R. We will save this as “aspirinFE.txt.” The d parameter will be given a normal distribution with mean 0 and precision 1E-5.

##### define JAGS model within R

cat("model
    {

    for ( i in 1:N ) {
    
    P[i] <- 1/V[i]
    
    y[i] ~ dnorm( d, P[i] )

    }

    ### Define the priors
    d ~ dnorm( 0, 0.00001 )

    ### Transform the ln(OR) to OR
    OR <- exp( d )

    }", file="aspirinFE.txt")

Set up the JAGS model

Use the following syntax to set up the JAGS model. Notice that the number of chains is set to “1.” We are going to only use a single chain for our MCMC simulations. The number of adaptions is set at 500. This is done to improve the calibation of the MCMC and improve the mixing of the chain.

#### Set up the JAGS model and settings
jags.m <- jags.model( file = "aspirinFE.txt", data=dat, inits=inits, n.chains=1, n.adapt=500 )
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 7
##    Unobserved stochastic nodes: 1
##    Total graph size: 33
## 
## Initializing model

Specify the parameters to be monitored

These are the parameters that the output will provide for us. Review the BUGS model and notice that the output parameters only include d and OR.

## specify parameters to be monitored
params <- c("d", "OR")

Run JAGS and save the posterior samples

We will save the coda samples as “samps” and iterate this MCMC for 10,000 iterations. The more iterations your run, the longer it will take to complete.

## run JAGS and save posterior samples
samps <- coda.samples( jags.m, params, n.iter=10000 )

Summarize the posteriors samples with a burn-in of 5000 iterations

We need to have a burn-in period to eliminate any poor mixing of the MCMC at the beginning of the simulation. The usual number of burn-ins is approximately 5,000.

## summarize posterior samples
summary(samps)
## 
## Iterations = 1:10000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 10000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##      Mean      SD  Naive SE Time-series SE
## OR 1.1153 0.03701 0.0003701      0.0003701
## d  0.1086 0.03318 0.0003318      0.0003318
## 
## 2. Quantiles for each variable:
## 
##       2.5%     25%    50%    75%  97.5%
## OR 1.04453 1.09005 1.1148 1.1398 1.1897
## d  0.04356 0.08622 0.1087 0.1309 0.1737
summary(window(samps, start=5001))  # Burn in of 5000. Start at 5001
## 
## Iterations = 5001:10000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##      Mean      SD  Naive SE Time-series SE
## OR 1.1154 0.03694 0.0005224      0.0005224
## d  0.1087 0.03312 0.0004684      0.0004684
## 
## 2. Quantiles for each variable:
## 
##       2.5%     25%    50%    75%  97.5%
## OR 1.04453 1.09018 1.1153 1.1395 1.1892
## d  0.04356 0.08634 0.1091 0.1306 0.1733

Plot distributions of the parameters

Plot the posterior distributions of the parameters. We are looking for ditributions that are stable and not erratic.

plot(samps)

Example #2 – Aspirin meta-analysis using a random effects model

In this example, we will evaluate the same meta-analysis, but we will use a random effects model instead of a fixed effects model. In random effects model, there is heterogeneity in the outcomes across the different studies. In order to address this, we need to include a new parameter into the meta-analysis called tau.

Read-in the data

I have saved the data as data1.csv. Please use this data for this example.

##### Clear the dataset
rm(list=ls())

##### For mac:
setwd("/Users/mbounthavong/Dropbox/UW Courses/PHARM 536 -- Beth's lecture/Beths Winbugs examples")

##### For PC:
# setwd("D:\\mboun\\Dropbox\\UW Courses\\PHARM 536 -- Beth's lecture\\Beths Winbugs examples")

##### Read the data into R.
##### The data is saved as a CSV file.
data = read.csv("data1.csv")
head(data) # Shows the first six entries
##   N          y         V
## 1 1  0.3289011 0.0388957
## 2 2  0.3845458 0.0411673
## 3 3  0.2195622 0.0204915
## 4 4  0.2222206 0.0647646
## 5 5  0.2254672 0.0351996
## 6 6 -0.1246363 0.0096167

Define the values for the model.

With a random effects model, the prior will need to have initial values. These are usually zero unless you have some idea of what the values should be. For this example, we will set the initial values to zero. Only tau has an initial value of 1. Since there are 7 studies in the meta-analysis, there must be 7 initial values for delta.

##### Values for simulation
N <- length(data$y) # There are 7 total studies. 

##### load libraries
library(rjags)
library(coda)

##### now prepare dat for JAGS
## N is the number of entries (e.g., 7)
## y is the outcome in the data (e.g., ln(OR))
## V is the variance in the data
dat <- list("N" = N, "y" = data$y, "V" = data$V)  # names list of numbers

##### Initial values
inits <- list( d = 0.0, tau = 1, delta = c (0, 0, 0, 0, 0, 0, 0) )

Embed the BUGS model into R.

After setting up the initial values, embed the BUGS model directly into R. We will save this as “aspirinRE2.txt.”

##### define JAGS model within R
## The BUGS model is saved as a text or jag file. I used Notepad ++ to build the model
## but you can use any text editor. Please do not use Words.

#### Alternatively, you can enter the mode directly in R:
cat("model
    {

    for (i in 1:N) {
    
    y[i] ~ dnorm( delta[i], P[i] )

    delta[i] ~ dnorm( d, prec )

    P[i] <- 1/V[i]

    }

    ### Define the priors
    d ~ dnorm( 0, 0.00001 )
    tau ~ dunif( 0, 10 )
    tau.sq <- tau * tau
    prec <- 1 / tau.sq
  
    ### Transform ln(OR) into OR
    OR <- exp ( d )

    }", file="aspirinRE2.txt")

Set up the JAGS model

## Set up the JAGS model.
jags.m <- jags.model( file = "aspirinRE2.txt", data=dat, inits=inits, n.chains=1, n.adapt=500 )
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 7
##    Unobserved stochastic nodes: 9
##    Total graph size: 48
## 
## Initializing model

Specify the parameters of interest.

Normally this is the outcome, such as the odds ratio or standardized mean difference. We also include tau and delta. Delta tells us the treatment effect of ASA. We won’t plot delta[i]; however, you can do this by including delta as part of the parameters to be monitored. Keep in mind that you may run into errors if you plot too many graphics into a single viewing pane due to the margins and size limitations. If you encounter this problem, it maay be best to plot a few parameters at a time.

## specify parameters to be monitored
params <- c("d", "OR", "tau")

Run JAGS model and save the samples

## run JAGS and save posterior samples
samps <- coda.samples( jags.m, params, n.iter=10000 )

Use a burn-in of 5,000 iterations

## summarize posterior samples
summary(samps)
## 
## Iterations = 501:10500
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 10000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##       Mean      SD  Naive SE Time-series SE
## OR  1.1637 0.11715 0.0011715       0.002366
## d   0.1469 0.09571 0.0009571       0.001840
## tau 0.1626 0.11296 0.0011296       0.006094
## 
## 2. Quantiles for each variable:
## 
##          2.5%     25%   50%    75%  97.5%
## OR   0.981045 1.09357 1.148 1.2160 1.4382
## d   -0.019137 0.08944 0.138 0.1956 0.3634
## tau  0.009806 0.08769 0.144 0.2118 0.4274
summary(window(samps, start=5001))  # Burn in of 5000. Start at 5001.
## 
## Iterations = 5001:10500
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 5500 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##       Mean      SD Naive SE Time-series SE
## OR  1.1656 0.11952 0.001612       0.003319
## d   0.1485 0.09574 0.001291       0.002548
## tau 0.1580 0.11639 0.001569       0.009341
## 
## 2. Quantiles for each variable:
## 
##          2.5%     25%    50%    75%  97.5%
## OR   0.983968 1.09745 1.1483 1.2158 1.4343
## d   -0.016162 0.09299 0.1383 0.1954 0.3607
## tau  0.004856 0.08291 0.1424 0.2090 0.4012

Plot the posterior distribution

Plot the traces and posterior distributions.

plot(samps)