Answers to Bayesian Computation with JAGS and Farmer Jöns

One thing to note is that the code changes you have to make between questions often are minimal. Yet we go from running a simple binomial model to running a pretty advanced linear model.

All answers below use wide “sloppy” uniform priors, and these could certainly be shaped up, and be made more informative.

Question I

Not much to do here, other than to run it. Here is the graph you would see if everything is working properly.

Question 2

s <- as.data.frame(stan_samples)
mean(abs(s$theta2 - s$theta1) < 0.2)

## [1] 0.21575

Question 3

cowA <- c(0, 1, 0, 0, 0, 0, 1, 0, 0, 0)
cowB <- c(0, 0, 1, 1, 1, 0, 1, 1, 1, 0)

# Using the same model as in Question 1, just using the new data. 

data_list <- list(y1 = cowA, y2 = cowB, n1 = length(cowA), n2 = length(cowB))

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)

# Plotting and summarizing the posterior distribution
stan_samples

## Inference for Stan model: dc298a19c9e8503d8b6766e7e6e386e0.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
## theta1   0.25    0.00 0.12   0.06   0.15   0.23   0.32   0.52  2752    1
## theta2   0.58    0.00 0.13   0.31   0.49   0.58   0.68   0.82  2388    1
## lp__   -15.93    0.02 1.04 -18.66 -16.31 -15.59 -15.21 -14.93  2122    1
## 
## Samples were drawn using NUTS(diag_e) at Thu Sep 22 16:59:21 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(stan_samples)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

Calculate the probability that medicine A is better than medicine B.

s <- as.data.frame(stan_samples)
mean(s$theta1 > s$theta2)

## [1] 0.0415

mean(s[,"theta1"] > s[,"theta2"])

## [1] 0.0415

So should probably go with medicine B then…

Question 4

# The Stan model as a string.
model_string <- "
data {
  int n1;
  int n2;
  vector[n1] y1;
  vector[n2] y2;
}

parameters {
  real mu1;
  real mu2;
  real<lower=0> sigma1;
  real<lower=0> sigma2;
}

model {  
  mu1 ~ uniform(0, 2000);
  mu2 ~ uniform(0, 2000);
  sigma1 ~ uniform(0, 1000);
  sigma2 ~ uniform(0, 1000);
  y1 ~ normal(mu1, sigma1);
  y2 ~ normal(mu2, sigma2); 
}

generated quantities {
}
"

diet_milk <- c(651, 679, 374, 601, 401, 609, 767, 709, 704, 679)
normal_milk <- c(798, 1139, 529, 609, 553, 743, 151, 544, 488, 555, 257, 692, 678, 675, 538)
data_list <- list(y1 = diet_milk, y2 = normal_milk, 
                  n1 = length(diet_milk), n2 = length(normal_milk))

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)

# Plotting and summarizing the posterior distribution
stan_samples

## Inference for Stan model: ce9fc71c270eed0bb9482a8456fb506d.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##           mean se_mean    sd    2.5%     25%     50%     75%   97.5% n_eff
## mu1     617.85    0.93 48.39  523.38  587.22  616.69  647.30  719.13  2723
## mu2     598.35    1.20 62.41  477.44  558.10  598.14  638.87  724.56  2684
## sigma1  152.81    0.89 43.92   95.35  123.79  144.93  172.82  255.41  2447
## sigma2  250.20    1.03 51.11  173.45  214.45  242.64  277.54  370.88  2439
## lp__   -133.29    0.03  1.42 -136.91 -134.03 -132.97 -132.24 -131.51  1774
##        Rhat
## mu1       1
## mu2       1
## sigma1    1
## sigma2    1
## lp__      1
## 
## Samples were drawn using NUTS(diag_e) at Thu Sep 22 16:59:57 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(stan_samples)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

Is it likely that the diet is going to make the cows produce more milk on average?

s <- as.data.frame(stan_samples)
mu_diff <- s$mu2 - s$mu1 
hist(mu_diff)

mean(mu_diff > 0)

## [1] 0.4005

mean(mu_diff < 0)

## [1] 0.5995

It is almost as likely that the diet is better as that the diet is worse. So this experiment does not really support that the diet will result in the cows producing more milk .

Question 5

# The Stan model as a string.
model_string <- "
data {
  int n1;
  int n2;
  vector[n1] y1;
  vector[n2] y2;
}

parameters {
  real mu1;
  real mu2;
  real<lower=0> sigma1;
  real<lower=0> sigma2;
}

model {  
  mu1 ~ uniform(0, 2000);
  mu2 ~ uniform(0, 2000);
  sigma1 ~ uniform(0, 1000);
  sigma2 ~ uniform(0, 1000);
  y1 ~ student_t(3, mu1, sigma1);
  y2 ~ student_t(3, mu2, sigma2);
}

generated quantities {
}
"

diet_milk <- c(651, 679, 374, 601, 4000, 401, 609, 767, 3890, 704, 679)
normal_milk <- c(798, 1139, 529, 609, 553, 743, 3,151, 544, 488, 15, 257, 692, 678, 675, 538)
data_list <- list(y1 = diet_milk, y2 = normal_milk, 
                  n1 = length(diet_milk), n2 = length(normal_milk))

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)

# Plotting and summarizing the posterior distribution
stan_samples

## Inference for Stan model: 5494eeeda3320aee4c85504377e391da.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##           mean se_mean     sd    2.5%     25%     50%     75%   97.5%
## mu1     655.59    3.83 141.77  406.20  571.41  640.90  725.53  988.77
## mu2     555.53    1.60  69.55  403.35  515.39  559.51  600.47  680.06
## sigma1  416.04    5.08 194.64  150.06  260.41  376.02  539.46  871.09
## sigma2  243.43    1.57  63.57  145.12  198.40  233.61  280.35  398.09
## lp__   -166.96    0.04   1.26 -170.05 -167.56 -166.70 -166.02 -165.33
##        n_eff Rhat
## mu1     1367    1
## mu2     1886    1
## sigma1  1466    1
## sigma2  1650    1
## lp__    1199    1
## 
## Samples were drawn using NUTS(diag_e) at Thu Sep 22 17:00:34 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(stan_samples)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

Is it likely that diet is going to make the cows produce more milk on average?

s <- as.data.frame(stan_samples)
mu_diff <- s$mu2 - s$mu1
hist(mu_diff)

mean(mu_diff > 0)

## [1] 0.2525

mean(mu_diff < 0)

## [1] 0.7475

Again there is no strong evidence that the diet is any good (but compare with the result, would you have used the original Normal model!).

Question 6

# The Stan model as a string.
model_string <- "
data {
  int n1;
  int n2;
  int y1[n1];
  int y2[n2];
}

parameters {
  real<lower=0> lambda1;
  real<lower=0> lambda2;
}

model {  
  lambda1 ~ uniform(0, 100);
  lambda2 ~ uniform(0, 100);
  y1 ~ poisson(lambda1);
  y2 ~ poisson(lambda2); 
}

generated quantities {
}
"

diet_eggs <- c(6, 4, 2, 3, 4, 3, 0, 4, 0, 6, 3)
normal_eggs <- c(4, 2, 1, 1, 2, 1, 2, 1, 3, 2, 1)
data_list <- list(y1 = diet_eggs, y2 = normal_eggs, 
                  n1 = length(diet_eggs), n2 = length(normal_eggs))

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)

# Plotting and summarizing the posterior distribution
stan_samples

## Inference for Stan model: 03c92e31683dae13696b3e76eed7a9ec.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##          mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## lambda1  3.27    0.01 0.53  2.33  2.91  3.24  3.60  4.38  2385    1
## lambda2  1.91    0.01 0.41  1.20  1.62  1.87  2.16  2.81  2108    1
## lp__    -1.70    0.02 0.97 -4.26 -2.02 -1.40 -1.02 -0.75  1750    1
## 
## Samples were drawn using NUTS(diag_e) at Thu Sep 22 17:01:10 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(stan_samples)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

Is it likely that diet going to make the chickens produce more eggs on average?

s <- as.data.frame(stan_samples)
lambda_diff <- s$lambda1 - s$lambda2 
hist(lambda_diff)

mean(lambda_diff > 0)

## [1] 0.979

There is pretty good evidence that the diet is effective and that chickens on the diet produce more eggs on average (that is, lambda1 seems higher than lambda2). Looking at lambda_diff a “best guess” is that the diet results in around 1-2 more eggs on average.

Question 7

This implements the same model as in question 4, but using smarter indexing so that the code is not as redundant and so that it works with the format of the data in the data.frame d .

# The Stan model as a string.
model_string <- "
data {
  int n;
  int n_groups;
  int x[n];
  vector[n] y;
}

parameters {
  vector[n_groups] mu;
  vector<lower=0>[n_groups] sigma;
}

model {  
  mu ~ uniform(0, 2000);
  sigma ~ uniform(0, 1000);
  y ~ normal(mu[x], sigma[x]);
}

generated quantities {
}
"

d <- data.frame(
  milk = c(651, 679, 374, 601, 401, 609, 767, 709, 704, 679, 798, 1139,
           529, 609, 553, 743, 151, 544, 488, 555, 257, 692, 678, 675, 538),
  group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 
            2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2))
data_list <- list(y = d$milk, x = d$group, n = length(d$milk), 
                  n_groups = max(d$group))

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)

# Plotting and summarizing the posterior distribution
stan_samples

## Inference for Stan model: 68927fed8cab7398aeed267d67b0a586.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##             mean se_mean    sd    2.5%     25%     50%     75%   97.5%
## mu[1]     617.73    0.94 49.12  519.52  587.58  617.22  648.79  715.36
## mu[2]     595.91    1.20 63.40  469.83  555.09  595.32  637.08  721.19
## sigma[1]  152.89    0.91 42.82   94.69  123.08  144.38  173.71  259.34
## sigma[2]  249.42    1.00 51.68  173.39  213.98  241.83  274.69  378.02
## lp__     -133.31    0.04  1.49 -137.06 -134.03 -132.97 -132.23 -131.50
##          n_eff Rhat
## mu[1]     2725    1
## mu[2]     2805    1
## sigma[1]  2214    1
## sigma[2]  2650    1
## lp__      1735    1
## 
## Samples were drawn using NUTS(diag_e) at Thu Sep 22 17:01:47 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(stan_samples)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

This should give you the same result as in question 4.

Question 8

Amazingly we don’t have to change the model at all from question 7, we can just rerun it with the new data. That is, if we were smart with how we defined the priors and instead of writing:

mu[1] ~ uniform(0, 2000);
mu[2] ~ uniform(0, 2000);

simply wrote mu ~ uniform(0, 2000);.

d <- data.frame(
  milk = c(651, 679, 374, 601, 401, 609, 767, 709, 704, 679, 798, 1139, 529,
           609, 553, 743, 151, 544, 488, 555, 257, 692, 678, 675, 538, 1061,
           721, 595, 784, 877, 562, 800, 684, 741, 516),
  group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
            2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3))

data_list <- list(y = d$milk, x = d$group, n = length(d$milk), 
                  n_groups = max(d$group))


# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)

# Plotting and summarizing the posterior distribution
stan_samples

## Inference for Stan model: 68927fed8cab7398aeed267d67b0a586.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##             mean se_mean    sd    2.5%     25%     50%     75%   97.5%
## mu[1]     617.11    1.04 48.22  518.99  586.71  617.48  647.66  711.91
## mu[2]     594.81    1.35 64.56  468.83  553.04  593.69  636.60  723.08
## mu[3]     732.33    1.32 62.38  604.11  695.85  733.63  770.48  855.91
## sigma[1]  152.42    0.88 42.15   94.56  123.60  144.26  172.78  255.26
## sigma[2]  250.11    1.02 49.98  173.43  214.22  243.17  278.00  365.28
## sigma[3]  190.66    1.24 54.41  117.07  153.06  180.27  216.14  325.17
## lp__     -184.64    0.05  1.82 -189.04 -185.58 -184.30 -183.32 -182.14
##          n_eff Rhat
## mu[1]     2131    1
## mu[2]     2296    1
## mu[3]     2245    1
## sigma[1]  2299    1
## sigma[2]  2397    1
## sigma[3]  1938    1
## lp__      1552    1
## 
## Samples were drawn using NUTS(diag_e) at Thu Sep 22 17:01:47 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(stan_samples)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

# Now comparing the tree different groups.
s <- as.data.frame(stan_samples)
hist(s[,"mu[3]"] - s[,"mu[1]"] )

hist(s[,"mu[3]"] - s[,"mu[2]"] )

mean(s[,"mu[3]"] - s[,"mu[1]"] > 0)

## [1] 0.93125

mean(s[,"mu[3]"] - s[,"mu[2]"] > 0)

## [1] 0.94075

So it is pretty likely that diet 2 (mu[3]) is better than both diet 1 (mu[2]) and using no special diet (mu[1]).

Question 9

So, let’s change the model from question 7 into a regression model!

# The Stan model as a string.
model_string <- "
data {
  int n;
  vector[n] x;
  vector[n] y;
}

parameters {
  real beta0;
  real beta1;
  real<lower=0> sigma;
}

model {  
  vector[n] mu;
  beta0 ~ uniform(-1000, 1000);
  beta1 ~ uniform(-1000, 1000);
  sigma ~ uniform(0, 1000);
  mu = beta0 + beta1 * x;
  y ~ normal(mu, sigma);
}

generated quantities {
}
"

d <- data.frame(milk = c(685, 691, 476, 1151, 879, 725, 1190, 1107, 809, 539,
                         298, 805, 820, 498, 1026, 1217, 1177, 684, 1061, 834),
                hours = c(3, 7, 6, 10, 6, 5, 10, 11, 9, 3, 6, 6, 3, 5, 8, 11, 
                          12, 9, 5, 5))
data_list <- list(y = d$milk, x = d$hours, n = length(d$milk))

# Compiling and producing posterior samples from the model.
stan_samples <- stan(model_code = model_string, data = data_list)

# Plotting and summarizing the posterior distribution
stan_samples

## Inference for Stan model: 8ff03882e9f9a899c2e5b85844a34427.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##          mean se_mean     sd    2.5%     25%     50%     75%   97.5% n_eff
## beta0  395.97    3.71 134.07  135.64  307.45  397.91  481.30  662.98  1306
## beta1   62.33    0.50  17.95   27.07   50.95   62.46   74.00   97.67  1278
## sigma  222.83    0.91  38.72  162.23  195.37  216.90  245.03  316.25  1792
## lp__  -111.90    0.03   1.22 -115.10 -112.46 -111.59 -111.00 -110.49  1352
##       Rhat
## beta0    1
## beta1    1
## sigma    1
## lp__     1
## 
## Samples were drawn using NUTS(diag_e) at Thu Sep 22 17:02:24 2016.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

plot(stan_samples)

## ci_level: 0.8 (80% intervals)

## outer_level: 0.95 (95% intervals)

plot(d$hours, d$milk, xlim=c(0, 13), ylim = c(0, 1300))

# Adding a sample of the posterior draws to the plot in order to visualize the
# uncertainty of the regression line.
s <- as.data.frame(stan_samples)
for(i in sample(nrow(s), size = 20)) {
  abline(s[i,"beta0"], s[i,"beta1"], col = "gray")
}

It seems like there is good evidence that an increase in sunshine (or something that co-varies with sunshine perhaps…) results in an increase in milk production.