1 Introduction

Most interesting statistical problems involve multiple unknown parameters. For example, many problems involve comparing two (or more) populations or groups based on two (or more) samples. In such situations, each population or group will have its own parameters, and there will often be dependence between parameters. We are usually interested in difference or ratios of parameters between groups.

The example below concerns the familiar context of comparing two proportions. We will start by assuming prior independence between parameters, but we will also consider dependence between parameters.

2 Instructions

This RMarkdown file provides a template for you to fill in. Read the file from start to finish, completing the parts as indicated. Some code is provided for you. Be sure to run this code, and make sure you understand what it’s doing. In particular, mind the PAUSEs; be sure to PAUSE and think about these questions before proceeding.

Some blank “code chunks” are provided; you are welcome to add more (Code > Insert chunk or CTRL-ALT-I) as needed. There are also a few places where you should type text responses. Feel free to add additional text responses as necessary.

You can run individual code chunks using the play button. You can use objects defined in one chunk in others. Just keep in mind that chunks are evaluated in order. So if you call something x in one chunk, but redefine x as something else in another chunk, it’s essential that you evaluate the chunks in the proper order.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. When you are finished

  • click Knit and check to make sure that you have no errors and everything runs properly. (Fix any problems and redo this step until it works.)
  • Make sure your type your name(s) at the top of the notebook where it says “Type your name(s) here”. If you worked in a team, you will submit a single notebook with both names; make sure both names are included
  • Submit your completed files in Canvas.

You’ll need a few R packages, which you can install using install.packages

library(ggplot2)
library(dplyr)
library(knitr)
library(janitor)
library(rjags)
library(viridis)

knitr::opts_chunk$set(
  echo = TRUE,
  warning = FALSE,
  message = FALSE
)

3 Setup

Many studies of animals involve tagging the animals, but can the tags be harmful? Specifically, are metal bands used for tagging penguins harmful? Researchers studied the effects of banding in a sample of 100 king penguins near Antarctica that had already been tagged with RFID (radio frequency identification) chips. Of these 100 penguins, 50 were randomly assigned to receive metal bands on their flippers (in addition to the RFID chip), while the remaining 50 did not receive metal bands. The penguins were followed for 4.5 years to see which penguins survived.

Let \(\theta_1\) denote the probability that a penguin without a metal band survives a 4.5-year period, and let \(\theta_2\) denote the probability that a penguin with a metal band survives a 4.5-year period. (Throughout, “penguin” refers to whatever population of penguins this sample would be reasonbly representative of.)

PAUSE to consider your prior distribution for \(\theta_1\) and \(\theta_2\). Suppose you don’t know much about penguin survival in Antarctica; what prior distribution might you assume? Would you necessarily assume that \(\theta_1\) and \(\theta_2\) are independent? Why might you want to assume some dependence between the parameters in the prior?

We’ll discuss some issues in choosing the prior later. Here’s a summary of the sample data.

  • Of the 50 penguins without metal bands, 31 survived this 4.5-year study period.
  • Of the 50 penguins with metal bands, 16 survived the 4.5-year study period.

We will use the sample data to perform Bayesian inference to compare the probability of surviving with and without the metal band. In particular, we’re interested in the parameters \(\theta_1 - \theta_2\) and \(\theta_1 / \theta_2\).

4 Prior independence

We’ll start with a prior that assumes \(\theta_1\) and \(\theta_2\) are independent. We’ll choose the same marginal prior distribution for both \(\theta_1\) and \(\theta_2\) to reflect a prior belief of “no difference”. A quick Google search reveals that the lifespan of Antarctic king penguins is 15-20 years, though we don’t know how old the penguins were at the start of the 4.5-year study. Let’s assume a prior with a mean that represents penguins are more likely to survive than not, but is still fairly non-informative. Therefore, our prior assumes

  • \(\theta_1\) and \(\theta_2\) are independent according to the prior
  • \(\theta_1\) has a Beta(3, 2) prior distribution
  • \(\theta_2\) has a Beta(3, 2) prior distribution

(We could have used a Uniform(0, 1) = Beta(1, 1) prior, but choosing \(\alpha\) and \(\beta\) values other than 1 will illustrate the process of finding the posterior a little better.)

The following simulates \((\theta_1, \theta_2)\) pairs from the joint prior distribution and plots them.

N = 10000

# prior 
alpha1_prior = 3
beta1_prior = 2

alpha2_prior = 3
beta2_prior = 2

theta1 = rbeta(N, alpha1_prior, beta1_prior)
theta2 = rbeta(N, alpha2_prior, beta2_prior)

sim_prior = data.frame(theta1, theta2)

ggplot(sim_prior, aes(theta1, theta2)) +
  geom_point(color = "skyblue", alpha = 0.4) +
  stat_ellipse(level = 0.98, color = "black", size = 2) +
  stat_density_2d(color = "grey", size = 1) +
  geom_abline(intercept = 0, slope = 1)

ggplot(sim_prior, aes(theta1, theta2)) +
  stat_density_2d(aes(fill = ..level..),
                  geom = "polygon", color = "white") +
  scale_fill_viridis_c() +
  geom_abline(intercept = 0, slope = 1)

We can use simulation to approximate the prior distribution of \(\theta_1 - \theta_2\). PAUSE to consider what this prior distribution says about how the survival probabilities compare with and without the band.

theta_diff = theta1 - theta2

hist(theta_diff,
     freq = FALSE,
     xlab = "Difference in survival probabilities (no band - band)",
     main = "Prior Distribution")

quantile(theta_diff, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##         1%        10%        25%        75%        90%        99% 
## -0.6480639 -0.3755530 -0.2030572  0.1957492  0.3726763  0.6374987

This prior distribution gives credibility to a wide range of scenarios: survival could be more likely either with or without the band, and the difference in survival probabilities could either be 0, small, or large.

4.1 Exercise

The parameter \(\theta_1-\theta_2\) compares the two survival probabilities with an absolute difference. For a relative difference, we can investigate \(\theta_1 / \theta_2\) (often called the “relative risk”).

Use simulation to approximate the prior distribution of \(\theta_1 / \theta_2\) and write a short sentence or two describing the prior distribution. (Note: you might see a few super extreme ratios that will throw of the scale, so you might consider \(\log(\theta_1 / \theta_2)\) instead for plotting.)

# Type your code here.
N = 10000

# prior 
alpha1_prior = 3
beta1_prior = 2

alpha2_prior = 3
beta2_prior = 2

theta1 = rbeta(N, alpha1_prior, beta1_prior)
theta2 = rbeta(N, alpha2_prior, beta2_prior)

theta_diff = theta1/theta2

hist(log(theta_diff),
     freq = FALSE,
     xlab = "Relative Risk in survival probabilities (no band/band)",
     main = "Prior Distribution")

quantile(theta_diff, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##        1%       10%       25%       75%       90%       99% 
## 0.2119108 0.4951411 0.7137775 1.4091113 2.0323688 4.8520884

TYPE YOUR RESPONSE HERE. The prior distribution is centered around 0 and does give some credibility to values for the Relative risk ratio up to +- 2. However, we can conclude from this distribution that there is no assciation between the penguins wearing bands and the penguins not wearing bands, assuming this prior.

5 Deriving the posterior distribution

Recall the prior distribution

  • \(\theta_1\) and \(\theta_2\) are independent according to the prior
  • \(\theta_1\) has a Beta(3, 2) prior distribution
  • \(\theta_2\) has a Beta(3, 2) prior distribution

PAUSE to carefully write an expression for the prior distribution.

The prior distribution for \(\theta_1\) satisfies \[ \pi(\theta_1) \propto \theta_1^{3 - 1}(1-\theta_1)^{2-1} \] The prior distribution for \(\theta_2\) satisfies \[ \pi(\theta_2) \propto \theta_2^{3 - 1}(1-\theta_2)^{2-1} \] Since the prior assumes \(\theta_1\) and \(\theta_2\) are independent, the joint prior distribution for \(\theta_1\) and \(\theta_2\) satisfies \[ \pi(\theta_1, \theta_2) = \pi(\theta_1)\pi(\theta_2) \propto \left(\theta_1^{3 - 1}(1-\theta_1)^{2-1} \right)\left(\theta_2^{3 - 1}(1-\theta_2)^{2-1} \right) \]

Recall the sample data

  • Of the 50 penguins without metal bands, 31 survived this 4.5-year study period.
  • Of the 50 penguins with metal bands, 16 survived the 4.5-year study period.

PAUSE to carefully write the likelihood function. Is it reasonable to assume that the two samples (50 without bands, 50 with bands) are independent?

For the penguins without bands, the likelihood of 31 out of 50 penguins surviving is, from a Binomial likelihood, \[ f(y_1 = 31 | \theta_1) \propto \theta_1^{31}(1-\theta_1)^{50-31} \]

For the penguins with bands, the likelihood of 16 out of 50 penguins surviving is, from a Binomial likelihood, \[ f(y_2 = 16 | \theta_2) \propto \theta_2^{16}(1-\theta_2)^{50-16} \]

Because the penguins were randomly assigned to band/no band, it is reasonable to assume that these are two independent random samples (conditional on \(\theta_1, \theta_2\)). Therefore, the likelihood of the sample is the product of the two likelihoods above \[ f(y_1 = 31, y_2 = 16 | \theta_1, \theta_2) \propto \left(\theta_1^{31}(1-\theta_1)^{50-31}\right)\left(\theta_2^{16}(1-\theta_2)^{50-16}\right) \]

PAUSE to carefully write an expression for the posterior distribution. What do you notice? Are \(\theta_1\) and \(\theta_2\) independent according to the posterior distribution? What is the marginal posterior distribution of \(\theta_1\)? Of \(\theta_2\)?

Posterior is proportional to the product of prior and likelihood

\[\begin{align*} & \quad \pi(\theta_1, \theta_2 | y_1 = 31, y_2 = 16)\\ \propto & \quad \pi(\theta_1, \theta_2) f(y_1 = 31, y_2 = 16|\theta_1, \theta_2)\\ \propto & \quad \left[\left(\theta_1^{3 - 1}(1-\theta_1)^{2-1} \right)\left(\theta_2^{3 - 1}(1-\theta_2)^{2-1} \right)\right]\left[\left(\theta_1^{31}(1-\theta_1)^{50-31}\right)\left(\theta_2^{16}(1-\theta_2)^{50-16}\right)\right]\\ \propto & \quad \left(\theta_1^{3 + 31 - 1}(1-\theta_1)^{2 + 50 - 31 -1} \right)\left(\theta_2^{3 + 16 - 1}(1-\theta_2)^{2 + 50 - 16 -1} \right) \end{align*}\]

  • Since the joint posterior distribution factors into two factors, one involving \(\theta_1\) only and one involving \(\theta_2\) only, then \(\theta_1\) and \(\theta_2\) are independent according to the posterior
  • The marginal posterior distribution of \(\theta_1\) is Beta(3 + 31, 2 + 19). That is, the posterior distribution of \(\theta_1\) is updated based on the data from sample 1 (no band) only as in the Beta-Binomial model.
  • The marginal posterior distribution of \(\theta_2\) is Beta(3 + 16, 2 + 34). That is, the posterior distribution of \(\theta_2\) is updated based on the data from sample 2 (band) only as in the Beta-Binomial model.
# data
n1 = 50
y1 = 31

n2 = 50
y2 = 16

# posterior
alpha1 = alpha1_prior + y1
beta1 = beta1_prior + n1 - y1

alpha2 = alpha2_prior + y2
beta2 = beta2_prior + n2 - y2

theta1 = rbeta(N, alpha1, beta1)
theta2 = rbeta(N, alpha2, beta2)

sim_posterior = data.frame(theta1, theta2)

# plots
ggplot(sim_posterior, aes(theta1, theta2)) +
  geom_point(color = "skyblue", alpha = 0.4) +
  stat_ellipse(level = 0.98, color = "black", size = 2) +
  stat_density_2d(color = "grey", size = 1) +
  geom_abline(intercept = 0, slope = 1)

ggplot(sim_posterior, aes(theta1, theta2)) +
  stat_density_2d(aes(fill = ..level..),
                  geom = "polygon", color = "white") +
  scale_fill_viridis_c() +
  geom_abline(intercept = 0, slope = 1)

6 Comparing proportions

Once we have the posterior distribution of \((\theta_1, \theta_2)\) pairs, we can find the posterior distribution of any transformations of these parameters.

We can approximate the posterior distribution of \(\theta_1-\theta_2\) by simulating \((\theta_1, \theta_2)\) pairs from the joint posterior distribution and then finding the difference \(\theta_1-\theta_2\) for each pair. Because \(\theta_1\) and \(\theta_2\) are independent in the posterior, we can simulate \((\theta_1, \theta_2)\) pairs by simulating \(\theta_1\) and \(\theta_2\) independently.

theta_diff = theta1 - theta2

hist(theta_diff,
     freq = FALSE,
     xlab = "Difference in survival probabilities (no band - band)",
     main = "Posterior Distribution")

6.1 Exercises

Compute and interpret the posterior probability that \(\theta_1 > \theta_2\).

# Type your code here
p = sum((theta_diff > 0))/length(theta_diff)
p
## [1] 0.9981

TYPE YOUR RESPONSE HERE. Therefore, there is a 99.85% posterior probability that the mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is larger than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band.

Compute and interpret posterior central 50%, 80%, and 98% credible intervals for \(\theta_1 - \theta_2\)

# Type your code here
quantile(theta_diff, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##         1%        10%        25%        75%        90%        99% 
## 0.05832112 0.15392920 0.21180891 0.33504909 0.38889007 0.47677307

TYPE YOUR RESPONSE HERE. There is a posterior probability of 50% that mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is is between 21.18 and 33.71 percent greater than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band.

There is a posterior probability of 80% that mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is between 15.62 and 39.06 percent greater than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band.

There is a posterior probability of 98% that mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is is between 6.07 and 47.65 percent greater than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band.

Use simulation to approximate the posterior distribution of \(\theta_1/\theta_2\).

# Type your code here
theta_over = theta1 /theta2

hist(log(theta_over),
     freq = FALSE,
     xlab = "Difference in survival probabilities (no band - band)",
     main = "Posterior Distribution")

Compute and interpret posterior central 50%, 80%, and 98% credible intervals for \(\theta_1 /\theta_2\).

# Type your code here
quantile(theta_over, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##       1%      10%      25%      75%      90%      99% 
## 1.130165 1.378434 1.561738 2.083290 2.393758 3.107152

TYPE YOUR RESPONSE HERE. There is a posterior probability of 50% that true relative risk ratio of the mean proportion penguins that survive a 4.5 year period while not wearing a metal band and the mean proportion of penguins that survive a 4.5 year period while wearing a metal band is between .7055 and 1.387.

There is a posterior probability of 80% that true relative risk ratio of the mean proportion penguins that survive a 4.5 year period while not wearing a metal band and the mean proportion of penguins that survive a 4.5 year period while wearing a metal band is between .486 and 2.007

There is a posterior probability of 98% that true relative risk ratio of the mean proportion penguins that survive a 4.5 year period while not wearing a metal band and the mean proportion of penguins that survive a 4.5 year period while wearing a metal band is between .2116 and 4.477.

7 A few notes about independence

  • It is typical to assume independence in the data, e.g., independence of values of the measured variables within and between samples (conditional on the parameters).
    • Whether independence in the data is a reasonable assumption depends on how the data is collected.
  • But whether it is reasonable to assume independence of parameters is a completely separate question and is dependent upon our subjective beliefs about any relationships between parameters.
  • When there are multiple parameters and
    • there is independence between parameters in the prior
    • and independence in the data (e.g., independently selected samples)
    • then there will be independence between parameters in the posterior
    • so the posterior distribution of each parameter can be updated separately as in a one parameter model.

8 Prior dependence

We might wish to assume some prior dependence between parameters. In this example, while there could be differences in survival probability due to bands, knowing the survival probability with the band tells us something about the survival probability without the band (and vice versa). Furthermore, if we wanted a prior that reflects a prior belief in “no difference” then we would want a prior correlation between \(\theta_1\) and \(\theta_2\). With the independent prior, and same prior means, the average difference was 0, but there was wide variability about that average.

Constructing a prior distribution in which the marginal distributions are Beta but there is dependence is difficult. Instead, we’ll try an easier way to introduce correlation.

Bivariate Normal distributions are commonly used as joint distributions for two random variables. We’ll assume a Bivariate Normal distribution for \(\theta_1\) and \(\theta_2\) with

  • prior mean of 0.6 and prior SD of 0.2 for \(\theta_1\) (similar mean and SD as the Beta(3, 2) prior)
  • prior mean of 0.6 and prior SD of 0.2 for \(\theta_2\) (similar mean and SD as the Beta(3, 2) prior)
  • prior correlation of 0.7 between \(\theta_1\) and \(\theta_2\).

Note: Normal distributions are not the most natural in this example since \(\theta_1\) and \(\theta_2\) can only take values between 0 and 1. We could truncate the Bivariate Normal prior to assign plausibility only to pairs in (0, 1). But we’re primarily interested in the posterior, and the likelihood will handle zero-ing out the posterior for \(\theta\) values outside of (0, 1). So we’ll leave the Bivariate Normal prior as is for simplicity.

The following code simulates \((\theta_1, \theta_2)\) pairs from the joint prior distribution. (The function mvrnorm in the MASS package can be used to simulate values from a Multivariate Normal distribution. The inputs are a mean vector, and a covariance matrix.)

prior_mean <- c(0.6, 0.6)
prior_sd <- c(0.2, 0.2)
prior_corr <- .7

prior_cov <- matrix(c(prior_sd[1] ^ 2,
              prior_corr * prior_sd[1] * prior_sd[2],
              prior_corr * prior_sd[1] * prior_sd[2],
              prior_sd[2] ^ 2), nrow = 2)

library(MASS)

sim_prior = data.frame(mvrnorm(N, prior_mean, prior_cov))
names(sim_prior) = c("theta1", "theta2")

ggplot(sim_prior, aes(theta1, theta2)) +
  geom_point(color = "skyblue", alpha = 0.4) +
  stat_ellipse(level = 0.98, color = "black", size = 2) +
  stat_density_2d(color = "grey", size = 1) +
  geom_abline(intercept = 0, slope = 1)

ggplot(sim_prior, aes(theta1, theta2)) +
  stat_density_2d(aes(fill = ..level..),
                  geom = "polygon", color = "white") +
  scale_fill_viridis_c() +
  geom_abline(intercept = 0, slope = 1)

8.1 Exercises

Use simulation to approximate the prior distribution of \(\theta_1-\theta_2\). How does this distribution compare to the one for the independent prior?

# Type your code here
sim_prior <- sim_prior %>%
  mutate(mu_diff = theta1 - theta2)

ggplot(sim_prior,
   aes(mu_diff)) +
  geom_histogram(aes(y=..density..), color = "black", fill = "white") +
  geom_density(size = 1, color = "skyblue") +
  labs(x = "Difference in  survival probabilities (no band - band)",
   title = "Prior Distribution")

TYPE YOUR RESPONSE HERE. Compared to the independent case, there seems to be much more concentration and density around the 0.0 difference. Whereas in the independent case, the width of the x-axis was larger and placed much more plausibility to the difference of +- .5 and more. In the dependent case, as seen above, places very small plausibility in these values.

Compute prior central 50%, 80%, and 98% credible intervals for \(\theta_1 - \theta_2\). How do the credible intervals compare to the ones for the independent prior?

# Type your code here
quantile(sim_prior$mu_diff, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##         1%        10%        25%        75%        90%        99% 
## -0.3590794 -0.1984750 -0.1066716  0.1034970  0.1979609  0.3504054

TYPE YOUR RESPONSE HERE. The credible intervals are much narrower when compared to the ones using the independent prior. The independent case places much more plausbility to higher absolute values of theta (true difference between no band and band).

Use simulation to approximate the prior distribution of \(\theta_1/\theta_2\). How does this distribution compare to the one for the independent prior?

# Type your code here
sim_prior <- sim_prior %>%
  mutate(mu_diff = theta1 / theta2)

ggplot(sim_prior,
   aes(mu_diff)) +
  geom_histogram(aes(y=..density..), color = "black", fill = "white", bins = 50) +
  geom_density(size = 1, color = "skyblue") +
  labs(x = "Relative Risk ratio in  survival probabilities (no band / band))",
   title = "Prior Distribution")

TYPE YOUR RESPONSE HERE. Compared to the independent case, there seems to be much more concentration and density around the 0.0 relative risk ratio. Whereas in the independent case, the width of the x-axis was larger and placed much more plausibility to the difference of +- 2 and more. In the dependent case, as seen above, places very small plausibility in these values.

Compute and interpret prior central 50%, 80%, and 98% credible intervals for \(\theta_1 /\theta_2\). How do the credible intervals compare to the ones for the independent prior?

# Type your code here
quantile(sim_prior$mu_diff, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##        1%       10%       25%       75%       90%       99% 
## 0.3884867 0.6947323 0.8325098 1.1889151 1.4270973 2.3367066

TYPE YOUR RESPONSE HERE. The credible intervals are much narrower when compared to the ones using the independent prior. The independent case places much more plausbility to higher absolute values of theta (true realtive risk ratio between no band and band).

Which prior distribution — independent or dependent — represents a stronger prior belief in “no difference”?

TYPE YOUR RESPONSE HERE. The dependent prior distribution, as values are more centered at or near 0.

9 Posterior distribution in JAGS

We’ll assume the dependent prior from the previous section and the same sample data as before.

Exercise. Describe in detail how you would use naive simulation (not JAGS or MCMC) to approximate the joint posterior distribution of \((\theta_1, \theta_2)\). In particular, how would you use simulation to approximate the posterior probability that \(\theta_1 > \theta_2\)?

TYPE YOUR RESPONSE HERE. 1) I would use a multivariate distribution that incorportates the correlation between the two varables and then simulate 1000s of pairs from both variables from this prior distribution. 2) Next, I would come up with likelihood function based on the data, and find the likelihood of observing what was seen in the data given the pair simulated from the prior distribution. 3) Next, create a product (prior * likelihood) and sum(product) 4) To create a posterior distribution do, for each product quantity, product/sum(product). Now summarize all values and this is the joint posterior distribution for theta1 and theta2. 5) Then simulate theta1 and theta2 pairs from the joint posterior distribution and then finding the difference theta1−theta2 for each pair. 6) Finally, find the probability that the difference between theta1 and theta 2 > 0 in the simulated distribution and this is the approximate posterior probability that \(\theta_1 > \theta_2\).

The following JAGS code uses simulation to approximate the joint posterior distribution of \((\theta_1, \theta_2)\). The syntax is tricky, so don’t worry about the details yet. Later we’ll cover other examples that have similar syntax, in more common situations (e.g. regression, hierarchical models). Just know that JAGS is carrying out the simulation you described in the previous exercise in an efficient way. (Some notes about similar syntax for comparing two means are near the end of Chapter 16.)

# data
n_groups = 2

y = c(31, 16)
n = c(50, 50)

# model: likelihood and prior

model_string <- "model{

  # Likelihood
  for (g in 1:n_groups){
      y[g] ~ dbinom(theta[g], n[g])
  }

  # Prior
  theta[1:n_groups] ~ dmnorm.vcov(prior_mean[1:n_groups],
                                  prior_cov[1:n_groups, 1:n_groups])

}"

# pass inputs to JAGS
dataList = list(y = y,
                n = n,
                n_groups = n_groups,
                prior_mean = prior_mean,
                prior_cov = prior_cov)

# Compile
model <- jags.model(textConnection(model_string),
                data = dataList,
                n.chains = 5)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 2
##    Unobserved stochastic nodes: 1
##    Total graph size: 16
## 
## Initializing model
update(model, 1000, progress.bar = "none")

Nrep = 10000

posterior_sample <- coda.samples(model,
                             variable.names = c("theta"),
                             n.iter = Nrep,
                             progress.bar = "none")

# Summarize and check diagnostics
summary(posterior_sample)
## 
## Iterations = 2001:12000
## Thinning interval = 1 
## Number of chains = 5 
## Sample size per chain = 10000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##            Mean      SD  Naive SE Time-series SE
## theta[1] 0.5842 0.06212 0.0002778      0.0009138
## theta[2] 0.3720 0.06296 0.0002815      0.0009452
## 
## 2. Quantiles for each variable:
## 
##            2.5%    25%    50%    75%  97.5%
## theta[1] 0.4609 0.5424 0.5850 0.6263 0.7031
## theta[2] 0.2528 0.3286 0.3707 0.4142 0.4990

The following code extracts the JAGS output as a data.frame/matrix.

sim_posterior = data.frame(as.matrix(posterior_sample))
names(sim_posterior) = c("theta1", "theta2")

head(sim_posterior, 10) %>%
  kable(digits = 5)
theta1 theta2
0.53141 0.45899
0.54312 0.44903
0.62453 0.54292
0.53623 0.50436
0.53311 0.51847
0.58552 0.54623
0.65845 0.52675
0.64845 0.53344
0.61298 0.54065
0.55485 0.50705

The following plots the joint posterior distribution of \((\theta_1, \theta_2)\)

ggplot(sim_posterior, aes(theta1, theta2)) +
  geom_point(color = "skyblue", alpha = 0.4) +
  stat_ellipse(level = 0.98, color = "black", size = 2) +
  stat_density_2d(color = "grey", size = 1) +
  geom_abline(intercept = 0, slope = 1)

ggplot(sim_posterior, aes(theta1, theta2)) +
  stat_density_2d(aes(fill = ..level..),
              geom = "polygon", color = "white") +
 scale_fill_viridis_c() +
  geom_abline(intercept = 0, slope = 1)

9.1 Exercises

Be sure you don’t miss the “how would you simulate the posterior distribution” exercise at the start of this section!

Compute and interpret the posterior probability that \(\theta_1 > \theta_2\). How does it compare to the posterior probability for the independent prior?

# Type your code here
sim_posterior = sim_posterior %>%
      mutate(mu_diff = theta1 - theta2)
sum(sim_posterior$mu_diff > 0) / length(sim_posterior$mu_diff)
## [1] 0.99382

TYPE YOUR RESPONSE HERE. Therefore, there is a 99.48% posterior probability that the mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is larger than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band; in the dependent case. This posterior probability is slightly lower (99.48 vs 99.85) than the case of theta1 and theta2 being independent.

Use simulation to approximate the posterior distribution of \(\theta_1-\theta_2\). How does this distribution compare to the one for the independent prior?

# Type your code here
sim_posterior = sim_posterior %>%
      mutate(mu_diff = theta1 - theta2)

ggplot(sim_posterior,
       aes(mu_diff)) +
  geom_histogram(aes(y=..density..), color = "black", fill = "white") +
  geom_density(size = 1, color = "seagreen") +
  labs(x = "Difference in survival probabilities (no band - band)",
       title = "Posterior Distribution")

TYPE YOUR RESPONSE HERE. In the dependent case (Above) there appears to be a higher concentration of plausibility/density placed on values at or near .2, whereas in the independent case the values for theta (true difference in proportions) was centered at around .3.

Compute posterior central 50%, 80%, and 98% credible intervals for \(\theta_1 - \theta_2\). How do the credible intervals compare to the ones for the independent prior?

# Type your code here
quantile(sim_posterior$mu_diff, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##         1%        10%        25%        75%        90%        99% 
## 0.01612315 0.10597208 0.15679894 0.26871082 0.31746787 0.39718697

TYPE YOUR RESPONSE HERE. There is a posterior probability of 50% that mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is is between 15.27 and 26.52 percent greater than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band.

There is a posterior probability of 80% that mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is between 10.24 and 31.51 percent greater than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band.

There is a posterior probability of 98% that mean proportion of penguins that survive a 4.5 year period while not wearing a metal band is is between 1.94 and 39.80 percent greater than the mean proportion of penguins that survive a 4.5 year period while wearing a metal band.

These values in the dependent case are much narrower when compared to the posterior credible intervals calculated above.

Use simulation to approximate the posterior distribution of \(\theta_1/\theta_2\). How does this distribution compare to the one for the independent prior?

# Type your code here
sim_posterior = sim_posterior %>%
      mutate(mu_ratio = theta1 / theta2)

ggplot(sim_posterior,
       aes(mu_ratio)) +
  geom_histogram(aes(y=..density..), color = "black", fill = "white") +
  geom_density(size = 1, color = "seagreen") +
  labs(x = "Difference in survival probabilities (no band - band)",
       title = "Posterior Distribution")

In the dependent case (Above) there appears to be a higher concentration of plausibility/density placed on values at or near 1.6, whereas in the independent case the values for theta (true realtive risk between the two groups) was centered at around 1.85.

Compute and interpret posterior central 50%, 80%, and 98% credible intervals for \(\theta_1 /\theta_2\). How do the credible intervals compare to the ones for the independent prior?

# Type your code here
quantile(sim_posterior$mu_ratio, c(0.01, 0.10, 0.25, 0.75, 0.90, 0.99))
##       1%      10%      25%      75%      90%      99% 
## 1.033492 1.247060 1.388887 1.793720 2.031435 2.562147

There is a posterior probability of 50% that true relative risk ratio of the mean proportion penguins that survive a 4.5 year period while not wearing a metal band and the mean proportion of penguins that survive a 4.5 year period while wearing a metal band is between 1.377 and 1.781

There is a posterior probability of 80% that true relative risk ratio of the mean proportion penguins that survive a 4.5 year period while not wearing a metal band and the mean proportion of penguins that survive a 4.5 year period while wearing a metal band is between 1.237 and 1.781

There is a posterior probability of 98% that true relative risk ratio of the mean proportion penguins that survive a 4.5 year period while not wearing a metal band and the mean proportion of penguins that survive a 4.5 year period while wearing a metal band is between 1.041 and 2.554

These values in the dependent case are much narrower when compared to the posterior credible intervals calculated above.

10 Conclusions

Is there evidence that penguins with metal bands are more likely to die than those without? Can we conclude that the bands are the cause of any difference in survival probabilities? Write a few sentences summarizing your analysis and reporting conclusions in context. Be sure to support you conclusions with appropriate values reported in context.

TYPE YOUR RESPONSE HERE.

As is most likely the case, there should intuitively be a dependency between the survival rate of the penguins with a metal band and those without a metal band. This is because they will still be living near each other and depending on each other for survival in the wild. In this case, when looking at the posterior credible intervals, the 50, 80, and 98 percent credible intervals for the true difference in survival proportions amongst no band and band do NOT contain 0. Thus, we have evidence to suggest that penguins with metal bands are more likely to die than those without. In the outline of this experient, the experimenters randomly assigned the penguins to each “treatment” group (no band or band). Therefore, we CAN conclude that the bands are the cause of any difference in survival probabilities.

11 Reflection

  1. Write a few sentences summarizing one important concept you have learned in this lab
  2. What is (at least) one question that you still have?

TYPE YOUR RESPONSE HERE. 1) One important concept I have learned in this lab is being able to distinguish and understand when I should be using a joint probability distribution where there is dependency between the variables, and how to do so using simulation. 2) Is there a way to better understand when exactly we should be assuming independence vs. dependency, or is it simply based on the research question being investigated?