Applied Bayesian Lab 5

Complete all Questions and submit your final PDF or html (either works!) under Assignments in Canvas.

Goal

Today we are continuing to explore Markov Chain Monte Carlo (MCMC) methods. These methods to allow us to draw samples from our posterior distribution, and hence perform inference on those parameters, when we cannot sample from the posterior directly.

In this lab, we are going to explore the Metropolis algorithm.

The Data

We will work with the same researcher from our last class. Recall that this researcher is modeling hummingbirds egg lengths. However, we are going to use a slightly different model than we did in class.

Let \(Y_i\) represent the length of hummingbird egg \(i\), where \(i = 1, 2, \dots, n = 45\). In class, we used the following model:

The Sampling Model

\[Y_1, \dots, Y_n | \mu \sim N(\mu, \sigma^2)\]

The Prior

\[\mu \sim Beta(\alpha,\beta)\] The motivation for this prior choice was the knowledge that hummingbird eggs had to be between 0 and 1 inch in length.

However, we are now going to adapt our model:

The Sampling Model

\[Y_1, \dots, Y_n | \mu \sim N(\mu, \sigma^2)\]

The Prior

\[\mu \sim Normal(a,b^2)\]

Why are we using a different model now?? Well, today the goal is to learn about how Metropolis works, so I wanted us in a setting where we actually could derive the form of the posterior distribution. This means that we can check to see how our samples from Metropolis compare to the actual samples we draw from the posterior. This allows to assess how well Metropolis works!

The Actual Posterior

With our new model with the normal prior, we can actually sample directly from the posterior. Let’s play with this.

Question 1

The scientist believes that the eggs in the hummingbird boxes should have an average length of around .6, with a variance around this belief of .0025. What prior hyper parameters would you set to reflect this belief?

Question 2

Draw a graph to illustrate the prior distribution you have chosen. Hint: The library bayesrules contains a handy function called plot_normal which takes two arguments, the mean and the standard deviation.

Question 3

Assuming that \(\sigma\) is known, what is the posterior distribution for \(\mu\) in this model? Do not fill in the values for the prior hyper parameters yet, leave them \(a\) and \(b\). Hint: As always, you must state the name of the distribution and the form of any needed parameters.

We are told by the scientist that for our \(n = 45\) randomly sampled eggs, the average length was \(\bar{y} = .55\). We are also told that \(\sigma = .12\).

Question 4

What is the posterior mean and posterior standard deviation of \(\mu\) based on this information and our prior hyper parameters? Round to 3 non-zero decimal places.

Question 5

Draw a graph to illustrate the posterior distribution for \(\mu\). Hint: The library bayesrules contains a handy function called plot_normal which takes two arguments, the mean and the standard deviation.

Question 6

What is a 95% posterior credible interval for \(\mu\)? Here, use the middle 95% of the posterior distribution.

Okay, so this is the actual posterior. This means that is we use Metropolis to estimate \(\mu\), we should get samples that yield an empirical distribution that is very similar to what we have just derived. Remember, empirical distribution just means the distribution of the samples. Let’s try it!

Metropolis: First Iteration

Okay, so now we are going to use Metropolis to draw posterior samples for \(\mu\). We know that there are two steps in Metropolis:

1. Propose
1. Accept/Reject

The Proposal Step

The proposal step starts with a initial guess at \(\mu\), which we call \(\mu^{(0)}\). We are going to have some fun and initialize poorly. Let’s set \(\mu^{(0)} = .1\).

Once we have our initial sample, we need to update that guess with a proposal. We propose a new value \(\mu^{*}\) from a proposal distribution:

\[\mu^{*}|\mu^{(0)} \sim N( \mu^{(0)}, \tau^2)\]

For now, let’s set \(\tau^2 = (.2^2) =.04\). We’ll explore other choices of \(\tau\) later.

Question 7

Set a seed of 100. Draw a proposal \(\mu^{*}\) for \(\mu\) and state your proposed value \(\mu^{*}\).

Question 8

You will notice that your proposal of \(\mu^{*}\) is not valid. Why is this not a valid draw, and why do you think we obtained such a value for \(\mu^{*}\)?

When a proposal is not valid, we discard the proposal and try again.

Question 9

Copy and paste your chunk from Question 7. Change the seed to 365 and to draw a new proposal of \(\mu^{*}\). State the value of \(\mu^{*}\) and whether this proposal is valid.

The Accept/Reject Step

Now that we have a proposal of \(\mu^{*}\), we have a decision to make. Do we think that \(\mu^{*}\) is more likely to be draw from the posterior of \(\mu\) than \(\mu^{(0)}\)? If so, we move our chain to \(\mu^{*}\) by setting

\[\mu^{(1)} = \mu^{*}\]

However, if \(\mu^{*}\) is not likely draw from the posterior, than we want to leave our chain right where it is by setting

\[\mu^{(1)} = \mu^{(0)}\]

This means we somehow need to determine whether or not \(\mu^{*}\) is likely to be a draw from the posterior (or at least, more likely than \(\mu^{(0)}\)).

Question 10

We know what the actual posterior distribution for \(\mu\) is. Which value of \(\mu\) is more likely to occur under the posterior: \(\mu^{(0)}\) or \(\mu^{*}\)? Explain your reasoning.

Typically, we don’t have the actual posterior (if we did, we wouldn’t need Metropolis). In Metropolis, we use an acceptance ratio to help us determine the next step in our MCMC chain.

\[R = min\left( \frac{P(\mu^{*} | Y_1, \dots, Y_n)}{P(\mu^{(0)} | Y_1, \dots, Y_n)}, 1 \right) = min( R_{inner}, 1)\] We can simplify the inner part of the ratio, \(R_{inner}\), to

\[\begin{aligned} R_{inner} &= \frac{P(Y_1, \dots, Y_n|\mu^{*})P(\mu^{*} )}{P(Y_1, \dots, Y_n|\mu^{(0)})P(\mu^{(0)} )} \\ &= \underbrace{\frac{P(Y_1, \dots, Y_n|\mu^{*})}{P(Y_1, \dots, Y_n|\mu^{(0)})}}_\text{Likelihood Component} \times \overbrace{\frac{P(\mu^{*} )}{P(\mu^{(0)} )}}^\text{Prior Component} \end{aligned}\]

Question 11

Simplify the Likelihood Component of \(R_{inner}\). Leave it in a general form, meaning do not plug in any numeric values that are specific to our data example yet (so leave \(a\) and \(b\) as \(a\) and \(b\) and any \(y_i\) terms as \(y_i\) terms).

Question 12

Simplify the Prior Component of \(R_{inner}\). Leave it in a general form, meaning do not plug in any numeric values that are specific to our data example yet (so leave \(a\) and \(b\) as \(a\) and \(b\) and any \(y_i\) terms as \(y_i\) terms).

With both components computed, we now know the form of the acceptance ratio!! The next step is to write a function that will help us to compute it.

Pause: What’s a function?

A function is a computer structure that takes inputs and produces an output. It has a structure that looks like this:

nameOfFunction <- function( inputs ){
  
  Compute what you want to output 
  
  return(output)

}

In our case, we want to input \(\mu^{*}\) and \(\mu = \mu^{(0)}\) and output our acceptance ratio R. We also need to input any constants, like \(a\) or \(\sigma\), that are needed in the ratio. This means we will have a structure like

computeAcceptanceRatio <- function( mustar, mu , a, b, sigma, n, ybar ){
  
  Compute R 
  
  return(R)
}

Inside the function we need to put in the steps to compute R, which luckily we just derived!

computeAcceptanceRatio <- function( mustar, mu , a, b, sigma, n, ybar ){
  
  # Compute the likelihood part
  likelihood_numerator <- 
  likelihood_denominator <- 
  likelihood_component <- likelihood_numerator / likelihood_denominator
  
  # Compute the prior part
  prior_numerator <- 
  prior_denominator <- 
  prior_component <- prior_numerator / prior_denominator
  
  # Multiply the two
  R <- likelihood_component * prior_component
  
  # Store 1 or the value from previous step
  R <- min( 1, R)
  
  # Output R
  return(R)
}

Question 13

Fill in the missing pieces for the function above to compute your acceptance ratio. Show your function as the answer to this question.

Question 14

Using your function, compute the acceptance ratio for our given draws of \(\mu^{*}\) and \(\mu^{(0)}\).

The value \(R\) you have just computed is the probability of accepting \(\mu^{*}\). This means that \((R \times 100)\)% of the time we propose \(\mu^{*}\) we want to accept it, and \(((1-R)\times 100)\)% of the time we want to reject it. How can we do that??

We are going to draw a random number \(u\) from a \(Uniform(0,1)\). If \(u\) is less than \(R\), then we accept \(\mu^{*}\). Otherwise, we reject \(\mu^{*}\)

Question 15

Using the same seed (365), draw a random draw from a \(Uniform(0,1)\). State what value you draw, and whether you accept or reject \(\mu^{*}\).

Question 16

Based on your results so far, what is the value of \(\mu^{(1)}\)?

And that’s one iteration of Metropolis!

Running 10 Metropolis Iterations

We now have all the code that we need to run one iteration of Metropolis. Now, let’s do 10 iterations.

Question 17

Create a chunk and set a seed of 365 (this is the ONLY time you set a seed in this code chunk). In this code chunk, create a for loop (or use something else if you’d rather!) to run 10 iterations of Metropolis. Save (1) the values of \(\mu^{(1)}\) through \(\mu^{(10)}\) and (2) whether the algorithm accepted or rejected at each step. Print out your data frame as your answer to this question.

Question 18

Create a trace plot of your 10 draws of \(\mu\). Add horizontal lines on your graph for (1) the mean of your draws and (2) the actual posterior mean of \(\mu\) from the first section of this project. Comment on what you see.

Running it for Real

Now that we have everything set up, let’s let this run longer. In reality, we usually run thousands of iterations of Metropolis. We are going to run \(S = 6000\), and then burn off the first 1000.

Question 19

Run your Metropolis sampler for \(S = 6000\) iterations. Do not show your data frame. Instead, as your answer to this question tell me what percent of your proposals were accepted and what percent were rejected (before burn-in, meaning use all \(S\) iterations). The percent you accept is called your Metropolis Acceptance Rate.

The Metropolis Acceptance rate is important to keep track of because it helps us assess the efficiency of our sampler. Ideally, we like an acceptance rate around about 40%. If we are less than that, we probably need to adjust our \(\tau\) value.

Question 20

Try other values of \(\tau\) to try to get your acceptance rate in the range of .3 - .4 (30% - 40%). State the \(\tau\) that you choose. What do you think is the benefit of having a Metropolis Acceptance rate that is 40% rather than, say, something like 20% or 15%?

Question 21

Create a trace plot of your draws (all \(S\) of them). Add horizontal lines on your graph for (1) the mean of your draws and (2) the actual posterior mean of \(\mu\) from the first section of this project. Comment on what you see.

Question 22

Remove the first 1000 draws as burn-in. Make a 95% HPD interval for \(\mu\) using your Metropolis draws. State the interval, and compare it to the true posterior credible interval in Question 6 (remember that normal distributions are symmetric, so the HPD interval and the usual equal-tail credible intervals are the same).

Summary

So, we have just verified that the Metropolis algorithm works! We have just seen that whether we use Metropolis to sample from the posterior, or sample from the posterior directly, we get very similar results!

Turning in your assignment

When your Markdown document is complete, do a final knit and make sure everything compiles. You will then submit your document on Canvas. You must submit a PDF or html document to Canvas. No other formats will be accepted. Make sure you look through the final document to make sure everything has knit correctly.

This work was created by Nicole Dalzell and is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Last updated 2023 March 25.