Rejection Sampling: Why w(θ) = α(θ) in this Example

Introduction

This document demonstrates rejection sampling for Bayesian inference using the Cauchy prior example.

Problem Setup:

We have i.i.d. observations from a normal distribution: \(y_i \sim N(\theta, 1)\)

Prior: \(\theta \sim \text{Cauchy}(0, 1)\) (standard Cauchy)

Posterior: \(p(\theta|y) \propto \text{Likelihood}(\theta) \times \text{Prior}(\theta)\)

Likelihood: \(\prod_{i=1}^n \exp\left(-\frac{1}{2}(y_i - \theta)^2\right) = \exp\left(-\frac{n}{2}(\theta - \bar{y})^2\right)\)

Rejection Sampling Algorithm

General Rejection Sampling Steps:

Choose a proposal density \(g(\theta)\) that is easy to sample from and covers the support of the target density \(f(\theta)\) (the unnormalized posterior kernel).
Find the constant \(M = \sup_{\theta} \frac{f(\theta)}{g(\theta)}\) such that \(f(\theta) \le M \cdot g(\theta)\) for all \(\theta\).
For each iteration \(j = 1, \dots, N\):
- Draw \(\theta^{(j)} \sim g(\theta)\) (sample from proposal)
- Draw \(u^{(j)} \sim \text{Uniform}(0, 1)\)
- Compute acceptance probability: \(\alpha(\theta^{(j)}) = \frac{f(\theta^{(j)})}{M \cdot g(\theta^{(j)})}\)
- Accept \(\theta^{(j)}\) if \(u^{(j)} < \alpha(\theta^{(j)})\), otherwise reject
Estimate \(E(\theta|y)\) by the sample mean of the accepted \(\theta\) values.

Applying to the Example

In this specific example:

Target kernel: \(f(\theta) = \text{Likelihood} \times \text{Prior} = \exp\left(-\frac{n}{2}(\theta - \bar{y})^2\right) \times \text{Cauchy}(\theta)\)
Proposal: \(g(\theta) = \text{Cauchy}(\theta)\) (using prior as proposal)
Since \(f(\theta)/g(\theta) = \exp\left(-\frac{n}{2}(\theta - \bar{y})^2\right) \le 1\), we have \(M = 1\)

Therefore: - \(\alpha(\theta) = \frac{f(\theta)}{1 \cdot g(\theta)} = \exp\left(-\frac{n}{2}(\theta - \bar{y})^2\right)\) - This equals the likelihood function (since prior cancels out)

R Implementation

# Step 1: Generate data with seed = 44 (as in the book)
set.seed(44)

n <- 25
true_theta <- 1
y <- rnorm(n, mean = true_theta, sd = 1)
y_bar <- mean(y)

cat("=== Data Generation (seed = 44) ===\n")

## === Data Generation (seed = 44) ===

cat("True θ:", true_theta, "\n")

## True θ: 1

cat("Sample size n:", n, "\n")

## Sample size n: 25

cat("Sample mean ȳ:", round(y_bar, 3), "\n\n")

## Sample mean ȳ: 0.847

# Step 2: Rejection sampling with seed = 456 (as in the book)
set.seed(456)

N <- 10000
accepted <- c()

for (j in 1:N) {
  # Draw from prior (standard Cauchy)
  theta_candidate <- rcauchy(1, location = 0, scale = 1)
  
  # Compute acceptance probability (likelihood)
  # w(θ) = exp(-n/2 * (θ - ȳ)^2)
  w <- exp(- (n/2) * (theta_candidate - y_bar)^2)
  
  # Accept or reject
  if (runif(1) < w) {
    accepted <- c(accepted, theta_candidate)
  }
}

# Results
cat("=== Rejection Sampling Results (seed = 456) ===\n")

## === Rejection Sampling Results (seed = 456) ===

cat("Number of draws from prior:", N, "\n")

## Number of draws from prior: 10000

cat("Number of samples retained:", length(accepted), "\n")

## Number of samples retained: 962

cat("Acceptance rate:", round(length(accepted)/N * 100, 2), "%\n")

## Acceptance rate: 9.62 %

cat("Posterior mean E(θ|y):", round(mean(accepted), 3), "\n")

## Posterior mean E(θ|y): 0.807

cat("95% Credible Interval:", round(quantile(accepted, 0.025), 3), ",", 
    round(quantile(accepted, 0.975), 3), "\n\n")

## 95% Credible Interval: 0.42 , 1.183

# Plot the posterior distribution
hist(accepted, breaks = 50, col = "skyblue", 
     border = "white",
     main = "Posterior Distribution of θ (Example 5.4)",
     xlab = "θ",
     freq = FALSE,
     xlim = c(-2, 4))

# Add vertical lines
abline(v = mean(accepted), col = "red", lwd = 2, lty = 1)
abline(v = quantile(accepted, 0.025), col = "darkgreen", lwd = 2, lty = 2)
abline(v = quantile(accepted, 0.975), col = "darkgreen", lwd = 2, lty = 2)
abline(v = y_bar, col = "blue", lwd = 2, lty = 3)
abline(v = true_theta, col = "purple", lwd = 2, lty = 4)

# Add legend
legend("topright",
       legend = c("Posterior Mean (0.816)", "95% Credible Interval", 
                  "Sample Mean (ȳ)", "True θ"),
       col = c("red", "darkgreen", "blue", "purple"),
       lty = c(1, 2, 3, 4),
       lwd = 2)

# Add text annotation
text(x = mean(accepted), y = 0.05, 
     labels = paste("Mean =", round(mean(accepted), 3)), 
     pos = 4, col = "red")

Key Point: Why w(θ) = α(θ) in this example

In this example:

Target: \[f(\theta) = \text{Likelihood}(\theta) \times \text{Prior}(\theta)\]
Proposal: \[g(\theta) = \text{Prior}(\theta)\] (using prior as importance sampling density)
Constant: \[M = \sup_{\theta} \frac{f(\theta)}{g(\theta)} = \sup_{\theta} \text{Likelihood}(\theta) = 1\]

Therefore:

\[ \alpha(\theta) = \frac{f(\theta)}{M \cdot g(\theta)} = \frac{\text{Likelihood}(\theta) \times \text{Prior}(\theta)}{1 \times \text{Prior}(\theta)} = \text{Likelihood}(\theta) \]

So the author’s

\[w(\theta^{(j)}) = \exp\left(-\frac{n}{2} (\theta^{(j)} - \bar{y})^2\right)\]

is exactly the acceptance probability \(\alpha(\theta)\) in this special case.

General Rejection Sampling

In general rejection sampling, you would compute:

Weight (unnormalized importance weight):
\[w(\theta) = f(\theta) / g(\theta)\]
Acceptance probability:
\[\alpha(\theta) = w(\theta) / M\]
Compare \(u \sim U(0,1)\) with \(\alpha(\theta)\).