We have:
Prior density for \(\theta\) is:
\[ \pi(\theta) = \frac{1}{\sqrt{2\pi} \, \theta(1-\theta)} \exp\left[ -\frac12 \left( \log\frac{\theta}{1-\theta} \right)^2 \right], \quad 0<\theta<1 \]
The posterior is:
\[ \pi(\theta \mid y) \propto \pi(\theta) \times \theta^S (1-\theta)^{n-S} \]
where \(S = \sum_{i=1}^n y_i\).
The posterior is not conjugate, so the Bayesian estimator \(E(\theta \mid y)\) is not directly available.
We want the acceptance probability for an independence Metropolis-Hastings algorithm that uses the prior as the proposal distribution, independent of current state.
Let the target distribution be the posterior:
\[ \pi(\theta \mid y) \propto \pi(\theta) \times L(\theta) \]
where: \[ L(\theta) = \prod_{i=1}^n \theta^{y_i} (1-\theta)^{1-y_i} = \theta^{S} (1-\theta)^{n-S}, \quad S = \sum_{i=1}^n y_i \]
Hence:
\[ \pi(\theta \mid y) \propto \frac{1}{\sqrt{2\pi} \, \theta(1-\theta)} \exp\left[ -\frac12 \left( \log\frac{\theta}{1-\theta} \right)^2 \right] \times \theta^{S} (1-\theta)^{n-S} \]
Proposal: \(q(\theta' \mid \theta) = \pi(\theta')\) (the prior), independent of current \(\theta\).
For MH with proposal \(q(\theta' \mid \theta)\), at state \(\theta\), propose \(\theta' \sim q(\cdot|\theta)\) and accept with probability:
\[ \alpha(\theta, \theta') = \min\left\{ 1, \frac{\pi(\theta' \mid y)}{\pi(\theta \mid y)} \times \frac{q(\theta \mid \theta')}{q(\theta' \mid \theta)} \right\} \]
Here \(q(\theta' \mid \theta) = \pi(\theta')\) by design, so \(q(\theta \mid \theta') = \pi(\theta)\).
Thus:
\[ \frac{q(\theta \mid \theta')}{q(\theta' \mid \theta)} = \frac{\pi(\theta)}{\pi(\theta')} \]
\[ \frac{\pi(\theta' \mid y)}{\pi(\theta \mid y)} = \frac{\pi(\theta') L(\theta')}{\pi(\theta) L(\theta)} \]
Multiply by \(\frac{\pi(\theta)}{\pi(\theta')}\):
\[ \alpha = \min\left\{ 1, \frac{\pi(\theta') L(\theta')}{\pi(\theta) L(\theta)} \cdot \frac{\pi(\theta)}{\pi(\theta')} \right\} \]
The \(\pi(\theta)\) and \(\pi(\theta')\) cancel, leaving:
\[ \alpha = \min\left\{ 1, \frac{L(\theta')}{L(\theta)} \right\} \]
The acceptance probability is:
\[ \boxed{\alpha(\theta, \theta') = \min\left\{ 1, \frac{(\theta')^{S} (1-\theta')^{n-S}}{\theta^{S} (1-\theta)^{n-S}} \right\}} \]
So the proposal prior cancels out: MH step reduces to accepting with probability \(\min(1, \text{likelihood ratio})\).
Step 1: Propose - Draw \(z' \sim \mathcal{N}(0, 1)\) - Set \(\theta' = \frac{e^{z'}}{1 + e^{z'}}\) (draw from prior, independent of current)
Step 2: Compute acceptance probability - Compute \(r = \frac{(\theta')^{S} (1-\theta')^{n-S}}{(\theta^{(t-1)})^{S} (1-\theta^{(t-1)})^{n-S}}\) - Set \(\alpha = \min(1, r)\)
Step 3: Accept or reject - Draw \(u \sim \text{Uniform}(0, 1)\) - If \(u < \alpha\): set \(\theta^{(t)} = \theta'\) (accept) - Else: set \(\theta^{(t)} = \theta^{(t-1)}\) (reject)
Step 4: Repeat steps 1-3 for \(t = 2, \ldots, T\)
# Generate synthetic data
set.seed(123)
n <- 100
true_theta <- 0.3
y <- rbinom(n, 1, true_theta)
S <- sum(y)
cat("Observed data: n =", n, ", S =", S, "\n")
## Observed data: n = 100 , S = 29
cat("True theta =", true_theta, "\n")
## True theta = 0.3
# Function to draw from prior (proposal distribution)
draw_from_prior <- function() {
z <- rnorm(1, 0, 1)
theta <- exp(z) / (1 + exp(z))
return(theta)
}
# Function to compute likelihood
likelihood <- function(theta, S, n) {
return(theta^S * (1-theta)^(n-S))
}
# Metropolis-Hastings independence sampler
mh_independence <- function(S, n, n_iter, initial_theta = NULL) {
# Initialize: draw from prior if not specified
if (is.null(initial_theta)) {
theta_current <- draw_from_prior()
} else {
theta_current <- initial_theta
}
# Store samples
samples <- numeric(n_iter)
samples[1] <- theta_current
acceptance_count <- 0
# MCMC loop
for (t in 2:n_iter) {
# Propose from prior (independent of current)
theta_proposed <- draw_from_prior()
# Compute likelihood ratio
L_current <- likelihood(theta_current, S, n)
L_proposed <- likelihood(theta_proposed, S, n)
# Handle numerical stability
if (L_proposed == 0 && L_current == 0) {
# Use log scale for very small values
log_r <- S * (log(theta_proposed) - log(theta_current)) +
(n-S) * (log(1-theta_proposed) - log(1-theta_current))
r <- exp(min(log_r, 700)) # Cap to avoid Inf
} else {
r <- L_proposed / L_current
}
alpha <- min(1, r)
# Accept/reject
if (runif(1) < alpha) {
theta_current <- theta_proposed
acceptance_count <- acceptance_count + 1
}
samples[t] <- theta_current
}
acceptance_rate <- acceptance_count / (n_iter - 1)
return(list(samples = samples, acceptance_rate = acceptance_rate))
}
# Run the sampler
n_iter <- 10000
results <- mh_independence(S, n, n_iter)
cat("\nAcceptance rate:", round(results$acceptance_rate * 100, 2), "%\n")
##
## Acceptance rate: 18.69 %
# Posterior mean (Bayes estimator under squared error loss)
posterior_mean <- mean(results$samples)
cat("Posterior mean (Bayes estimator):", round(posterior_mean, 4), "\n")
## Posterior mean (Bayes estimator): 0.2979
# Compare with MLE
mle <- S / n
cat("MLE:", round(mle, 4), "\n")
## MLE: 0.29
# Trace plot
plot(results$samples, type = "l",
main = "Trace Plot of θ",
xlab = "Iteration", ylab = "θ",
col = "steelblue")
abline(h = true_theta, col = "red", lwd = 2, lty = 2)
abline(h = posterior_mean, col = "darkgreen", lwd = 2)
legend("topright",
legend = c("True θ", "Posterior Mean"),
col = c("red", "darkgreen"),
lty = c(2, 1), lwd = 2)
# Histogram of posterior samples (after burn-in)
burn_in <- 1000
post_burnin <- results$samples[-(1:burn_in)]
hist(post_burnin, breaks = 50,
main = "Posterior Distribution of θ",
xlab = "θ", ylab = "Frequency",
col = "lightblue", border = "white")
abline(v = true_theta, col = "red", lwd = 2, lty = 2)
abline(v = posterior_mean, col = "darkgreen", lwd = 2)
abline(v = mle, col = "purple", lwd = 2, lty = 3)
legend("topright",
legend = c("True θ", "Posterior Mean", "MLE"),
col = c("red", "darkgreen", "purple"),
lty = c(2, 1, 3), lwd = 2)
# Autocorrelation
acf(post_burnin, main = "Autocorrelation of θ (after burn-in)")
# Effective sample size
library(coda)
effective_size <- effectiveSize(post_burnin)
cat("\nEffective sample size:", round(effective_size, 0), "\n")
##
## Effective sample size: 1249
cat("Efficiency (ESS/n_iter):", round(effective_size / length(post_burnin) * 100, 2), "%\n")
## Efficiency (ESS/n_iter): 13.88 %
summary_stats <- data.frame(
Statistic = c("Mean", "Median", "SD", "2.5% Quantile", "97.5% Quantile"),
Value = c(mean(post_burnin),
median(post_burnin),
sd(post_burnin),
quantile(post_burnin, 0.025),
quantile(post_burnin, 0.975))
)
print(summary_stats)
## Statistic Value
## 1 Mean 0.29827353
## 2 Median 0.29607928
## 3 SD 0.04419668
## 4 2.5% Quantile 0.21820202
## 5 97.5% Quantile 0.38711688
Initialization: We draw \(\theta^{(0)}\) from the prior (via \(\mathcal{N}(0,1)\) on logit scale)
Each iteration: Propose \(\theta'\) from prior (independent of current state)
Acceptance probability: \(\min(1, \text{likelihood ratio})\) — prior cancels out!
This is an independence sampler — proposals don’t depend on current state
The acceptance rate is determined solely by how well the prior matches the likelihood. If the prior is informative and close to the true \(\theta\), acceptance will be high. Otherwise, it may be low. ```