Highest Posterior Density (HPD) Intervals

Introduction

This document explains Highest Posterior Density (HPD) intervals—a Bayesian credible interval that contains only the most probable values of a parameter.

The Problem with Simple Credible Intervals

A standard credible interval of level \(1-\alpha\) is any interval \((a, b)\) such that:

\[P(a < \theta < b \mid y) = 1-\alpha\]

However, this definition does not guarantee the interval contains the highest probability regions. In multi-modal or skewed posterior distributions, an equal-tailed interval may include low-probability “valleys” while excluding high-probability “peaks.”

Your text notes: “Such an interval may leave out high density areas. To guard against such problems we define the highest posterior density, or HPD, intervals.”

Definition of HPD Interval

A credible interval satisfying \(P(\theta \in (a, b) \mid y) = 1-\alpha\) is an HPD credible interval if:

\[\pi(\theta \mid y) \geq \pi(\psi \mid y) \text{ for all } \theta \in (a, b) \text{ and } \psi \notin (a, b)\]

In plain English: Every point inside the interval has a higher posterior density than every point outside the interval.

Generalization to Multiple Parameters

For a parameter vector \(\theta\):

A set \(A\) is a \(100(1-\alpha)\%\) credible set if \(P(\theta \in A \mid y) = 1-\alpha\)
A set \(A\) is an HPD credible region if \(\pi(\theta \mid y) \geq \pi(\psi \mid y)\) for all \(\theta \in A\) and \(\psi \notin A\)

Steps to Calculate an HPD Interval

Conceptual Algorithm

Start at the peak: Find the posterior mode \(\hat{\theta}\) (where \(\pi(\theta \mid y)\) is maximized)
Set a horizontal line: Begin at height \(h = \pi(\hat{\theta} \mid y)\)
Lower the line gradually: Decrease \(h\)
Form a region: At each level \(h\), define \(C(h) = \{\theta: \pi(\theta \mid y) \geq h\}\)
Check coverage: Calculate \(P(\theta \in C(h) \mid y) = \int_{C(h)} \pi(\theta \mid y) d\theta\)
Find target threshold: Find \(h^*\) such that \(P(\theta \in C(h^*) \mid y) = 1-\alpha\)
Result: The HPD region is \(C(h^*)\)

Visual Demonstration

# Create a simple visualization
x <- seq(-4, 4, length.out = 500)
y <- dnorm(x, 0, 1)

plot(x, y, type = "l", lwd = 2, 
     main = "HPD Interval for a Normal Posterior",
     xlab = expression(theta), ylab = expression(pi(theta*"|"*y)))
abline(h = 0, col = "gray")
abline(v = 0, lty = 2, col = "gray")

# Add a horizontal line for HPD threshold
h_level <- dnorm(qnorm(0.025, 0, 1), 0, 1)
abline(h = h_level, col = "red", lwd = 2, lty = 2)

# Shade the HPD region
lower <- qnorm(0.025, 0, 1)
upper <- qnorm(0.975, 0, 1)
polygon(c(x[x >= lower & x <= upper], upper, lower),
        c(y[x >= lower & x <= upper], 0, 0),
        col = rgb(0.3, 0.5, 0.8, 0.4), border = NA)

legend("topright", legend = c("Posterior", "HPD threshold", "95% HPD region"),
       col = c("black", "red", rgb(0.3, 0.5, 0.8, 0.4)),
       lwd = c(2, 2, NA), fill = c(NA, NA, rgb(0.3, 0.5, 0.8, 0.4)))

Special Case: Normal-Normal Model

Model Setup

Prior: \(\theta \sim N(\mu, \tau^2)\)
Likelihood: \(y_i \sim N(\theta, \sigma^2)\) for \(i=1, \ldots, n\)
Posterior: \(\theta \mid y \sim N(\mu_n, \sigma_n^2)\) (Normal distribution)

Posterior Parameters

Posterior precision (inverse variance):

\[\frac{1}{\sigma_n^2} = \frac{n}{\sigma^2} + \frac{1}{\tau^2}\]

Posterior mean (weighted average):

\[\mu_n = \frac{\frac{n\bar{y}}{\sigma^2} + \frac{\mu}{\tau^2}}{\frac{n}{\sigma^2} + \frac{1}{\tau^2}}\]

The 95% HPD Interval Formula

For a Normal posterior (unimodal and symmetric), the HPD interval equals the equal-tailed credible interval:

\[\mu_n \pm 1.96 \times \left(\frac{n}{\sigma^2} + \frac{1}{\tau^2}\right)^{-1/2}\]

Note: The text you provided had a typographical error in the exponent. The standard deviation is \(\sqrt{\sigma_n^2}\), not \(\sqrt[4]{\sigma_n^2}\).

Why Normal Posterior is Special

For a symmetric, unimodal distribution:

The HPD interval = shortest credible interval
The shortest interval is centered on the mean/mode
Therefore, HPD = equal-tailed credible interval

# Demonstration: For Normal posterior, HPD = equal-tailed interval
alpha <- 0.05
z_critical <- qnorm(1 - alpha/2)

# Example parameters
mu_n <- 48.5
sigma_n <- 0.85

# HPD (and equal-tailed) interval
lower_hpd <- mu_n - z_critical * sigma_n
upper_hpd <- mu_n + z_critical * sigma_n

cat("95% HPD Interval:", round(lower_hpd, 2), "to", round(upper_hpd, 2), "\n")

## 95% HPD Interval: 46.83 to 50.17

cat("Interval width:", round(upper_hpd - lower_hpd, 3), "\n")

## Interval width: 3.332

New York Air Pollution Example

Data Context

Applying the Normal-Normal model to New York air pollution data:

Case 1: Prior parameters \(k=1\), \(m=10\) → HPD = (46.20, 49.61)
Case 2: Prior parameters \(k=1\), \(m=1\) → HPD = (46.62, 49.60)

Interpretation

The parameter \(m\) controls prior precision:

Smaller \(m\) → larger \(\tau^2\) → more diffuse prior
When \(m=1\), the prior is more diffuse than when \(m=10\)
This slightly shifts the posterior mean and changes interval width
Both intervals strongly concentrate around 47-49, suggesting robust inference

# Simulate the NY air pollution example
set.seed(42)

# Hypothetical data (posterior summaries from the text)
posterior_means <- c(47.905, 48.11)   # Approximated from interval midpoints
posterior_sds <- c(0.87, 0.76)        # Back-calculated from interval widths

case_labels <- c("k=1, m=10", "k=1, m=1")
z <- qnorm(0.975)

for(i in 1:2) {
  lower <- posterior_means[i] - z * posterior_sds[i]
  upper <- posterior_means[i] + z * posterior_sds[i]
  cat(case_labels[i], ":\n")
  cat("  95% HPD = (", round(lower, 2), ",", round(upper, 2), ")\n")
  cat("  Width =", round(upper - lower, 3), "\n\n")
}

## k=1, m=10 :
##   95% HPD = ( 46.2 , 49.61 )
##   Width = 3.41 
## 
## k=1, m=1 :
##   95% HPD = ( 46.62 , 49.6 )
##   Width = 2.979

Calculation Methods by Distribution Type

Distribution Type	HPD Calculation Method
Symmetric & Unimodal (Normal, t)	Equal-tailed interval: mean ± quantile × SE
Skewed (Gamma, Beta, Lognormal)	Numerically find narrowest interval covering \(1-\alpha\) probability (grid search or optimization)
Multi-modal	HPD region may be disjoint (multiple intervals). Use horizontal line method

Computational Approach with MCMC

When posterior samples are available (e.g., from Stan, JAGS, or PyMC):

# Example: Finding HPD from MCMC samples
set.seed(123)

# Simulate skewed posterior samples (e.g., Gamma posterior)
posterior_samples <- rgamma(10000, shape = 5, scale = 2)

# Function to compute HPD (simplified)
compute_hpd <- function(samples, prob = 0.95) {
  cred_mass <- prob
  alpha <- 1 - cred_mass
  n <- length(samples)
  sorted_samples <- sort(samples)
  
  # Number of samples to include
  interval_size <- floor(cred_mass * n)
  
  # Find shortest interval
  widths <- sorted_samples[(interval_size):n] - sorted_samples[1:(n - interval_size + 1)]
  idx <- which.min(widths)
  
  c(lower = sorted_samples[idx], 
    upper = sorted_samples[idx + interval_size - 1])
}

hpd_interval <- compute_hpd(posterior_samples)

cat("95% HPD Interval from MCMC samples:\n")

## 95% HPD Interval from MCMC samples:

cat("(", round(hpd_interval[1], 3), ",", round(hpd_interval[2], 3), ")\n")

## ( 2.508 , 18.634 )

# Compare with equal-tailed (which would be inappropriate here)
equal_tailed <- quantile(posterior_samples, c(0.025, 0.975))
cat("\nEqual-tailed 95% interval (for comparison):\n")

## 
## Equal-tailed 95% interval (for comparison):

cat("(", round(equal_tailed[1], 3), ",", round(equal_tailed[2], 3), ")\n")

## ( 3.266 , 20.138 )

cat("Equal-tailed width:", round(equal_tailed[2] - equal_tailed[1], 3), "\n")

## Equal-tailed width: 16.872

cat("HPD width:", round(hpd_interval[2] - hpd_interval[1], 3), "\n")

## HPD width: 16.127

cat("HPD is narrower by:", 
    round((equal_tailed[2] - equal_tailed[1]) - (hpd_interval[2] - hpd_interval[1]), 4))

## HPD is narrower by: 0.7449

Using Specialized Packages

In R:

# install.packages("coda")
library(coda)
HPDinterval(as.mcmc(posterior_samples), prob = 0.95)

In Python:

# pip install arviz
import arviz as az
az.hdi(posterior_samples, hdi_prob=0.95)

Key Takeaways

HPD intervals contain only the highest density regions of the posterior
For Normal posteriors, HPD = equal-tailed credible interval (due to symmetry)
For skewed or multi-modal posteriors, HPD differs from equal-tailed intervals and is generally narrower (more precise)
Computation requires more work than simple quantiles—either analytical derivation or numerical methods
HPD is the preferred credible interval because it doesn’t exclude high-probability areas of the posterior