Overview of Acceptance/Rejection Sampling

Ihsan E. Buker

Introduction

  • Simulating data is an essential components of statistical research.
  • If cdf is known, pdf is tractable, etc. direct methods can be utilized.
    • Ex: inverse transform sampling

Introduction: Inverse Transform Sampling

Continuous Case:

  • Generate \(U \sim \operatorname{Unif}(0,1)\)
  • Let \(X = F_X^{-1}(U=u)\)

Discrete Case:

  • Generate \(U \sim \operatorname{Unif}(0,1)\)
  • Find \(k\) s.t. \(\sum_{j=1}^{k-1} p_j \leq U < \sum_{j=1}^k p_j\).
  • Let \(X = x_k\)

Direct approaches form the basis of functions such as r* from R, or rand from SAS.

Motivation

  • What if we do not have the cdf, or cannot invert it?
  • What if the density is not a proper pdf/pmf?
    • i.e., \(\int^\infty_{-\infty}f(x) \ \text{d}x \neq 1\)

This is where indirect methods come into play!

Motivation: Example

Consider an alternative game of darts.

Motivation: Example

Consider an alternative game of darts.

Motivation: Example

Consider an alternative game of darts.

Motivation: Example

Now, let’s start throwing our darts!

Motivation: Example

Now, let’s start throwing our darts!

Motivation: Example

After throwing many darts, we look at the distribution…

Motivation: Example

… and discard the blue observations.

Motivation: Example

Without having to fully evaluate the target density, we were able to sample from it!

The Accept/Reject Algorithm: Notation

  • \(Y \sim f_Y(y)\) and \(V \sim f_V(V)\).
    • \(\operatorname{supp}(f_Y) \subseteq \operatorname{supp}(f_V)\).
  • \(M \equiv \sup_y \frac{f_Y(y)}{f_V(y)} < \infty\)
    • Ensures \(M \cdot f_V(y) \geq f_Y(y) \ \forall y\)

The Accept/Reject Algorithm: Procedure

To generate \(Y \sim f_Y\):

  1. Generate \(U \sim \operatorname{Unif}(0,1)\)

  2. Generate \(V \sim f_V\) s.t. \(U \perp V\)

  3. If \(U < \frac{1}{M} \cdot \frac{f_Y(V)}{f_V(V)}\), set \(Y=V\); else, discard \(V\).

The Accept/Reject Algorithm: Proof

Consider the cdf of \(Y\):

\[ \begin{aligned} P(Y \leq y) &= P(V \leq y \mid \text{accepted } V) \\ &= P\left(V \leq y \mid U < \frac{1}{M} \cdot \frac{f_Y(V)}{f_V(V)}\right) \\ &= \frac{P\left(V \leq y, U < \frac{1}{M} \cdot \frac{f_Y(V)}{f_V(V)}\right)}{P\left(U < \frac{1}{M} \cdot \frac{f_Y(V)}{f_V(V)}\right)} \end{aligned} \]

The Accept/Reject Algorithm: Proof

Consider the numerator of the previous expression. Let us condition it upon \(V=v\) and integrate over the support of \(V\) to account for this modification.

\[ \begin{aligned} .&= \int_{V} P\left(V \leq y, U < \frac{1}{M} \cdot \frac{f_Y(V)}{f_V(V)} \mid V=v \right) \cdot f_V(v) \ \text{d}v \\ &= \int_{V} P\left(v \leq y, U < \frac{1}{M} \cdot \frac{f_Y(v)}{f_V(v)} \right) \cdot f_V(v) \ \text{d}v \\ &= \int_{V} I(v \leq y) \cdot P\left(U < \frac{1}{M} \cdot \frac{f_Y(v)}{f_V(v)} \right) \cdot f_V(v) \ \text{d}v \\ \end{aligned} \]

The Accept/Reject Algorithm: Proof

…continued:

\[ \begin{aligned} .&= \int_{V} I(v \leq y) \cdot P\left(U < \frac{1}{M} \cdot \frac{f_Y(v)}{f_V(v)} \right) \cdot f_V(v) \ \text{d}v \\ &= \int^{y}_{-\infty} \frac{1}{M} \cdot \frac{f_Y(v)}{f_V(v)} \cdot f_V(v) \ \text{d}v \\ &= \frac{1}{M} \int^{y}_{-\infty} f_Y(v) \ \text{d}v \end{aligned} \]

The Accept/Reject Algorithm: Proof

Recall the denominator of the very first expression:

\[ \begin{aligned} P\left(U < \frac{1}{M} \cdot \frac{f_Y(V)}{f_V(V)}\right) &= \int_V P\left(U < \frac{1}{M} \cdot \frac{f_Y(V)}{f_V(V)} \mid V=v \right) \cdot f_V(v) \ \text{d}v \\ &= \int_V P\left(U < \frac{1}{M} \cdot \frac{f_Y(v)}{f_V(v)}\right) \cdot f_V(v) \ \text{d}v \\ &= \int_V \frac{1}{M} \cdot \frac{f_Y(v)}{f_V(v)} \cdot f_V(v) \ \text{d}v \\ &= \frac{1}{M} \int_V f_Y(v) \ \text{d}v \end{aligned} \]

The Accept/Reject Algorithm: Proof

Lastly, we combine the simplified numerator and denominator:

\[ \begin{aligned} P(Y \leq y) &= \frac{\frac{1}{M} \int^{y}_{-\infty} f_Y(v) \ \text{d}v}{\frac{1}{M}} \\ &= \int^{y}_{-\infty} f_Y(v) \ \text{d}v \\ &\implies Y \sim f_Y \end{aligned} \]

The Accept/Reject Algorithm: Interesting Result

Let \(N\) be the number of iterations needed to generate one sample from the target distribution.

  • \(N \sim \operatorname{Geom}(p = 1/M) \implies \mathbb{E}(N) = M\)
  • Smallest \(M\) satisfying \(M \cdot f_V(y) \geq f_Y(y) \ \forall y\) should be chosen to increase efficiency of algorithm.

Application of Algorithm

Consider the target density \(f_X(x) = \sin(x^2); \ 0 \leq x \leq \sqrt{\pi}\).

Application of Algorithm

A good proposal distribution may be \(f_Y(y) = N(1.25, 0.4); -\infty \leq x \leq \infty\).

Application of Algorithm

\[ M = \sup_x \left(\frac{\text{Target Density}}{\text{Proposal Density}}\right) \]

target <-
  function(x) {
    sin(x^2)
  }

proposal <-
  function(x) {
    dnorm(x = x, mean = 1.25, sd = 0.4)
  }

x <- seq(from = 0, to = sqrt(pi), by = 0.001)

M <- max(target(x) / proposal(x), na.rm = TRUE)

Application of Algorithm

Application of Algorithm

Assume that we want a sample of size \(n_x=500\).

set.seed(7423)

y <- c()
n <- 0

for(i in 1:1e5){
  u <- runif(1, 0,1)
  norm_val <- rnorm(1, mean = 1.25, sd = 0.4)
  
  ratio <- 
    sin(norm_val^2) / 
    {dnorm(norm_val, mean = 1.25, sd = 0.4)*M}
  
  y[i] <- ifelse(u < ratio, norm_val, NA)
  n <- sum(!is.na(y))
  if(n == 500) break
}

Application of Algorithm

set.seed(7423)

y <- c()
n <- 0

for (i in 1:1e5) {
  u <- runif(1, 0, 1)
  norm_val <- rnorm(1, mean = 1.25, sd = 0.4)

  ratio <-
    sin(norm_val^2) / {
      dnorm(norm_val, mean = 1.25, sd = 0.4) * M
    }

  y[i] <- ifelse(u < ratio, norm_val, NA)
  n <- sum(!is.na(y))
  if (n == 500) break
}

Application of Algorithm

Review of Algorithm

  • 56 % of proposals accepted.
  • 771 iterations expected to generate 500 samples.
    • Our sampling was less efficient than expected.
  • Generating 500 samples took 0.088 seconds.
    • Could be improved via vectorization.

Various Scenarios: Larger M

Various Scenarios: Bad Proposal Distribution

Consider \(f(x) = \Gamma(\alpha = 2, \beta = 5); 0 \leq x\).

Various Scenarios: Bad Proposal Distribution

References

Chen, Yen-Chi. n.d. “Lecture 4: Importance Sampling and Rejection Sampling.”
Choo, Nathaniel, and Kyle Kolsti. 2018. “Sampling from Arbitrary Distributions Using the Rejection Method.” STAT COE-Report-37-2018.
Peng, Roger. n.d. Advanced Statistical Computing.
Sachdeva, Kapil. 2021. “What Is Rejection Sampling?” Medium. https://towardsdatascience.com/what-is-rejection-sampling-1f6aff92330d.