Rejection Sampling

Theory

\(\newcommand{\target}{p_{obj}} % target distribution\) \(\newcommand{\test}{p_{test}} % test, candidate distribution\) \(\newcommand{\var}{Y} % sampled variable\) \(\newcommand{\tvar}{X} % target variable\) \(\newcommand{\acc}{\text{Accept}}\) \(\newcommand{\scaling}{k}\) \(\newcommand{\R}{\mathbb{R}}\)

Good source: https://bookdown.org/rdpeng/advstatcomp/rejection-sampling.html

Goal: sample from target or objective distribution \(\target\) - known, but we cannot sample from it directly
Got: test distribution \(\test\) that we can sample from
Bound: \(\scaling \, \test(x) \geq \target(x)\) for some positive \(\scaling\)
Idea: sample from \(\test\) and accept/reject draw according to some rule, so that accepted samples look like they come from \(\target\)

Algorithm

sample \(\var \sim \test\)
sample \(U \sim \text{Uniform}[0,1]\)
if \(\test(\var)\) scaled by \(\scaling\) and \(U\) lies below \(\target(\var)\), accept:

\[ \acc\iff \quad \scaling \, \test(\var) \, U \le \target(\var) \iff U \le \frac{\target(\var)}{\scaling \, \test(\var)} \]

\(N_\acc\), the number of trials before a sample is accepted, is a geometric RV with parameter \(1 / \scaling\) (hence expected value \(\scaling\)).
Hence, \(\scaling\) should be chosen as small as possible.

\[ \mathbb{P}(\acc) = \mathbb{P}\left( U \le \frac{\target(\var)}{\scaling \, \test(\var)} \right) = \mathbb{E}\left( \frac{\target(\var)}{\scaling \, \test(\var)} \right) = \int_\mathbb{R} \frac{\target(y)}{\scaling \, \test(y)} \, \test(y) \, dy = 1 / \scaling \] - the RV \(\tvar = (\var | \acc)\) has the desired target density \(\target\)

We should show that \[ \mathbb{P}( \tvar \le t ) = \mathbb{P}( \var \le t | \acc ) = \int_{-\infty}^t \target(x) \, dx \] See lecture notes or https://bookdown.org/rdpeng/advstatcomp/rejection-sampling.html for a proof.

Simulation

knitr::opts_chunk$set(echo = TRUE) # This command sets the default option for code chunks in R Markdown to display the source code in the output document

see https://rpubs.com/janselmi/TD10-RM

use RS to generate \(n = 10^5\) samples from a RV \(\tvar\) with target density \(\target(t) = \frac{t^2}{9} 1_{[0,3]}\)
Estimate \(P(\tvar < 2)\) and compare with the theoretical result
Compute the average number of iterations of the rejection method to generate one sample, and compare with theoretical result
Create a density histogram for your generated values and compare it with the theoretical density \(\target\)

Hints

runif(n, a, b) generates \(n\) samples from \(\text{Uniform}[a,b]\)
hist makes a histogram
curve plots a curve

1. Use RS to generate samples

Bound target distribution \(\target\)

how do you check \(\target\) actually is a pdf?

\[ \int_{\R} \target(t) dt = \int_0^3 \frac{t^2}{9} dt = \frac{t^3}{27}|_0^3 = 1 \quad \text{q.e.d.} \]

what is the simplest way to bound the target distribution \(\target\)? Note that it has compact support

curve(x^2/9, from = 0, to = 3)

Just use a uniform distribution on \([0,3]\) and scale it so that its maximal value is the same as the maximal value attained by \(\target\), that is \(3^2/9 = 1\).
The maximal (constant) value of a uniform distribution on \([0,3]\) is \(1/3\) (by normalization of pdfs), so the optimal scaling factor is \(\scaling = 3\).

k = 3 # scaling parameter

curve(x^2/9, from = 0, to = 3, 
      col = "red", xlab = "x", ylab = "Density", main = "Bound target pdf")        # target pdf

curve(1/3 + 0*x, from = 0, to = 3, 
      col = "black", add = TRUE, lty = 2)                                          # uniform [0,3], dotted

curve(1 +0*x, from = 0, to = 3, 
      col = "blue", add = TRUE)                                                    # rescaled uniform [0,3] to bound the target pdf

legend(x = 0.1, y = 0.9,
       legend = c("Target: x^2 / 9", "Test: Uniform[0,3]", "Bound: k Uniform[0,3]"),
       col = c("red", "black", "blue"),
       lty = c(1, 2, 1)
       )

Implement RS

n=1e5;        # Number of samples to generate
accept = c(); # Create an empty vector. This vector will contain the samples of X
iter_vec = c(); # This vector will contain the number of iterations per sample

for (count in 1:n){        # to generate n samples
  # Rejection method here
  j=0;                     # for each sample, count how many iterations are required
  while(TRUE){             # Loop: until a sample is accepted
    #Generate Ui and Yi
    j=j+1;                # increment counter of iteration until a sample is accepted
    Ui=runif(1,0,1);      # sample U from uniform[0,1]
    Yi=runif(1,0,3);      # sample Y from test distribution (in this case, uniform[0,3], with expression test(x) = 1/3 on [0,3], 0 else)
    
    # REJECTION RULE
    # if k * test(Yi) * Ui <= target(Yi), accept Yi sample and break loop
    if( k * Ui * 1/3 <= Yi^2/9 ) {
      accept[count]=Yi;  # add Yi to accept vector
      iter_vec[count]=j; # add number of iterations required to accept sample to iter_vec
      break;
    }
  }  
}

for low values of n, let’s have a look at the output

if (n <= 10){
  print(accept)
}

if (n <= 10){
  print(iter_vec)
}

2. Estimate \(P(\tvar < 2)\) and compare with the theoretical result

The theoretical result is \[ P(\tvar < 2) = \int_{-\infty}^2 \target(t) dt = \int_{0}^2 t^2 / 9 dt = 8 / 27 \]

p_le2_theory=8/27;

to estimate this value given our samples, build the logic vector accept < 2 (returning 0 if false and 1 if true), and average it:

p_le2_rm = mean(accept<2);

# p_le2_rm = sum(accept<2)/n; # equivalent

print(paste0("The rejection method says P(X<2) = ", p_le2_rm ));

## [1] "The rejection method says P(X<2) = 0.29743"

print(paste0("...while theory says P(X<2) = ", p_le2_theory ));

## [1] "...while theory says P(X<2) = 0.296296296296296"

3. Compute the average number of iterations of the rejection method to generate one sample, and compare with theoretical result

Theory says \(\mathbb{E}(N_\acc) = \scaling = 3\)
to estimate this value given our sample, average the iter_vec vector (containing the number of iterations performed at each round)

average_n_iter = mean(iter_vec)
print(paste0("The average number of iterations per sample is ", average_n_iter ));

## [1] "The average number of iterations per sample is 2.99194"

4. Create a density histogram for your generated values and compare it with the theoretical density \(\target\)

given the generated samples accept, the function hist automatically takes care of generating an histogram, dividing the data domain over suitable intervals and computing the corresponding frequency
the flag probability = TRUE normalizes the histogram (computing relative frequencies as opposed to absolute frequencies)

hist(accept, probability = TRUE)
curve(x^2/9, from = 0, to = 3,add=TRUE)

to make the histogram finer, play with the parameter breaks

hist(accept, probability = TRUE, breaks = 50)
curve(x^2/9, from = 0, to = 3,add=TRUE)