Introduction

Accept-Reject simulation method Typically, a C.D.F. will not come up in nice closed expression for us to invert evrytime. There are many distributions for which the inverse transform method and even general transformations will fail to be able to generate the required randomvariables. When we come up against that problem, the next simulation method of choice is almost always going to be the Accept-Reject method. This method falls in the indirect simulation methods.

With indirect methods we generate samples from a simpler random variable and only accept it as the sample of our target random variable subject to a passing test.

Algorithm of Accept-Reject Method:

1: Generate Y ~ g, U ~ \(U_{[0,1]}\);

2: Accept X = Y if U \(\leq\) \(\frac{f(Y)}{M_g}\)(Y);

3: Return to 1 otherwise.

Example 1: Generate 1000 samples from Beta(4, 3) using the accept-reject method.

We know that if \(X \sim Beta(\alpha, \beta)\), then its density \(f\) is given by, \[ f(x\lvert \alpha,\beta) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha - 1}(1-x)^{\beta - 1} \] Plugging in \(\alpha = 4\) and \(\beta = 3\), we get \[ f(x) = 60x^3(1-x)^2 \] What is the support of f?

Any Beta distribution is supported on [0, 1]. Let us plot this density.

# Creating a continues variable
x.seq <- seq(from = 0, to = 1, length.out = 101)
f <- sapply(x.seq, function(x) 60*x^3*(1-x)^2)

# put the x and y values in a data frame
plot_data <- data.frame("x" = x.seq, "f" = f)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.5
# create the base layer of ggplot with geom_Line
plot_f <- ggplot(plot_data, aes(x = x, y = f)) +
          geom_line(color = "red", size = 1.5) +
          labs(x = "x", y = "Beta(4,3)")
plot_f

Now, our candidate density function g should have the same support as f and should be a valid density. Lets take g to be uniform density U(0, 1).

A naive choice for the envelope e(x) = Mg(x), would be to take M = max(f(x)). Thus, we can guarantee that e(x) \(\geq\) f(x). This naive choice works whenever we have a compact support for f(x) and can find a global maxima.

Lets find M.

We can use the optimize function in R to find the maxima!

max_p <- optimize(function(x) 60*x^3*(1-x)^2, c(0.05, 0.95), maximum = TRUE)
round(max_p$maximum, 4)
## [1] 0.6

Thus, M = f(0.6).

# Lets create our beta density f
beta.f <- function(x) {60*x^3*(1-x)^2}
M <- beta.f(0.6)
print(M)
## [1] 2.0736

Our candidate density function g(x) = 1 for any x in [0, 1] because we chose it to be uniform(0,1). Hence, our envelope function e(y) = M. Now, let us write a simple accept-reject algorithm for this example

num.samp <- 10000
beta.samples <- rep(0, num.samp)
count.samp <- 0

Setting up a while loop for to draw 1000 samples

while(count.samp < num.samp) {
# Step 1: generat sample y from g and u from unif(0,1)
y <- runif(1) # g is U(0,1)
u <- runif(1)
# Step 2: Check the test
if (u < beta.f(y)/M) {
  count.samp <- count.samp + 1
  # set X = Y
  beta.samples[count.samp] <- y
  }
}

Let us check if this worked

hist(beta.samples,prob=T,ylab="f(x)",xlab="x",ylim=c(0,max(beta.f(x.seq))),
main="Histogram of draws from Beta(4,3)" )
lines(x.seq,f,lty=2,col = "red")

Let us visualize which samples were accepted and which were rejected.

# Total number of samples
num.sim <- 3000
# testing condition : u < f/e and hence u*e < f
# e is just M, so u*e is u * M
ue <- M*runif(num.sim)
# y is a sample from g and g is U(0,1)
y <- runif(num.sim)

# create all the pairs y and u*e
mat <- cbind(y, ue)
# put them in a data frame
plot_data <- data.frame(mat)
colnames(plot_data) <- c("y", "ue")
# create a factor variable to see which were accepted
plot_data$Accepted <- ue < beta.f(y)
# show the first 5 pairs
head(plot_data,5)
##           y         ue Accepted
## 1 0.3217643 1.36190394    FALSE
## 2 0.4522343 0.06472544     TRUE
## 3 0.2537980 0.99258897    FALSE
## 4 0.3216093 0.83463968     TRUE
## 5 0.4725194 0.77717469     TRUE
# If (u*e < f(y)) then samples are accepted otherwise rejected
# Hence, in scatter plot we will color by the Accepted variable
plot_ar <- ggplot(plot_data, aes(x=y, y=ue)) +
           geom_point(shape=20, aes(color=Accepted)) +
           stat_function(fun = beta.f) +
           geom_vline(xintercept = 0.0) +
           geom_vline(xintercept = 1.0) +
           geom_hline(yintercept = M) +
           geom_hline(yintercept = 0)

# beautify : Increase font size in axes
plot_ar + 
  theme(axis.text.x = element_text(face = "bold",size = 12),
        axis.text.y =element_text(face = "bold", size = 12),
        axis.line = element_line(colour = "black",
                                 size = 1, linetype = "solid"))

Let us check what proportion of the samples were accepted

round(sum(plot_data$Accepted)/nrow(plot_data),2)
## [1] 0.48
round((1/M),2)
## [1] 0.48

Let’s check the result by using R function.

Example Generate 1000 samples from gamma distribution with alpha=4 and beta=3

beta.samples <- rbeta(10000,4,3)

# Plot histogram 
hist(beta.samples, breaks = 21, col = "steelblue")

# Plot the density 
plot(density(beta.samples))
polygon(density(beta.samples), col="steelblue", border="red")

It looks like our accept-reject method works very well.

This is not a coincidence! \(\frac{1}{M}\) is the expected proportion of candidates that are accepted. If we find a better M (i.e a tighter envelope) we will on average have fewer rejections.

A good envelope should have the following properties:
1. Exceeds the target f everywhere
2. Easy to sample from g.
3. Generate few rejections.

If f is supported on a set [a, b], then a naive but good choice for e is max(f(x)) *\(\frac{1}{(b-a)}\) We take the candidate density to be uniform U(a, b) and M = max(f(x)).

References:

  1. Colorado State University MAS Computational Course Notes.
  2. Colorado State University MAS Mathematical Probability Course Notes.
  3. Introducing Monte Carlo Methods with R-Springer



***********