Accept-Reject simulation method Typically, a C.D.F. will not come up in nice closed expression for us to invert evrytime. There are many distributions for which the inverse transform method and even general transformations will fail to be able to generate the required randomvariables. When we come up against that problem, the next simulation method of choice is almost always going to be the Accept-Reject method. This method falls in the indirect simulation methods.
With indirect methods we generate samples from a simpler random variable and only accept it as the sample of our target random variable subject to a passing test.
Algorithm of Accept-Reject Method: 1: Generate Y ~ g, U ~ \(U_{[0,1]}\); 2: Accept X = Y if U \(\leq\) \(\frac{f(Y)}{M_g}\)(Y); 3: Return to 1 otherwise.
Example 1: Generate 1000 samples from Beta(4, 3) using the accept-reject method.
We know that if \(X \sim Beta(\alpha, \beta)\), then its density \(f\) is given by, \[ f(x\lvert \alpha,\beta) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha - 1}(1-x)^{\beta - 1} \] Plugging in \(\alpha = 4\) and \(\beta = 3\), we get \[ f(x) = 60x^3(1-x)^2 \] What is the support of f?
Any Beta distribution is supported on [0, 1]. Let us plot this density.
# Creating a continues variable
x.seq <- seq(from = 0, to = 1, length.out = 101)
f <- sapply(x.seq, function(x) 60*x^3*(1-x)^2)
# put the x and y values in a data frame
plot_data <- data.frame("x" = x.seq, "f" = f)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.5
# create the base layer of ggplot with geom_Line
plot_f <- ggplot(plot_data, aes(x = x, y = f)) +
geom_line(color = "red", size = 1.5) +
labs(x = "x", y = "Beta(4,3)")
plot_f
Now, our candidate density function g should have the same support as f and should be a valid density. Lets take g to be uniform density U(0, 1).
A naive choice for the envelope e(x) = Mg(x), would be to take M = max(f(x)). Thus, we can guarantee that e(x) \(\geq\) f(x). This naive choice works whenever we have a compact support for f(x) and can find a global maxima.
Lets find M.
We can use the optimize function in R to find the maxima!
max_p <- optimize(function(x) 60*x^3*(1-x)^2, c(0.05, 0.95), maximum = TRUE)
round(max_p$maximum, 4)
## [1] 0.6
Thus, M = f(0.6).
# Lets create our beta density f
beta.f <- function(x) {60*x^3*(1-x)^2}
M <- beta.f(0.6)
print(M)
## [1] 2.0736
Our candidate density function g(x) = 1 for any x in [0, 1] because we chose it to be uniform(0,1). Hence, our envelope function e(y) = M. Now, let us write a simple accept-reject algorithm for this example
num.samp <- 10000
beta.samples <- rep(0, num.samp)
count.samp <- 0
Setting up a while loop for to draw 1000 samples
while(count.samp < num.samp) {
# Step 1: generat sample y from g and u from unif(0,1)
y <- runif(1) # g is U(0,1)
u <- runif(1)
# Step 2: Check the test
if (u < beta.f(y)/M) {
count.samp <- count.samp + 1
# set X = Y
beta.samples[count.samp] <- y
}
}
Let us check if this worked
hist(beta.samples,prob=T,ylab="f(x)",xlab="x",ylim=c(0,max(beta.f(x.seq))),
main="Histogram of draws from Beta(4,3)" )
lines(x.seq,f,lty=2,col = "red")
Let us visualize which samples were accepted and which were rejected.
# Total number of samples
num.sim <- 3000
# testing condition : u < f/e and hence u*e < f
# e is just M, so u*e is u * M
ue <- M*runif(num.sim)
# y is a sample from g and g is U(0,1)
y <- runif(num.sim)
# create all the pairs y and u*e
mat <- cbind(y, ue)
# put them in a data frame
plot_data <- data.frame(mat)
colnames(plot_data) <- c("y", "ue")
# create a factor variable to see which were accepted
plot_data$Accepted <- ue < beta.f(y)
# show the first 5 pairs
head(plot_data,5)
## y ue Accepted
## 1 0.3217643 1.36190394 FALSE
## 2 0.4522343 0.06472544 TRUE
## 3 0.2537980 0.99258897 FALSE
## 4 0.3216093 0.83463968 TRUE
## 5 0.4725194 0.77717469 TRUE
# If (u*e < f(y)) then samples are accepted otherwise rejected
# Hence, in scatter plot we will color by the Accepted variable
plot_ar <- ggplot(plot_data, aes(x=y, y=ue)) +
geom_point(shape=20, aes(color=Accepted)) +
stat_function(fun = beta.f) +
geom_vline(xintercept = 0.0) +
geom_vline(xintercept = 1.0) +
geom_hline(yintercept = M) +
geom_hline(yintercept = 0)
# beautify : Increase font size in axes
plot_ar +
theme(axis.text.x = element_text(face = "bold",size = 12),
axis.text.y =element_text(face = "bold", size = 12),
axis.line = element_line(colour = "black",
size = 1, linetype = "solid"))
Let us check what proportion of the samples were accepted
round(sum(plot_data$Accepted)/nrow(plot_data),2)
## [1] 0.48
round((1/M),2)
## [1] 0.48
Let’s check the result by using R function.
Example Generate 1000 samples from gamma distribution with alpha=4 and beta=3
beta.samples <- rbeta(10000,4,3)
# Plot histogram
hist(beta.samples, breaks = 21, col = "steelblue")
# Plot the density
plot(density(beta.samples))
polygon(density(beta.samples), col="steelblue", border="red")
It looks like our accept-reject method works very well.
This is not a coincidence! \(\frac{1}{M}\) is the expected proportion of candidates that are accepted. If we find a better M (i.e a tighter envelope) we will on average have fewer rejections.
A good envelope should have the following properties: 1. Exceeds the target f everywhere 2. Easy to sample from g. 3. Generate few rejections.
If f is supported on a set [a, b], then a naive but good choice for e is max(f(x)) *\(\frac{1}{(b-a)}\) We take the candidate density to be uniform U(a, b) and M = max(f(x)).
References: