knitr::opts_chunk$set(echo = TRUE)
f <- function(x) {
(1/3) * dbeta(x, 1, 5) +
(1/3) * dbeta(x, 3, 5) +
(1/3) * dbeta(x, 10, 5)
}
x <- seq(0,1, length.out = 100000)
y <- f(x)
plot(x, y, type = "l", col="blue",
lwd = 2, main = "Mixture of Beta Distributions",
xlab = expression(theta), ylab = "Density")
I plotted the density function f(x) where:
\[
f(x) =
\frac{1}{3}\operatorname{Be}(\alpha_1 = 1,\beta_1 = 5) +
\frac{1}{3}\operatorname{Be}(\alpha_2 = 3,\beta_2 = 5) +
\frac{1}{3}\operatorname{Be}(\alpha_3 = 10, \beta_3 = 5)
\]
set.seed(1234)
K <- max(y)
x_K <- x[which.max(y)]
K
## [1] 1.666667
x_K
## [1] 0
N <- 10000
x_proposal <- runif(N, 0, 1)
u <- runif(N, 0, 1)
accept_prob <- f(x_proposal) / K
accepted_values <- x_proposal[u <= accept_prob]
length(accepted_values)
## [1] 5978
Using K <- max(y), we find the largest value of the
density values stored in y. This value becomes the upper
bound for the rejection sampling. The maximum density occurs at
x = 0, which is found using x[which.max(y)].
Using the given function, at x = 0, the first beta density
has value 5, while the other two beta densities have a value of 0,
giving us,
\(f(0) = \frac{1}{3}(5) + \frac{1}{3}(0) + \frac{1}{3}(0) = \frac{5}{3} \approx 1.6667\).
Therefore, K = 1.6667 and the corresponding value is
x = 0.
We then simulate 10,000 candidate values, stored in
x_proposal, from a Uniform(0,1)
distribution.
A corresponding value is then generated in u which is
used to decide whether to accept or reject the proposed values.
accept_prob calculates the acceptance probability for
every proposed value. Since the proposal distribution is
Uniform(0,1), its density is \(q(x) = 1\), therefore giving us a rejection
sampling acceptance probability of \(\frac{f(x)}{K}\). We then compare the
values in u with accept_prob and store the
resulting accepted values in accepted_values.
observed_acceptance_rate <- length(accepted_values) / N
observed_acceptance_rate
## [1] 0.5978
theoretical_acceptance_rate <- 1 / K
theoretical_acceptance_rate
## [1] 0.6
The observed acceptance rate is calculated as the number of accepted values divided by the total number of proposed values:
\[ \text{Observed acceptance rate} = \frac{\text{number of accepted values}}{N} \]
The theoretical acceptance rate for rejection sampling is:
\[ \text{Theoretical acceptance rate} = \frac{1}{K} \]
This is because the proposal distribution is
Uniform(0,1), so q(x) = 1 on the interval
[0,1], and the target density integrates to 1. Since \(K \approx 1.6667\),
\[ \frac{1}{K} = \frac{1}{1.6667} \approx 0.6 \]
Therefore, the theoretical acceptance rate is approximately 0.6. The observed acceptance rate should be close to this value, although it may not be exactly equal due to random simulation.
theta_imp <- x_proposal
q_theta <- dunif(theta_imp, 0, 1)
weights <- f(theta_imp) / q_theta
norm_weights <- weights / sum(weights)
mean_imp <- sum(theta_imp * norm_weights)
var_imp <- sum((theta_imp - mean_imp)^2 * norm_weights)
mean_imp
## [1] 0.4048396
var_imp
## [1] 0.06148562
head(weights)
## [1] 1.3076239 1.2601404 1.2414027 1.2615142 0.3347256 1.2785976
For the importance sampling algorithm, I used the same 10,000 values
generated from the Uniform(0,1) proposal distribution in
the accept/reject algorithm. These values are stored in
theta_imp.
The importance weights are calculated as
\[ w_i = \frac{f(\theta_i)}{q(\theta_i)} \]
where \(f(\theta_i)\) is the target
density and \(q(\theta_i)\) is the
proposal density. Since the proposal distribution is
Uniform(0,1), we have \(q(\theta_i) = 1\) for \(\theta_i \in [0,1]\). Therefore, the
weights simplify to
\[ w_i = f(\theta_i). \]
The weights are then normalised by dividing each weight by the sum of all weights.
density_accept_reject <- density(
accepted_values,
from = 0,
to = 1,
cut = 0
)
bw_imp <- bw.nrd0(theta_imp)
density_importance <- density(
theta_imp,
weights = norm_weights,
bw = bw_imp,
from = 0,
to = 1,
cut = 0
)
plot(x, y, type = "l", col = "blue", lwd = 2,
ylim = c(0, 2), main = "Comparison of Target and Estimated Densities",
xlab = expression(theta), ylab = "Density")
lines(
density_accept_reject, col = "red",
lwd = 2, lty = 2
)
lines(
density_importance, col = "green",
lwd = 2, lty = 3
)
legend(
"topright",
legend = c("Target Density", "Accept/Reject Density", "Importance Sampling Weighted Density"),
col = c("blue", "red", "green"),
lwd = 2, lty = c(1,2,3)
)
The blue curve shows the true target density. The red dashed curve
shows the density estimate from the values accepted by the accept/reject
algorithm. The green dotted curve shows the weighted density estimate
from the importance sampling algorithm, where the normalised importance
weights are included using the weights argument in the
density() function.
Both simulated density estimates are close to the target density, although they are not exactly identical because they are based on random simulation.
From the plot, the accept/reject algorithm appears to approximate the target density slightly better in this simulation. The density of the accepted values closely follows the shape of the target density, and the observed acceptance rate is approximately 0.5978, which is very close to the theoretical acceptance rate of 0.6. This shows that the accept/reject algorithm is working efficiently for this target distribution.
The importance sampling algorithm also provides a reasonable approximation, and it has the advantage of using all 10,000 simulated values. However, it relies on the importance weights to correct for the difference between the proposal distribution and the target distribution. This can make the approximation more sensitive to the behaviour of the weights.
I would use the accept/reject algorithm for this problem because the acceptance rate is reasonably high, the method is simple, and the accepted values are direct samples from the target distribution. Importance sampling is still useful, especially for estimating expectations, but for simulating values from this target density, accept/reject sampling is more direct.