The Cauchy distribution with parameter \(\alpha\) has pdf \[ f_X(x) = \frac{\alpha}{\pi(\alpha^2 + x^2)}, \quad -\infty < x < \infty. \]
Deriving the CDF. Integrating the pdf: \[ F_X(x) = \int_{-\infty}^x \frac{\alpha}{\pi(\alpha^2 + t^2)}\,dt. \] Substituting \(t = \alpha u\) (\(dt = \alpha\,du\)): \[ F_X(x) = \int_{-\infty}^{x/\alpha} \frac{\alpha}{\pi\,\alpha^2(1+u^2)}\,\alpha\,du = \frac{1}{\pi}\int_{-\infty}^{x/\alpha}\frac{du}{1+u^2} = \frac{1}{\pi}\arctan\!\left(\frac{x}{\alpha}\right) + \frac{1}{2}. \]
Inverse CDF (quantile function). Setting \(u = F_X(x)\) and solving for \(x\): \[ u - \frac{1}{2} = \frac{1}{\pi}\arctan\!\left(\frac{x}{\alpha}\right) \implies x = F_X^{-1}(u) = \alpha\tan\!\bigl(\pi(u - \tfrac{1}{2})\bigr). \]
Simulation algorithm: generate \(U \sim \text{Uniform}(0,1)\), then \(X = \alpha\tan(\pi(U - \tfrac{1}{2}))\).
rcauchy_custom <- function(n, alpha = 1) {
u <- runif(n)
alpha * tan(pi * (u - 0.5))
}
# Verify against built-in Cauchy with scale = alpha
set.seed(42)
alpha_test <- 2
x_custom <- rcauchy_custom(5000, alpha = alpha_test)
x_builtin <- rcauchy(5000, location = 0, scale = alpha_test) # scale = alpha
# Compare empirical CDFs
plot(ecdf(x_custom[abs(x_custom) < 30]),
main = paste0("ECDF comparison: custom vs built-in Cauchy(alpha=", alpha_test, ")"),
xlab = "x", col = "steelblue", lwd = 2, xlim = c(-30, 30))
lines(ecdf(x_builtin[abs(x_builtin) < 30]), col = "tomato", lwd = 2, lty = 2)
legend("topleft", c("Custom (inversion)", "Built-in rcauchy"),
col = c("steelblue","tomato"), lwd = 2, lty = 1:2, bty = "n")
The two ECDFs are visually indistinguishable, confirming the
inversion sampler is correct. Note: R’s
rcauchy(n, location=0, scale=alpha) parameterises the
Cauchy distribution with scale \(= \alpha\), exactly matching the pdf \(f_X(x) = \alpha/[\pi(\alpha^2+x^2)]\).
We use the Cauchy pdf \(f_C(x;\alpha) = \dfrac{\alpha}{\pi(\alpha^2+x^2)}\) as an envelope for \(\phi(x) = \dfrac{1}{\sqrt{2\pi}}e^{-x^2/2}\) (the standard normal density). We need \(k\,f_C(x;\alpha) \ge \phi(x)\) for all \(x\), so the minimum valid \(k\) is:
\[ k(\alpha) = \sup_x \frac{\phi(x)}{f_C(x;\alpha)} = \sup_x \frac{\dfrac{1}{\sqrt{2\pi}}e^{-x^2/2}}{\dfrac{\alpha}{\pi(\alpha^2+x^2)}} = \sup_x \frac{\pi(\alpha^2+x^2)}{\alpha\sqrt{2\pi}}\,e^{-x^2/2}. \]
Finding the supremum. Differentiating with respect to \(x\) and setting to zero: \[ x\bigl[2 - \alpha^2 - x^2\bigr] = 0 \implies x = 0 \text{ or } x^2 = 2 - \alpha^2 \quad (\alpha < \sqrt{2}). \] For \(\alpha < \sqrt{2}\), the global maximum occurs at \(x^2 = 2 - \alpha^2\), giving: \[ k(\alpha) = \frac{\pi \cdot 2 \cdot e^{-(2-\alpha^2)/2}}{\alpha\sqrt{2\pi}} = \frac{\sqrt{2\pi}}{\alpha}\,e^{-1+\alpha^2/2}. \]
Minimising \(k\) over \(\alpha\). Taking \(dk/d\alpha = 0\): \[ \frac{d}{d\alpha}\!\left[\frac{e^{\alpha^2/2}}{\alpha}\right] = e^{\alpha^2/2}\left(1 - \frac{1}{\alpha^2}\right) = 0 \implies \alpha^* = 1. \] The optimal constants are \(\alpha^* = 1\) and \[ k^* = \sqrt{2\pi}\,e^{-1/2} = \sqrt{2\pi/e} \approx 1.5203. \] The theoretical acceptance probability is \(1/k^* = \sqrt{e/(2\pi)} \approx 65.8\%\).
alpha_star <- 1
k_star <- sqrt(2 * pi) * exp(-0.5)
cat(sprintf("Optimal alpha: %g\n", alpha_star))
## Optimal alpha: 1
cat(sprintf("Optimal k*: %.4f (acceptance prob = %.4f)\n", k_star, 1/k_star))
## Optimal k*: 1.5203 (acceptance prob = 0.6577)
rn_rej <- function(n, alpha = 1) {
k <- sqrt(2 * pi) * exp(-0.5)
samples <- numeric(n)
n_accept <- 0L
n_total <- 0L
while (n_accept < n) {
y <- alpha * tan(pi * (runif(1) - 0.5)) # Y ~ Cauchy(alpha)
u2 <- runif(1)
# envelope value at y
env <- k * alpha / (pi * (alpha^2 + y^2))
if (u2 * env <= dnorm(y)) {
n_accept <- n_accept + 1L
samples[n_accept] <- y
}
n_total <- n_total + 1L
}
cat(sprintf("Empirical acceptance rate: %.4f (theoretical: %.4f)\n",
n / n_total, 1 / k))
samples
}
set.seed(42)
samp_norm <- rn_rej(1000)
## Empirical acceptance rate: 0.6596 (theoretical: 0.6577)
hist(samp_norm, breaks = 40, freq = FALSE,
main = "N(0,1) via Rejection Sampling (Cauchy envelope)",
xlab = "x", col = "steelblue", border = "white", ylim = c(0, 0.46))
curve(dnorm(x), add = TRUE, col = "red", lwd = 2)
legend("topright", "N(0,1) density", col = "red", lwd = 2, bty = "n")
Histogram of 1000 N(0,1) samples via rejection sampling. Red curve is the theoretical N(0,1) density.
A \(\text{Gamma}(2,1)\) variable has density \(f_X(x) = x e^{-x}\), \(x > 0\).
Normalising constant: integrating by parts, \[ P(X > 4) = \int_4^\infty x e^{-x}\,dx = \Bigl[-xe^{-x}\Bigr]_4^\infty + \int_4^\infty e^{-x}\,dx = 4e^{-4} + e^{-4} = 5e^{-4}. \]
Conditional density: \[ f(x) = \frac{f_X(x)}{P(X>4)} = \frac{xe^{-x}}{5e^{-4}} = \frac{x}{5}\exp\{-(x-4)\}, \quad x > 4. \qquad\square \]
Full density: \(f_e(x) = \tfrac{1}{2}e^{-x/2}\), \(x > 0\). Tail probability: \(P(X>4) = e^{-4/2} = e^{-2}\). \[ h_1(x) = \frac{\tfrac{1}{2}e^{-x/2}}{e^{-2}} = \frac{1}{2}e^{2-x/2} = \frac{1}{2}\exp\!\bigl\{-\tfrac{1}{2}(x-4)\bigr\}, \quad x > 4. \] CDF: \(H_1(x) = 1 - e^{-(x-4)/2}\). Inverse CDF (for simulation): \(H_1^{-1}(u) = 4 - 2\ln(1-u)\).
Optimal \(k_1\): Form the ratio \[ \frac{f(x)}{h_1(x)} = \frac{(x/5)e^{-(x-4)}}{(1/2)e^{-(x-4)/2}} = \frac{2x}{5}\,e^{-(x-4)/2}. \] Differentiating: \(\tfrac{d}{dx}\bigl[\tfrac{2x}{5}e^{-(x-4)/2}\bigr] = \tfrac{2}{5}e^{-(x-4)/2}(1-x/2) < 0\) for all \(x > 4\). The ratio is strictly decreasing; its supremum is achieved at \(x = 4\): \[ \boxed{k_1 = \frac{2 \cdot 4}{5} = \frac{8}{5} = 1.6.} \]
Standard Cauchy pdf: \(f_C(x) = \tfrac{1}{\pi(1+x^2)}\). Define \(C_2 = P(\text{Cauchy} > 4) = \tfrac{1}{2} - \tfrac{\arctan 4}{\pi}\). \[ h_2(x) = \frac{f_C(x)}{C_2} = \frac{1}{\pi C_2(1+x^2)}, \quad x > 4. \] CDF: \(H_2(x) = \dfrac{\arctan x - \arctan 4}{\pi/2 - \arctan 4}\). Inverse CDF: \(H_2^{-1}(u) = \tan\!\bigl(u(\tfrac{\pi}{2} - \arctan 4) + \arctan 4\bigr).\)
Optimal \(k_2\): \[ \frac{f(x)}{h_2(x)} = \frac{\pi C_2}{5}\cdot x(1+x^2)\,e^{-(x-4)}. \] Let \(g(x) = x(1+x^2)e^{-(x-4)}\). Then \(g'(x) = e^{-(x-4)}\bigl(1 + 3x^2 - x - x^3\bigr)\). Evaluating at \(x = 4\): \(1 + 48 - 4 - 64 = -19 < 0\), and this expression remains negative for all \(x > 4\) (since \(x^3\) dominates). So \(g\) is decreasing on \((4,\infty)\) and its supremum is at \(x = 4\): \(g(4) = 4 \times 17 \times 1 = 68\). \[ \boxed{k_2 = \frac{68\pi C_2}{5}.} \]
C2 <- 0.5 - atan(4) / pi
k1 <- 8 / 5
k2 <- 68 * pi * C2 / 5
cat(sprintf("C2 = %.6f\n", C2))
## C2 = 0.077979
cat(sprintf("k1 = %.4f (acceptance prob = %.4f)\n", k1, 1/k1))
## k1 = 1.6000 (acceptance prob = 0.6250)
cat(sprintf("k2 = %.4f (acceptance prob = %.4f)\n", k2, 1/k2))
## k2 = 3.3317 (acceptance prob = 0.3001)
Since \(k_1 = 1.6 < k_2 \approx 3.33\), the exponential envelope \(h_1\) yields a higher acceptance probability (\(\approx 62.5\%\) vs \(\approx 30\%\)).
rej_sample <- function(N, k, fx, hx, ihx) {
x.sim <- numeric(N)
N.rej <- 0L
acc <- 0L
while (acc < N) {
y <- ihx(runif(1)) # draw from h via inverse CDF
u2 <- runif(1)
if (u2 * k * hx(y) <= fx(y)) {
acc <- acc + 1L
x.sim[acc] <- y
} else {
N.rej <- N.rej + 1L
}
}
return(list(x.sim, N.rej))
}
# Target density f(x)
fx <- function(x) (x / 5) * exp(-(x - 4))
# Envelope 1: Exp(1/2) | x > 4
h1x <- function(x) 0.5 * exp(-(x - 4) / 2)
ih1x <- function(u) 4 - 2 * log(1 - u)
# Envelope 2: Standard Cauchy | x > 4
h2x <- function(x) 1 / (pi * C2 * (1 + x^2))
ih2x <- function(u) tan(u * (pi/2 - atan(4)) + atan(4))
set.seed(42)
res1 <- rej_sample(1000, k1, fx, h1x, ih1x)
res2 <- rej_sample(1000, k2, fx, h2x, ih2x)
cat(sprintf("h1 (Exp): rejections = %d, acceptance rate = %.4f (theory = %.4f)\n",
res1[[2]], 1000/(1000 + res1[[2]]), 1/k1))
## h1 (Exp): rejections = 593, acceptance rate = 0.6277 (theory = 0.6250)
cat(sprintf("h2 (Cauchy): rejections = %d, acceptance rate = %.4f (theory = %.4f)\n",
res2[[2]], 1000/(1000 + res2[[2]]), 1/k2))
## h2 (Cauchy): rejections = 2360, acceptance rate = 0.2976 (theory = 0.3001)
x_seq <- seq(4.01, 20, length.out = 600)
plot(x_seq, fx(x_seq), type = "l", lwd = 2.5, col = "black",
ylim = c(0, 0.88), xlab = "x", ylab = "Density",
main = "f(x) and scaled envelopes")
lines(x_seq, k1 * h1x(x_seq), col = "steelblue", lwd = 2, lty = 2)
lines(x_seq, k2 * h2x(x_seq), col = "tomato", lwd = 2, lty = 3)
legend("topright",
legend = c("f(x) [target]",
"k1*h1(x) [Exp, k=1.6]",
"k2*h2(x) [Cauchy, k~3.33]"),
col = c("black","steelblue","tomato"), lwd = 2, lty = 1:3, bty = "n")
Target density f(x) with scaled envelopes k1h1(x) and k2h2(x) on (4, 20).
Discussion. The expected number of proposals per accepted sample is \(k_i\), so the expected rejection count over \(N=1000\) accepted samples is \(N(k_i - 1)\): approximately \(600\) for \(h_1\) and \(2330\) for \(h_2\). The simulation results confirm this. The plot explains why: \(k_1 h_1(x)\) hugs \(f(x)\) tightly throughout \((4,\infty)\) (both decay at exponential rates), whereas \(k_2 h_2(x)\) (Cauchy, heavy-tailed) stays far above \(f(x)\) for large \(x\), creating a large wasted region and many rejections.
We estimate the integral \(I = \displaystyle\int_0^1\sqrt{1-x^2}\,dx\). The true value is \(I = \pi/4 \approx 0.7854\) (area of a unit quarter-circle).
N <- 50000
true_I <- pi / 4
set.seed(42)
\[ \hat{I}_{\text{MC}} = \frac{1}{N}\sum_{i=1}^N h(U_i), \quad h(u) = \sqrt{1-u^2}, \quad U_i \overset{\mathrm{iid}}{\sim} \mathrm{Uniform}(0,1). \]
u_mc <- runif(N)
h_mc <- sqrt(1 - u_mc^2)
I_mc <- mean(h_mc)
var_mc <- var(h_mc) / N # variance of the estimator
cat(sprintf("Simple MC: estimate = %.6f, SE = %.6f, true = %.6f\n",
I_mc, sqrt(var_mc), true_I))
## Simple MC: estimate = 0.783978, SE = 0.001006, true = 0.785398
Write \(I = E_g\!\bigl[h(X)/g(X)\bigr]\) and estimate with \(N\) draws from \(g\).
Choice of IS densities. The key is how the IS weight \(w(x) = h(x)/g(x)\) behaves. Near \(x = 1\), \(h(x) \approx \sqrt{2(1-x)}\) (goes to zero like \((1-x)^{1/2}\)). For \(g(x) \propto (1-x)^\beta\), the weight \(w \propto (1-x)^{1/2-\beta}\):
Beta\((1,b)\) has density \(\propto (1-x)^{b-1}\), so \(\beta = b-1\).
Three choices (all with \(g(0) > 0\) so variance is also finite near \(x=0\)):
set.seed(42)
# g2: Beta(1, 1.5)
u2 <- rbeta(N, 1, 1.5)
w2 <- sqrt(1 - u2^2) / dbeta(u2, 1, 1.5)
I_IS2 <- mean(w2)
var_IS2 <- var(w2) / N
# g3: Beta(1, 2)
u3 <- rbeta(N, 1, 2)
w3 <- sqrt(1 - u3^2) / dbeta(u3, 1, 2)
I_IS3 <- mean(w3)
var_IS3 <- var(w3) / N
cat(sprintf("g1 Uniform : est = %.6f, var = %.2e, %%red = %+.1f%%\n",
I_mc, var_mc, 0))
## g1 Uniform : est = 0.783978, var = 1.01e-06, %red = +0.0%
cat(sprintf("g2 Beta(1,1.5) : est = %.6f, var = %.2e, %%red = %+.1f%%\n",
I_IS2, var_IS2, 100*(var_mc - var_IS2)/var_mc))
## g2 Beta(1,1.5) : est = 0.785333, var = 1.08e-07, %red = +89.4%
cat(sprintf("g3 Beta(1,2) : est = %.6f, var = %.2e, %%red = %+.1f%%\n",
I_IS3, var_IS3, 100*(var_mc - var_IS3)/var_mc))
## g3 Beta(1,2) : est = 0.786336, var = 2.65e-06, %red = -161.9%
cat(sprintf("\nTrue value: %.6f\n", true_I))
##
## True value: 0.785398
Comparison. Beta\((1,\tfrac{3}{2})\) achieves roughly a 90% variance reduction because the IS weight \(w_2 = \tfrac{2}{3}\sqrt{1+x}\) is nearly constant (range \([\tfrac{2}{3}, \tfrac{2\sqrt{2}}{3}]\)), meaning the density closely matches the integrand’s shape. Beta\((1,2)\) is worse than the uniform: its density decays too steeply near \(x=1\), creating IS weights that blow up there and inflate variance.
Divide \((0,1)\) into \(J=3\) equal strata and sample \(n_j = N/3\) uniform points per stratum.
\[ \hat{I}_{\text{strat}} = \sum_{j=1}^J \frac{1}{J}\,\bar{h}_j, \qquad \widehat{\mathrm{Var}}(\hat{I}_{\text{strat}}) = \sum_{j=1}^J \frac{1}{J^2}\cdot\frac{\hat\sigma_j^2}{n_j}. \]
J <- 3
n_j <- N / J
set.seed(42)
I_str <- 0
v_str <- 0
for (j in seq_len(J)) {
a <- (j - 1) / J; b <- j / J
u_j <- runif(n_j, a, b)
h_j <- sqrt(1 - u_j^2)
I_str <- I_str + (1/J) * mean(h_j)
v_str <- v_str + (1/J)^2 * var(h_j) / n_j
}
pct_str <- 100 * (var_mc - v_str) / var_mc
cat(sprintf("Stratified (J=3): est = %.6f, var = %.2e, %%red = %.1f%%\n",
I_str, v_str, pct_str))
## Stratified (J=3): est = 0.784442, var = 2.27e-07, %red = 77.6%
Stratification reduces variance because within each narrow stratum the integrand \(\sqrt{1-x^2}\) varies less than over the whole interval, lowering within-stratum variance.
Use \(N/2\) correlated pairs \((U_i,\,1-U_i)\): \[ \hat{I}_{\text{anti}} = \frac{1}{N/2}\sum_{i=1}^{N/2}\frac{h(U_i)+h(1-U_i)}{2}. \] Since \(h(x) = \sqrt{1-x^2}\) is strictly decreasing on \([0,1]\), \(h(U)\) and \(h(1-U)\) are negatively correlated (\(\rho < 0\)), and \(\mathrm{Var}(\hat{I}_{\text{anti}}) = (1+\rho)\,\mathrm{Var}(\hat{I}_{\text{MC}})\).
set.seed(42)
n_half <- N / 2
u_a <- runif(n_half)
h_a1 <- sqrt(1 - u_a^2)
h_a2 <- sqrt(1 - (1 - u_a)^2)
a_vals <- (h_a1 + h_a2) / 2
I_anti <- mean(a_vals)
var_anti <- var(a_vals) / n_half
rho_hat <- cor(h_a1, h_a2)
pct_anti <- 100 * (var_mc - var_anti) / var_mc
cat(sprintf("Antithetic: est = %.6f, var = %.2e, %%red = %.1f%%\n",
I_anti, var_anti, pct_anti))
## Antithetic: est = 0.784466, var = 2.79e-07, %red = 72.5%
cat(sprintf("Estimated correlation rho(h(U), h(1-U)) = %.4f\n", rho_hat))
## Estimated correlation rho(h(U), h(1-U)) = -0.7241
Approximate \(h(x) = \sqrt{1-x^2}\) by the straight line \(L(x) = 1 - x\) (connecting the endpoints \((0,1)\) and \((1,0)\)). Since \(\int_0^1 L(x)\,dx = \tfrac{1}{2}\) is known, write:
\[ I = \int_0^1(h(x) - L(x))\,dx + \tfrac{1}{2}. \]
More generally, use \(U\) as a control variate with \(E[U] = \tfrac{1}{2}\): \[ \hat{I}_{\text{CV}} = \bar{h} - \hat\beta\!\left(\bar{U} - \tfrac{1}{2}\right), \qquad \hat\beta = \frac{\widehat{\mathrm{Cov}}(h(U),\,U)}{\widehat{\mathrm{Var}}(U)}. \] (The fixed line \(L(x)=1-x\) corresponds to \(\beta = -1\); the data-driven \(\hat\beta\) optimises the variance reduction.)
set.seed(42)
u_cv <- runif(N)
h_cv <- sqrt(1 - u_cv^2)
# Fixed control: L(x) = 1 - x => beta = -1
I_cv_fixed <- mean(h_cv) - (-1) * (mean(u_cv) - 0.5)
var_cv_fixed <- var(h_cv + u_cv) / N # h - (-1)*u = h + u
# Optimal beta (minimises variance)
beta_hat <- cov(h_cv, u_cv) / var(u_cv)
I_cv_opt <- mean(h_cv) - beta_hat * (mean(u_cv) - 0.5)
var_cv_opt <- var(h_cv - beta_hat * u_cv) / N
pct_fixed <- 100 * (var_mc - var_cv_fixed) / var_mc
pct_opt <- 100 * (var_mc - var_cv_opt) / var_mc
cat(sprintf("Straight-line beta = -1: est = %.6f, var = %.2e, %%red = %.1f%%\n",
I_cv_fixed, var_cv_fixed, pct_fixed))
## Straight-line beta = -1: est = 0.784416, var = 2.92e-07, %red = 71.1%
cat(sprintf("Optimal beta = %.4f: est = %.6f, var = %.2e, %%red = %.1f%%\n",
beta_hat, I_cv_opt, var_cv_opt, pct_opt))
## Optimal beta = -0.7134: est = 0.784290, var = 1.53e-07, %red = 84.8%
The estimated \(\hat\beta\) is close to \(-1\) (the slope of \(L\)), confirming that \(L(x) = 1-x\) is a reasonable linear approximation to \(\sqrt{1-x^2}\) on \([0,1]\). The control variate approach reduces variance by removing the linear trend in \(h(U)\).
methods <- c("Simple MC", "IS: Beta(1, 3/2)", "IS: Beta(1, 2)",
"Stratified (J=3)", "Antithetic", "CV (optimal beta)")
ests <- c(I_mc, I_IS2, I_IS3, I_str, I_anti, I_cv_opt)
vars <- c(var_mc, var_IS2, var_IS3, v_str, var_anti, var_cv_opt)
pct_reds <- round(100 * (1 - vars / var_mc), 1)
summary_df <- data.frame(
Method = methods,
Estimate = round(ests, 5),
Estimator_Var = formatC(vars, format = "e", digits = 2),
Pct_Var_Reduc = pct_reds
)
knitr::kable(summary_df,
col.names = c("Method", "Estimate", "Est. Variance", "% Var. Reduction"),
caption = paste0("Comparison of Monte Carlo methods for $I = \\pi/4 = ",
round(true_I, 5), "$. Variance reduction is relative",
" to Simple MC."))
| Method | Estimate | Est. Variance | % Var. Reduction |
|---|---|---|---|
| Simple MC | 0.78398 | 1.01e-06 | 0.0 |
| IS: Beta(1, 3/2) | 0.78533 | 1.08e-07 | 89.4 |
| IS: Beta(1, 2) | 0.78634 | 2.65e-06 | -161.9 |
| Stratified (J=3) | 0.78444 | 2.27e-07 | 77.6 |
| Antithetic | 0.78447 | 2.79e-07 | 72.5 |
| CV (optimal beta) | 0.78429 | 1.53e-07 | 84.8 |