This section contains the setup and the various utility functions used throughout.
Libraries used:
library(ggplot2)
library(data.table)
Consider the following problem:
\[ \begin{aligned} H = \int_0^3 x^2 + 2x dx &= \frac{x^3}{3} + \frac{2x^2}{2} \Big|_0^3 \\ &= \left[ \frac{27}{3} + \frac{18}{2} \right] - 0 \\ &= 9 + 9 = 18 \end{aligned} \]
Monte Carlo integration provides a numerical approach to solving the problem by re-expressing the problem in the form of an expectation. Specifically, by the definition of expectation for a continuous random variable \(X\) and the Law of the unconscious statistician (LOTUS) the expectation of any function of \(X\) can be found by:
\[ \begin{aligned} E[h(X)] = \int_{\mathcal{X}} h(x) f(x) dx \end{aligned} \]
where \(f(x)\) is the probability density function for \(X\). If we take \(X \sim \mathsf{Uniform}(a, b)\) where \(a, b\) define the limits of our integral such that the PDF is \(f(x) = 1/(b-a)\) then a Monte Carlo estimator for \(E[h(X)]\) is
\[ \begin{aligned} E[h(X)] = (b - a)\frac{1}{N}\sum_{n=1}^N h(X_i) \end{aligned} \]
where the \(X_n\) denotes the \(n^{\text{th}}\) random draw from a uniform distribution with limits \(a\) and \(b\). For the above example, for large \(N\), we have:
\[ \begin{aligned} H &\approx \hat{H} = (3 - 0)\frac{1}{N}\sum_{n=1}^N h(X_n) \\ X &\sim \mathsf{Uniform}(0, 3) \\ h(x) &= x^2 + 2x \end{aligned} \]
In R, this can be implemented as follows
set.seed(44)
a <- 0
b <- 3
N <- 50
x <- runif(N, a, b)
h <- function(x) (x^2)+2*x
d <- data.table(x = x, y = h(x))
with the first 50 samples visualised below
ggplot(d) +
geom_linerange(aes(x = x, ymin = 0, ymax = y)) +
stat_function(fun = h, lty = 2) +
geom_hline(yintercept = (b - a) * mean(d$y), lwd = 0.2, lty = 3) +
geom_text(data = data.table(a = a, b = b, mu = mean(d$y)),
aes(x = (b - a)/2, y = 1.02*(b - a) * mu),
label = paste0("I = ", round((b - a) * mean(d$y), 2))) +
scale_x_continuous(name = "X", limits = c(a, b)) +
scale_y_continuous(name = "h(x)") +
theme_bw()
Figure: Monte Carlo integration
By the law of large numbers, as \(N\) increases the expectation will converge to \(H\):
N <- 100000
x <- runif(N, a, b)
h <- function(x) (x^2)+2*x
(H <- (b - a) * mean(h(x)))
## [1] 18.0152
and since the draws are independent (but uncorrelated would suffice), the variance of the estimator is obtained from
\[ \begin{aligned} \mathsf{Var}(\hat{H}) &= \mathsf{Var} \left( (b - a)\frac{1}{N}\sum_{n=1}^N h(X_i) \right) \\ &= \left( \frac{b - a}{N} \right)^2 \mathsf{Var} \left( \sum_{n=1}^N h(X_i) \right) \\ &= \left( \frac{b - a}{N} \right)^2 N \mathsf{Var}(h(X_i)) \\ &= \frac{(b - a)^2}{N} \mathsf{Var}(h(X_i)) \\ \end{aligned} \]
and hence that the standard deviation converges with \(\mathcal{O}(\sqrt{N})\). The variance of the estimator allows us to compute a Monte Carlo error bound on the integration result using the standard error
\[ \begin{aligned} \hat{\sigma}_e &= \sqrt{ \frac{(b - a)^2}{N} \mathsf{Var}(h(X_i)) } \\ \end{aligned} \]
and the usual method for constructing a \(100(1-\alpha)\)% confidence interval, for example a 95% CI \(\hat{H} \pm 1.96 \hat{\sigma}_e\). The coverage of an interval constructed in this manner can also be empirically tested using simulation techniques:
sim <- function(){
N <- 100
x <- runif(N, a, b)
h <- function(x) (x^2)+2*x
hx <- h(x)
H <- (b - a) * mean(hx)
v <- ((b-a)^2)/N * var(hx)
se3 <- sqrt(v)
lb <- H + qnorm(0.025) * se3
ub <- H + qnorm(0.975) * se3
c(H = H, lb = lb, ub = ub)
}
m <- do.call(rbind, lapply(1:1000, function(i){
sim()
}))
mean(m[, 2] < 18 & m[, 3] > 18)
## [1] 0.953
Monte Carlo integration scales to the multidimensional setting. Consider
\[ \begin{aligned} H &= \int_{x=0}^2 \int_{y=0}^1 2x^2 + y^2 dy dx \\ &= \int_{x=0}^2 2x^2 + \frac{1}{3} dx \\ &= \frac{2x^3}{3} + \frac{x}{3} \Big|_0^2 \\ &= \frac{16}{3} + \frac{2}{3} = 6 \end{aligned} \]
in which I computed the 1-d integral with respect to \(y\) first and then ran a second 1-d integral w.r.t \(x\). This suggests that a 2-d integral boils down to two 1-d integrals, which we can already handle. However, a better alternative maintains an error rate at \(\mathcal{O}(\sqrt{N})\), specifically, in D-dimensional space we have:
\[ \begin{aligned} H &= \int_{V_D} h(\vec{x}) \\ &= \int_{V_D} h(\vec{x}) dx_1 dx_2 \dots dx_D \\ &\approx \frac{V_D}{N} \sum_{n=1}^N h(\vec{x}_n) \end{aligned} \]
where \(V_D\) is the D-dimensional volume and \(\vec{x}_n\) a randomly and uniformly distributed set of points in the volume \(V_D\). Clearly, the implementation is therefore a minor extension of what was done in a univariate setting:
set.seed(44)
ax <- 0; bx <- 2
ay <- 0; by <- 1
N <- 50000
x <- runif(N, ax, bx)
y <- runif(N, ay, by)
h <- function(x, y) 2*x^2 + y^2
(H <- (bx - ax)*(by - ay) * mean(h(x, y)))
## [1] 5.991426
Monte Carlo integration provides a general purpose numerical approach to solving integration problems. The approach has been introduced in its simplest form for single and multivarible problems. Extensions recognise the inefficiencies in uniform sampling and attempt to address that via importance sampling, to name just one.