Ronald WESONGA (Ph.D)
October 2022
Monte Carlo (MC) methods refers to a broad collection of tools that are useful for approximating quantities based on artificially generated random samples. They include the Monte Carlo integration, the inversion technique for generating the required sample and the Markov chain Monte Carlo methods. In principle, the approximation can be made as good as required when the sample size is made sufficiently large. These methods are a very useful tool in Bayesian inference. As an example, consider the Buffon’s needle problem, where a needle of length 10 cm is dropped randomly onto a floor with parallel lines at a distance of 10 cm apart. What is the probability of the needle crosses a line? The exact value of p can be worked out analytically as \(\frac{2}{\pi} = 0.63662\).
Assuming the analytical solution is constrained, we could instead estimate p using the Monte Carlo methods. The simplest way to do this would be to toss the needle onto the floor 1000 times. If the needle crosses a line 843 times (say), then the Monte Carlo estimate of p is just 843/1,000 = 0.843. Alternatively, we simulate each drop on a computer and each time determine whether the virtual needle has crossed a virtual line. Although faster and more accurate, it will also require at least some mathematical work to identify exactly what the parameters of each drop are and what configuration of those parameters correspond to the needle crossing a line.
The method of Monte Carlo integration is popularly used for estimating means. Suppose \(\mu\), is the mean of some distribution defined by \(f(x)\) with a cumulative distribution function \(F(x)\), but we are unable to calculate \(\mu\) exactly, for example by applying the formulae \[\mu=E[X]=\int xf(x)dx = \sum xf(x)\] depending on X. Suppose, further that we are able to generate a random sample from the distribution in question \(x_1, ..., x_J \sim iid ~ f(x)\), then \[E[\bar{x}]=\sum_{j=1}^J x_j=\mu.\] Also, a \(1 - \alpha\) CI for \(\mu\) given by \[CI=\left(\bar{x}\pm z_{\alpha/2}\frac{s}{\sqrt{J}}\right)\] where \(s^2=\frac{1}{J-1}\sum_{j=1}^{J}(x_j-\bar{x})^2\).
m <- 10000
x <- runif(m,2,6)
thetaESt <- mean(exp(-x))*4
thetaEXa <- exp(-2)-exp(-6)
print(c(thetaESt, thetaEXa),digits=3) ## [1] 0.133 0.133
options(digits=4)
J = 100
set.seed(221)
xv=rgamma(J,3,2)
xbar=mean(xv)
s=sd(xv)
ci=xbar + c(-1,1)*qnorm(0.975)*s/sqrt(J)
c(xbar,s,s^2,ci,ci[2]-ci[1])## [1] 1.5170 0.8320 0.6921 1.3539 1.6800 0.3261
options(digits=4)
J = 100
set.seed(221)
xv=rgamma(J,3,2)
xbar=mean(xv)
s=sd(xv)
ci=xbar + c(-1,1)*qnorm(0.975)*s/sqrt(J)
c(xbar,s,s^2,ci,ci[2]-ci[1])## [1] 1.5170 0.8320 0.6921 1.3539 1.6800 0.3261
J = 1000
set.seed(231)
xv=rgamma(J,3,2)
xbar=mean(xv)
s=sd(xv)
ci=xbar + c(-1,1)*qnorm(0.975)*s/sqrt(J)
c(xbar,s,s^2,ci,ci[2]-ci[1]) ## [1] 1.5199 0.8722 0.7607 1.4658 1.5740 0.1081
## [1] 1.5199 0.8722 0.7607 1.4658 1.5740 0.1081
Estimating the value of x such that \(F(x) = p\) is the p-quantile \[q = F^{-1}(p).\] For example the estimate of the median \(q_{1/2}\) is denoted by \[\hat{q_{1/2}}=\begin{cases} x_{{(j+1)}/2}, &\text{if J is odd}; \\ x_{{j}/2}+x_{{(j+1)}/2}, &\text{if J is even}. \end{cases}\]
The \((1-\alpha)\) CDR of x for \((q_{\alpha/2}, q_{1-\alpha/2})\) is \((\hat{q}_{\alpha/2},\hat{q}_{1-\alpha/2})\)
For some function of x, say \(y = g(x).\) To estimate \[\psi=E[y]= \int yf(y)dy = E[g(x)]=\int g(x) f(x)dx.\] Then calculate \(y_j=g(x_j)\) for each \(j=1,...,J\) to obtain a random sample \(y_1,...,y_J \sim iid~F(y)\) then apply the MC. Hence, \[\psi= E[\bar{y}]= \frac{1}{J} \sum_1^J y_j\]
The density of \(f(x)\) can be estimated by smoothing a probability histogram of \(x_1,...,x_J.\) While density of \(f(y)\) can be estimated by smoothing a probability histogram of \(y_1,...,y_J,\) especially for complicated functions of x. Note:
Suppose that \(x \sim G(3, 2).\) Use MC methods and a sample of size J = 1000 to estimate:
Present your results graphically, and wherever possible show the true values of the quantities being estimated. Then repeat everything but using a Monte Carlo sample size of \(J=10000\).
options(digits=4)
J = 1000
set.seed(221)
xv=rgamma(J,3,2)
xbar=mean(xv)
xci=xbar + c(-1,1)*qnorm(0.975)*sd(xv)/sqrt(J)
xcdr=quantile(xv,c(0.1,0.9)); xden=density(xv)
yv=xv^2 * exp(-xv) / ( 1 + xv + 1/xv )
ybar=mean(yv)
yci=ybar + c(-1,1)*qnorm(0.975)*sd(yv)/sqrt(J)
ycdr=quantile(yv,c(0.1,0.9))
yden=density(yv) hist(xv,prob=T,breaks=seq(0,7,0.25),xlim=c(0,7),ylim=c(0,0.6),xlab="x", main="")
lines(xden,lty=2,lwd=2)
xvec=seq(0,10,0.01)
lines(xvec,dgamma(xvec,3,2),lty=1,lwd=2)
abline(v= c(xbar, xci, xcdr), lty=2, lwd=2)
abline(v=c(3/2,qgamma(c(0.1,0.9),3,2)), lty=1,lwd=2)
legend(4,0.6,c("MC estimates","True values"),lty=c(2,1),lwd=c(2,2)) hist(yv,prob=T,breaks=seq(0,0.2,0.005),xlim=c(0,0.2),ylim=c(0,30),xlab="y", main="")
lines(yden,lty=2,lwd=2)
abline(v= c(ybar, yci, ycdr), lty=2, lwd=2)
legend(4,0.6,c("MC estimates","True values"),lty=c(2,1),lwd=c(2,2)) ## Solution q2 b
options(digits=4)
J = 10000
set.seed(221)
xv=rgamma(J,3,2)
xbar=mean(xv)
xci=xbar + c(-1,1)*qnorm(0.975)*sd(xv)/sqrt(J)
xcdr=quantile(xv,c(0.1,0.9)); xden=density(xv)
yv=xv^2 * exp(-xv) / ( 1 + xv + 1/xv )
ybar=mean(yv)
yci=ybar + c(-1,1)*qnorm(0.975)*sd(yv)/sqrt(J)
ycdr=quantile(yv,c(0.1,0.9))
yden=density(yv) hist(xv,prob=T,breaks=seq(0,9,0.25),xlim=c(0,7),ylim=c(0,0.6),xlab="x", main="")
lines(xden,lty=2,lwd=2)
xvec=seq(0,10,0.01)
lines(xvec,dgamma(xvec,3,2),lty=1,lwd=2)
abline(v= c(xbar, xci, xcdr), lty=2, lwd=2)
abline(v=c(3/2,qgamma(c(0.1,0.9),3,2)), lty=1,lwd=2)
legend(4,0.6,c("MC estimates","True values"),lty=c(2,1),lwd=c(2,2)) hist(yv,prob=T,breaks=seq(0,0.2,0.005),xlim=c(0,0.2),ylim=c(0,30),xlab="y", main="")
lines(yden,lty=2,lwd=2)
abline(v= c(ybar, yci, ycdr), lty=2, lwd=2)
legend(4,0.6,c("MC estimates","True values"),lty=c(2,1),lwd=c(2,2))To estimate \[\psi= E[g(x)]=\int g(x) f(x)dx.\] We sample from \(h(x)\) which should be as similar to \(f(x)\) as possible. We then write \[\psi= E[g(x)]=\int \left( g(x) \frac{f(x)}{h(x)}\right)h(x)dx = \int w(x)h(x)dx.\] If \(f(x)\) is known only up to a multiplicative constant, that is \(f(x) = \frac{k(x)}{c},\) where the kernel \(k(x)\) is known exactly but it is too difficult or impossible to evaluate the normalising constant \(c = \int k(x)dx.\)
We write \[\psi=\int \frac{g(x)}{c}f(x)dx = \frac{\int \frac{g(x)f(x)}{h(x)}h(x)dx}{\int \frac{k(x)}{h(x)}h(x)dx} = \frac{\int w(x)h(x)dx}{\int u(x) h(x)dx}.\] Then sample from \(x_1,...,x_J\sim iid~h(x)\) and apply MC estimation to the means of \(w(x)~and~u(x)\) to obtain \[\hat{\psi}=\frac{\bar{w}}{\bar{u}} = \frac{\frac{\sum w_j}{J}}{\frac{\sum u_j}{J}}\] where \(w_j = w(x_j)~and~u_j=u(x_j).\)
Use Monte Carlo methods and importance sampling to estimate \(\mu\) where x has a density \[f(x) \propto \frac{e^{-x}}{x+1},~x > 0\]
Let \(k(x)=\frac{e^{-x}}{x+1}\) and \(h(x)=e^{-x},~~ x>0,\) the standard exponential or \(Gamma(1,1)\) density. So, \[\mu = \int_0^{\infty} xf(x)dx = \frac{\int\frac{x}{x+1}h(x)dx}{\int\frac{1}{x+1}h(x)dx}\] and the MC estimate of \(\mu\) is \[\hat{\mu}=\frac{\sum_1^J\frac{x_j}{x_j+1}}{\sum_1^J\frac{1}{x_j+1}}\]
options(digits=10)
kfun=function(x){ exp(-x)/(x+1) }
c=integrate(f=kfun,lower=0,upper=Inf)$value
xffun= function(x){ x*(1/0.5963474)*exp(-x)/(x+1) }
mu = integrate(f=xffun,lower=0,upper=Inf)$value
J=100000
set.seed(413)
xv=rgamma(J,1,1)
num=mean(xv/(xv+1))
den=mean(1/(xv+1))
est=num/den
c(num, den, est) ## [1] 0.4034510685 0.5965489315 0.6763084254
Suppose that we have a random sample from the bivariate distribution of two random variables x and y, denoted \((x_1, y_1),...,(x_J , y_J ) \sim iid ~f(x,y),\) and we are interested in some function of x and y, say r = g(x, y) . Then, we calculate \(r_j =g(x_j,y_j)\) and perform MC inference on the resulting sample \(r_1,...,r_J \sim iid~f(r).\) If they are independent then we simply sample \(x_j \sim f(x)~and~y_j \sim f(y).\) If x and y are dependent, it may not be obvious how to generate \((x_j,y_j)\sim f(x,y).\) We may apply the composition method as outlined:
To sample a vector \((x_j, y_j ) \sim f(x, y).\) One way is to first sample \(x_j \sim f( x)\) and then sample \(y_j \sim f( y | x_j ).\) It follows by the identity \(f(x,y) = f(x)f(y|x).\) Alternatively, first sample \(y_j \sim f( y)\) and then sample \(x_j \sim f( x | y_j ).\) It follows by the identity \(f(x,y) = f(y)f(x|y).\)
To sample a triplet \((x_j,y_j,z_j) \sim f(x,y,z)\) is to first sample \(y_j \sim f ( y )\) , then sample \(x_j \sim f ( x | y_j)\) and finally sample \(z_j \sim f( z | x_j , y_j).\) This works because of the identity \(f(x,y,z) = f(y)f(x|y)f(z|x,y).\)
Suppose that we are interested in the distribution of a random variable defined by \(r = \frac{y} {x+\sqrt |y|},\) where x and y have a joint distribution defined by the pdf \(f(x,y)= f(x)f(y|x),\) and where \(x\sim G(3,2)\) and \((y|x) \sim N(x,x).\) Use the R functions rgamma() and rnorm() to generate a sample of size \(J = 1000\) from the joint distribution of x and y.
Then use the method of the MC to estimate \(\psi= E[r]\) and report a \(95\%\) CI for \(\psi.\)
Also estimate the \(80\%\) CDR for \(r\) and \(f(r).\)
Present the results both graphically and numerically.
options(digits=4)
J = 1000
set.seed(221)
xv=rgamma(J,3,2)
yv = rnorm(xv,sqrt(xv))
rv = yv/(xv+sqrt(abs(yv)))
rbar=mean(rv)
rci=rbar + c(-1,1)*qnorm(0.975)*sd(rv)/sqrt(J)
rcdr=quantile(rv,c(0.1,0.9))
rden=density(rv)
c(rbar,rci,rcdr) ## 10% 90%
## 0.4256 0.4026 0.4486 -0.1025 0.8339
hist(rv,prob=T, breaks=seq(-1,1.8,0.1),xlim=c(-1,1.6),ylim=c(0,1.3),xlab="r", main="")
lines(rden,lty=1,lwd=2)
abline(v= c(rbar, rci, rcdr), lty=2, lwd=2)To estimate a binomial proportion p, we may interpret p as the mean \(\mu\) of a Bernoulli distribution and directly apply the method of Monte Carlo in the usual way. Suppose we are able to generate \(x_1,...,x_J \sim iid ~ Bern(p)\). Then the MC estimate of p is: \(\bar{x}=\frac{1}{J}\sum_1^Jx_j\) where \(s^2=\frac{1}{J-1}\sum_1^J(x_j^{2}-J\bar{x}^2)=\frac{1}{J-1}(J\bar{x}-J\bar{x}^2)=\frac{J}{J-1}\bar{x}(1-\bar{x}).\) The MC SE is \(\frac{s}{\sqrt{J}}=\sqrt\frac{\bar{x}(1-\bar{x})}{J-1}.\) The MC \(1???\alpha\) CI for p is: \(\hat{p}\pm z_{\alpha/2}\sqrt\frac{\hat{p}(1-\hat{p})}{J}.\)
Other CI to consider:
Use MC to estimate \(p=P\left(\sqrt\frac{x}{x+1}>0.3e^x\right)\) where \(x \sim Gamma(3,2).\)
Hint: With \(J = 20000,\) we sample \(x_1,..., x_J \sim iid ~ G(3,2)\) and let \[r_j=I\left(\sqrt\frac{x}{x+1}>0.3e^x\right)\] and \[\hat{p}=\frac{1}{J}\sum_1^J r_j\]
Note:
options(digits=4)
J=20000
set.seed(162)
xv=rgamma(J,3,2)
ct=0
yv= sqrt(xv)*exp(-xv) / sqrt(xv+1)
for(j in 1:J) if(yv[j] > 0.3)
ct=ct+1
phat=ct/J
ci=phat+c(-1,1)*qnorm(0.975)*sqrt(phat*(1-phat)/J)
c(phat,ci) ## [1] 0.2117 0.2060 0.2173
hist(yv,prob=T,breaks=seq(0,0.5,0.005),xlim=c(0,0.4),xlab="y",main=" ")
abline(v=0.3,lwd=3)
lines(density(yv),lwd=3)A needle of length 10 cm is dropped randomly onto a floor with lines on it that are parallel and 10 cm apart.
Let: X = perpendicular distance from centre of needle to nearest line in units of 5 cm Y = acute angle between lines and needle in radians C = ‘The needle crosses a line’. Then \[X \sim U (0,1)\Longrightarrow f(x)=1, 0<x<1\] \[Y \sim U (0,\frac{\pi}{2})\Longrightarrow f(y)=\frac{2}{\pi}, 0<y<\frac{\pi}{2}\] \(X \perp Y\) hence, \[f(x,y)=\frac{2}{\pi}, 0<x<1;0<y<\frac{\pi}{2}\] and \(C={X<sin(Y)}={(x,y):x<sin(y)}\)
\[\therefore ~~~~~~ p=p(C)=P(X< sin(Y))\] \[p=\int\int f(x,y)dxdy = \frac{2}{\pi}\int_{y=0}^{\pi/2}\left(\int_{x=0}^{sin(y)}dx\right)dy\] \[=\frac{2}{\pi}\int_{y=0}^{\pi/2}sin(y)dy\] \[\therefore p=\frac{2}{\pi}\]
We make use of the analysis in (a) whereby: \(C=\left( (x,y):x < sin(y)\right)\) and where \(x \sim U(0,1), y \sim U(0,\frac{\pi}{2}), X \perp Y.\) We now sample \(x_1,...,x_J \sim iid ~ U(0,1)\) and \(y_1,...,y_J \sim iid ~ U(0,\frac{\pi}{2})\) (all independent of one another). Next, we obtain the indicators defined by \[r_j=I(x_j < sin(y_j)) = \begin{cases} 1, & x_j < sin(y_j) \\ 0, & Otherwise. \end{cases}\] The MC estimate of p is \(\hat{p}=\bar{r}=\frac{1}{J}\sum_{j=1}^{J} r_j = \frac{r_T}{J}.\) The \(95\% ~ CI\) of p is \(CI=\left( \hat{p}\pm z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{J}}\right).\)
plot(seq(0, pi/2, 0.01), sin(seq(0, pi/2, 0.01)), type="l", lwd=3, xlab="y", ylab="x")
abline(v=c(0,pi/2),lty=3)
abline(h=c(0,1),lty=3)
text(0.2,0.4,"x = sin(y)")
text(1,0.4,"C")
text(0.35,0.8,"Complement of C")
text(1.52,0.06,"pi/2") J=1000
set.seed(213)
xv=runif(J,0,1)
yv=runif(J,0,pi/2)
rv=rep(0,J)
options(digits=4)
for(j in 1:J)
if(xv[j]<sin(yv[j]))
rv[j]=1
phat=mean(rv)
z=qnorm(0.975)
pci=phat+c(-1,1)*z*sqrt(phat*(1-phat)/J)
c(phat,pci,pci[2]-pci[1]) ## [1] 0.61800 0.58789 0.64811 0.06023
J=10000
set.seed(215)
xv=runif(J,0,1)
yv=runif(J,0,pi/2)
rv=rep(0,J)
for(j in 1:J)
if(xv[j]<sin(yv[j]))
rv[j]=1
phat=mean(rv)
z=qnorm(0.975)
pci=phat+c(-1,1)*z*sqrt(phat*(1-phat)/J)
c(phat,pci,pci[2]-pci[1])## [1] 0.63320 0.62375 0.64265 0.01889
Using the R function rgamma(), generate a random sample of size J = 100 from the gamma distribution with parameters 3 and 2 and mean \(\mu=3/2.\) Then use the method of Monte Carlo to estimate \(\mu.\) In your estimation, include a \(95\%\) CI for \(\mu\) and the width of this CI. Also report whether the CI contains the true value of \(\mu.\)
Repeat (a) but with J = 200, 500, 1000, 10000 and 100000, respectively. Report the widths of the resulting CIs and, for each CI, state whether it contains \(\mu.\) Discuss any patterns that you see.
Repeat (a) M = 100 times and report the proportion of the resulting M \(95\%\) MC CIs which contain the true value of the mean. (In each case use J = 100.) Hence calculate a \(95\%\) CI for p, the true coverage probability of the \(95\%\) MC CI for \(\mu\) based on a MC sample of size J = 100 from the Gamma(3,2) distribution.
Repeat (c), but with M = 200, 500, 1000 and 10000, respectively. Discuss any patterns that you see.
options(digits=5)
J = 100
set.seed(221)
xv=rgamma(J,3,2)
xbar=mean(xv)
ci=xbar + c(-1,1)*qnorm(0.975)*sd(xv)/sqrt(J)
c(xbar,ci)## [1] 1.5170 1.3539 1.6800
## [,1] [,2] [,3] [,4] [,5] [,6]
## Jvec 100.00 200.00 500.00 1000.00 1.0e+04 1.0e+05
## xbarvec 1.52 1.47 1.43 1.47 1.5e+00 1.5e+00
## LBvec 1.35 1.35 1.36 1.42 1.5e+00 1.5e+00
## UBvec 1.68 1.59 1.50 1.53 1.5e+00 1.5e+00
## Wvec 0.33 0.25 0.14 0.11 3.4e-02 1.1e-02
J=100
M=100
ct=0
set.seed(442)
for(m in 1:M)
{
xv=rgamma(J,3,2)
xbar=mean(xv)
ci=xbar + c(-1,1)*qnorm(0.975)*sd(xv)/sqrt(J)
if((ci[1]<=1.5)&&(1.5<=ci[2]))
ct = ct + 1
}
p=ct/M
ci=p+c(-1,1)*qnorm(0.975)*sqrt(p*(1-p)/J)
c(ct,p,ci)## [1] 93.00000 0.93000 0.87999 0.98001
## [1] 1.00e+04 9.44e-01 9.40e-01 9.49e-01 9.00e-03