Text Book: Simulation, Ross, 2013
Let \(X\) be an exponential random variable with mean 1. Give an efficient algorithm for simulating a random variable whose distribution is the conditional distribution of \(X\) given that \(X<0.05\). That is, its density function is
\[ f(x) = \frac{e^{-x}}{1-e^{-0.05}}, \quad 0<x<0.05 \]
Generate 1000 such variables and use them to estimate \(E( \ X \mid X<0.05 \ )\). Then determine the exact value of \(E( \ X \mid X<0.05 \ )\).
solution.
\[ U = F(X) = \int_0^{X} f(t)dt = \int_0^{X} \frac{e^{-t}}{1-e^{-0.05}}dt = \frac{1-e^{-X}}{1-e^{-0.05}}, \quad 0<X<0.05 \]
Then
\[ X = F^{-1}(U) = -\log\bigg( 1-(1-e^{-0.05})U \bigg) \]
ex6 = function(n){
U = runif(n)
X = -log(1-(1-exp(-0.05))*U)
return(X)
}
f = function(x){ exp(-x)/(1-exp(-0.05)) }
set.seed(1150413)
res = ex6(1000)
hist(res, breaks = 50, probability = T)
lines(seq(0,0.05,0.0001), f(seq(0,0.05,0.0001)), lwd = 2, col = "red")
## The mean of the simulated data is 0.02505983
We directly compute the mean of the distribution \(X \mid X<0.05\).
\[ \int_0^{0.05} x \frac{e^{-x}}{1-e^{-0.05}} dx = \frac{ -xe^{-x}-e^{-x} \vert_0^{0.05} }{ 1-e^{-0.05} } = \frac{ 1-1.05e^{-0.05} }{ 1-e^{-0.05} } \approx 0.0248 \]
Give algorithms for generating random variables from the following distributions.
\[ F(x)=\left\{\begin{array}{ll} \frac{1-e^{-2x}+2x}{3} & \text{if} \ 0<x<1 \\ \frac{3-e^{-2x}}{3} & \text{if} \ 1<x<\infty \end{array}\right. \]
solution.
Note that the distribution can be divide into two groups.
\[ F(x) = \frac{1}{3}F_{X_1}(x) + \frac{2}{3}F_{X_2}(x) \]
In group 1, \(X_1 \sim exp(1/2), \ F_{X_1}(x) = 1-e^{-2x} \ \Rightarrow \ X=\frac{-1}{2}\log(1-U) = \frac{-1}{2}\log U\)
In group 2, \(X_2 \sim Unif(0,1), \ F_{X_2}(x) = x \ \Rightarrow \ X=U\)
ex8.b = function(n){
X = rep(NA, n)
for (i in 1:n){
U = runif(1)
X[i] = ifelse(U < 1/3, -1/2*log(runif(1)), runif(1))
}
return(X)
}
set.seed(1150413)
f = function(x){
return(
ifelse(x < 1,
(2 * exp(-2*x) + 2) / 3,
(2 * exp(-2*x)) / 3)
)
}
res = ex8.b(100000)
hist(res, breaks = 500, probability = T, xlim = c(0,3))
lines(seq(0,3,0.01), f(seq(0,3,0.01)), lwd = 2, col = "red")
Suppose it is easy to generate random variable from any of the distribution \(F_i, \ i=1,\dots,n\). How can we generate from the following distributions?
\(F(x) = \prod_{i=1}^n F_i(x)\)
\(F(x) = 1- \prod_{i=1}^n [ 1-F_i(x) ]\)
solution.
Suppose \(X_i\) and \(X_j\) are independent for \(i \neq j\). Then
\[ F_X(x) = \prod_{i=1}^n F_i(x) = \prod_{i=1}^n P(X_i \le x) = P(X_i \le x, \ \forall i) = P(\max(X_1,\dots,X_n) \le x) \]
Therefore, \[ X = \max( X_1,\dots,X_n ) \]Similarly,
\[ F_Y(x) = 1 - \prod_{i=1}^n [1 - F_i(x)] = 1 - \prod_{i=1}^n P(X_i > x) = 1 - P(X_i > x, \ \forall i) = P(\min(X_1,\dots,X_n) \le x) \]
Therefore, \[ Y = \min(X_1,\dots,X_n) \]
The process of the algorithm is:
Step 1: Generate \(U_1, \dots, U_n\)
Step 2: Generate \(X_1,\dots,X_n\) by \(U_1, \dots, U_n\), respectively
Step 3: Take maximum or the minimum
Using the rejection method and the results of Exercise 12, give two other methods, aside from the inverse transform method, that can be used to generate a random variable having distribution function
\[ F(x) = x^n, \quad 0 \le x \le 1 \]
Discuss the efficiency of the three approaches to generating from \(F\).
solution.
By rejection method, take \(Y \sim U(0,1)\), then
\[ c = \max_{0 \le x \le 1}h(x) = \max_{0 \le x \le 1}\frac{f(x)}{g(x)} = \max_{0 \le x \le 1} nx^{n-1} = n \]
The mean times of iteration is \(n\).
Consider \(n=10\)
ex13.1 = function(n, N){
X = rep(NA,n)
iter = rep(NA,n)
f = function(x){ return(N*x^(N-1)) }
for (j in 1:n) {
Y = runif(1)
cri = f(Y)/N
i = 1
while (runif(1) >= cri) {
Y = runif(1)
cri = f(Y)/N
i = i+1
}
X[j] = Y
iter[j] = i
}
return(list("X"=X, "iter"=iter))
}
set.seed(1150413)
res = ex13.1(100000, 10)
hist(res$X, breaks = 100, probability = T, xlim = c(0,1), ylim = c(0,10))
f = function(x, N){ return(N*x^(N-1)) }
lines(seq(0,1,0.01), f(seq(0,1,0.01), 10), lwd = 2, col = "red")
Alternatively, by Exercise 12, this random variable is the maximum of \(X_1,\dots,X_n\), where \(X_i \sim U(0,1)\) for all \(i\).
ex13.2 = function(n, N){
X = rep(NA, n)
for (j in 1:n) {
X[j] = max(runif(N))
}
return(X)
}
set.seed(1150413)
res = ex13.2(100000, 10)
hist(res, breaks = 100, probability = T, xlim = c(0,1), ylim = c(0,10))
lines(seq(0,1,0.01), f(seq(0,1,0.01), 10), lwd = 2, col = "red")
Give two methods for generating a random variable having density function
\[ f(x) = xe^{-x}, \quad 0 \le x < \infty\]
solution.
We apply the rejection method with the exponential distribution with mean 2 and 3.
\[ c_1 = \max 2xe^{-x/2} = 4e^{-1} \approx 1.471518 \quad \text{and} \quad c_2 = \max 3xe^{-2x/3} = \frac{9}{2}e^{-1} \approx 1.655457 \]
f = function(x){ return(x*exp(-x)) } # target
ex15 = function(n, theta){
X = rep(NA,n)
iter = rep(NA,n)
g = function(x){ return(dexp(x, 1/theta)) } ## exp(theta)
h = function(x){ f(x)/g(x) }
det.c = optim(1, h, method = "BFGS",
control = list(fnscale = -1))
c = det.c$value
for (j in 1:n){
Y = -theta*log(runif(1))
i = 1
cri = h(Y)/c
while(runif(1) >= cri){
Y = -theta*log(runif(1))
cri = h(Y)/c
i = i + 1
}
X[j] = Y
iter[j] = i
}
return(list("X"=X, "iter"=iter, "c"=c))
}
set.seed(1150413)
res = ex15(100000, 2)
hist(res$X, breaks = 250, probability = T, xlim = c(0,10), ylim = c(0,0.4))
lines(seq(0,10,0.01), f(seq(0,10,0.01)), lwd = 2, col = "red")
Give two algorithms for generating a random variable having distribution function
\[ F(x) = 1-e^{-x}-e^{-2x}+e^{-3x}, \quad x>0 \]
solution.
We apply the rejection method first. Let \(Y \sim exp(1)\).
\[ c = \max_{x>0} \frac{f(x)}{g(x)} = \max_{x>0} \frac{e^{-x}+2e^{-2x}-3e^{-3x}}{e^{-x}} = \max_{x>0} (1+2e^{-x}-3e^{-2x}) = 4/3 \]
f = function(x){ return(exp(-x)+2*exp(-2*x)-3*exp(-3*x)) }
g = function(x){ return(dexp(x)) }
h = function(x){ return(f(x)/g(x)) }
ex16.1 = function(n){
X = rep(NA,n)
iter = rep(NA,n)
det.c = optim(1, h, method = "BFGS", control = list(fnscale = -1))
c = det.c$value
for (j in 1:n){
Y = -log(runif(1))
cri = h(Y)/c
i = 1
while(runif(1) >= cri){
Y = -log(runif(1))
cri = h(Y)/c
i = i+1
}
X[j] = Y
iter[j] = i
}
return(list("X"=X, "iter"=iter, "c"=c))
}
set.seed(1150413)
res = ex16.1(100000)
hist(res$X, breaks = 100, probability = T, xlim = c(0,8), ylim = c(0,0.8))
lines(seq(0,8,0.01), f(seq(0,8,0.01)), lwd = 2, col = "red")
Note that
\[ F(x) = 1-e^{-x}-e^{-2x}+e^{-3x} = (1-e^{-x})(1-e^{-2x}), \]
which is the product of the distribution function of \(exp(1)\) and \(exp(1/2)\).
By Exercise 12, it is the random variable which is the largest value from these two distribution.
ex16.2 = function(n){
Y = pmax(-log(runif(n)), -1/2 * log(runif(n)))
return(Y)
}
set.seed(1150413)
res = ex16.2(100000)
hist(res, breaks = 200, probability = T, xlim = c(0,8), ylim = c(0,0.8))
lines(seq(0,8,0.01), f(seq(0,8,0.01)), lwd = 2, col = "red")
Give two algorithms or generating a random variable having density function
\[ f(x) = \frac{1}{4} +2x^3 + \frac{5}{4}x^4, \quad 0<x<1 \]
solution.
We apply the rejection method with the uniform distribution \(U(0,1)\). Then
\[ c = \max_{0<x<1} \frac{f(x)}{g(x)} = \max_{0<x<1} \frac{1}{4} +2x^3 + \frac{5}{4}x^4 = \frac{7}{2} \]
f = function(x){ return(1/4 + 2*x^3 +5/4*x^4) }
ex17.1 = function(n){
X = rep(NA,n)
iter = rep(NA,n)
det.c = optim(1, f, method = "Brent", lower = 0, upper = 1,
control = list(fnscale = -1))
c = det.c$value
for (j in 1:n){
Y = runif(1)
cri = f(Y)/c
i = 1
while(runif(1) >= cri){
Y = runif(1)
cri = f(Y)/c
i = i+1
}
X[j] = Y
iter[j] = i
}
return(list(X=X, iter=iter, c=c))
}
set.seed(1150413)
res = ex17.1(100000)
hist(res$X, breaks = 100, probability = T, ylim = c(0, 3.5))
lines(seq(0,1,0.01), f(seq(0,1,0.01)), lwd = 2, col = "red")
Note that
\[ F(x) = \frac{1}{4}x + \frac{1}{2}x^4 + \frac{1}{4}x^5 \]
There are three group with the probability 1/4, 1/2, and 1/4.
In group 1, \(X_1 \sim U(0,1), \ F_{X_1}(x)=x \ \Rightarrow \ U=X\)
In group 2, \(X_2 \sim Beta(4,1), \ F_{X_2}(x)=x^4 \ \Rightarrow \ U=X^{1/4}\)
In group 3, \(X_3 \sim Beta(5,1), \ F_{X_3}(x)=x^5 \ \Rightarrow \ U=X^{1/5}\)
ex17.2 = function(n){
X = rep(NA, n)
for (j in 1:n){
U = runif(1)
if (U < 0.25) {
X[j] = runif(1)
}else if(U < 0.75){
X[j] = runif(1)^(1/4)
}else{
X[j] = runif(1)^(1/5)
}
}
return(X)
}
set.seed(1150413)
res = ex17.2(100000)
hist(res, breaks = 100, probability = T)
lines(seq(0,1,0.01), f(seq(0,1,0.01)), lwd = 2, col = "red")
Give an algorithm for generating a random variable having density function
\[ f(x) = 2xe^{-x^2}, \quad x>0 \]
solution.
We apply the rejection method. Consider \(Y\) follows exponential distribution with mean 1, that is,
\[ g(y) = e^{-x}, \quad x>0 \]
Then
\[ c = \max_{x>0} \frac{f}{g} = \max_{x>0} 2xe^{-x^2+x} = 2xe^{-x^2+x} \bigg\vert_{x=1} = 2 \]
f = function(x){ return(2*x*exp(-x^2)) }
g = function(x){ return(exp(-x)) }
h = function(x){ return(f(x)/g(x)) }
ex18 = function(n){
X = rep(NA,n)
iter = rep(NA,n)
det.c = optim(1, h, method = "Brent", lower = 0, upper = 5,
control = list(fnscale = -1))
c = det.c$value
for (j in 1:n) {
Y = -log(runif(1))
cri = h(Y)/c
i = 1
while(runif(1) >= cri){
Y = -log(runif(1))
cri = h(Y)/c
i = i+1
}
X[j] = Y
iter[j] = i
}
return(list(X=X, iter=iter, c=c))
}
set.seed(1150413)
res = ex18(100000)
hist(res$X, breaks = 100, probability = T, ylim = c(0,0.9))
lines(seq(0,3,0.01), f(seq(0,3,0.01)), lwd = 2, col = "red")
Give an algorithm that generates a random variable \(X\) having density
\[ f(x)=30(x^2-2x^3+x^4), \quad 0 \le x \le 1 \]
Discuss the efficiency of this approach.
solution.
Note that
\[ f(x) = 30x^2(1-x)^2, \quad 0<x<1 \]
It follows a \(Beta(3)(3)\) distribution. We apply the rejection method with \(Beta(3,1)\), that is,
\[ g(y) = 3x^2, \quad 0<x<1 \]
Then
\[ c = \max_{0<x<1} f/g = \max_{0<x<1} 10(1-x)^2 = 10 \]
f = function(x){ return(30*x^2*(1-x)^2) }
g = function(x){ return(3*x^2) }
h = function(x){ return(f(x)/g(x)) }
ex22 = function(n){
X = rep(NA, n)
iter = rep(NA,n)
det.c = optim(1, h, method = "Brent", lower = 0, upper = 1,
control = list(fnscale = -1))
c = det.c$value
for (j in 1:n) {
Y = runif(1)^(1/3)
cri = h(Y)/c
i = 1
while(runif(1) >= cri){
Y = runif(1)^(1/3)
cri = h(Y)/c
i = i+1
}
X[j] = Y
iter[j] = i
}
return(list(X=X, iter=iter, c=c))
}
set.seed(1150413)
res = ex22(100000)
hist(res$X, breaks = 100, probability = T, ylim = c(0,2))
lines(seq(0,1,0.01), f(seq(0,1,0.01)), lwd = 2, col = "red")