Question I:

Part 1:

Given the integral \(\int_{0}^1 x\log(x) dx\), solving it directly with parts with setting \(u = \log(x), dv = x dx\), so \(du = \frac{1}{x}\), and \(v = \frac{1}{2x^2}\). Thus, we can use the formula \(\int udv = uv - \int v du\) under the limit (0,1) for it. By plugging in those values, the integral solved to be -0.25.

Part 2:

Question gives the information that the integral is the expectation with respect to an uniform distirbution. In other words, x ~ Unif(0,1), and x has probability density function h(x) = log(x), so E(x) is the intergral we are estimating. The naive Montel Carlo can be shown as R codes as below:

set.seed(2)
n = 1000
x = runif(n,0,1)
y = x*log(x)
c("mean"= mean(y),"st.err" = sd(y))
##       mean     st.err 
## -0.2445820  0.1098266

The expectation is also -0.25 roughly, and the standard error is 0.1098266 calculated by R.

Part 3:

Now, there are two different importance distributions: (i) Be(1.5,1.5) and (ii) Be(1.25,1.5). Our weight funciton is the ratio between h(x)/htilde(x), where h(x) is the probability density function of x, and htilde(x) is the probablity densidty function(beta distribution) of x. Let us look at the R code:

set.seed(1);
x2      = rbeta(1000, 1.5, 1.5);
gx2     = x2*log(x2)/dbeta(x2, 1.5, 1.5);
Exp_gx2 = mean(gx2);
Sd_gx2  = sqrt(var(gx2));
set.seed(1);
x3      = rbeta(1000, 1.25, 1.5);
gx3     = x3*log(x3)/dbeta(x3, 1.25, 1.5);
Exp_gx3 = mean(gx3);
Sd_gx3  = sqrt(var(gx3));
x  = seq(0, 1, 0.01);
y1 = x * log(x); 
y2 = x * log(x) / dbeta(x, 1.5, 1.5);
y3 = x * log(x) / dbeta(x, 1.25, 1.5);
plot(x, y1, main = "x * log(x) * w(x) for importance densities", ylab = "y", type = 'l');
lines(x, y2, lty = 2);
lines(x, y3, lty = 3);
legend(0.7, -0.27, c("uniform", "beta(1.5,1.5)", "beta(1.25,1.5)"), lty = c(1, 2, 3));

By checking the result of the those three cases, it is able to find that the native Monte Carlo’ standard error is 0.11 roughtly; later on, by different weight functions based on different shape of beta distirbution, the standard error shrinkages to 0.064 and 0.056 respectively. Case two is giving more precise result. From the plot above, the importance distribution with beta(1.25,1.5) gives the product of g(x)w(x) gives the flattest plot, so the estimate of the expected value is most accurate.

Question II

Part 1: \[ \begin{align*} L(\theta) &= \prod_{i=1}^n \int f_\theta(y_i|x_i)h_\theta(x_i) dx_i = \prod_{i=1}^n \int \frac{f_\theta(y_i|x_i)h_\theta(x_i) \tilde{h}_\theta(x_i)}{\tilde{h}_\theta(x_i)} dx_i \\ &= \prod_{i=1}^n \int \frac{f_\theta(y_i|x_i)h_\theta(x_i) h_{\theta_0}(x_i)}{h_{\theta_0}(x_i)} dx_i = \prod_{i=1}^n \int f_\theta(y_i|x_i) w(x_i) h_{\theta_0}(x_i)dx_i = E_{h_{\theta_0}}(\prod_{i=1}^n f_\theta(y_i|x_i) w(x_i)) \end{align*} \]

Above likehood can be calculated by the product of the conditional expectation, which includes the important distribution \(h_{\theta_0}(x_i)\), and the weight function \(w(x_i) = \frac{h_\theta(x_i)}{h_{\theta_0}(x_i)}\). The importance sampling estimator of likelihood will be: \[ \begin{align*} \hat{E}_{h_{\theta_0}}(\prod_{i=1}^n f_\theta(y_i|x_i)) & = \frac{1}{N} \sum_{i=1}^N (\prod_{i=1}^n f_\theta(y_i|x_i) w(x_i)) \\ & = \frac{1}{N} \sum_{i=1}^N (\prod_{i=1}^n \frac{1}{\sqrt{2\pi x_i}} \exp(-\frac{1}{2x_i}(y-\theta)^2) \frac{\theta \exp(-x_i\theta)}{\theta_0 \exp(-x_i\theta_0)}) \end{align*} \]

Part 2:

Here, x has probability density function \(h_\theta(x_i) = \theta e^{-x_i\theta}\), so R can help us to calculate the different values of likelihood based on \(\theta = 0.5\)

random.y = function(theta, n){
  set.seed(1)
  x = rexp(n,rate=theta); cond.y = NULL
  for(i in 1:n){
  cond.y[i] = rnorm(1, mean = theta, sd = sqrt(x[i]))  
  }
  return(list("likelihood" = prod(dnorm(cond.y,theta,sqrt(x))) * prod(dexp(x, theta)), "deviates" = cond.y))
}
(deviates = random.y(0.5,30)$deviates)
##  [1] -0.349160530 -1.474812192  0.525224061  0.375367046 -0.006994052
##  [6] -0.542643769 -0.518475110  1.255039704  2.093279678  1.038051028
## [11] -0.216330815  2.028721250  0.060510335  5.728947213  1.314354374
## [16] -0.151519462 -1.111689631 -0.834942758 -0.374736910 -1.196512144
## [21]  3.015044004  1.442744711  0.325646040  0.783124507  0.326493637
## [26]  1.341751655 -0.355655130  0.345581910  0.883183867  1.372933677

Part 3 (a):

(log.likelihood = log(random.y(0.5,30)$likelihood))
## [1] -98.85318

Part 3 (b)

library(ggplot2)
set.seed(1);
theta = 0.5;
x     = rexp(30, theta);
y     = rnorm(30, theta, sqrt(x));
L     = prod(dnorm(y,theta,sqrt(x))) * prod(dexp(x, theta));
Xij = matrix(,1000,30)
theta0 = 1;
for(i in 1:1000){
  Xij[i,] = rexp(30, theta0);
 }
L = c();
theta = seq(0.2, 1.5, length = 100);
for (k in 1:100){
    theta_k = theta[k];
    sum_L = 0;
    for (i in 1:1000){
    sum_L = sum_L + prod(dnorm(y, theta_k, sqrt(Xij[i,]))) * prod(dexp(Xij[i,], theta_k)) / prod(dexp(Xij[i,], theta0));
    }
    L[k] = sum_L/1000;
}
L[which.max(L)]; ## the maximum likelihood value
## [1] 1.385919e-22
qplot(theta,log(L), geom = "line",main = "log of likelihood estimates")

Question III:

Part 1: Proof \[ \begin{align*} E(e^{-x_2^2}|X_1 = x_1) & = \int \exp(-x^2) \frac{1}{\sqrt{2\pi \tau^2}} \exp[-\frac{(x-\mu)^2}{2\tau^2}] dx \\ & = \frac{1}{\sqrt{2\pi \tau^2}} \int \exp[-\frac{x^2 - 2x\mu + \mu^2 + 2\tau^2x^2}{2\tau^2}] dx \\ & = \frac{1}{\sqrt{2\pi \tau^2}} \int \exp[-\frac{(2\tau^2 + 1)x^2 - 2x\mu + \mu^2}{2\tau^2}] dx \\ & = \frac{1}{\sqrt{2\pi \tau^2}} \int \exp[-\frac{x^2 - 2\frac{\mu}{2\tau^2 + 1}x + \frac{\mu^2}{2\tau^2 + 1}}{2\frac{\tau^2}{2\tau^2 + 1}}] dx \\ & = \frac{1}{\sqrt{2\pi \tau^2}} \int \exp[-\frac{(x - \frac{\mu}{2\tau^2 + 1})^2 + \frac{\mu^2}{2\tau^2 + 1} - \frac{\mu^2}{(2\tau^2 + 1)^2}}{2 \frac{\tau^2}{2\tau^2 + 1}}] dx\\ & = \frac{1}{\sqrt{2\pi \tau^2}} \int \exp[-\frac{x - \frac{\mu}{2\tau^2 + 1}}{2 \frac{\tau^2}{2\tau^2 + 1}}] \exp[-\frac{\mu^2}{2\tau^2 + 1}] dx \\ & = \frac{1}{\sqrt{2\pi \frac{\tau^2}{2\tau^2 + 1}}} \frac{1}{\sqrt{2\tau^2 + 1}} \exp[-\frac{\mu^2}{2\tau^2 + 1}] \int \exp[-\frac{x - \frac{\mu}{2\tau^2 + 1}}{2\frac{\tau^2}{2\tau^2 + 1}}] dx = \frac{1}{\sqrt{2\tau^2 + 1}}\exp[-\frac{\mu^2}{2\tau^2 + 1}] \end{align*} \]

Part 2:

Below R code is to help estimate the standard deviation of \(g(x) = exp(-X_2^2)\):

v = 5; mu = 3; sigma = 0.5; N=1000
set.seed(1)
x1 = rchisq(N, v)
x2 = NULL;tau = NULL
for(i in 1:N){
  tau = sqrt(sigma^2*v/x1[i])
  x2[i] = rnorm(1, mu, tau)
}
(se.x2 = sd(exp(-x2^2))) 
## [1] 0.06889511

Respectively, the estimated standard deviation of phi(x) is calculated:

(sd.phi = sqrt(exp(-mu^2 / (2*tau^2 + 1)) / sqrt(2*tau^2 + 1)))
## [1] 0.1508096