\(1\). Write a Monte Carlo Simulation program to approximate the value of the constant π by generating points uniformly distributed in \([−1,1]^2\) and counting the number of points that fall within the unit circle (radius \(1\), centered at \((0,0)\)). Report your results for sample sizes \(n = 100, 1000, 10000\). Find the estimated variance of each \(\hat{\pi}\) estimate. Define the functions \(h(x)\) and \(f(x)\) you used in your estimate
\[\pi=E\left[ h(X)\right] = \int h\left( x \right) f\left(x \right)dx\]
First define \(h(x)\) and \(f(x)\)
\[h(x)=4* \mathbf{1}_{\left\{ d1^2 + d2^2 < 1 \right\}}, \hspace{1cm} \text{where $d1$ and $d2$ $\in$ [-1,1]}\]
\[f(x) = \left\{ \begin{array}{ll} \frac{1}{4} \hspace{1cm} -1 \le d1 \le 1 , \space \space -1 \le d2 \le 1 \\ 0 \hspace{1cm} else \end{array} \right.\]
set.seed(12)
estimates <- function(power){
N <- 10^power
x<- numeric()
p<- numeric()
x[1]<- 0
for (n in 1:N){
d1 <- runif(1, -1, 1) # the first sample from uniform -1,...,1
d2 <- runif(1, -1,1) # the second sample from uniform -1,...,1
unit <- d1^2 + d2^2 # Pythagoras' Theorem #
x[n+1] <- (x[n]+(unit<1)) # count number of times the points fall inside the unit circle #
p[n]<- 4*x[n+1]/n # compute empirical pi #
}
plot(p, type = "l", ylab = "simulation ", xlab = "number of random unifrom[-1,1]")
abline(h=pi, col = "red")
mu_hat <- mean(p) # compute the estimated value of pi #
var_hat <- sum((p-mu_hat)^2) / (N*(N-1)) # compute the estimated variance of pi #
table <- as.data.frame(c(mu_hat, var_hat))
rownames(table) <- c("estimated of pi", "estiamted variance of pi")
colnames(table) <- c("Values")
print(table)
}
# compute estimate values of pi and estimate variance of pi when n varies #
# for n=100 #
estimates(2)
## Values
## estimated of pi 3.263835682
## estiamted variance of pi 0.002765397
# for n=1000 #
estimates(3)
## Values
## estimated of pi 3.0769379997
## estiamted variance of pi 0.0000157039
# for n=100000 #
estimates(5)
## Values
## estimated of pi 3.137837e+00
## estiamted variance of pi 3.355179e-09
\(2\). Suppose we want to find the value of the integral
\[\mu=\int^5_{-10} \frac{1}{1+\theta^2} exp\left( -\frac{1}{2} \left( \theta-5 \right)^2 \right) d\theta\]
\((a)\) Write an R-function that approximate the value of μ through Monte Carlo integration using a random sample of size n from the Cauchy distribution
\[f(x) = \frac{1}{\pi} \frac{1}{1+x^2}, \hspace{1cm} -\infty < x < \infty\]
Use your function to estimate the value of the intergral for values \(n = 10^3, \space 10^4, \space 10^5, \space 10^6, \space 10^7, \space 10^8\).
Also provide an estimate of the uncertainty of the approximation (variance of the estimate) for each sample size.
\[h(x)=\pi exp(-\frac{1}{2} (x-5)^2) \mathbf{1}_{-10\le x \le5}\] \[f(x) =\frac{1}{\pi} \frac{1}{1+x^2} \hspace{1cm} \text{$-\infty \le x \le \infty$}\]
h <- function(x){ifelse(x<=5 & x>=-10, pi*exp(-0.5*(x-5)^2), 0)} # defining the h-function
estimates <- function(power){
n <- 10^power
set.seed(15)
X <- rcauchy(n)
mu.hat <- mean(h(X)) # compute the estimated values of mu #
var.hat <- sum((h(X)-mu.hat)^2) / (n*(n-1)) # compute the estimated variances of mu #
return(c(mu.hat, var.hat))
}
# compute the estimated values of mu and the estimated variances of mu when n varies #
est <- matrix(, nrow=2, ncol=6)
colnames(est) <- c("n=10^3", "n=10^4", "n=10^5", "n=10^6", "n=10^7", "n=10^8")
rownames(est) <- c("mu_hat", "var_hat")
for ( i in 2:7){
est[,i-1] <- estimates(i+1)
est_cauchy <- as.data.frame(t(est))
}
est_cauchy
## mu_hat var_hat
## n=10^3 0.05792099 1.004299e-04
## n=10^4 0.07198648 1.331973e-05
## n=10^5 0.07230242 1.343482e-06
## n=10^6 0.07203166 1.334564e-07
## n=10^7 0.07212664 1.334616e-08
## n=10^8 0.07212863 1.334056e-09
\((b)\) Write an R-function that approximates the value of μ through Monte Carlo integration using a random sample of size n from a Normal distribution. Use your function to estimate the value of the integral for values
\[n = 10^3, \space 10^4, \space 10^5, \space 10^6, \space 10^7, \space 10^8.\]
Also provide an estimate of the uncertainty of the approximation (variance of the estimate) for each sample size.
\[h(x)=\frac{\sqrt{2\pi}}{1+x^2} \mathbf{1}_{-10\le x \le5}\] \[f(x) =\frac{1}{\sqrt(2\pi)} exp(-\frac{1}{2} (x-5)^2) \hspace{1cm} \text{$-\infty \le x \le \infty$}\]
h <- function(x){ifelse(x<=5 & x>=-10, sqrt(2*pi)/(1+x^2), 0)} # defining the h-function
estimates <- function(power){
n <- 10^power
set.seed(15)
X <- rnorm(n, 5, 1)
mu.hat <- mean(h(X))# compute the estimated values of mu #
var.hat <- sum((h(X)-mu.hat)^2) / (n*(n-1)) # compute the estimated variances of mu #
return(c(mu.hat, var.hat))
}
# compute the estimated values of mu and the estimated variances of mu when n varies #
est <- matrix(, nrow=2, ncol=6)
colnames(est) <- c("n=10^3", "n=10^4", "n=10^5", "n=10^6", "n=10^7", "n=10^8")
rownames(est) <- c("mu_hat", "var_hat")
for ( i in 2:7){
est[,i-1] <- estimates(i+1)
est_normal <- as.data.frame(t(est))
}
est_normal
## mu_hat var_hat
## n=10^3 0.07112256 9.124895e-06
## n=10^4 0.07279729 7.155691e-07
## n=10^5 0.07202128 6.866685e-08
## n=10^6 0.07210907 6.881617e-09
## n=10^7 0.07212751 6.885426e-10
## n=10^8 0.07211416 6.883729e-11
\((c)\) Visualize your collected results in one plot (using ggplot2). Create a line graph of the estimates \(\hat{mu}\) as a function of sample size \(n\). Use a logarithmic axis for the sample size. Use different colors for the lines corresponding to the two methods (from \((a)\) and \((b)\)). Add error bars to the lines (in the same color as the line) to visualize the uncertainty estimates. Let the uncertainty be defined as the standard error (square root of estimated variance) of the estimate. Your finished plot should look similar to the ones shown here: http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/
# create the data set for the line plot #
sample_size <- c(10^3, 10^4, 10^5, 10^6, 10^7, 10^8)
cauchy_tab <- as.data.frame(cbind( sample_size,est_cauchy))
normal_tab <- as.data.frame(cbind(sample_size, est_normal))
sum_tab <- rbind(cauchy_tab, normal_tab)
rownames(sum_tab) <- 1:12
Subject <- c(rep("Cauchy", 6), rep("Normal",6)) # factor the subjects #
sum_tab <- as.data.frame(cbind(Subject, sum_tab))
sum_tab$Subject <- factor(sum_tab$Subject)
sum_tab
## Subject sample_size mu_hat var_hat
## 1 Cauchy 1e+03 0.05792099 1.004299e-04
## 2 Cauchy 1e+04 0.07198648 1.331973e-05
## 3 Cauchy 1e+05 0.07230242 1.343482e-06
## 4 Cauchy 1e+06 0.07203166 1.334564e-07
## 5 Cauchy 1e+07 0.07212664 1.334616e-08
## 6 Cauchy 1e+08 0.07212863 1.334056e-09
## 7 Normal 1e+03 0.07112256 9.124895e-06
## 8 Normal 1e+04 0.07279729 7.155691e-07
## 9 Normal 1e+05 0.07202128 6.866685e-08
## 10 Normal 1e+06 0.07210907 6.881617e-09
## 11 Normal 1e+07 0.07212751 6.885426e-10
## 12 Normal 1e+08 0.07211416 6.883729e-11
# line graph of the estimates mu hat as a function of sample size n #
library(ggplot2)
ggplot(sum_tab, aes(x=sample_size, y=mu_hat, color=Subject)) + geom_errorbar(aes(ymin=mu_hat-sqrt(var_hat), ymax=mu_hat+sqrt(var_hat)), width=.1)+geom_line() +geom_point()+scale_x_log10()+labs(x = "Sample Size", y = "Estimated of Mu Hat", title = "The Change of Estimated of Mu Hat")