Overview

This report mainly focus on running a random simulation and compare the statiscs of the sample (mean, variance) with the thoretical statistics (mean, variance) # Simulations of exponential distribution In this section, we will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is \(\dfrac{1}{\lambda}\) and the standard deviation is also \(\dfrac{1}{\lambda}\). Set \(\lambda=0.2\) for all of the simulations. we will investigate the distribution of averages of \(n=40\) exponentials. Note that we will do a \(B=2000\) simulations.

set.seed(1001)
M <- 40
B <- 1000
lambda <- 0.2
data <- matrix(0,B,M)
for (i in 1:B) {
    data[i,] <- rexp(M,lambda)
}

Sample Mean versus Theoretical Mean

First of all, We compute our sample mean from simiulation by the following code:

single.mean <- numeric()
for (i in 1:B){
  single.mean[i] <- mean(data[i,])
}
sample.mean <- mean(single.mean)
theor.mean <- 1/lambda
cbind(sample.mean, theor.mean)
##      sample.mean theor.mean
## [1,]    4.973765          5

The simulation result shows the real mean of \(4.973765\) is close to the theoretical mean of \(\dfrac{1}{\lambda}=5\).

Now we can compare them by plotting the histogram with the sample mean and theoretical mean as follow.

hist(single.mean, xlab="Sample Mean", ylab = "Freq", main="Histogram of Sample Means")
abline(v=sample.mean, col = "red", lty = 1, lwd = 2)
abline(v=1/lambda, col = "blue", lty =2, lwd = 2)
legend('topright', c("Sample Mean", "Theoretical Mean"), bty = "n", lty = c(1,2), lwd=c(2,2), col = c(col = "red", col = "blue"))

From the plot we can observe the sample mean is approximately bell-shape, and the mean of the sample mean is very close to the theoretical mean. # Sample Variance versus Theoretical Variance First of all, We compute our sample variance from simiulation by the following code:

sample.var <- var(single.mean)
theor.var <- ((1/lambda)^2)/M
cbind(sample.var, theor.var)
##      sample.var theor.var
## [1,]  0.6220001     0.625

The simulation results show that the real variance of \(0.6220001\) is very close to the theoretical variance of sample mean \(\dfrac{\sigma^2}{n}=\dfrac{1}{\lambda^2 \cdot n}=0.625\).

Now we can compare the result with the thoretical, we can plot the histogram with the fitted line.

hist(single.mean, prob = TRUE, main = "Exponential Distribution for n=40 B = 2000", xlab = "Sample mean", ylab="Freq")
lines(density(single.mean),lty=1, col="red",lwd=2)
abline(v=sample.mean, col = "red", lty = 1, lwd = 2)
abline(v = 1/lambda, col = "blue",lty=2,lwd=2)
xfit <- seq(min(single.mean), max(single.mean), length = 100)
yfit <- dnorm(xfit, mean = 1/lambda, sd = (1/lambda/sqrt(M)))
lines(xfit, yfit, pch = 22, col = "blue", lty = 2,lwd=2)
legend('topright', c("Simulated Values", "Theoretical Values"), bty = "n", lty = c(1,2), lwd = c(2,2), col = c("red", "blue"))

The distribution of sample mean is approximately normal

Due to the central limit theorem, the averages of samples follow normal distribution. The figure above also shows the density computed using the histogram and the normal density plotted with theoretical mean and variance values. Also, the q-q plot below suggests the normality. The theoretical quantiles again match closely with the actual quantiles. The above methods of comparison prove that the distribution is approximately normal.

qqnorm(single.mean, col='red',lty = 1,lwd=1) 
qqline(single.mean, col='blue',lty = 2,lwd=2);
legend('topleft', c("qqnorm", "qqline"),bty = "n", lty = c(1,2), lwd = c(1,2), col = c("red", "blue"))