Overview

The aim of this project is to investigate the exponential distribution and compare it with the Central Limit Theorem (CLT). Here, lambda will be set to 0.2 for all simulations. This project will investigate the distribution of averages of 40 exponentials over a thousand simulations. Setting the values:

n <- 40
nosim <- 1000
lambda <- 0.2
expdata <- matrix(rexp(n*nosim, lambda), nosim, n)

1. Show the sample mean and compare it to the theoretical mean of the distribution.

Sample mean:

expmean <- apply(expdata, 1, mean)
samplemean <- mean(expmean)
round(samplemean, 2)
## [1] 5.05

Theoretical mean:

1/lambda
## [1] 5

Comparison:

We see that the sample mean and theoretical mean are pretty close to each other.

hist(expmean, main = "1000 averages of 40 exponentials", xlab = "Average of 40 exponentials", breaks = 20)
abline(v = samplemean, col = "red", lwd = 2)
legend("topright", legend = "Sample mean", col = "red", cex = 0.8, lwd = 2)

2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.

Formula for theoretical variance of sample means is 1/((lambda^2)*n).

expvar <- var(expmean)
round(expvar, 2)
## [1] 0.63
theovar <- 1/((lambda^2)*n)
round(theovar, 2)
## [1] 0.62

We see that the variance of sample means and theoretical variance are close to each other.

3. Show that the distribution is approximately normal.

x <- seq(min(expmean), max(expmean), length = 2*n)
y <- dnorm(x, mean = 1/lambda, sd = 1/(lambda*sqrt(n)))
hist(expmean, main = "1000 averages of 40 exponentials", xlab = "Average of 40 exponentials", breaks = 20, prob = T, ylim = c(0, max(y, expmean/length(expmean))))
lines(x, y, pch = 22, col="magenta", lwd = 2, lty = 1)
legend("topright", legend = "Normal density", col = "magenta", cex = 0.8, lwd = 2, lty = 1)

From the plot, we see that the normalized distribution of sample means is approximately the same as the standard normal distribution. This is consistent with the statement of the Central Limit Theorem (CLT).