Synopsis

The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also also 1/lambda. Set lambda = 0.2 for all of the simulations. This simulation investigates the distribution of averages of 40 exponential(0.2)s. In this simulation, we investigate the distribution of averages of 40 numbers sampled from exponential distribution with lambda=0.2.

The code below calculates 1000 simulated averages of 40 exponentials:

library(ggplot2)
set.seed(3)
lambda <- 0.2
num_sim <- 1000
sample_size <- 40
sim <- matrix(rexp(num_sim*sample_size, rate=lambda), num_sim, sample_size)
row_means <- rowMeans(sim)

The code for the distribution of the means is as below:

# plotting the histogram of averages:
hist(row_means, breaks=50, prob=TRUE,
     main="Distribution of averages of samples,
     drawn from exponential distribution with lambda=0.2",
     xlab="")
# density of the averages of samples:
lines(density(row_means))
# theoretical center of distribution:
abline(v=1/lambda, col="red")
# theoretical density of the averages of samples:
xfit <- seq(min(row_means), max(row_means), length=100)
yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(sample_size)))
lines(xfit, yfit, pch=22, col="red", lty=2)
# adding legends 
legend('topright', c("simulation", "theoretical"), lty=c(1,2), col=c("black", "red"))

plot of chunk unnamed-chunk-2

Due to the central limit theorem, the averages of samples follow normal distribution. The figure above also shows the density computed using the histogram and the normal density plotted with theoretical mean and variance values. Also, the q-q plot below suggests the normality.

qqnorm(row_means); qqline(row_means)

plot of chunk unnamed-chunk-3

Evaluatation of the coverage of the confidence interval for 1/lambda:

lambda_vals <- seq(4, 6, by=0.01)
coverage <- sapply(lambda_vals, function(lamb) {
  mu_hats <- rowMeans(matrix(rexp(sample_size*num_sim, rate=0.2),
                             num_sim, sample_size))
  ll <- mu_hats - qnorm(0.975) * sqrt(1/lambda**2/sample_size)
  ul <- mu_hats + qnorm(0.975) * sqrt(1/lambda**2/sample_size)
  mean(ll < lamb & ul > lamb)
})

library(ggplot2)
qplot(lambda_vals, coverage) + geom_hline(yintercept=0.95)

plot of chunk unnamed-chunk-4