Overview

This experiment demonstrates the Central Limit Theorem by simulating the exponential distribution in R. The Central Limit Theorem states that if we sample from a population, the mean of the samples will approach a normal distribution as the sample size increases.

Simulations

The following R code sets a starting point for the pseudo random number generator (the seed) for reproducibility. It then runs one thousand simulations, each creating an exponential distribution of forty observations, with the rate parameter set to 0.2. The mean of each distribution is saved, resulting in a vector of 1000 means.

set.seed(1)
lambda <- .2
n <- 40
values <- NULL
for (i in 1:1000) values = c(values, mean(rexp(n, lambda)))

Sample Mean versus Theoretical Mean

sample_mean <- mean(values)
theoretical_mean <- 1 / lambda

The sample mean (the average of the 1000 simulation means) was 4.9900252, which is very close to the theoretical mean of the exponential distribution: 5. This is shown in the figures below.

Sample Variance versus Theoretical Variance

sample_var <- var(values)
theoretical_var <-  (1 / lambda) ^ 2 / n

The sample variance (the variance of the 1000 simulation means) was 0.6111165, which is very close to the theoretical variance of the exponential distribution: 0.625.

##             Sample Theoretical
## Mean     4.9900252   5.0000000
## Variance 0.6111165   0.6250000

Distribution

As predicted by the Central Limit Theorem, the distribution of the sample means is appoximately normal. See figure below.

m <- sample_mean
std <- sqrt(sample_var)
hist(values, density=20, breaks=20, prob=TRUE, 
     xlab="values", ylim=c(0, 0.7), 
     main="Normal Curve Over Histogram")
curve(dnorm(x, mean=m, sd=std), 
col="darkblue", lwd=2, add=TRUE)