Synopsis

Exponential distribution has a mean and standard deviation as the inverse of its rate.

The central limit theorem (CLT) states that if you have a population with mean mu and standard deviation sigma and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed with mean mu and standard deviation sigma/n0.5, where n is the number of samples whose mean is taken each time.

The Central Limit Theorem is tested against an exponential distribution. 1000 simulations of sample size 40 were run.

Importing libraries

library(knitr)
library(ggplot2)

Simulation of exponential distribution

The aim is to obtain 1000 values of mean of 40 random generations from exponential distribution with a rate of 0.2. For that we need to generate 1000*40 i.e. 40,000 random exponential simulations. The exponential distribution has a mean and standard deviation of 1/0.2 = 5.

set.seed(1994)

nosim  <- 1000  # number of simulated averages
n <- 40         # sample size for each mean
lambda <- 0.2   # rate parameter
expsim  <- data.frame( x = rexp(nosim * n, lambda)) # data frame with simulated values
expmns <- data.frame(x = apply(matrix(expsim$x,nosim), 1, mean))
# data frame with means of simulated values

ggplot(data = expsim, aes(x)) +
geom_histogram(binwidth = .8, color = "black", fill = "red", aes(y = ..density..)) +
stat_function(fun = function(x){dexp(x = x, rate = lambda)}, size = 2) +
labs(title = "Exponential Distribution( Rate = 0.2 )")

Comparing mean and variance of sample and population

mean(expmns$x)
## [1] 5.011837

The mean of sample mean distribution is 5.012, which is very close to population mean of 1/lambda = 1/0.2 = 5

var(expmns$x)
## [1] 0.6076351

The variance of sample mean distribution is 0.608, which is very close to population variance is sigma2/n = 1/lambda2n = 1/(0.22 * 40) = 0.625

Applying Central Limit Theorem

According to CLT, the distribution of sample means will approximately normally distributed with a mean of 5 and a standard deviation of 5/400.5.

mu    <- 1/lambda
sigma <- 1/lambda

ggplot(expmns, aes(x = x)) + 
geom_histogram(binwidth=0.1, colour = "black", fill = "blue", aes(y = ..density..)) +
stat_function(fun = dnorm, args = list(mean = mu, sd = sigma/sqrt(n)), size = 2) +
labs(title = "Distribution of Sample Means", x = "Mean", y = "Density")

Conclusion

It can be concluded that sample means are approximately normally distributed with mean mu and standard deviation sigma/n0.5, as stated by the Central Limit Theorem.