Overview

The present simulation proves the applicability of the Central Limit Theorem (CLT) to the exponential distribution.

Simulations

This simulation is intended to prove the aplicability of the Central Limit Theorem to an exponential distribution. From wikipedia

In probability theory, the central limit theorem (CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed, regardless of the underlying distribution

The exponential distribution whose density function is:

\[ f(x;\lambda) = \begin{cases} \lambda e^{-\lambda x} \quad x\ge0,\\ 0 \quad \quad x<0. \end{cases} \]

It is elegible to meet the CLT as its mean and its standard deviation are well-defined, both have the same value: \(1/\lambda\).

The simulation consist in:

lambda=0.2
n=40
set.seed(1234)

ed = NULL # exponential distribution variable
nd = NULL # normal distribution variable

for (i in 1 : 1000) {ed = c(ed, mean(rexp(n, lambda)))
nd= c(nd, mean(rnorm(n,mean=1/lambda,sd=1/lambda)))}

Sample Mean versus Theoretical Mean

library(ggplot2)

dat<-data.frame(distribution=factor(c("Exponential")), iid= c(rexp(n, lambda)))

means=data.frame(Means=factor(c("Sample","Theoretical")),values=c(mean(dat$iid),1/lambda))

ggplot(dat, aes(x=iid)) + 
    geom_histogram(aes(y=..density..),      # Histogram with density instead of count on y-axis
                   binwidth=.2,
                   colour="black", fill="white") +
    geom_density(alpha=.2, fill="#FF6666") +  # Overlay with transparent density plot
  geom_vline(data=means, aes(xintercept=values, color=Means), size=1, show.legend =TRUE)

Theoretically, the mean should be \(1/\lambda=5\), however a sample mean is: 4.5592522 as shown in the picture

Sample Variance versus Theoretical Variance

ggplot(dat, aes(x=distribution))+geom_point(aes(y=sd(iid)))+geom_point(aes(x="Normal",y=5))+ ylim(0,6)+ labs(x="Distribution Type",y="Standard Deviation")

Theoretically, the standard deviation should be \(1/\lambda=5\), however the sample standard deviation is: 4.1378964

Distribution

In order to compare the distribrution of means of exponential distribution random variables with the normal distribution, it is the next box plot which is a convenient way of graphically depicting groups of numerical data through their quartiles.

dat<-data.frame(distribution=factor(rep(c("Exponential", "Normal"), each=1000)), iid= c(ed,nd))

 ggplot(dat, aes(x=distribution, y=iid, fill=distribution)) + geom_boxplot() + guides(fill=FALSE) + coord_flip()

As it is shown in the picture, the 1000 means of sets of 40 exponential distribution random variables match fairly good that of 1000 means of sets of 40 normal distribution random variables. Hence might be concluded that the exponential distribution is compliant with the CLT.