In this document, we are going to show the comparison between exponential distribution and its central limit. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. We will investigate the distribution of averages of 40 exponentials and do a thousand simulations.
Setting the seed for the pseudo-random number generator and also ensure reproducible simulation.
library(ggplot2)
set.seed(19830)
data = replicate(1000,mean(rexp(40,0.2)))
df = data.frame(data)
colnames(df) <- c("mean")
p <- ggplot(df,aes(x=mean))+labs(x="",y="")+geom_histogram(colour="black", fill="white")+ggtitle("Exponential Distribution")
p
The theoretical mean of exponential distribution is
1/0.2
## [1] 5
The sample mean of the simulated data is
mean(data)
## [1] 4.992918
p + geom_vline(aes(xintercept=mean(data)), color="red", linetype="dashed", size=1)+
geom_vline(aes(xintercept=1/0.2), color="blue", linetype="dashed", size=1)
The theoretical variance is
1/(0.2*sqrt(40))
## [1] 0.7905694
The sample variance of the simulated data is
var(data)
## [1] 0.6495278
We can apply qqplot to show how close the simulated distribution and normal distribution. Normal distribution appears to be good approximation to simulated exponential distribution due to central limit theorem.
qqnorm(data)
qqline(data)