Testing CLT’s (Central Limit Theorem) consistency of expected values through the following experiment/process:

  1. Generating 1000 simulations of i.i.d. exponential distributions, each distribution having 40 observations (n=40) and a rate of 0.2 (\(\lambda\)=0.2).
  2. Each distribution will be reduced to its mean, yielding a vector of 1000 averages, from which we’ll extract sample mean and variance values.

As the CLT states, the distribution of these 1000 averages is approximately \(N(\mu, \sigma^2 / n)\). For an exponential distribution, \(\mu=\sigma=\lambda^{-1}\).

   #Parameters & theoretical Values setting:
simulations <- 1000         ; n        <- 40               ; lambda <- .2
theorMean   <- 1/lambda     ; theorVar <- (1/lambda)^2/n
   #Simulation:
set.seed(0) #Ensuring reproducibility
sample      <- replicate(simulations,mean(rexp(n,lambda)))
   #Results:
sampleMean  <- mean(sample) ; sampleVar   <- var(sample)

Expected Value Visualization

Variance will be visualized through unitary standard deviations (dotted line) from mean (solid line):

These previous results (points 1 & 2 of the assignment) already show how closely approximated the theoretical & sample values are. To further verify the similarity of distributions (and to address point 3), two overlaid plots of both the sample distribution and the theoretical normal distribution will be displayed. Also, a Quantile-Quantile Plot of the sample will be displayed, comparing the sample to a normal distribution.

Thanks to R’s randomization functions, we’ve been able to prove the Central Limit Theorem to be true. The distribution of 1000 averages of exponential observations clearly shows harmony with CLT’s statement. Further research on randomization algorithms to be performed.

Code for plots

par(mfrow=c(1,2),mar=c(2,3.8,2,1),cex.main=.8)

hist(sample,freq=F,breaks=40,main="Theoretical Values",
                   col = "aquamarine3",xlab = NULL,border="aquamarine3")
        abline(v=theorMean,col="gold",lwd=4)
        abline(v=c(theorMean-sqrt(theorVar),theorMean+sqrt(theorVar)),col="gold",lwd=4,lty=2)
        
hist(sample,freq=F,breaks=40,main="Sample Values",
                   col = "aquamarine3",xlab = NULL,border="aquamarine3")
        abline(v=sampleMean,col="gold",lwd=4)
        abline(v=c(sampleMean-sqrt(sampleVar),sampleMean+sqrt(sampleVar)),col="gold",lwd=4,lty=2)
par(mfrow=c(2,1),mar=c(4,5,3,3))
        
xnorm <- seq(min(sample), max(sample), length=100)
ynorm <- dnorm(xnorm, mean=1/lambda, sd=((1/lambda)/sqrt(n)))

hist(sample,breaks=40,freq=F,col="aquamarine3", border="aquamarine3",
            xlab ="Average Values",main="Normal Overlay")
        lines(xnorm, ynorm, pch=22, col="red", lty=5, lwd=3)

qqnorm(sample)
                qqline(sample, col = 2)