Overview

We are going to investigate the exponential distribution and compare it to the Central Limit Theorem via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials.

Setup

We first setup the different parameters as follow:

set.seed(101) # to allow reproducibility
lambda<-0.2
sim<-1000
n<-40

With the parameters, we can create a matrix of 1000 columns (the number of simulation) with 40 rows (number of expenentials values). After that, we can calculate the sample mean and the theoretical mean.

Sample Mean vs Theoretical Mean

We can now plot the histogram of the simulations means and compare the sample mean to the theoretical mean.

hist(simulation.mean, col="blue",main="Sample Mean vs Theoretical Mean")
abline(v=sample.mean,col="red",lwd=4)
abline(v=theoretical.mean,col="green",lwd=4)
legend("topright", c("sample mean","theoretical mean"), col=c("red","green"), lwd=5)

From the histogram, we can see that the theoretical mean (5) and the sample mean (5.0126026) are close which confirm the Central Limit Theorem.

Sample Variance vs Theoretical Variance

We can also calculate the sample and theoretical variance using the formulas below.

sample.variance <-round(var(simulation.mean),4)
theoretical.variance <- round(((1/lambda)/sqrt(n))^2,4)

With this we can confirm that the sapmle variance (0.5985) and the theoretical variance (0.625) are close.

Is the distribution of means normal?

Let’s check if the distribution of the mean is approximately normal as stated by the Central Limit Theorem.

To do so, we can use the Q-Q diagram which compares the position of the theoretical quantiles of the chosen distribution (here the normal distribution) against those of the observed population (simulation mean).

qqnorm(simulation.mean)
qqline(simulation.mean)

We can see on the Q-Q plot that it folows the line and that the devication are minimal which is a strong indication of normal distribution.