Overview

The exponential distribution, rexp(n, lambda), is compared with central limit theorem. The mean and standard deviation of exponential distribution is 1/lambda. Setting lambda = 0.2 for all of the simulations, we will investigate the distribution of averages of 40 exponentials, in a thousand simulations.

Simulations

Setting seed for replication, mean /sd (lambda =0.2), number of exponentials (n = 40), and number of simulations (nosim = 1000) are defined. Simulation is generated thru simul and thereafter, mean is calculated as simul_mean.

set.seed(1)
lambda <- 0.2
n <- 40
nosim <- 1000

simul <- replicate(nosim, rexp(n, lambda))
simul_mean <- apply(simul, 2, mean)

Sample Mean versus Theoretical Mean

Thru calculating both theoritical and analytical means and comparing them in the histogram as below, we find that they are very close and almost equal.

analytical_mean <- mean(simul_mean)
analytical_mean
## [1] 4.990025
theoritical_mean <- 1/lambda
theoritical_mean
## [1] 5
hist(simul_mean, xlab = "mean", main = "Exponential Distribution")
abline(v = analytical_mean, col = "blue", lwd = 2)
abline(v = theoritical_mean, col = "red", lwd = 2)

Figure-1: distribution of the means of the simulation, which roughly looks to follows normal distribution. The analytical mean (colored blue) and the theoritical mean (colored red) are approximately coinciding as the red vertical line at x = 5.

Therefore, the centre of distribution is almost identical with the theoritical centre of distribution.

Sample Variance versus Theoretical Variance

The sample and the theoritical variances are found very close at around 0.62.

analytical_sd <- sd(simul_mean)
analytical_sd
## [1] 0.7817394
analytical_var <- sd(simul_mean)^2
analytical_var
## [1] 0.6111165
theoritical_sd <- 1/lambda/sqrt(n)
theoritical_sd
## [1] 0.7905694
theoritical_var <- (1/lambda/sqrt(n))^2
theoritical_var
## [1] 0.625

Distribution - whether Normal?

creating a normal distribution thru dnorm formula and plotting it as the ‘red’ line over the sample distributions of exponentials ( the blue bars in the histogram), it shows that these distributions are very very closely approximated.

sample <- seq(min(simul_mean), max(simul_mean), length = nosim)
sample_normal <- dnorm(sample, mean = 1/lambda, sd = (1/lambda/sqrt(n)))

hist(simul_mean, breaks = n, prob = T, col = "lightblue", xlab = "means", ylab = "density", main = "Theoritical Normal and Actual Distribution")
lines(sample, sample_normal, col = "red")

Figure-2: Comparison of simul_means (our sample distribution of means of exponentials) with a normal distribution shows that they look very similar.

qqnorm(simul_mean)
qqline(simul_mean, col = 2)

Figure -3: as also evident from the NOrmal Q-Q plot, the sample distribution of means of exponentials is closely approximated to a normal distribution.