Overview

Investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. Investigate the distribution of averages of 40 exponentials. Perform a thousand simulations.

Load required libraries

library(ggplot2)


Initialize variables and simulate data

set.seed(10)

lambda <- 0.2

n <- 40

sims <- 1000

#Perform 1000 simulations of the exponential distribution
exponentials <- replicate(sims, rexp(n, lambda))

#Calculate the mean of each simulation
means <- apply(exponentials, 2, mean)


Compare sample mean to theoretical mean

Calculate mean of of simulated distribution means.

dist_mean <- mean(means)
dist_mean
## [1] 5.04506

Calculate theoretical mean via the equation: 1/lambda

theo_mean <- 1/lambda
theo_mean
## [1] 5

The mean of the simulated distribution means is a good approximation of the theoretical mean. Both appear at the center of the following histogram of the simulated distribution means. The red line represents the theoretical means, while the yellow line represents the mean of the simulated distribution means.


Compare sample variance to theoretical variance

Calculate standard deviation and variance of simulated distribution means.

dist_stdev <- sd(means)
dist_var <- var(means)

dist_stdev
## [1] 0.7982821
dist_var
## [1] 0.6372544

Calculate theoretical standard deviation via the equation: (1/lambda)/sqrt(n)

theo_stdev <- (1/lambda)/sqrt(40)
theo_var <- theo_stdev^2

theo_stdev
## [1] 0.7905694
theo_var
## [1] 0.625

The values calculated from the simulated distribution means closely match the theoretical values.

The distribution of sample means

As shown in the above plot, the distribution of sample means from simulated exponential distributions (represented by the red line) is very close to a normal distribution (represented by the yellow line). With additional simulations, the two lines would converge.