Investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. Investigate the distribution of averages of 40 exponentials. Perform a thousand simulations.
library(ggplot2)
set.seed(10)
lambda <- 0.2
n <- 40
sims <- 1000
#Perform 1000 simulations of the exponential distribution
exponentials <- replicate(sims, rexp(n, lambda))
#Calculate the mean of each simulation
means <- apply(exponentials, 2, mean)
Calculate mean of of simulated distribution means.
dist_mean <- mean(means)
dist_mean
## [1] 5.04506
Calculate theoretical mean via the equation: 1/lambda
theo_mean <- 1/lambda
theo_mean
## [1] 5
The mean of the simulated distribution means is a good approximation of the theoretical mean. Both appear at the center of the following histogram of the simulated distribution means. The red line represents the theoretical means, while the yellow line represents the mean of the simulated distribution means.
Calculate standard deviation and variance of simulated distribution means.
dist_stdev <- sd(means)
dist_var <- var(means)
dist_stdev
## [1] 0.7982821
dist_var
## [1] 0.6372544
Calculate theoretical standard deviation via the equation: (1/lambda)/sqrt(n)
theo_stdev <- (1/lambda)/sqrt(40)
theo_var <- theo_stdev^2
theo_stdev
## [1] 0.7905694
theo_var
## [1] 0.625
The values calculated from the simulated distribution means closely match the theoretical values.
As shown in the above plot, the distribution of sample means from simulated exponential distributions (represented by the red line) is very close to a normal distribution (represented by the yellow line). With additional simulations, the two lines would converge.