In this project you will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Set lambda = 0.2 for all of the simulations. You will investigate the distribution of averages of 40 exponentials. Note that you will need to do a thousand simulations.
Illustrate via simulation and associated explanatory text the properties of the distribution of the mean of 40 exponentials. You should 1. Show the sample mean and compare it to the theoretical mean of the distribution. 2. Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution. 3. Show that the distribution is approximately normal.
library(ggplot2)
# lambda is 0.2
lambda = 0.2
# we will be using 40 exponentials
n = 40
# we will be running 1000 simulations
nsims = 1:1000
# set a seed to reproduce the data
set.seed(876)
# gather the means
means <- data.frame(x = sapply(nsims, function(x) {mean(rexp(n, lambda))}))
# lets take a looks at the top means
head(means)
## x
## 1 4.129788
## 2 5.302187
## 3 4.699978
## 4 4.332842
## 5 5.726389
## 6 4.387737
## [1] 5
## [1] 4.991893
## [1] 0.7905694
## [1] 0.625
## [1] 0.78538
## [1] 0.6168217
It can be easily observed that the Central Limit Theory is working to make the actual data follow a normal curve by observing the shape of the actual data on the graph shown in blue as compared to the normal curve shown in red.