In this project, we will simulate exponential distribution and verify the central limit theorem: The mean of the sampling distribution of mean converges to the theoretical mean of the distribution. Similarly, the variance of the sampling distribution of means converges to \(\sigma^2/n\).
The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. Given that lambda = 0.2, the theoretical mean and standard deviation are both 5.
The R command rexp(n,lambda) gives exponential distribution for given value of sample size n and rate parameter lambda. We will perform one thousand simulations each with sample size n=40 and lambda=0.2 and show that the mean of sampling distribution converge to 5 whereas variance converges to \(\sigma^2/n\).
We will find mean of 40 samples that comes from exponential distribution with mean 5 (lambda=0.2) using mean(rexp(40,0.2)), and repeat this process over and over again upto 1000 times. We then draw the histogram of the distribution of 1000 sample means and also locate the mean of the distribution.
mns = NULL
for (i in 1 : 1000) mns = c(mns, mean(rexp(40,.2)))
hist(mns, xlab = "Sample mean", main = "Distribution of Sample means")
abline(v=mean(mns), col="red", lwd=2)
The theoretical mean is 5, and below is the mean of sampling distribution of sample means that approaches 5.
mean(mns)
## [1] 5.013422
The theoretical standard deviation is 5, so that the theoretical variance is \(\sigma^2=25\). By theory, the variance of the sampling distribution of mean is \(\sigma^2/n\). Below is the computation of variance of sample means that is approximately equal to \(\sigma^2/n\).
var(mns)
## [1] 0.6661614
5^2/40
## [1] 0.625
The central limit theorem says that the sampling distribution of sample means is approximately normal. We will compare the distribution with the normal distribution. The histogram and the density plot of normal distribution clearly supports that the sampling distribution of sample means is approximately normal.
hist(mns,xlab = "Sample mean", main = "Comparison with normal distribution", probability = TRUE)
x<-seq(0,100,0.01)
curve(dnorm(x, mean = mean(mns), sd=sd(mns)), col="blue", lwd=2, add = TRUE)