In this project, I investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution is simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda the standard deviation is also 1/lambda. I Set lambda = 0.2 for all of the simulations.
First of all, always set seed for any simulation experiments. This is for reproducible research. Then I set up the basic variables.
In this section, I take the size of each sample to be 40 and iterate 1000 simulations. The mean and standard deviation of each sample are stored in 2 varibles named sampleMean and sampleSd respectively
set.seed(1)
lambda = 0.2
sampleSize = 40
theoryMean = 1 / lambda
theorySd = 1 / lambda
simulation = 1000
sampleMean = NULL
sampleSd = NULL
i = 1L
for (i in 1:simulation)
{
random = rexp(sampleSize,lambda)
sampleMean = c(mean(random), sampleMean)
sampleSd = c(sd(random), sampleSd)
}
hist(sampleMean, main = "distribution of mean value of 1000 samples")
abline(v=theoryMean,col="red",lwd=10)
hist(sampleSd, main = "distribution of sd value of 1000 samples")
abline(v=theorySd,col="red",lwd=10)
Here is a table summarising the key statistics
x_hat = mean(sampleMean)
s = mean(sampleSd)
rbind(c("theoretical mean", theoryMean),
c("sample mean", x_hat),
c("theoretical sd", theorySd),
c("sample sd", s))
## [,1] [,2]
## [1,] "theoretical mean" "5"
## [2,] "sample mean" "4.99002520077716"
## [3,] "theoretical sd" "5"
## [4,] "sample sd" "4.89577686514373"
mar =c(1,1,1,1)
par(mfcol = c(1,2))
set.seed(1)
hist(rexp(1000,lambda),main = "1000 random exponentials")
hist(sampleMean, main = "1000 means of random exponentials")