The exponential distribution describes the time between events in a process where such events occur independently and at a constant average rate (a “Poisson process”). Lambda, the single parameter of the distribution, defines the average rate of the events. The mean and variance of this distribution are both equal to 1/lambda.
In R, we can obtain n samples from an exponential distribution with the following code:
rexp(5,0.2)
## [1] 1.0156533 0.4406902 8.5772410 1.5403781 2.5494587
For a larger we need tidyverse too
library(tidyverse)
large <- rexp(100000,0.55)
summary(large)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000008 0.523749 1.264569 1.824301 2.540215 18.985635
sd(large)
## [1] 1.819019
qplot(large,col="black", xlab = "large", ylab = "count")
In this exercise, we investigate the Central Limit Theorem by simulating 1000 independent exponential distributions (with identical lambda) and calculating their mean. These 1000 means form a distribution themselves. By the CLT, as the sample size of this new distribution increases, the following things should happen:
n <- 40
lambda <- 0.2
mean_expected <- 1/lambda
sd_expected <- 1/(lambda*sqrt(n))
c(mean_expected, sd_expected)
## [1] 5.0000000 0.7905694
means <- NULL
for(i in 1:1000) means = c(means,mean(rexp(40,0.2)))
qplot(means, col="black",xlab="means", ylab="count")
The mean seems to be around 5, lets check it
round(c(mean_expected, mean(means)),2)
## [1] 5.00 5.03
round(c(sd_expected, sd(means)),2)
## [1] 0.79 0.77
After looking at those close numbers we conclude that CLT works fine.