In this project we have investigated the exponential distribution in R and compared it with the Central Limit Theorem. The exponential distribution has been simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. The distribution of averages of 40 exponentials have been investigated. A thousand simulations have been run to arrive at the results.
#Loading libraries necessary for the analysis
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.5.1
#Initialising data
set.seed(170388)
lambda <- 0.2
n <- 40
size <- 1000
#Creating the sample
sample <- replicate(size, rexp(n, lambda))
#Calculating mean for each row in the sample
means <- apply(sample, 2, mean)
sample_mean <- round(mean(means),3)
original_mean <- round(1/lambda, 3)
The sample mean is 4.98 and the original mean is 5. The values are almost equal to each other.
sample_variance <- round(var(means),3)
original_variance <- round((1/lambda)^2/n,3)
The sample variance is 0.655 and the original variance is 0.625.
sample_sd <- round(sd(means),3)
original_sd <- round(sqrt(original_variance),3)
The sample standard deviation is 0.809 and the original standard deviation is 0.791. The values are almost equal to each other.
The plot below shows the distribution of the 40 sample means.It also shows the sample mean and the original mean (calculated by exponential distribution formula)
The plot below shows the density distribution of 40 sample means and compares it with the normal distribution to test the Central Limit Theorem.
The distribution above shows normally distributed sample means
sample_confinterval <- round(sample_mean + c(-1,1)*1.96*sample_sd/sqrt(n),2)
original_confinterval <- original_mean + c(-1,1)*1.96*sqrt(original_variance)/sqrt(n)
The sample confidence interval is (4.73, 5.23) and the theoretical confidence interval is (4.755, 5.245).