In this project, we will simulate the exponential distribution in
R and compare it with the Central Limit Theorem.
The exponential distribution can be simulated in R with
rexp(n, lambda) where lambda is the rate
parameter. The mean of an exponential distribution is
1/lambda and the standard deviation is also
1/lambda. Set lambda = 0.2 for all of the
simulations. You will investigate the distribution of averages of 40
exponentials. Note that you will need to do a thousand simulations.
library("ggplot2")
#set seed for reproducibility
set.seed(54)
#set lambda to 0.2
lambda <- 0.2
#40 samples
n <- 40
#1000 simulations
simulations <- 1000
#simulation
simulated_Exp <- replicate(simulations, rexp(n, lambda))
#mean of simulated exponentials
mean_Exp <- apply(simulated_Exp, 2, mean)
Show the sample mean and compare it to the theoretical mean of the distribution.
#sample mean
Samp_mean <- mean(mean_Exp)
Samp_mean
## [1] 4.969918
#theoretical mean
theor_mean <- 1/lambda
theor_mean
## [1] 5
#plot showing the exponential function simulations of sample means
hist(mean_Exp, xlab = "Means", main = "Exponential Function Simulations of Sample Means", col = "yellow")
abline(v = Samp_mean, col = "red")
abline(v = theor_mean, col = "blue")
The Sample mean is 4.9699178 while the theoretical mean is 5. The center of the distribution of averages for 40 exponentials is very close to the theoretical center of the distribution.
Show how variable the sample is (via variance) and compare it to the theoretical variance of the distribution.
#sample standard deviation
samp_sd <- sd(mean_Exp)
samp_sd
## [1] 0.7826315
#theoretical standard deviation
theor_sd <- (1/lambda)/sqrt(n)
theor_sd
## [1] 0.7905694
#sample variance
samp_var <- samp_sd^2
samp_var
## [1] 0.612512
#theoretical variance
theor_var <- theor_sd^2
theor_var
## [1] 0.625
The sample varies by 0.612512, which is very close to the theoretical variance of 0.625
#distribution density of means
xfit <- seq(min(mean_Exp), max(mean_Exp), length=100)
yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(n)))
hist(mean_Exp,breaks=n, prob=T, col="yellow", xlab = "Means", main="Density Distribution of Means",ylab="Density")
lines(xfit, yfit, pch=22, col="black", lty=1)
#compare the distribution for averages of 40 exponentials to a normal distribution
qqnorm(mean_Exp)
qqline(mean_Exp, col = 2)