In this project we will be comparing Exponential Distribution in R and the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. For the purpose of our comparsion, we will
To ensure reproducability, we will set the seed to 99.
# Set seed
set.seed(99)
# set lambda to 0.2
lambda <- 0.2
# set sample size to 40
sample_size <- 40
# execute 1000 simulations
simulations <- 1000
# begin simulations
simulations_results <- replicate(simulations, rexp(sample_size, lambda))
# now calculate the mean for the results
mean_results <- apply(simulations_results, 2, mean)
Now that we have results from Exponential Distribution, lets compare the actual mean with the theorical
# actual mean from exponential results
actual_expo_mean <- mean(mean_results)
actual_expo_mean
## [1] 5.014808
# theorical mean of exponential distribution is 1/lambda
theorical_mean <- 1/lambda
theorical_mean
## [1] 5
# note red line is the actual mean, blue is the theorical mean
hist(mean_results, xlab = "mean", main = "Exponential Results with Theorical and Actual mean")
abline(v = actual_expo_mean, col = "red")
abline(v = theorical_mean, col = "blue")
The above graph shows that the Theorical mean is 5 where Actual mean is 5.0148085.
Now, lets take a look at the variance of the results
# standard deviation of the exponential results
sd_exp_results <- sd(mean_results)
sd_exp_results
## [1] 0.7700348
# lets look at the variance of the actuals
variance_results <- sd_exp_results^2
variance_results
## [1] 0.5929536
# standard deviation of the theory
theorical_sd <- (1/lambda)/sqrt(sample_size)
theorical_sd
## [1] 0.7905694
# lets look at the variance from the theory
variance_theory <- ((1/lambda)*(1/sqrt(sample_size)))^2
variance_theory
## [1] 0.625
Standard deviation of the results is: 0.7700348 Standard deviation of the theory is: 0.7905694
Actual variance: 0.5929536 Theorical variance: 0.625
Finally, lets show the distribution of the means
xfit <- seq(min(mean_results), max(mean_results), length=100)
yfit <- dnorm(xfit, mean=1/lambda, sd=(1/lambda/sqrt(sample_size)))
hist(mean_results,breaks=sample_size,prob=T,col="orange",xlab = "means",main="Density of Means from 1000 Simulations",ylab="density")
lines(xfit, yfit, pch=22, col="black", lty=5)
# compare the distribution of averages of 40 exponentials to a normal distribution
qqnorm(mean_results)
qqline(mean_results, col = 2)
Base on the sample size of 40 and 1000 simulations, the results of the exponentials is very close to the normal distribution.