The purpose of this project is to compare the exponential distribution with the Central Limit Theorem. The exponential distribution can be simulated in R with rexp(n, lambda) where lambda is the rate parameter. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda. The following instructions have been provided: Set lambda = 0.2 for all of the simulations. Investigate the distribution of averages of 40 exponentials over a 1000 simulations.
# set seed for reproducability
set.seed(12345)
# Set sampling values as described in the project instructions
lambda <- 0.2 # lambda
n <- 40 # number of exponentials
sims <- 1000 # number of simulations
#Run simulations
sim_exp <- replicate(sims, rexp(n, lambda))
#Calc the means of the exponential simulations
means_exp <- apply(sim_exp, 2, mean)
#Histogram of the means
hist(means_exp, breaks=40, xlim = c(2,9), main="Exponential Function Simulation Means",
col = "salmon")
Question 1 - Sample Mean vs Theoretical Mean The mean of the exponential distribution is 1/lambda. In this case, lambda is 0.2. Therefore, the theoretical mean should result as 5 (i.e. 1 / 0.2). Let’ss see if that holds true.
# plot histogram of the sample means
hist(means_exp, col="salmon", main="Theoretical Mean vs. Actual Mean", xlim = c(2,9),breaks=40, xlab = "Simulation Means")
# plot a vertical red line at the mean of the sample means
abline(v=mean(means_exp), lwd="4", col="black")
# determine the mean of our sample means
mean(means_exp)
## [1] 4.971972
Our sample mean above is very close to our theoretical mean of 5.
Question 2 - Sample Variance vs Theoretical Variance The standard deviation of the exponential distribution is (1/lambda) / sqrt(n). Next, we’ll see if this matches our simulations.
# theoretical standard deviation vs. simulation standard deviation
print(paste("Theoretical standard deviation: ", round( (1/lambda)/sqrt(n) ,4)))
## [1] "Theoretical standard deviation: 0.7906"
print(paste("Practical standard deviation: ", round(sd(means_exp) ,4)))
## [1] "Practical standard deviation: 0.7716"
print(paste("Theoretical variance: ", round( ((1/lambda)/sqrt(n))^2 ,4)))
## [1] "Theoretical variance: 0.625"
print(paste("Practical variance: ", round(sd(means_exp)^2 ,4)))
## [1] "Practical variance: 0.5954"
The formulas above show us that the variances are very close.
Question 3 - Distribution Finally, we’ll investigate whether the exponential distribution is approximately normal. As per the Central Limit Theorem, the means of the sample simulations should follow a normal distribution.
#General Plot with ditribution curve drawn
hist(means_exp, prob=TRUE, col="salmon", main="Exponential Function Simulation Means", breaks=40, xlim=c(2,9), xlab = "Simulation Means")
lines(density(means_exp), lwd=3, col="red")
# Normal distribution line creation
x <- seq(min(means_exp), max(means_exp), length=2*n)
y <- dnorm(x, mean=1/lambda, sd=sqrt(((1/lambda)/sqrt(n))^2))
lines(x, y, pch=22, col="black", lwd=2, lty = 2)
As the graph shows, the distribution of means of our sampled exponential distributions appear to follow a normal distribution, due to the Central Limit Theorem. If we increased our number of samples (currently 1000), the distribution would be even closer to the standard normal distribution.The dotted line above is a normal distribution curve and we can see that it is very close to our sampled curve, which is the red line above.