This report shows the exponential distribution of averages of 40 exponentials from a thousand simulations in order to compare them with the Central Limit Theorem. I illustrate the properties of the distribution of the mean of 40 exponentials by:
set.seed(459)
# create 1000 randomly generated means for the exponential distribution with lambda as rate.
# The exponential distribution is simulated in R with rexp(n, lambda) where lambda is the rate parameter and equal to 0.2. The mean of exponential distribution is 1/lambda and the standard deviation is also 1/lambda.
lambda=.2
mymean=1/lambda
mysd=1/lambda
n=40
data = NULL
for (i in 1 : 1000) data = c(data, mean(rexp(n, lambda)))
summary(data) #mean is just below mymean of 1/lambda=5
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.792 4.427 4.923 4.981 5.477 7.717
The simulated mean is close to the theoretical mean of 5.
#compare Theoretical Mean to Simulated Mean
theoretical_mean <- 1/lambda
print (paste("Theoretical Mean Distribution = ", theoretical_mean))
## [1] "Theoretical Mean Distribution = 5"
print (paste("Simulated Mean Distribution = ", mean(data)))
## [1] "Simulated Mean Distribution = 4.98131322933715"
The simulated variance is close to the theoretical variance of 0.625. Performing a t-test results in t=2261.7 and fail to reject the null hypotheis that true difference in means of variances equal zero.
#compare Theoretical Variance to Simulated Variance
theoretical_variance <- (1/lambda)^2/n;
print (paste("Theoretical Variance = ", theoretical_variance))
## [1] "Theoretical Variance = 0.625"
print (paste("Simulated Variance = ", var(data)))
## [1] "Simulated Variance = 0.628150010288495"
# Fail to reject null hypothesis that true difference in means=0
t.test(x=c(var(data), var(data)+.00001), y=c(theoretical_variance,theoretical_variance+.00001))
##
## Welch Two Sample t-test
##
## data: c(var(data), var(data) + 1e-05) and c(theoretical_variance, theoretical_variance + 1e-05)
## t = 445.48, df = 2, p-value = 5.039e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.003119586 0.003180435
## sample estimates:
## mean of x mean of y
## 0.628155 0.625005
The histgram shows the distribution of simulated means closely follows the normal curve (overlaid in blue). The Normal Q-Q Plot shows evidence of normality.
hist(data, density=25, breaks=20, prob=TRUE, xlab="Means", main="Distribution of Simulated Means with Normal Curve Overlay")
curve(dnorm(x, mean=mean(data), sd=sd(data)),
col="blue", lwd=2, add=TRUE, yaxt="n")
abline(v=5, col="red", lwd=4)
qqnorm(data); qqline(data) #shows normality