require("ggplot2")
## Loading required package: ggplot2
set.seed(1729)
num.sims <- 1000
lambda <- 0.2
n <- 40
for (i in 1:1000) {
sample <- rexp(n=n,lambda)
#get mean of sample of 40 draws from the distribution
meanSample <- mean(sample)
#write the mean of each sample to the index of the iteration
num.sims[i] <- meanSample
#print(iterations) test update of matrix
}
meanOfMeans <- mean(num.sims)
\(\bar X_n\) is approximately \(N(\mu, \sigma^2 / n)\)
Consequently, we observed this limiting in our data. The mean of the distribution of sample means was 4.99 and the mean of the population it was estimating (1/lambda) was 5. Simply, 4.99 approximates 5. Further, we can see evidence of the Central Limit Theorem in these results by plotting the density of means with a vertical blue line showing the hypothetical mean (5). If we collected more and more data (increase ‘n’) then our density would approximate more and more to a normal distribution.
print(1/lambda)
## [1] 5
print(meanOfMeans)
## [1] 4.997828
ggplot(as.data.frame(num.sims), aes(x=num.sims)) + geom_density() + geom_vline(xintercept = meanOfMeans, size = .75, color = "blue") +labs(x="x's", y="density")
As we collect infinitely more samples from the distribution and take their means not only does the mean of the sample means converage on the population mean but the sample variance also converges on the population variance. In this regard, we find more evidence to support the Central Limit Theorem.
ssd <- 1/(lambda*sqrt(n))
ssd
## [1] 0.7905694
psd <- sd(num.sims)
psd
## [1] 0.7772278
The sample standard deviation is 1/lambda * sqrt(n): 0.79 whereas the standard deviation of the population given is 0.77. These two numbers approximate.
Further, if we extend this example a bit we know that we can simply square the standard deviations and arrive at similar results for the variance. And, of course there is simply the variance of the actual observations we took via num.sims, which is 0.60.
sample_var <- ssd^2 #which is also 1/lambda^2/n
sample_var
## [1] 0.625
pop_var <- psd^2 #Which is also sigma or 1/lambda^2
pop_var
## [1] 0.604083
var(num.sims)
## [1] 0.604083
Finally, we compare the two plots, one of the standard normal distribution (in blue) and of our simulated exponential distribution converging on the central limit and becoming more ‘normal’ like.
ggplot(as.data.frame(num.sims), aes(x = num.sims)) + geom_density(binwidth=.2)+stat_function(geom = "line", fun = dnorm, args = list(mean = meanOfMeans, sd = 1/.2/sqrt(n)), size = 1.5, color = "blue") + geom_vline(xintercept=meanOfMeans, color = "goldenrod")