This report investigates the behaviour of the mean of 40 exponentially distributed random variables, simulated 1000 times in R. We compare the sample mean and variance to the theoretical values and demonstrate the Central Limit Theorem.
We simulate 1000 times (sims), each time generating 40 exponential random variables (rexp(n, lambda)) and compute their mean. All means are stored in the means vector. The theoretical mean of an exponential distribution is 1/lambda = 5. According to the CLT, the mean of the sampling distribution should also approach 5.
sample_mean <- mean(means)
theoretical_mean <- 1/lambda
sample_mean
## [1] 5.011911
theoretical_mean
## [1] 5
hist(means, breaks = 50, probability = TRUE,
main = "Distribution of Sample Means (n=40)",
xlab = "Sample Mean")
abline(v = sample_mean, col = "blue", lwd = 2)
abline(v = theoretical_mean, col = "red", lwd = 2, lty = 2)
legend("topright", legend=c("Sample Mean", "Theoretical Mean"),
col=c("blue", "red"), lty=c(1,2), lwd=2)
The histogram shows the distribution of sample means. The blue line indicates the observed sample mean, while the red dashed line shows the theoretical mean. The two are very close, confirming that the sample mean converges to the theoretical mean.
The theoretical standard deviation of the exponential distribution is also 1/lambda = 5. The standard deviation of the sample mean (from CLT) should be: 0.7906
# SD
sample_sd <- sd(means)
theoretical_sd <- (1 / lambda) / sqrt(n)
#Variance
sample_var <- sample_sd^2
theoretical_var <- theoretical_sd^2
sample_sd
## [1] 0.7749147
theoretical_sd
## [1] 0.7905694
sample_var
## [1] 0.6004928
theoretical_var
## [1] 0.625
The sample standard deviation and variance of the means are close to the theoretical values. This further supports the CLT, which states that the variability of the sample means decreases with the square root of the sample size.
To show the distribution is approximately normal, we’ll overlay a normal curve and use a Q-Q plot.
#Histogram with Normal Curve – Figure 2 (Appendix)
hist(means, breaks = 50, probability = TRUE,
main = "Sample Means vs. Normal Distribution",
xlab = "Sample Mean")
curve(dnorm(x, mean=theoretical_mean, sd=theoretical_sd),
col="darkgreen", lwd=2, add=TRUE)
legend("topright", legend="Normal Curve", col="darkgreen", lwd=2)
#Q-Q-Plot - Figure 3 (Appendix)
qqnorm(means, main = "Q-Q Plot of Sample Means")
qqline(means, col = "red")
The histogram with the overlaid normal curve shows that the distribution of means is bell-shaped and symmetric. The Q-Q plot shows that the points mostly follow the reference line, suggesting that the sample means are approximately normally distributed.
Summary Metric Sample Value Theoretical Value Mean of Sample Means ~5.003 5 SD of Sample Means ~0.782 0.7906 Variance of Sample Means ~0.611 0.625 The sample mean and variance closely align with the theoretical expectations. The distribution of sample means is approximately normal. This simulation clearly demonstrates the Central Limit Theorem in action, even when starting with a skewed distribution like the exponential.