Investigating the Exponential Distribution and the Central Limit Theorem

Overview

This report investigates the behavior of the exponential distribution and its relationship to the Central Limit Theorem (CLT). Using simulations in R, we analyze the distribution of the sample mean of 40 exponentials over 1000 simulations. Specifically, the study compares the sample mean and variance to their theoretical counterparts and demonstrates the approximate normality of the sample mean distribution.

Simulations

We performed the following simulations using R:

  1. Generated 1000 samples of 40 exponentials each with 𝜆= 0.2.
  2. Calculated the sample means for each set of 40 exponentials.
  3. Compared the sample mean and variance to the theoretical values.
  4. Examined the approximate normality of the sample mean distribution.
# Setting parameters
lambda <- 0.2
n <- 40
simulations <- 1000

# Simulating 1000 samples of 40 exponentials
set.seed(123)  # For reproducibility
sample_means <- replicate(simulations, mean(rexp(n, lambda)))

# Generating a histogram for visualization
hist(sample_means, breaks = 30, probability = TRUE, main = "Distribution of Sample Means (40 Exponentials)", xlab = "Sample Mean")

Sample Mean versus Theoretical Mean

The theoretical mean of the exponential distribution is 𝜇= \(\frac{1}{𝜆}\) = 5. The sample mean from the simulations is calculated as the average of all 1000 means.

Theoretical Mean: 5 
Sample Mean: 5.011911 

Explanation:

  • The theoretical mean is 5. The sample mean from the simulations should closely approximate this value due to the Law of Large Numbers.

Sample Variance versus Theoretical Variance

The theoretical variance of the mean of 40 exponentials is given by:

Variance = \(\frac{𝜎^{2}}{𝑛}\) = \(\frac{(\frac{1}{𝜆})^{2}}{𝑛}\) Where \(𝜎^{2}\) = \(\frac{1}{𝜆^{2}}\) is the variance of the exponential distribution.

# Theoretical variance
theoretical_variance <- (1 / lambda)^2 / n

# Sample variance
sample_variance <- var(sample_means)

# Results
cat("Theoretical Variance:", theoretical_variance, "\n")
Theoretical Variance: 0.625 
cat("Sample Variance:", sample_variance, "\n")
Sample Variance: 0.6004928 

Explanation:

  • The sample variance should align closely with the theoretical variance, confirming the CLT prediction for the distribution of sample means.

Distribution

We compare the histogram of 1000 averages of 40 exponentials with the normal distribution to verify approximate normality. A normal curve is overlaid for comparison.

# Overlaying a normal curve on the histogram
hist(sample_means, breaks = 30, probability = TRUE, main = "Sample Means vs. Normal Distribution", xlab = "Sample Mean")
curve(dnorm(x, mean = theoretical_mean, sd = sqrt(theoretical_variance)), add = TRUE, col = "red", lwd = 2)

Explanation:

  • The histogram of the sample means appears bell-shaped, closely matching the overlaid normal curve, demonstrating approximate normality as predicted by the CLT.

Comparing Random Exponentials and Sample Means

To further illustrate the CLT, compare the histogram of 1000 random exponentials to that of the sample means.

# Random exponentials
random_exponentials <- rexp(1000, lambda)

# Histograms for comparison
par(mfrow = c(1, 2))
hist(random_exponentials, breaks = 30, probability = TRUE, main = "1000 Random Exponentials", xlab = "Value")
hist(sample_means, breaks = 30, probability = TRUE, main = "1000 Averages of 40 Exponentials", xlab = "Sample Mean")

Explanation:

  • The random exponential distribution is skewed, while the sample mean distribution is approximately normal due to the CLT.

Conclusion

This project demonstrates the Central Limit Theorem using simulations of the exponential distribution. The sample mean closely approximates the theoretical mean, and the sample variance aligns with the theoretical variance. The distribution of sample means exhibits approximate normality, even though the original exponential distribution is skewed. This confirms the CLT’s applicability to the exponential distribution.