Overview

In this project, we examined the exponential distribution within R and contrasted it with the Central Limit Theorem using 1,000 simulations, each containing a sample of 40 exponentials with a constant rate parameter (lambda) of 0.2. Our investigation focused on the distribution of the average of 40 exponentials, determining the sample mean and variance, and juxtaposing them with their corresponding theoretical estimations. The simulation analysis revealed that the sample mean and variance closely resembled the theoretical mean and variance, suggesting that the simulation effectively captured the characteristics of the exponential distribution.

Additionally, the project established that the distribution of the average of 40 exponentials approximates a normal distribution, in line with the Central Limit Theorem’s predictions. The histogram of the simulated sample means mirrored a normal density curve, underscoring the distinction between the distribution of numerous random exponentials and that of numerous averages of 40 exponentials. The project’s outcomes lend credence to the Central Limit Theorem, which posits that the distribution of sample means converges to a normal distribution as the sample size expands, independent of the population distribution’s form.

Simulations

To scrutinize the exponential distribution in R and juxtapose it with the Central Limit Theorem, we aim to carry out an analysis using 1,000 simulations. Each of these simulations will encompass a sample of 40 exponentials, maintaining a uniform rate parameter (lambda) of 0.2. We will ascertain the sample mean and variance for every simulation and subsequently evaluate their characteristics.

# set up parameters for simulations
n <- 40
lambda <- 0.2
num_sims <- 1000

# generate simulations with seed for reliable reproduction of results
set.seed(123)
simulated_means <- replicate(num_sims, {
  sample <- rexp(n, lambda)
  mean(sample)
})

Comparing the Sample Mean to the Theoretical Mean

The calculated mean of the simulated sample means and the theoretical mean of the exponential distribution were assessed and contrasted. The distribution is focused around the theoretical mean, equivalent to 1/lambda (1/0.2 = 5 for this instance). This substantiates that our simulation aligns with the theoretical core of the distribution.

## Mean of simulated means: 5.011911
## Theoretical mean: 5

Comparing the Sample Variance to the Theoretical Variance

The dispersion of the simulated sample means and the theoretical dispersion of the distribution of the mean of 40 exponentials were ascertained and juxtaposed. These two figures ought to be in proximity to one another, signifying that the simulation aligns with the theoretical dispersion.

## Variance of simulated means: 0.6004928
## Theoretical variance: 0.625

Illustrating the Sample Distribution is Approximately Normal

A visual depiction of the simulated sample means was produced, with a normal density curve incorporated to showcase the distribution’s resemblance to a normal pattern. This singular chart accentuates the essential properties of the data, demonstrating that the distribution of the mean of 40 exponentials approximates a normal distribution, in agreement with the Central Limit Theorem’s predictions.

Appendix: All Source Code

# set up parameters for simulations
n <- 40
lambda <- 0.2
num_sims <- 1000

# generate simulations with seed for reliable reproduction of results
set.seed(123)
simulated_means <- replicate(num_sims, {
  sample <- rexp(n, lambda)
  mean(sample)
})

# calculate the mean of simulated means
mean_simulated_means <- mean(simulated_means)

# calculate the theoretical mean
theoretical_mean <- 1 / lambda

# return the results
cat("Mean of simulated means:", mean_simulated_means, "\n")
cat("Theoretical mean:", theoretical_mean, "\n")

# calculate the variance of simulated means
variance_simulated_means <- var(simulated_means)

# calculate the theoretical variance
theoretical_variance <- (1 / lambda^2) / n

cat("Variance of simulated means:", variance_simulated_means, "\n")
cat("Theoretical variance:", theoretical_variance, "\n")

# Load library
library(ggplot2)

# Create a data frame with the simulated_means
simulated_means_df <- data.frame(means = simulated_means)

# Plot histogram with overlaid normal density curve
ggplot(simulated_means_df, aes(x = means)) +
    geom_histogram(aes(y = ..density..), 
                   binwidth = 0.5, fill = "#1F77B4", 
                   alpha = 0.8, color = "white", 
                   size = 0.2) +
    stat_function(fun = dnorm,
                  args = list(mean = mean_simulated_means,
                    sd = sqrt(variance_simulated_means)), 
                    color = "#D62728", size = 1.5, linetype = "solid") +
    theme_bw() +
    labs(title = "Distribution of Simulated Means of 40 Exponentials",
         subtitle = "Comparison with Normal Density Curve",
         x = "Simulated Means of 40 Exponentials",
         y = "Density") +
    theme(plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
          plot.subtitle = element_text(size = 14, hjust = 0.5),
          axis.title = element_text(size = 12, face = "bold"),
          axis.text = element_text(size = 10),
          panel.grid.major = element_line(color = "grey80"),
          panel.grid.minor = element_line(color = "grey90"),
          panel.border = element_blank(),
          panel.background = element_blank())