Overview:

This report explores the properties of the distribution of the mean of 40 exponentials in R, comparing it with the theoretical values and the Central Limit Theorem. We conduct 1000 simulations with a lambda parameter of 0.2 for the exponential distribution and analyze the obtained results.

Simulations:

Set lambda and number of simulations

set.seed(2024) # for reproducability
lambda <- 0.2 # rate parameter
n_simulations <- 1000 # no of simulations

Store sample means and variances

sample_means <- rep(NA, n_simulations)
sample_variances <- rep(NA, n_simulations)

Perform simulations

for (i in 1:n_simulations) {
  exponentials <- rexp(n = 40, rate = lambda)
  sample_means[i] <- mean(exponentials)
  sample_variances[i] <- var(exponentials)
}

Sample Mean versus Theoretical Mean

The theoretical mean of an exponential distribution with lambda (\({\lambda}\))= 0.2 is \(\frac{1}{\lambda}\)= 5.

Figure 1 shows the distribution of the sample means along with a vertical line indicating the theoretical mean.

Figure 1: Distribution of Sample Means

library(ggplot2)
ggplot(data.frame(sample_means), aes(x = sample_means)) +
  geom_histogram(fill = my_colors[1], color = "black", bins = 30) +
  geom_vline(xintercept = 5, color = "blue", linetype = "dashed", size = 1) +
  labs(title = "Distribution of Sample Means",
       x = "Sample Mean", y = "Frequency") +
  theme_minimal()

Calculation of the Average Sample mean

mean_sample <- mean(sample_means)

Function to calculate confidence interval

calculate_ci <- function(x) {
  se <- sd(x) / sqrt(length(x))
  mean_val <- mean(x)
  ci_lower <- mean_val - 1.96 * se
  ci_upper <- mean_val + 1.96 * se
  return(c(lower = ci_lower, upper = ci_upper))
}

Confidence interval for sample means

ci_sample_means <- calculate_ci(sample_means)

The average of the sample means is 5.009, with a 95% confidence interval of (4.96, 5.058), closely aligning with the theoretical value. This highlights that the simulation accurately generated samples with the expected mean.

Sample Variance versus Theoretical Variance

The theoretical variance of an exponential distribution with lambda = 0.2 is \(\frac{1}{\lambda^2}\)= 25.

Figure 2 compares the distribution of the sample variances with the theoretical value.

Figure 2: Distribution of Sample Variances

ggplot(data.frame(sample_variances), aes(x = sample_variances)) +
  geom_histogram(fill = my_colors[2], color = "black", bins = 30) +
  geom_vline(xintercept = 25, color = "red", linetype = "dashed", size = 1) +
  labs(title = "Distribution of Sample Variances",
       x = "Sample Variance", y = "Frequency") +
  theme_minimal()

mean_variance <- mean(sample_variances)

The average of the sample variances is 24.961, again closely matching the theoretical value. This demonstrates that the simulated samples also reflect the expected variance.

Distribution of Sample Means

Figure 3 illustrates the distribution of the sample means, which visually approaches a normal distribution.

Figure 3: Distribution of Sample Means (with Density)

ggplot(data.frame(sample_means),aes(x=sample_means))+
  geom_histogram(aes(y=..density..,fill="Histogram"),bins=30,color="black")+
  stat_function(fun=dnorm,aes(color="Normal Distribution Curve"),
  args=list(mean=mean(sample_means),sd=sd(sample_means)))+
  geom_density(aes(color="Density Curve"),size=1)+
  labs(title="Distribution of Sample Means (with Density and Normal Distribution)",
  x="Sample Mean",y="Density")+
  scale_color_manual(values=c("Density Curve"="blue","Normal Distribution Curve"="red"))+
  scale_fill_manual(values=c("Histogram"=my_colors[3]))+
  theme_minimal()+theme(plot.title=element_text(hjust=0.3),legend.position="bottom")+
  guides(fill=guide_legend(title="Distribution: "),color=guide_legend(title="Curves: "))

The sample means distribution, depicted in Figure 3, conforms to the Central Limit Theorem’s principle, showcasing a tendency toward normality as sample size increases. The inclusion of both density and normal distribution curves emphasizes this alignment. Thus, the simulation outcomes validate the theorem’s expectations regarding the distribution of sample means.

Figure 4: QQ Plot of Sample Means

qqnorm(sample_means)
qqline(sample_means, col = "red")

In the QQ plot above, the points closely follow the red line, indicating that the sample means are approximately normally distributed, consistent with the Central Limit Theorem.

Conclusion

This simulation exercise confirms theoretical expectations about the exponential distribution and showcases the Central Limit Theorem in action. By examining the sample means and variances, we demonstrate that the simulations accurately reflected the theoretical values. Furthermore, the distribution of sample means visually approximates a normal distribution, supporting the Central Limit Theorem’s principle. The average of the sample means is approximately, 5.009, and the average of the sample variances is approximately, 24.961 This analysis serves as a foundation for further exploration of the Central Limit Theorem and its implications for statistical inference.