This report explores the properties of the distribution of the mean of 40 exponentials in R, comparing it with the theoretical values and the Central Limit Theorem. We conduct 1000 simulations with a lambda parameter of 0.2 for the exponential distribution and analyze the obtained results.
set.seed(2024) # for reproducability
lambda <- 0.2 # rate parameter
n_simulations <- 1000 # no of simulations
sample_means <- rep(NA, n_simulations)
sample_variances <- rep(NA, n_simulations)
for (i in 1:n_simulations) {
exponentials <- rexp(n = 40, rate = lambda)
sample_means[i] <- mean(exponentials)
sample_variances[i] <- var(exponentials)
}
The theoretical mean of an exponential distribution with lambda (\({\lambda}\))= 0.2 is \(\frac{1}{\lambda}\)= 5.
Figure 1 shows the distribution of the sample means along with a vertical line indicating the theoretical mean.
library(ggplot2)
ggplot(data.frame(sample_means), aes(x = sample_means)) +
geom_histogram(fill = my_colors[1], color = "black", bins = 30) +
geom_vline(xintercept = 5, color = "blue", linetype = "dashed", size = 1) +
labs(title = "Distribution of Sample Means",
x = "Sample Mean", y = "Frequency") +
theme_minimal()
Calculation of the Average Sample mean
mean_sample <- mean(sample_means)
Function to calculate confidence interval
calculate_ci <- function(x) {
se <- sd(x) / sqrt(length(x))
mean_val <- mean(x)
ci_lower <- mean_val - 1.96 * se
ci_upper <- mean_val + 1.96 * se
return(c(lower = ci_lower, upper = ci_upper))
}
Confidence interval for sample means
ci_sample_means <- calculate_ci(sample_means)
The average of the sample means is 5.009, with a 95% confidence interval of (4.96, 5.058), closely aligning with the theoretical value. This highlights that the simulation accurately generated samples with the expected mean.
The theoretical variance of an exponential distribution with lambda = 0.2 is \(\frac{1}{\lambda^2}\)= 25.
Figure 2 compares the distribution of the sample variances with the theoretical value.
ggplot(data.frame(sample_variances), aes(x = sample_variances)) +
geom_histogram(fill = my_colors[2], color = "black", bins = 30) +
geom_vline(xintercept = 25, color = "red", linetype = "dashed", size = 1) +
labs(title = "Distribution of Sample Variances",
x = "Sample Variance", y = "Frequency") +
theme_minimal()
mean_variance <- mean(sample_variances)
The average of the sample variances is 24.961, again closely matching the theoretical value. This demonstrates that the simulated samples also reflect the expected variance.
Figure 3 illustrates the distribution of the sample means, which visually approaches a normal distribution.
ggplot(data.frame(sample_means),aes(x=sample_means))+
geom_histogram(aes(y=..density..,fill="Histogram"),bins=30,color="black")+
stat_function(fun=dnorm,aes(color="Normal Distribution Curve"),
args=list(mean=mean(sample_means),sd=sd(sample_means)))+
geom_density(aes(color="Density Curve"),size=1)+
labs(title="Distribution of Sample Means (with Density and Normal Distribution)",
x="Sample Mean",y="Density")+
scale_color_manual(values=c("Density Curve"="blue","Normal Distribution Curve"="red"))+
scale_fill_manual(values=c("Histogram"=my_colors[3]))+
theme_minimal()+theme(plot.title=element_text(hjust=0.3),legend.position="bottom")+
guides(fill=guide_legend(title="Distribution: "),color=guide_legend(title="Curves: "))
The sample means distribution, depicted in Figure 3, conforms to the Central Limit Theorem’s principle, showcasing a tendency toward normality as sample size increases. The inclusion of both density and normal distribution curves emphasizes this alignment. Thus, the simulation outcomes validate the theorem’s expectations regarding the distribution of sample means.
qqnorm(sample_means)
qqline(sample_means, col = "red")
In the QQ plot above, the points closely follow the red line, indicating
that the sample means are approximately normally distributed, consistent
with the Central Limit Theorem.
This simulation exercise confirms theoretical expectations about the exponential distribution and showcases the Central Limit Theorem in action. By examining the sample means and variances, we demonstrate that the simulations accurately reflected the theoretical values. Furthermore, the distribution of sample means visually approximates a normal distribution, supporting the Central Limit Theorem’s principle. The average of the sample means is approximately, 5.009, and the average of the sample variances is approximately, 24.961 This analysis serves as a foundation for further exploration of the Central Limit Theorem and its implications for statistical inference.