library(plotly) library(ggplot2) library(reshape2)
2025-03-11
library(plotly) library(ggplot2) library(reshape2)
The Central Limit Theorem (CLT) is a cornerstone of probability and statistics, offering insights into why the normal distribution is pervasive:
When independent random variables are added, their normalized sum tends toward a normal distribution:
\[ Z_n = \frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt{n}} \]
The CLT is widely applied across various fields:
The CLT is robust across various initial distributions. The next slide demonstrates this visually through an interactive 3D plot.
The next slide shows histograms demonstrating how sample means from uniformly distributed samples approximate a normal distribution as sample sizes grow.
## No id variables; using all as measure variables
The following slide includes a boxplot visualizing how variability decreases with larger sample sizes, highlighting the precision benefit of the CLT.
ggplot(means_df, aes(x=variable, y=value)) +
geom_boxplot(fill="lightblue", color="darkblue") +
labs(title="Variability of Sample Means Across Sample Sizes",
x="Sample Size", y="Sample Mean") +
theme_minimal()
Although the CLT applies regardless of the original distribution, the next slide illustrates how highly skewed populations require larger samples to achieve normality.
# Function to generate sample means from a uniform distribution
generate_means_uniform <- function(n, size) {
replicate(n, mean(runif(size, min=0, max=1)))
}
# Example: Generating sample means for n=1000 and size=30
sample_means <- generate_means_uniform(1000, 30)
## Thank You!