Central Limit Theorem (CLT)

2025-03-11

Libraries Used

library(plotly)
library(ggplot2)
library(reshape2)

Introduction to the Central Limit Theorem

The Central Limit Theorem (CLT) is a cornerstone of probability and statistics, offering insights into why the normal distribution is pervasive:

Formal Definition:

When independent random variables are added, their normalized sum tends toward a normal distribution:

\[ Z_n = \frac{\sum_{i=1}^n X_i - n\mu}{\sigma \sqrt{n}} \]

Significance:
- Experimental results influenced by many small random disturbances often follow a normal distribution.

Applications of the Central Limit Theorem

The CLT is widely applied across various fields:

Economics and Finance: Risk assessment and portfolio management
Quality Control: Evaluating manufacturing consistency
Survey Analysis: Estimating population parameters from sample data

Visualization of CLT (3D Interactive) - Explanation

The CLT is robust across various initial distributions. The next slide demonstrates this visually through an interactive 3D plot.

X-axis: Increasing sample sizes
Y-axis: Different initial distributions
Z-axis: Sample means density

Visualization of CLT (3D Interactive) - Plot

Distribution of Sample Means (Uniform) - Explanation

The next slide shows histograms demonstrating how sample means from uniformly distributed samples approximate a normal distribution as sample sizes grow.

Distribution of Sample Means (Uniform) - Plot

## No id variables; using all as measure variables

Variability of Sample Means - Explanation

The following slide includes a boxplot visualizing how variability decreases with larger sample sizes, highlighting the precision benefit of the CLT.

Variability of Sample Means - Plot

ggplot(means_df, aes(x=variable, y=value)) +
  geom_boxplot(fill="lightblue", color="darkblue") +
  labs(title="Variability of Sample Means Across Sample Sizes",
       x="Sample Size", y="Sample Mean") +
  theme_minimal()

Impact of Population Skewness - Explanation

Although the CLT applies regardless of the original distribution, the next slide illustrates how highly skewed populations require larger samples to achieve normality.

Impact of Population Skewness - Plot

Key R Code Highlights

Generating Sample Means

# Function to generate sample means from a uniform distribution
generate_means_uniform <- function(n, size) {
  replicate(n, mean(runif(size, min=0, max=1)))
}

# Example: Generating sample means for n=1000 and size=30
sample_means <- generate_means_uniform(1000, 30)


## Thank You!