2025-03-23

Introduction

The Central Limit Theorem states that the mean \({\mu}\) of many independent random variables tends to follow a normal distribution, even if the original data is not normally distributed.

As the sample size “n” increases, the distribution of the sample average becomes a bell-curve.

The sample standard deviation approaches \(\frac{\sigma}{\sqrt{n}}\), where \({\sigma}\) is the population standard deviation.

The sample mean approaches the population mean: \[\lim_{n \to \infty} {\overline{\text{X}}}_n = \mu\] In the next slides, we will set up a simple experiment to see this in action.

source: https://en.wikipedia.org/wiki/Central_limit_theorem

Example: Coin Toss Experiment

We all know the most basic example of a coin toss. Given that the coin is fair, we know that there is a 50% probability that the toss will be heads, and a 50% chance that the toss will be tails.

Let us set up 10000 trials of 1 coin flip and graph the density of heads being the outcome for each trial.

Example: Coin Toss (Continued)

As we can see, each trial can be heads or not heads. With just n = 1 toss, the graph does not look like a bell curve. In keeping with the theorem, if we increase n by a lot, let’s say to 1000, we should expect to see a histogram with a distribution that looks more like a bell curve.

R Code for Samples and Trials

Here is my code to setup the data frames for the trials and coin tosses.

n <- 1000             
trials <- 10000      

coin_tosses <- matrix(rbinom(n * trials, 
                             size = 1, prob = 0.5), 
                             nrow = trials)
sample_means <- rowMeans(coin_tosses)
df <- data.frame(sample_mean = sample_means)

R Code Plot

Here is my code for the plot.

ggplot(df, aes(x = sample_mean)) +
  geom_histogram(aes(y = after_stat(density)), 
                 bins = 30, fill = "orange", color = "black") +
  stat_function(fun = dnorm, 
                args = list(mean = mean(df$sample_mean), 
                            sd = sd(df$sample_mean)), 
                color = "darkred") +
  labs(
    title = paste("Central Limit Theorem Demo (n =", n, ")"),
    x = "Sample Mean (Of Heads)",
    y = "Density"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

3D Plot Stacking Different Trials

Let us continue to observe how the histogram becomes more like the normal distribution as n grows larger. Imagine the histograms we created earlier but stack them along a 3rd dimension of n values from 1 to 1000. Note the density on the expected 50% sample mean value as n increases.

Code for setting up Data on 3D Graph

n_values <- seq(1, 1000)
trials <- 1000
bins <- seq(0, 1, length.out = 50) 
bin_centers <- (bins[-1] + bins[-length(bins)]) / 2

density_matrix <- matrix(0, nrow = length(n_values), 
                         ncol = length(bin_centers))

for (i in seq_along(n_values)) {
  n <- n_values[i]
  tosses <- matrix(rbinom(n * trials, size = 1, prob = 0.5), 
                   nrow = trials)
  sample_means <- rowMeans(tosses)
  hist_data <- hist(sample_means, breaks = bins, plot = FALSE)
  density_matrix[i, ] <- hist_data$density
}

Code for plotly 3D Graph

plot_ly(
  x = bin_centers, 
  y = n_values,     
  z = ~density_matrix,
  type = "surface",
  colorscale = "Viridis"
) %>%
  layout(
    title = "3D Central Limit Theorem Surface (Sample Mean vs. n)",
    scene = list(
      xaxis = list(title = "Sample Mean"),
      yaxis = list(title = "Number of Tosses (n)"),
      zaxis = list(title = "Density")
    )
  )

Mathematical Equations Relating to CLT (using Latex)

Suppose \(X_1, X_2, \dots, X_n\) are random “independent and identically distributed (i.i.d) variables.

The sum of these can be described via: \(S_n = X_1 + X_2 + \dots + X_n\)

The sample mean is: \({\overline{\text{X}}}_n = \frac{S_n}{n}\)

Sample standard deviation of \({\overline{\text{X}}}_n\) is: \(\sigma_{\overline{\text{X}}_n} = \frac{\sigma}{\sqrt{n}}\)

As n increases, \({\overline{\text{X}}_n}\) will look like the normal distribution: \[\overline{\text{X}}_n \approx N(\mu, \frac{\sigma^2}{n})\]

Source: https://math.mit.edu/~dav/05.dir/class6-prep.pdf