The Central Limit Theorem

For our purposes, the CLT states that the distribution of averages of \(iid\) variables (properly normalized) becomes that of a standard normal as the sample size increases.

Because it has fairly loose requirements on the collection of populations that it applies to, the Central Limit Theorem applies in a nearly endless variety of settings, and we’ll go through several.

The basic result is that

\[ \frac{\bar{X}_n - \mu}{\sigma / \sqrt{n}} = \frac{\sqrt{n}(\bar{X}_n - \mu)}{\sigma} = \frac{Estimate - Mean of estimate}{Std. Err. of estimate}\]

has a distribution like that of a standard normal for large \(n\).

The most useful way to think about the CLT is that \(\bar{X}_n\) is approximately \(N(\mu, \sigma^2 / n)\)

So the sample average is approximately normally distributed with a mean given by the population mean and a variance given by the standard error of the mean.

Example

Simulate a standard normal random variable by rolling \(n\) (six sided).

Let \(X_i\) be the outcome for die \(i\)

Then note that \(\mu = E[X_i] = 3.5\)

\(Var(X_i) = 2.92\)

\(SE = \sqrt{2.92/n} = 1.71/\sqrt{n}\)

Let’s roll \(n\) dice, take their mean, subtract off \(3.5\), and divide by \(1.71/\sqrt{n}\). If the Central Limit Theorem is right this will be like a bell curve.

par(mfrow = c(1,3))
n <- c(10, 20, 30)
mu <- 3.5
var <- 2.92

experiment_10 <- c()
experiment_20 <- c()
experiment_30 <- c()

for(i in 1:10000) {
        se <- sqrt(var) / sqrt(n[1])
        dieroll <- sample(1:6, n[1], replace = T)
        experiment_10[i] <- ((sum(dieroll) / n[1]) - mu) / se
        
        se <- sqrt(var) / sqrt(n[2])
        dieroll <- sample(1:6, n[2], replace = T)
        experiment_20[i] <- ((sum(dieroll) / n[2]) - mu) / se
        
        se <- sqrt(var) / sqrt(n[3])
        dieroll <- sample(1:6, n[3], replace = T)
        experiment_30[i] <- ((sum(dieroll) / n[3]) - mu) / se
}
hist(experiment_10)
hist(experiment_20)
hist(experiment_30)

Coin CLT

Let \(X_i\) be the \(0\) or \(1\) result of the \(i^{th}\) flip of a possibly unfair coin * The sample proportion, say \(\hat{p}\), is the average of the coin flips * \(E[X_i] = p\) and \(Var(X_i) = p(1-p)\) * Standard error of the mean is \(\sqrt{p(1-p)/n}\)

So if we take the statistic

\[ \frac{\hat{p}-p}{\sqrt{p(1-p)/n}} \]

this is asymptotically normal, if \(n\) is large enough.

Notice that if the coin is fair then

\(p = 1/2\)

\(p(1-p) = 1/4\)

\(\sqrt{p(1-p)} = 1/2\)

So the standard error for a fair coin flip is \(1/2\sqrt{n}\)

So let’s flip the coin \(n\) times, take the sample proportion of heads, subtract off \(0.5\) and multiply the result \(2 \sqrt{n}\)

par(mfrow = c(1, 3))
n <- c(10, 20, 30)
experiment_10 <- c()
experiment_20 <- c()
experiment_30 <- c()

for(i in 1:10000) {
        coinflip <- sample(0:1, n[1], replace = T)
        experiment_10[i] <- ((sum(coinflip) / n[1]) - 0.5) * (2 / sqrt(n[1]))

        coinflip <- sample(0:1, n[2], replace = T)
        experiment_20[i] <- ((sum(coinflip) / n[2]) - 0.5) * (2 / sqrt(n[2]))
                        
        coinflip <- sample(0:1, n[3], replace = T)
        experiment_30[i] <- ((sum(coinflip) / n[3]) - 0.5) * (2 / sqrt(n[3]))
}

hist(experiment_10, breaks = 10)
hist(experiment_20, breaks = 10)
hist(experiment_30, breaks = 10)

Simulation results, \(p = 0.9\)

The speed at which the normalized coin flips converges to normality is governed by how biased the coin is:

par(mfrow = c(1, 3))
n <- c(10, 20, 30)
experiment_10 <- c()
experiment_20 <- c()
experiment_30 <- c()

for(i in 1:10000) {
        coinflip <- sample(c(0,1), n[1], replace = T, prob = c(0.1, 0.9))
        experiment_10[i] <- ((sum(coinflip) / n[1]) - 0.5) * (2 / sqrt(n[1]))

        coinflip <- sample(c(0,1), n[2], replace = T, prob = c(0.1, 0.9))
        experiment_20[i] <- ((sum(coinflip) / n[2]) - 0.5) * (2 / sqrt(n[2]))
                        

        coinflip <- sample(c(0,1), n[3], replace = T, prob = c(0.1, 0.9))   
        experiment_30[i] <- ((sum(coinflip) / n[3]) - 0.5) * (2 / sqrt(n[3]))
}

hist(experiment_10, breaks = 10)
hist(experiment_20, breaks = 10)
hist(experiment_30, breaks = 10)

When \(n = 10\) the distribution is not very bell-shaped. By 30, it’s getting there but still the probability’s, when approximated by the normal distribution would not be perfect.

So just keep this in mind that the central limit theorem doesn’t guarantee that the normal distribution will be a good approximation. Simply that has the number of coin flips limits to infinity, eventually it will be a good approximation.

Galton’s quincunx

https://en.wikipedia.org/wiki/Bean_machine#/media/File:Quincunx_(Galton_Box)_-_Galton_1889_diagram.png

This is a machine that you might have seen if you visited a science museum. Basically the cunx of this machine is illustrating the Central Limit Theorem with a game that looks a little like Pachinko. Every time a ball hits a peg, it’s a binomial experiment. So the balls will be collected down the scheme representing a normal distribution bell

07 02 Asymptotics and the CLT

Federico Viscioletti

12 novembre 2015

The Central Limit Theorem

Example

Coin CLT

Simulation results, \(p = 0.9\)

Galton’s quincunx