The Central Limit Theorem basically says that all sampling distributions turn out to be bell-shaped. We saw an example of a sampling distribution the in The Lady Tasting Tea module. We took repeated samples of coin flips where \(n=8\), and after thousands of samples, we plotted the results. Using those empirical results, we estimated the probabilities involved. The estimates get better as sample size increases. That intuitive idea is called the Law of Large Numbers.

Initializing RStudio

The Mosaic package was created by statistics instructors to help students learn the coding in R. Commands are streamlined to be more intuitive. Execute the code block below to load Mosaic (required each session).

library(mosaic)

I. Sampling Distributions

A sampling distribution is created by repeatedly drawing samples of the same size from an underlying distribution. In the examples below, \(n\) is the sample size for each individual draw, and Trials is the number of samples drawn.

Key Idea: the mean of a sampling distribution will be bell-shaped, regardless of the underlying distribution. The next dozen examples show the bell-shape sampling distribution emerging in the histograms as the number of trials increases.

Coin Flips = 20, Trials = 50

coins = do(50) * rflip(20)
histogram(~heads, data = coins,
          width = 1,
          fit = "normal")

The bell-shape is apparent but falls short of perfection. With larger sample sizes, the bars in the histogram will match up better with the envelope of the superimposed bell curve.

Coin Flips = 20, Trials = 200

coins = do(200) * rflip(20)
histogram(~heads, data = coins,
          width = 1,
          fit = "normal")

Coin Flips = 20, Trials = 1000

coins = do(1000) * rflip(20)
histogram(~heads, data = coins,
          width = 1,
          fit = "normal")

With a thousand trials, we observe a near-perfect bell-shape emerging in the histogram.

What about Wilder Underlying Distributions?

Perhaps you don’t find it surprising that the coins flips are bell-shaped. After all, the probability was fifty-fifty, so the symmetry, at least, isn’t surprising. This is an example of the uniform distribution where the bar plot of the probabilities is flat.

Unfair Coin Flips

Let’s change the game by using an underlying distribution that isn’t uniform. What if we still flip coins, but they aren’t fair? What if the probability of heads is 80%? The underlying probability distribution looks very different.

Let’s rerun these simulations and see what happens.

Coin Flips = 20, Trials = 50, \(P(\text{heads})={80\%}\)

coins = do(50) * rflip(20, prob = .8)
histogram(~heads, data = coins,
          width = 1,
          fit = "normal")

Coin Flips = 20, Trials = 200, \(P(\text{heads})={80\%}\)

coins = do(200) * rflip(20, prob = .8)
histogram(~heads, data = coins,
          width = 1,
          fit = "normal")

Coin Flips = 20, Trials = 1000, \(P(\text{heads})={80\%}\)

coins = do(1000) * rflip(20, prob = .8)
histogram(~heads, data = coins,
          width = 1,
          fit = "normal")

Notice the left tail of the histogram is slightly longer, but the bell-shape is nearly perfect and centered at 16 which is 80% of 20, as one would anticipate.

Dice Rolls with Values Squared

What about the following distribution: we roll a 6-sided die and take the square of the value showing. Then our underlying distribution will look very non-normal and skewed hard left.

barplot(c(1,4,9,16,25,36),space = 0, col = "red")

This isn’t a proper pdf because the \(y\)-axis is labeled in outcome values, not density, but the point was to show the shape, which it does.

Dice Rolls = 20, Trials = 50

d = do(50) * resample(1:6,20)
D = d^2
sumD = rowSums(D)
histogram(sumD ,
          width = 20,
          center = 300,
          fit = "normal")

Dice Rolls = 20, Trials = 200

d = do(200) * sample(1:6,20, replace = TRUE)
D = d^2
sumD = rowSums(D)
histogram(sumD ,
          width = 20,
          center = 300,
          fit = "normal")

Dice Rolls = 20, Trials = 1000

d = do(1000) * sample(1:6,20, replace = TRUE)
D = d^2
sumD = rowSums(D)
histogram(sumD ,
          width = 20,
          center = 300,
          fit = "normal")

The bell-shape has not yet completely emerged, but we’re getting close. Let’s try one more example with 10,000 trials.

Dice Rolls = 20, Trials = 10,000

d = do(10000) * sample(1:6,20, replace = TRUE)
D = d^2
sumD = rowSums(D)
histogram(sumD ,
          width = 20,
          center = 300,
          fit = "normal")

If we set the number of trials high enough, the bell-shape will eventually emerge in the histograms.

II. Central Limit Theorem

Suppose that we are sampling from a distribution of any shape, and further suppose that we actually know the distribution mean (\(\mu\)) and distribution standard deviation (\(\sigma\)). In other words, we know the probability density function (pdf) for the distribution and can calculate its expected value (mean) and standard deviation. In that case, the Central Limit Theorem applies.

Central Limit Theorem. The means of a sampling distribution from a population with mean \(\mu\) and standard deviation \(\sigma\) will have the normal distribution with mean of \(\mu\) and standard deviation of \(\frac{\sigma}{\sqrt{n}}\), that is the \[N\left(\mu,\frac{\sigma}{\sqrt n}\right)\] distribution. This is true regardless of the shape of the underlying distribution.

As \(n\) grows, the spread or dispersion is shrinking due to the \(\sqrt n\) term in the denominator. So larger values for \(n\) lead to \(\bar x\) being a better and better estimate of \(\mu\). This fact generalizes to the Law of Large Numbers which states that statistics (like \(\bar x\)) become better and better estimators of parameters (like \(\mu\)) as \(n\) grows larger and larger.

III. Normal Distribution

We refer to the Normal distribution with mean \(\mu\) and standard deviation \(\sigma\) with the notation \(N(\mu,\sigma)\). If we integrate the Gaussian function \[f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\left(\frac{x-\mu}{\sigma}\right)^2 }\] across all real numbers, the area under the curve is equal to one. The parameters \(\mu\) and \(\sigma\) will center the bell-curve at \(\mu\) and control the dispersion or spread.

If we need to do any computations, we use the Standard Normal distribution where \(\mu = 0\) and \(\sigma = 1\), e.g. \(N(0,1)\). As you’ll see in the formula above, these values have the effect of making the expression as uncomplicated as possible. If we want to convert values in a normal distribution to their Standard Normal equivalent, we compute a \(z\)-score.

\[z=\frac{x-\mu}{\sigma}\]

Standardized Scores

Standardized tests and standardized scores get their root from the Standard Normal distribution. Each value in a Normal distribution corresponds to an exact percentile.

Example: ACT Scores

ACT Composite Scores have (approximately) the \(N(21,6)\) distribution, though different test years have slightly different values. If Mateo earned a 29 on his ACT, what is his percentile score?

In R, we can calculate exact percentiles using the pnorm function which stands for “p-values from the normal distribution.”

pnorm(29,21,6)
[1] 0.9087888

So Mateo’s score is 90th percentile.

Using Mosaic we have a better option: the function xpnorm calculates the \(z\)-score and indicates it with a vertical line. The upper- and lower-tail probabilities also print out with the results and are displayed in different colors.

xpnorm(29,21,6)


If X ~ N(21, 6), then 

    P(X <= 29) = P(Z <= 1.333) = 0.9088
    P(X >  29) = P(Z >  1.333) = 0.09121
[1] 0.9087888

We can calculate Mateo’s standardized score. \[z_M=\frac{29-21}{6}=1.333\] After converting to a standardized score, we can look up the percentile in a \(z\)-table. This is the method we use in traditional by-hand statistics, where the calculus-based computations were avoided by converting to standardized scores and using tables. Now, cell phones can generate \(p\)-values to an exacting level of detail. R does the calculation, too, assuming the standard normal distribution unless a different mean and standard deviation are specified in the function call.

xpnorm(1.333)


If X ~ N(0, 1), then 

    P(X <= 1.333) = P(Z <= 1.333) = 0.9087
    P(X >  1.333) = P(Z >  1.333) = 0.09127
[1] 0.9087341

The slight difference in the two is the rounding error. The shaded area corresponds to a vertical line drawn at \(Z=1.333\) with the area to the left of it equally 90.87% of the total area under the curve.

We can also determine a person’s ACT score if we know their percentile. For example, suppose we knew Joe’s score was 45th percentile. Here, we use the R function xqnorm.

xqnorm(.45,21,6)


If X ~ N(21, 6), then 

    P(X <= 20.24603) = 0.45
    P(X >  20.24603) = 0.55
[1] 20.24603

Note the vertical line has been drawn at the 45th percentile, and the correspond \(z\)-score has been determined and displayed. Above the graph we see the printout showing the exact cutoff would be an ACT score of \(x=20.24603\) but we can round to 20 since only integer scores are possible.

Normal Curve Areas as Probabilities

Georgia ACT Scores

As reported in the AJC, Georgia ACT scores
were about half a point higher in 2018
than the US national average.

Because the bell curves we are using represent the distribution of populations, we can think of percentiles as probabilities. Suppose we draw a senior in high school at random from the state of Georgia, and assume the population has the \(N(21,6)\) distribution just like the nation-wide distribution. If so, what is the probability of the person we draw at random having an ACT score of 30?

xpnorm(30,21,6)


If X ~ N(21, 6), then 

    P(X <= 30) = P(Z <= 1.5) = 0.9332
    P(X >  30) = P(Z >  1.5) = 0.06681
[1] 0.9331928

The probability is approximately 93.32% that a person would be to the left of the vertical line at 30, so we subtract from 100% to get the probability: 6.68%.

IV. A Primitive Statistical Test

We know that resting heart rates (RHR) for healthy adult females have the \(N(75,8)\) distribution. Suppose we have a sample of 9 women whose average RHR is 70. We can use the Central Limit Theorem to tweak the above normal curve calculations into a group \(z\)-score.

Let’s imagine that this sample was drawn from a sub-population of adult women in their 40’s who have completed a half marathon in the least 3 years. Since RHR’s are a course measure of aerobic fitness, this sub-population of women might have average RHR’s that are different from their peers. Let’s call this sub-population of women \(R\), for runners.

We have a sample size of \(n=9\), and we use the null hypothesis that \(\mu_R=75\), e.g. that these women have identical average RHR’s as the overall population of healthy, adult women.

By the Central Limit theorem, the standard deviation will decrease.\[\sigma_R = \frac{\sigma}{\sqrt n}=\frac{8}{\sqrt 9}=\frac{8}{3}\]

We can now evaluate the probability that the null hypothesis is true using the xpnorm function.

xpnorm(70,75,8/3)


If X ~ N(75, 2.667), then 

    P(X <= 70) = P(Z <= -1.875) = 0.0304
    P(X >  70) = P(Z >  -1.875) = 0.9696
[1] 0.03039636

Results

Recall the Lady Tasting Tea example we worked out. Ronald Fischer, the godfather of statistics, suggested creating a null hypothesis: a falsifiable probability model that could be compared to the experimental data by using empirical or theoretical probabilities.

For this experiment, the null hypothesis is that the sample of nine women was drawn from the population and therefore from the \(N(75,8)\) distribution of RHR’s. We used a theoretical probability calculation assuming the sample of 9 women were drawn from this population. Our results indicate only a 3% chance exists of finding a random group drawn from the \(N(75,8)\) distribution with a mean at least this far from population mean \(\mu_0=75\). Since this probability is less than our default level of significance (\(\alpha = .05\)), we have reasonably good evidence the null hypothesis is false.

In modern statistical terms, we reject the null when it is unlikely to be true. What does this mean about our experiment? There appears to be a structural difference between the sub-population and the overall population. It would appear that female runners in their 40’s have a different RHR distribution than typical healthy adult females do. If we were to generalize, which is the goal of quantitative research, we would suggest we had evidence that female runners in their 40’s have lower RHR’s than the overall population of healthy adult women.

V. Exercises

  1. Find the percentile ranking for an IQ score of 120 using the xpnorm function. IQ’s have the \(N(100,15)\) distribution.

  2. Find the approximate IQ score that corresponds to the 25th percentile IQ. Use the xqnorm function. IQ’s have the \(N(100,15)\) distribution.

  3. Find the percentile ranking for an SAT-Math score of 700 using the xpnorm function. SAT components have the \(N(500,100)\) distribution.

  4. Find the approximate SAT-Math score that corresponds to the 60th percentile SAT-Math score. Use the xqnorm function. SAT components have the \(N(500,100)\) distribution.

  5. Find the “middle 90%” of the IQ distribution. The middle 90% will be a symmetric interval that traps exactly 5% of the area below it, 5% above it, and thus 90% within it. Use the xqnorm function. IQ components have the \(N(100,15)\) distribution.

