1 Introduction

Below is a breakdown of four powerful and commonly used probability distributions - normal, student’s T, chi-square and F. For the purposes of this analysis, I will be examining each distribution from the lens of exploring relative probabilities over a series of values, rather than cumulative ones at a particular point.

2 Normal Distribution

The normal distribution is the most fundamental and popular distribution in the world of statistics. Even those without statistical training have likely heard of the “bell curve.” Well, that shape symbolizes the relative probability distribution of standardized data which originated from a normally distributed population.

At the heart of the normal distribution’s characteristics is the empirical rule, which essentially states that about 68% of a population falls within one standard deviation of the mean, 95% falls within two standard deviations and over 99% fall within three standard deviations. An example of the bell curve with the empirical rule labeled is below.

include_graphics("Bell Curve.png")

Also stemming from the normal distribution comes the central limit theorem (CLT). The CLT states that, as the sample size of a study or experiment increases so does the sampling distribution’s resemblance of normality. This theorem is nonparametric, meaning it holds true regardless of the true population distribution from which the sample data came from. In other words, as long as the population of interest has a known and finite mean (or average) and variance, we can presume a relatively normal distribution of sample statistics (such as mean and proportion for example) coming from that population’s samples.

3 T Distribution

The normal distribution’s closest relative so to speak in the world of probability distributions is the T distribution (sometimes called the Student’s T). The T distribution is approximately bell shaped, and is also symmetric just like the normal distribution. The crucial difference between the T and normal distributions deals with knowledge of population variance.

As alluded to before, the normal distribution is a phenomenal tool when both the population mean and variance are finite and known. Alternatively, in a scenario where the population variance (\(\sigma^2\)) is unknown, the T distribution is often resorted to. For the T distribution, the sample variance (\(S^2\)) is used as an estimator of the population variance.

Additionally, the T distribution considers sample size in calculating relative probability while the normal distribution does not. The T distribution’s exact shape depends on its degrees of freedom (df). To find the degrees of freedom, we simply subtract one from the sample size. As degrees of freedom increases, so does the T distribution’s resemblance with the normal distribution, making this distribution most commonly used in cases of small sample sizes. An example of T distribution curves with differing degrees of freedoms in comparison to the standard normal distribution’s bell curve is below.

### Normal PDF 
curve(dnorm(x, mean = 0, sd = 1),
      from = -4,
      to = 4,
      lwd = 2,
      main = "Normal vs T Distributions",
      xlab = "Critical Value",
      ylab = "Probability Density")

box(lwd = 3)

### T Distribution PDF's

# df = 3
curve(dt(x, df = 3),
      from = -4, to = 4,
      lwd = 2,
      lty = 2,
      col = "red",
      add = TRUE)

# df = 10
curve(dt(x, df = 10),
      from = -4, to = 4,
      lwd = 2,
      lty = 2,
      col = "blue",
      add = TRUE)

# df = 20
curve(dt(x, df = 20),
      from = -4, to = 4,
      lwd = 2,
      lty = 2,
      col = "orange",
      add = TRUE)

legend("topright",
       legend = c("N (0,1)", 
                  paste0("T (df=", 3, ")"),
                  paste0("T (df=", 10, ")"),
                  paste0("T (df=", 20, ")")
                  ),
       col = c("black", "red", "blue", "orange"),
       lwd = 2,
       lty = 1,
       box.lwd = 3)

4 Chi-Square Distribution

Taking aspects from the normal, T and gamma distributions is the chi-square distribution. The gamma distribution is used for a population that is non-negative and skewed rightward. A typical example of when the gamma distribution might be applicable is modeling the length of time until a rare event (like a natural disaster) occurs.

The chi-square distribution is considered a special instance of the gamma. The gamma distribution has two parameters, alpha (\(\alpha\)) and beta (\(\beta\)). If \(\alpha\) = \(\frac{v}{2}\) (with v representing the degrees of freedom) and \(\beta\) = 2, then the distribution is chi-square.

Just like the T distribution, the chi-squares degrees of freedom is one unit less than the sample size being observed. Unlike the normal and T distributions though, the chi-square has a clear and prevalent rightward skew. This is because, as mentioned above, the chi-square distribution only deals with non-negative values. A greater degrees of freedom will correlate with an upward skew in relative probability happening at a greater critical value, before eventually leveling off. This can be seen in the plot below.

### Normal PDF 
curve(dnorm(x, mean = 0, sd = 1),
      from = 0,
      to = 40,
      lwd = 2,
      ylim = c(0,0.25),
      main = "Normal vs Chi-Square Distributions",
      xlab = "Critical Value",
      ylab = "Probability Density")

box(lwd = 3)

### Chi-Square Distribution PDF's

# df = 3
curve(dchisq(x, df = 3),
      from = 0, to = 40,
      lwd = 2,
      lty = 2,
      col = "red",
      add = TRUE)

# df = 10
curve(dchisq(x, df = 10),
      from = 0, to = 40,
      lwd = 2,
      lty = 2,
      col = "blue",
      add = TRUE)

# df = 20
curve(dchisq(x, df = 20),
      from = 0, to = 40,
      lwd = 2,
      lty = 2,
      col = "orange",
      add = TRUE)

legend("topright",
       legend = c("N (0,1)", 
                  paste0("Chi (df=", 3, ")"),
                  paste0("Chi (df=", 10, ")"),
                  paste0("Chi (df=", 20, ")")
                  ),
       col = c("black", "red", "blue", "orange"),
       lwd = 2,
       lty = 1,
       box.lwd = 3)

5 F Distribution

The F distribution largely resembles the chi-square and that is because the F distribution is actually the quotient of two independent chi-squared variables. While the T and chi-square distributions considered the degrees of freedom as (n - 1), the F distribution takes this formula and applies it to each of the two chi-squared variables in its calculation. Therefore, the F distribution has a separate degrees of freedom statistic for the numerator and denominator.

Like the chi-square distribution, the F distribution is non-negative and rightward skewed. The numerator degrees of freedom primarily controls the location of the curve’s skew relative to the plot’s x axis. An increasing numerator degrees of freedom will lead to the upward trajectory of relative probablity occurring at a greater critical value (a higher value along the x axis). The plot below illustrates how, while holding the denominator degrees of freedom constant, an increase in the numerator degrees of freedom means a rightward shift in the distribution’s peak, hence a weakening of the rightward skew characteristic.

### Normal PDF 
curve(dnorm(x, mean = 0, sd = 1),
      from = 0,
      to = 10,
      lwd = 2,
      ylim = c(0,1),
      main = "Normal vs F Distributions",
      xlab = "Critical Value",
      ylab = "Probability Density")

box(lwd = 3)

### F Distribution PDF's

curve(df(x, df1 =2, df2 = 10),
      from = 0, to = 10,
      lwd = 2,
      lty = 2,
      ylim = c(0,1),
      col = "red",
      add = TRUE)

curve(df(x, df1 =10, df2 = 10),
      from = 0, to = 10,
      lwd = 2,
      ylim = c(0,1),
      lty = 2,
      col = "blue",
      add = TRUE)

curve(df(x, df1 =100, df2 = 10),
      from = 0, to = 10,
      lwd = 2,
      ylim = c(0,1),
      lty = 2,
      col = "orange",
      add = TRUE)

legend("topright",
       legend = c("N (0,1)", 
                  paste0("F (df=", 2, ",", 10,  ")"),
                  paste0("F (df=", 10, ",",  10,  ")"),
                  paste0("F (df=", 100, ",",  10, ")")
                  ),
       col = c("black", "red", "blue", "orange"),
       lwd = 2,
       lty = 1,
       box.lwd = 3)

Meanwhile, the denominator degrees of freedom controls the distribution’s concentration. We can see via the graphic below that an increasing denominator degrees of freedom correlates with the relative probability density “spiking” within a smaller and more confined range of critical values. Hence, it is much more likely for a F distribution with a smaller denominator degrees of freedom to output a large F statistic.

### Normal PDF 
curve(dnorm(x, mean = 0, sd = 1),
      from = 0,
      to = 10,
      lwd = 2,
      ylim = c(0,1),
      main = "Normal vs F Distributions",
      xlab = "Critical Value",
      ylab = "Probability Density")

box(lwd = 3)

### F Distribution PDF's


curve(df(x, df1 =10, df2 = 2),
      from = 0, to = 10,
      lwd = 2,
      lty = 2,
      ylim = c(0,1),
      col = "red",
      add = TRUE)

curve(df(x, df1 =10, df2 = 10),
      from = 0, to = 10,
      lwd = 2,
      ylim = c(0,1),
      lty = 2,
      col = "blue",
      add = TRUE)

curve(df(x, df1 =10, df2 = 100),
      from = 0, to = 10,
      lwd = 2,
      ylim = c(0,1),
      lty = 2,
      col = "orange",
      add = TRUE)

legend("topright",
       legend = c("N (0,1)", 
                  paste0("F (df=", 10, ",", 2,  ")"),
                  paste0("F (df=", 10, ",",  10,  ")"),
                  paste0("F (df=", 10, ",",  100, ")")
                  ),
       col = c("black", "red", "blue", "orange"),
       lwd = 2,
       lty = 1,
       box.lwd = 3)

