Normal
Distribution
The normal distribution is the most fundamental and popular
distribution in the world of statistics. Even those without statistical
training have likely heard of the “bell curve.” Well, that shape
symbolizes the relative probability distribution of standardized data
which originated from a normally distributed population.
At the heart of the normal distribution’s characteristics is the
empirical rule, which essentially states that about 68% of a population
falls within one standard deviation of the mean, 95% falls within two
standard deviations and over 99% fall within three standard deviations.
An example of the bell curve with the empirical rule labeled is
below.
include_graphics("Bell Curve.png")

Also stemming from the normal distribution comes the central limit
theorem (CLT). The CLT states that, as the sample size of a study or
experiment increases so does the sampling distribution’s resemblance of
normality. This theorem is nonparametric, meaning it holds true
regardless of the true population distribution from which the
sample data came from. In other words, as long as the population of
interest has a known and finite mean (or average) and variance, we can
presume a relatively normal distribution of sample statistics (such as
mean and proportion for example) coming from that population’s
samples.
T Distribution
The normal distribution’s closest relative so to speak in the world
of probability distributions is the T distribution (sometimes called the
Student’s T). The T distribution is approximately bell shaped, and is
also symmetric just like the normal distribution. The crucial difference
between the T and normal distributions deals with knowledge of
population variance.
As alluded to before, the normal distribution is a phenomenal tool
when both the population mean and variance are finite and known.
Alternatively, in a scenario where the population variance (\(\sigma^2\)) is unknown, the T distribution
is often resorted to. For the T distribution, the sample variance (\(S^2\)) is used as an estimator of the
population variance.
Additionally, the T distribution considers sample size in calculating
relative probability while the normal distribution does not. The T
distribution’s exact shape depends on its degrees of freedom (df). To
find the degrees of freedom, we simply subtract one from the sample
size. As degrees of freedom increases, so does the T distribution’s
resemblance with the normal distribution, making this distribution most
commonly used in cases of small sample sizes. An example of T
distribution curves with differing degrees of freedoms in comparison to
the standard normal distribution’s bell curve is below.
### Normal PDF
curve(dnorm(x, mean = 0, sd = 1),
from = -4,
to = 4,
lwd = 2,
main = "Normal vs T Distributions",
xlab = "Critical Value",
ylab = "Probability Density")
box(lwd = 3)
### T Distribution PDF's
# df = 3
curve(dt(x, df = 3),
from = -4, to = 4,
lwd = 2,
lty = 2,
col = "red",
add = TRUE)
# df = 10
curve(dt(x, df = 10),
from = -4, to = 4,
lwd = 2,
lty = 2,
col = "blue",
add = TRUE)
# df = 20
curve(dt(x, df = 20),
from = -4, to = 4,
lwd = 2,
lty = 2,
col = "orange",
add = TRUE)
legend("topright",
legend = c("N (0,1)",
paste0("T (df=", 3, ")"),
paste0("T (df=", 10, ")"),
paste0("T (df=", 20, ")")
),
col = c("black", "red", "blue", "orange"),
lwd = 2,
lty = 1,
box.lwd = 3)

Chi-Square
Distribution
Taking aspects from the normal, T and gamma distributions is the
chi-square distribution. The gamma distribution is used for a population
that is non-negative and skewed rightward. A typical example of when the
gamma distribution might be applicable is modeling the length of time
until a rare event (like a natural disaster) occurs.
The chi-square distribution is considered a special instance of the
gamma. The gamma distribution has two parameters, alpha (\(\alpha\)) and beta (\(\beta\)). If \(\alpha\) = \(\frac{v}{2}\) (with v representing the
degrees of freedom) and \(\beta\) = 2,
then the distribution is chi-square.
Just like the T distribution, the chi-squares degrees of freedom is
one unit less than the sample size being observed. Unlike the normal and
T distributions though, the chi-square has a clear and prevalent
rightward skew. This is because, as mentioned above, the chi-square
distribution only deals with non-negative values. A greater degrees of
freedom will correlate with an upward skew in relative probability
happening at a greater critical value, before eventually leveling off.
This can be seen in the plot below.
### Normal PDF
curve(dnorm(x, mean = 0, sd = 1),
from = 0,
to = 40,
lwd = 2,
ylim = c(0,0.25),
main = "Normal vs Chi-Square Distributions",
xlab = "Critical Value",
ylab = "Probability Density")
box(lwd = 3)
### Chi-Square Distribution PDF's
# df = 3
curve(dchisq(x, df = 3),
from = 0, to = 40,
lwd = 2,
lty = 2,
col = "red",
add = TRUE)
# df = 10
curve(dchisq(x, df = 10),
from = 0, to = 40,
lwd = 2,
lty = 2,
col = "blue",
add = TRUE)
# df = 20
curve(dchisq(x, df = 20),
from = 0, to = 40,
lwd = 2,
lty = 2,
col = "orange",
add = TRUE)
legend("topright",
legend = c("N (0,1)",
paste0("Chi (df=", 3, ")"),
paste0("Chi (df=", 10, ")"),
paste0("Chi (df=", 20, ")")
),
col = c("black", "red", "blue", "orange"),
lwd = 2,
lty = 1,
box.lwd = 3)

F Distribution
The F distribution largely resembles the chi-square and that is
because the F distribution is actually the quotient of two independent
chi-squared variables. While the T and chi-square distributions
considered the degrees of freedom as (n - 1), the F distribution takes
this formula and applies it to each of the two chi-squared variables in
its calculation. Therefore, the F distribution has a separate degrees of
freedom statistic for the numerator and denominator.
Like the chi-square distribution, the F distribution is non-negative
and rightward skewed. The numerator degrees of freedom primarily
controls the location of the curve’s skew relative to the plot’s x axis.
An increasing numerator degrees of freedom will lead to the upward
trajectory of relative probablity occurring at a greater critical value
(a higher value along the x axis). The plot below illustrates how, while
holding the denominator degrees of freedom constant, an increase in the
numerator degrees of freedom means a rightward shift in the
distribution’s peak, hence a weakening of the rightward skew
characteristic.
### Normal PDF
curve(dnorm(x, mean = 0, sd = 1),
from = 0,
to = 10,
lwd = 2,
ylim = c(0,1),
main = "Normal vs F Distributions",
xlab = "Critical Value",
ylab = "Probability Density")
box(lwd = 3)
### F Distribution PDF's
curve(df(x, df1 =2, df2 = 10),
from = 0, to = 10,
lwd = 2,
lty = 2,
ylim = c(0,1),
col = "red",
add = TRUE)
curve(df(x, df1 =10, df2 = 10),
from = 0, to = 10,
lwd = 2,
ylim = c(0,1),
lty = 2,
col = "blue",
add = TRUE)
curve(df(x, df1 =100, df2 = 10),
from = 0, to = 10,
lwd = 2,
ylim = c(0,1),
lty = 2,
col = "orange",
add = TRUE)
legend("topright",
legend = c("N (0,1)",
paste0("F (df=", 2, ",", 10, ")"),
paste0("F (df=", 10, ",", 10, ")"),
paste0("F (df=", 100, ",", 10, ")")
),
col = c("black", "red", "blue", "orange"),
lwd = 2,
lty = 1,
box.lwd = 3)

Meanwhile, the denominator degrees of freedom controls the
distribution’s concentration. We can see via the graphic below that an
increasing denominator degrees of freedom correlates with the relative
probability density “spiking” within a smaller and more confined range
of critical values. Hence, it is much more likely for a F distribution
with a smaller denominator degrees of freedom to output a large F
statistic.
### Normal PDF
curve(dnorm(x, mean = 0, sd = 1),
from = 0,
to = 10,
lwd = 2,
ylim = c(0,1),
main = "Normal vs F Distributions",
xlab = "Critical Value",
ylab = "Probability Density")
box(lwd = 3)
### F Distribution PDF's
curve(df(x, df1 =10, df2 = 2),
from = 0, to = 10,
lwd = 2,
lty = 2,
ylim = c(0,1),
col = "red",
add = TRUE)
curve(df(x, df1 =10, df2 = 10),
from = 0, to = 10,
lwd = 2,
ylim = c(0,1),
lty = 2,
col = "blue",
add = TRUE)
curve(df(x, df1 =10, df2 = 100),
from = 0, to = 10,
lwd = 2,
ylim = c(0,1),
lty = 2,
col = "orange",
add = TRUE)
legend("topright",
legend = c("N (0,1)",
paste0("F (df=", 10, ",", 2, ")"),
paste0("F (df=", 10, ",", 10, ")"),
paste0("F (df=", 10, ",", 100, ")")
),
col = c("black", "red", "blue", "orange"),
lwd = 2,
lty = 1,
box.lwd = 3)

