Discussions

1. Student t distribution convergence to the Normal distribution

The Student t distribution with ν degrees of freedom approaches the standard Normal distribution as ν goes to infinity.

\[ t_\nu \xrightarrow[\nu \to \infty]{} \mathcal{N}(0, 1) \]

Both distributions are centered at 0.
For small degrees of freedom, the t distribution has heavier tails.
As the degrees of freedom increase, the tails become lighter and the t distribution converges to the Normal distribution.

df_list <- c(2, 5, 15, 30, 120)

x <- seq(-5, 5, length.out = 2000)

dens_norm <- dnorm(x)

dens_t <- sapply(df_list, function(v) dt(x, df = v))

plot(
  x, dens_norm,
  type = "l",
  lwd = 3,
  xlab = "x",
  ylab = "Density",
  main = "Normal Distribution vs Student t Distributions"
)

for (i in seq_along(df_list)) {
  lines(x, dens_t[, i], lwd = 2, lty = i + 1)
}

legend(
  "topright",
  legend = c("Normal(0,1)", paste0("t(df = ", df_list, ")")),
  lwd = c(3, rep(2, length(df_list))),
  lty = c(1, seq_along(df_list) + 1),
  bty = "n"
)

For df = 2 and df = 5, the Student t distributions have much heavier tails than the Normal distribution. As the degrees of freedom increase to 15 and 30, the difference between the t distribution and the Normal distribution becomes smaller. By df = 120, the t distribution is visually almost identical to the Normal distribution across most of the range, illustrating convergence.

2. Normal data and z-score transformation

We generate data according to:

\[ X_1, \ldots, X_n \sim \mathcal{N}(\mu, \sigma^2), \quad \mu = 108, \ \sigma = 7.2, \ n = 1000 \]

We then compute z-scores using the sample mean \(\bar{X}\) and sample standard deviation \(S\):

\[ Z_i = \frac{X_i - \bar{X}}{S} \]

This transformation shifts the data to have mean 0 and rescales it to have standard deviation 1.

set.seed(42)

n <- 1000
mu <- 108
sigma <- 7.2

x <- rnorm(n, mean = mu, sd = sigma)

x_bar <- mean(x)
s <- sd(x)

z <- (x - x_bar) / s

par(mfrow = c(1, 2))

hist(
  x,
  breaks = 35,
  probability = TRUE,
  main = "Distribution of Raw Data",
  xlab = "X",
  border = "white"
)
lines(density(x), lwd = 2)

hist(
  z,
  breaks = 35,
  probability = TRUE,
  main = "Distribution of Z-Scores",
  xlab = "Z",
  border = "white"
)
lines(density(z), lwd = 2)

par(mfrow = c(1, 1))

Yes, the two distributions have the same distributional shape. The z-score transformation is an affine transformation: it subtracts a constant (the mean) and divides by a positive constant (the standard deviation). Affine transformations preserve the shape of a distribution, changing only its location and scale. Because the original data are normally distributed, the standardized values remain normally distributed. The raw data are centered at 108 with spread 7.2, while the z-scores are centered at 0 with spread 1, but both retain the same bell-shaped form.

3. Explain a p-value

A p-value is the probability, assuming the null hypothesis \(H_0\) is true, of observing a test statistic at least as extreme as the one computed from the sample.

Let \(T\) be a test statistic and let \(t_{\text{obs}}\) be its observed value. Then:

Right-tailed test:

\[ p = P(T \ge t_{\text{obs}} \mid H_0) \]

Left-tailed test:

\[ p = P(T \le t_{\text{obs}} \mid H_0) \]

Two-tailed test:

\[ p = P(|T| \ge |t_{\text{obs}}| \mid H_0) \]

A small p-value means the observed result would be unlikely if \(H_0\) were true, which provides evidence against \(H_0\).
A large p-value means the observed result is reasonably plausible under \(H_0\), so the data do not provide strong evidence against \(H_0\).

A p-value is not the probability that \(H_0\) is true. It is a probability about the data (or more extreme data), conditional on \(H_0\) being true.

Suppose we test whether the population mean equals 108:

\[ H_0: \mu = 108 \quad \text{vs} \quad H_a: \mu \ne 108 \]

A one-sample t-test uses the statistic:

\[ t = \frac{\bar{X} - \mu_0}{S / \sqrt{n}} \]

If we observe \(t_{\text{obs}} = 2.5\) with \(df = n - 1\), then the two-tailed p-value is:

\[ p = 2 \cdot P(T \ge |t_{\text{obs}}| \mid H_0), \quad T \sim t_{df} \]

t_obs <- 2.5
df <- 24

p_two_tailed <- 2 * (1 - pt(abs(t_obs), df = df))
p_two_tailed

## [1] 0.01965418

Given a significance level

\[ \alpha = 0.05 \]

\[ p \le \alpha \]

reject

\[ H_0 \]

\[ p > \alpha \]

do not reject

\[ H_0 \]

The p-value measures the strength of evidence against

\[ H_0 \]

on a continuous scale, while

\[ \alpha \]

is a fixed cutoff used to make a binary decision.

Discussions

Sam Denomme

2026-01-16

1. Student t distribution convergence to the Normal distribution

2. Normal data and z-score transformation

3. Explain a p-value