Understanding the p-value

What is a p-value?

A p-value measures how surprising the data are if the null hypothesis is true.

It answers this question:

\[ \text{p-value} = P(\text{data at least as extreme as observed} \mid H_0 \text{ is true}) \]

Smaller p-values mean the observed result is less consistent with the null hypothesis.

Example question

We will use the built-in mtcars data set and test whether the mean miles per gallon is different from 18.

Sample size: 32
Sample mean mpg: 20.09
Null mean: 18

Hypotheses and test statistic

We test

\[ H_0 : \mu = 18 \]

against

\[ H_a : \mu \ne 18 \]

The one-sample t statistic is

\[ t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}} \]

Large values of \(|t|\) give smaller p-values.

Distribution of mpg

The solid line is the sample mean and the dashed line is the null mean.

What the p-value means

For a two-sided t-test,

\[ p\text{-value} = P\left(|T| \ge |t_{\text{obs}}| \mid H_0\right) \]

If the p-value is smaller than the chosen significance level \(\alpha\), we reject the null hypothesis.

Simulated null distribution

The p-value is the area in the tails beyond the observed test statistic.

Interactive plotly plot

This interactive chart shows the null distribution and the observed cutoff.

R code used for the null distribution plot

ggplot(null_df, aes(x = t)) +
  geom_histogram(
    aes(y = after_stat(density)),
    bins = 40,
    fill = "gray80",
    color = "white"
  ) +
  geom_vline(
    xintercept = c(-abs(t_obs), abs(t_obs)),
    color = "#8C1D40",
    linetype = "dashed",
    linewidth = 1
  ) +
  annotate(
    "text",
    x = abs(t_obs),
    y = 0.38,
    label = paste0("observed |t| = ", round(abs(t_obs), 2)),
    color = "#8C1D40",
    hjust = -0.05
  ) +
  labs(
    title = "Null Distribution of t Statistics",
    x = "t statistic under H0",
    y = "Density"
  ) +
  coord_cartesian(xlim = c(-6, 6)) +
  theme_minimal(base_size = 16)

Interpreting the result

Observed t statistic: 1.962

p-value: 0.0588

Because the p-value is greater than 0.05, we fail to reject \(H_0\) at the 5% level.

Key takeaways

A p-value is a probability computed under the null hypothesis.
Small p-values suggest the observed data are unusual under \(H_0\).
The p-value is not the probability that \(H_0\) is true.
Statistical significance depends on the chosen significance level.

P-values are one of the most common tools for making decisions in statistics.