The P-Value: Measuring Evidence in Data

2025-10-26

Slide 1: What Is a P-Value?

-The p-value quantifies how surprising the data are under a chosen model or assumption.

-It is not the probability that the model is true or false.

-Instead, it measures how compatible the observed data are with that assumption.

Smaller p-values mean less compatibility between data and model expectations.

-Imagine simulating data many times under some assumption (for example, a fair coin).

-The p-value tells us how extreme our observed result is in that simulated world.

-It answers:

“If the model were true, how often would we see data this unusual?”

The p-value measures how extreme the observed data are under a model.

It is defined as:

\[ p = P(T \ge t_{\text{obs}} \mid \text{model}) \]

where:

\(T\) = test statistic measuring extremeness (like a z-score)
\(t_{\text{obs}}\) = observed value of the statistic
The p-value is the area in the tail(s) of the sampling distribution beyond \(t_{\text{obs}}\)

The standard score (z-score) for a value is:

\[ z = \frac{\text{observed value} - \text{mean}}{\text{standard deviation}} \]

The probability of observing exactly \(k\) heads in \(n\) fair coin flips is:

\[ P(T = k) = \binom{n}{k} p^k (1-p)^{n-k} \]

The p-value for observing 15 heads or more (or 5 or fewer) is:

\[ p\text{-value} = \sum_{k=15}^{20} \binom{20}{k} 0.5^k (0.5)^{20-k} \;+\; \sum_{k=0}^{5} \binom{20}{k} 0.5^k (0.5)^{20-k} \]

-The p-value measures how extreme the observed data are under a model.

-It is not proof of truth or falsehood — only a measure of surprise.

-Combine it with context, effect size, and data quality for sound conclusions.

-Use visualization to understand what the number actually represents.