2025-10-26

Slide 1: What Is a P-Value?

-The p-value quantifies how surprising the data are under a chosen model or assumption.

-It is not the probability that the model is true or false.

-Instead, it measures how compatible the observed data are with that assumption.

Smaller p-values mean less compatibility between data and model expectations.

Slide 2: The Intuition

-Imagine simulating data many times under some assumption (for example, a fair coin).

-The p-value tells us how extreme our observed result is in that simulated world.

-It answers:

“If the model were true, how often would we see data this unusual?”

Slide 3: Mathematical Definition of the P-Value

The p-value measures how extreme the observed data are under a model.

It is defined as:

\[ p = P(T \ge t_{\text{obs}} \mid \text{model}) \]

where:

  • \(T\) = test statistic measuring extremeness (like a z-score)
  • \(t_{\text{obs}}\) = observed value of the statistic
  • The p-value is the area in the tail(s) of the sampling distribution beyond \(t_{\text{obs}}\)

Slide 4: Z-Score and P-Value Visualization

The standard score (z-score) for a value is:

\[ z = \frac{\text{observed value} - \text{mean}}{\text{standard deviation}} \]

Slide 5: 3D Visualization of Normal Distribution (Plotly)

Slide 6: Histogram of Simulated Z-Scores (ggplot2)

Slide 7: Calculating the P-Value Using the Binomial Formula with an example

The probability of observing exactly \(k\) heads in \(n\) fair coin flips is:

\[ P(T = k) = \binom{n}{k} p^k (1-p)^{n-k} \]

The p-value for observing 15 heads or more (or 5 or fewer) is:

\[ p\text{-value} = \sum_{k=15}^{20} \binom{20}{k} 0.5^k (0.5)^{20-k} \;+\; \sum_{k=0}^{5} \binom{20}{k} 0.5^k (0.5)^{20-k} \]

Slide 8: Key Takeaways

-The p-value measures how extreme the observed data are under a model.

-It is not proof of truth or falsehood — only a measure of surprise.

-Combine it with context, effect size, and data quality for sound conclusions.

-Use visualization to understand what the number actually represents.