P-Value

2026-02-07

Overview

The \(p\)-value is a numerical measure obtained from a statistical test that represents the probability of observing the given data at least as extreme as the observed result, assuming the null hypothesis is true.

In hypothesis testing, \(p\)-values are used to assess whether there is sufficient evidence to reject the null hypothesis. Smaller \(p\)-values indicate stronger evidence against the null hypothesis and increase the likelihood of rejecting it.

How to Calculate \(p\)-value

State the null hypothesis \(H_0\)
Calculate a test statistic, like
t-score: \(t = \frac{\bar{x} - \mu}{s / \sqrt{n}}\)
z-score: \(z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}}\)
Find the area in the distribution
Interpret the results
if \(p < \alpha\) then reject the results
if \(p > \alpha\) then fail to reject the hypothesis
Generally, a \(p\)-value of 0.05 or lower is considered statistically significant

\(p\)-value Graphically

Type I and Type II Errors

The curves represent the expected outcomes under no effect (\(H_0\)) and a real effect (\(H_1\)). Rejecting a true null is a Type I error, while missing a real effect is a Type II error. The vertical line shows the threshold for deciding when to reject \(H_0\).

Code from Previous Graph

  geom_line(aes(y = H0), color = "black", linewidth = 1.2) +
  geom_line(aes(y = H1), color = "blue", linetype = "dashed", linewidth = 1.2) +
  geom_area(data=subset(df, x > z_alpha), aes(y=H0), fill="red", alpha=0.4) +
  geom_area(data=subset(df, x <= z_alpha), aes(y=H1), fill="blue", alpha=0.3) +
  geom_vline(xintercept = z_alpha, linetype="dotted", color="black", linewidth=1) +
  geom_segment(aes(x = 2.8, y = 0.13, xend = 2, yend = 0.05),
               arrow = arrow(length = unit(0.2, "inches")),
               color = "red", linewidth = 1) +
  annotate("text", x=z_alpha+0.6, y=0.35, label="Reject H0", color="black") +
  annotate("text", x=-1.9, y=0.3, label="Fail to Reject H0", color="black") +   # moved left
  annotate("text", x=3.2, y=0.145, label="Type I Error (α)", color="red") +       # moved left and down
  annotate("text", x=-0, y=0.05, label="Type II Error (β)", color="blue") +
  labs(title="Type I and Type II Errors",
       x="Test Statistic / Sample Mean",
       y="Probability Density") +
  theme_minimal(base_size = 12)

How Sample Size Effects p-value

Increasing sample size makes the \(p\)-value smaller
A small sample size increases the probability of a Type II error

Common Misconceptions

A \(p\)-value is not the probability that \(H_0\) is true. It’s a measure of how compatible the observed data are with \(H_0\).
Small \(p\)-value is evidence against \(H_0\), not proof of effect.
Large \(p\)-value is inconclusive, not proof of no effect. It could be due to small sample size or noisy data.
Always consider effect size, sample size, and context.