Understanding p-values

What is a p-value?

A p-value measures the probability of observing a test statistic as extreme as, or more extreme than, the test statistic actually observed, given that the null hypothesis is true.

\[ p = P\left(|T| \ge |t_{\text{obs}}| \mid H_0\right) \]

If the p-value is small, it means the observed result would be unlikely if the null hypothesis were true.

A common rule is:

If \(p \le 0.05\): reject \(H_0\) (there is evidence against the null hypothesis)
If \(p > 0.05\): fail to reject \(H_0\) (there is not strong evidence against it)

The four steps of hypothesis testing

State a null hypothesis
Choose a test statistic
Find its distribution under \(H_0\)
Compare observed value to that distribution

Step 1: Hypotheses in LaTeX

\[ H_0: \mu = 70 \]

\[ H_a: \mu \neq 70 \]

Step 2: Test statistic in LaTeX

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]

\[ t \sim t_{n-1} \]

Example data and results

We analyze exam scores for 40 students.

Sample mean: 71

Test statistic: 2.048

p-value: 0.047

Histogram of exam scores

Boxplot of exam scores

R code for p-value visualization

curve_data <- data.frame(t = seq(-4, 4, length.out = 400))
curve_data$density <- dt(curve_data$t, df = n - 1)

tail_left <- subset(curve_data, t <= -abs(t_stat))
tail_right <- subset(curve_data, t >= abs(t_stat))

p-value as tail area

What this graph shows

The curve shows the distribution of the test statistic assuming the null hypothesis is true.

The dashed lines mark the observed test statistic.

The shaded red areas represent values as extreme or more extreme than the observed value.

The total shaded area is the p-value.

Interactive visualization

R code for test

t.test(scores, mu = 70)

## 
##  One Sample t-test
## 
## data:  scores
## t = 2.0478, df = 39, p-value = 0.04735
## alternative hypothesis: true mean is not equal to 70
## 95 percent confidence interval:
##  70.01227 71.98773
## sample estimates:
## mean of x 
##        71

Conclusion

The p-value is 0.047.

This provides evidence against the null hypothesis.

Smaller p-values indicate stronger evidence against \(H_0\).