A p-value measures how unlikely your data would be if the null hypothesis were true.
This presentation explains what a p-value means, includes examples, and provides visuals created in R.
2025-11-15
A p-value measures how unlikely your data would be if the null hypothesis were true.
This presentation explains what a p-value means, includes examples, and provides visuals created in R.
For a one-sided hypothesis test:
\[ H_0: \mu = \mu_0 \qquad H_A: \mu > \mu_0 \]
The test statistic for a one-sample t-test is:
\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]
The p-value is:
\[ p = P(T \ge t_{\text{obs}} \mid H_0) \]
knitr::opts_chunk$set(echo = TRUE) library(ggplot2) library(plotly) library(dplyr) set.seed(2025) n <- 20 mu0 <- 5 x <- rnorm(n, mean = 5.8, sd = 1.2) t_test <- t.test(x, mu = mu0, alternative = "greater") t_test
## ## One Sample t-test ## ## data: x ## t = 4.3012, df = 19, p-value = 0.0001927 ## alternative hypothesis: true mean is greater than 5 ## 95 percent confidence interval: ## 5.626641 Inf ## sample estimates: ## mean of x ## 6.047919
df <- n - 1
t_obs <- t_test$statistic
tvals <- seq(-4, 4, length.out = 400)
dens <- dt(tvals, df)
df_plot <- data.frame(t = tvals, density = dens)
ggplot(df_plot, aes(t, density)) +
geom_line() +
geom_area(data = subset(df_plot, t >= t_obs),
aes(t, density), alpha = 0.35) +
geom_vline(xintercept = t_obs, linetype="dashed") +
labs(title="Null t-distribution",
subtitle=paste("Observed t =", round(t_obs,3)),
x="t", y="Density")
dfx <- data.frame(x = x, index = 1:length(x)) ggplot(dfx, aes(index, x)) + geom_point(size=2, alpha=0.8) + geom_hline(yintercept = mean(x), color="maroon", linewidth = 1) + labs(title = "Sample Observations with Mean Line")
mu_hat_seq <- seq(4.5, 7.5, length.out = 60)
n_seq <- 5:60
sd_pop <- 1.2
mu0 <- 5
grid <- expand.grid(mu_hat = mu_hat_seq, n = n_seq) %>%
mutate(se = sd_pop / sqrt(n),
t = (mu_hat - mu0)/se,
p = 1 - pt(t, df=n-1))
zmat <- matrix(grid$p, nrow=length(n_seq), ncol=length(mu_hat_seq), byrow=TRUE)
plot_ly(x=mu_hat_seq, y=n_seq, z=zmat) %>%
add_surface() %>%
layout(title="p-value Surface",
scene=list(xaxis=list(title="Mean"),
yaxis=list(title="n"),
zaxis=list(title="p-value")))
A small p-value means your sample would be unlikely under \(H_0\).
It measures the probability that \(H_0\) is true.
Interpretation formula:
\[ \text{p-value} = P(\text{data as or more extreme} \mid H_0) \]
From our sample:
If \(p < 0.05\), we reject \(H_0\).
Does this result reject the null? We reject the null hypothesis.
Thank you for your time!