Understanding the p-value

What is a p-value?

The p-value is a number that describes how likely the data would be if the null hypothesis were true.

It helps to decide whether the patterns in the data are likely due to chance or are statistically significant.

The Role of the Null Hypothesis in P-value Testing

All statistical tests begin with a null hypothesis (\(H_0\)), which typically states there is no effect or no difference.

\[ \begin{aligned} H_0\!: &\ \text{There is no difference in lifespan between Group A and Group B} \\ H_1\!: &\ \text{There is a difference in lifespan between Group A and Group B} \end{aligned} \]

How to Interpret a p-value

A p-value tells us how likely it is to see the results we found if the null hypothesis is true.

A small p-value (typically < 0.05) means the result is unlikely by chance → we reject \(H_0\)
A large p-value means the result could happen by chance → we fail to reject \(H_0\)

The p-value does not prove that the alternative hypothesis is true.

How Is the p-value Calculated?

Statistical tests use a formula to compare the observed data with what is expected under the null hypothesis.

For example, the formula for a z-test is:

\[ z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} \]

Where:
- \(\bar{x}\) = sample mean
- \(\mu\) = population mean
- \(\sigma\) = standard deviation
- \(n\) = sample size

Visualizing Tree Data with ggplot2

## `geom_smooth()` using formula = 'y ~ x'

Interactive Plot with Plotly

ggplot2: MPG vs Weight

Interpreting the p-value (LaTeX)

A small p-value indicates strong evidence against the null hypothesis.

\[ \text{If } p \leq \alpha, \text{ reject } H_0 \\ \text{If } p > \alpha, \text{ fail to reject } H_0 \]

Common choices:
- \(\alpha = 0.05\) (5%)
- \(\alpha = 0.01\) (1%)

Summary

The p-value helps determine whether a result is statistically significant.
A small p-value suggests strong evidence against the null hypothesis.
Statistical tools like ggplot2 and Plotly allow us to visualize data and test results clearly.
Knowing how to interpret and apply p-values is essential in hypothesis testing and data-driven decision-making.