2025-06-08

P-value Definition

  • The p-value is a measure used in statistics that tells us the probability in which the tested results occurred due to chance.
  • The p-value allows us to test whether what is observed is statistically significant enough to reject the null hypothesis.
  • P-values of 0.05 and under are considered to be statistically significant.

Problems with P-values in Statistics

  • Even with larger samples, the p-value may be unreliable as researchers note redoing your experiment may yield extremely different results.
  • P-hacking is an abuse of statistics where a researcher will select the analysis that the holds a more pleasing result.

P-value and Significance

\[ \text{If } p \lt \alpha , \text{Reject } H_0 \]

\[ \text{If } p \geq \alpha , \text{Fail to Reject the } H_0\]

Exploring P-values with Coins

Suppose we flip a coin 100 times and get 65 tails out of the 100 times we flipped the coin.

We would now need to evaluate the likelihood of this coin landing 65 tails out of 100 flips if it is fair.

  • Null Hypothesis \(H_0\): The coin is fair p = 0.05

  • Alternative Hypothesis H\(_a\): The coin is not fair p \(\neq\) 0.05

Visualizing Coin Flip Results

This plot allows us to visualize our counts of coin flips to further understand p-values.

How Variablilty Affects P-values

In this graph, we have a histogram that represents the distribution of p-values had we conducted the previous experiment 50 times. Each simulation consists of 100 coin flips 50 times. The resulting proportion of tails was used to compute a p-value assuming a fair coin. We can now visualize how p-values can very from sample to sample, even when the null hypothesis is true.

Interactive Scatter Plot of P-values

This graph is an interactive plot that shows the p-values of the 50 repeated simulations. Each dot represents a p-value for each trial. This allows us to see more clearly how p-values can range from sample to sample even when conditions are the same.

Understanding P-hacking

  • P-hacking is a form of abuse in statistics that may occur either intentionally or unintentionally. Understanding how p-value variability may occur from sample to sample shows how one trial might be preferred over another to maintain desired results.

Recap - The null hypothesis is typically rejected when: \[ \text{p-value} < \alpha = 0.05 \]

Yet, even when the null hypothesis is true, false positives may occur and allow for such data to be perceived as statistically significant: \[ H_0 : \text{No effect but still p-value} < 0.05 \]