P-Values: Understanding and Applications

Introduction to P-Values

P-values are widely used in statistical hypothesis testing.
They measure the probability of obtaining results as extreme as observed, assuming the null hypothesis is true.
A smaller P-value suggests stronger evidence against the null hypothesis.

Mathematical Definition of P-Values

The P-value is defined as:

\[ P = P(T \geq t | H_0) \]

where: - \(T\) is the test statistic. - \(H_0\) is the null hypothesis.

Another integral representation:

\[ P = \int_{t}^{\infty} f(x) dx \]

where \(f(x)\) is the probability density function.

Importance of P-Values

Used in scientific research to determine statistical significance.
Common threshold: \(P < 0.05\) is often considered significant.
Helps decide whether to reject the null hypothesis.

Example: Does Sleep Improve Memory?

Experiment Setup:

Null Hypothesis (\(H_0\)): Sleep has no effect on memory recall.
Alternative Hypothesis (\(H_A\)): Sleep improves memory recall.
A study tests participants’ recall ability before and after a full night’s sleep.

Simulating the Experiment

set.seed(42)
before_sleep <- rnorm(30, mean = 65, sd = 10)  
after_sleep <- rnorm(30, mean = 72, sd = 10)   


t_test_result <- t.test(before_sleep, after_sleep, paired = TRUE)
t_test_result$p.value  # Extract the P-value

## [1] 0.1304872

Boxplot: Comparing Memory Scores

Density Plot: Memory Recall Distribution

What is P-Hacking?

P-hacking occurs when researchers manipulate analyses to obtain significant results.
Common practices:
- Running multiple tests and reporting only significant ones.
- Stopping data collection early if \(P < 0.05\).
- Selecting variables after seeing the data.

Avoiding P-Hacking

Pre-register your hypothesis before analyzing data.
Use correct statistical adjustments
Report effect sizes and confidence intervals, not just P-values.
Conduct replication studies to validate findings.

Annotated 3D Visualization of P-Values

Importance of the plot

Values near the peak (high Z) are likely outcomes under \(H_0\).
Values in the tails (low Z) are unlikely outcomes, leading to small P-values.
This visualization helps explain why small P-values indicate statistical significance.

Conclusion

P-values help determine statistical significance.
A small P-value suggests strong evidence against the null hypothesis.
Be cautious of P-hacking and always follow rigorous statistical practices.