2024-10-20

Introduction to p-value

What is the p-value?

  • The p-value is the probability of obtaining a result equal to or more extreme than what was actually observed, assuming the null hypothesis is true.

  • The p-value helps us determine if results are statistically significant.

  • Common threshold: p < 0.05.

  • Limitations: A small p-value does not guarantee a large effect size or practical significance.

Formal Definition of p-value

The p-value is formally defined as:

\[ p = P(T \geq t \mid H_0) \]

where \(T\) is the test statistic and \(H_0\) is the null hypothesis.

Hypothesis Testing Process

  1. Define null and alternative hypotheses.
  2. Choose a significance level (\(\alpha\)).
  3. Calculate the test statistic.
  4. Find the p-value.
  5. Compare p-value with \(\alpha\).
  6. Draw conclusions.

## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

3D Plotly Plot: P-value Distribution

R code to generate p-value distribution plot

p_values <- replicate(1000, t.test(rnorm(30))$p.value)

ggplot(data.frame(p_values = p_values), aes(x = p_values)) + # Fixed data argument geom_histogram(binwidth = 0.05, fill = “blue”, color = “white”) + labs(title = “Distribution of p-values under the Null Hypothesis”, x = “p-value”, y = “Frequency”)

##Significance Level and p-value (LaTeX)

If \[ 𝑝 ≤ 𝛼 \] we reject the null hypothesis

Common choices for \[𝛼\] include 0.05, 0.01, and 0.10.

##Conclusion

-The p-value is a critical concept in hypothesis testing.

-It allows us to make decisions based on statistical evidence.

-Always interpret p-values in context.