p-value in Hypothesis Testing

2024-10-20

Introduction to p-value

What is the p-value?

The p-value is the probability of obtaining a result equal to or more extreme than what was actually observed, assuming the null hypothesis is true.
The p-value helps us determine if results are statistically significant.
Common threshold: p < 0.05.
Limitations: A small p-value does not guarantee a large effect size or practical significance.

Formal Definition of p-value

The p-value is formally defined as:

\[ p = P(T \geq t \mid H_0) \]

where $T$ is the test statistic and $H_0$ is the null hypothesis.

Hypothesis Testing Process

Define null and alternative hypotheses.
Choose a significance level ($\alpha$).
Calculate the test statistic.
Find the p-value.
Compare p-value with $\alpha$.
Draw conclusions.

## Loading required package: ggplot2

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

3D Plotly Plot: P-value Distribution

R code to generate p-value distribution plot

p_values <- replicate(1000, t.test(rnorm(30))$p.value)

ggplot(data.frame(p_values = p_values), aes(x = p_values)) + # Fixed data argument geom_histogram(binwidth = 0.05, fill = “blue”, color = “white”) + labs(title = “Distribution of p-values under the Null Hypothesis”, x = “p-value”, y = “Frequency”)

##Significance Level and p-value (LaTeX)

If \[ 𝑝 ≤ 𝛼 \] we reject the null hypothesis

Common choices for \[𝛼\] include 0.05, 0.01, and 0.10.

##Conclusion

-The p-value is a critical concept in hypothesis testing.

-It allows us to make decisions based on statistical evidence.

-Always interpret p-values in context.