2025-03-16

P-Value in Statistics

This slideshow will cover the p-value and its significance in statistics, mainly hypothesis testing.

What is a P-Value?

A p-value measures the probability of obtaining an extreme result under the null hypothesis.

Lower p-values suggest stronger evidence against \(H_0\).

Key rule: If \(p < \alpha\), typically 0.05, we reject \(H_0\).

What is Hypothesis Testing?

  • Hypothesis testing determines if there is enough evidence to reject the null hypothesis.

  • It involves:

    1. Defining hypotheses:
      • \(H_0\): Null hypothesis (no effect).
      • \(H_1\): Alternative hypothesis (some effect).
    2. Collecting sample data.
    3. Computing a test statistic.
    4. Comparing the p-value to a significance level (\(\alpha\)).

The p-value is crucial in hypothesis testing.

Mathematical Derivation of the P-Value

The test statistic for a one-sample t-test is:

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]

where: \(\bar{x}\) is the sample mean, \(\mu_0\) is the hypothesized population mean, \(s\) is the sample standard deviation, \(n\) is the sample size.

3D Visualization of P-Values

This 3D plot (plotly) shows how sample size affects p-values.

When a sample size is small, p-values tend to be large, meaning we fail to reject more often

Simple Line Plot of P-Values

A simple line plot showing how p-values decrease as sample size increases.

Simple Histogram of Simulated P-Values

A histogram showing the distribution of p-values from 100 t-tests.

library(ggplot2)
# Generate random p-values from simulated t-tests
set.seed(123)
p_values <- replicate(100, t.test(rnorm(30))$p.value)
data <- data.frame(p_values)

ggplot(data, aes(x = p_values)) +
  geom_histogram(binwidth = 0.05, fill = "blue", color = "black") +
  labs(title = "Histogram of Simulated P-Values",
       x = "P-Value",
       y = "Frequency")