HW3

2026-06-09

What is a P-Value?

The \(p\)-value is a core metric used to weigh the strength of evidence against a baseline assumption. The closer the \(p\)-value is to 1, the higher the chance that the results are a fluke.

The Core Definition: The probability of obtaining test results at least as extreme as the observed results, assuming that the null hypothesis (\(H_0\), variables have no impact on the data) is true.
What it is NOT: It is not the probability that the null hypothesis is true, nor is it the probability that the alternative hypothesis is false.
Statistical Significance: We compare the \(p\)-value to a pre-determined significance threshold (\(\alpha\), typically \(0.05\)):
- If \(p \le \alpha \implies\) Reject \(H_0\) (Statistically Significant).
- If \(p > \alpha \implies\) Fail to reject \(H_0\) (Not Statistically Significant).

Formal Setup & Test Statistics (LaTeX Slide #1)

To compute a \(p\)-value, we model our data under a specific statistical test framework. Let’s start with a One-Sample \(t\)-Test:

Hypotheses: \[\begin{cases} H_0: \mu = \mu_0 \\ H_1: \mu \neq \mu_0 \end{cases}\]
The \(t\)-Statistic formula: \[t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}}\]
Where:
- \(\bar{X}\) is the sample mean.
- \(\mu_0\) is the hypothesized population mean acording to the null assumption.
- \(s\) is the sample standard deviation.
- \(n\) is the active sample size.

Integral Calculation of P-Values (LaTeX Slide #2)

Once the test statistic (\(t_{obs}\)) is found, the \(p\)-value represents an area under the probability density curve of the reference distribution(the percentage of the curve that is more extreme than what was observed).

For a continuous probability density function (PDF) \(f(t)\) representing the Student’s \(t\)-distribution with \(n-1\) degrees of freedom, a two-tailed \(p\)-value is defined mathematically as:

\[p\text{-value} = 2 \times P(T \ge |t_{obs}|) = 2 \int_{|t_{obs}|}^{\infty} f(t) \, dt\]

Where the cumulative distribution function (CDF) evaluates the tail space:

\[F(t) = \int_{-\infty}^{t} f(u) \, du \implies p\text{-value} = 2 \times (1 - F(|t_{obs}|))\]

The Continuous Null Distribution (ggplot2 Slide #1)

The \(p\)-value maps visually to the shaded tail regions of our distribution of possible results assuming the null hypothesis is true.

Data Preparation Code Chunk (R Code Slide)

This code models how sample sizes (\(n\)) and shifting effect sizes (\(\delta\)) force changes onto calculated \(p\)-values. Here they are mapped across a 3D coordinate frame.

sample_sizes <- seq(10, 200, by = 10)
effect_sizes <- seq(0.1, 1.0, by = 0.05)
grid <- expand.grid(n = sample_sizes, effect = effect_sizes)

grid_results <- grid %>%
  mutate(
    t_stat = effect / (1 / sqrt(n)),
    p_value = 2 * (1 - pt(abs(t_stat), df = n - 1))
  )

How Sample Size and Effect Shifting Compress P (ggplot2 Slide #2)

As sample groups size or physical effects widen, \(p\)-value drops towards absolute zero.

3D Sample Surface Mapping (Plotly Slide)

Rotate and hover across this active 3D surface to track exactly how sample volume (\(n\)) and effect size interact to generate \(p\)-values.

Summary Rules for Modern Research

Sample Bias: Large sample sets (\(n\)) will drive \(p\)-values down to hit statistical significance even when the real-world effect size is completely negligible.
Arbitrary Thresholds: Never rely purely on \(p \le 0.05\) as an absolute truth threshold. Always present effect sizes alongside confidence intervals.
Reproducibility: A single low \(p\)-value indicates nothing without peer replication, and high one means failure to prove a hypothesis true, not that it is false.

\[\text{Scientific Truth} \neq (p \le 0.05)\]