Hypothesis Testing

2026-04-12

What is Hypothesis Testing?

Hypothesis testing is a formal statistical framework for making decisions based on data.

A hypothesis is a claim or assumption about a population parameter
We use sample data to assess whether that claim is supported
The process produces a decision: reject or fail to reject the null hypothesis

Common applications:

Medicine: Does a new drug lower blood pressure?
Engineering: Does a new process reduce defect rates?
Business: Does a website redesign improve conversion?

The Two Hypotheses

Every hypothesis test begins with two competing statements:

Null Hypothesis \(H_0\): The default assumption — no effect, no difference.

\[H_0: \mu = \mu_0\]

Alternative Hypothesis \(H_a\): What we suspect or want to detect.

\[H_a: \mu \neq \mu_0 \quad \text{(two-tailed)}\] \[H_a: \mu > \mu_0 \quad \text{(right-tailed)}\] \[H_a: \mu < \mu_0 \quad \text{(left-tailed)}\]

We never prove \(H_0\) true — we either reject it or fail to reject it based on evidence.

The Test Statistic (Z-test)

When \(\sigma\) is known and \(n\) is large, we use the Z-test statistic:

\[Z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}\]

Symbol	Meaning
\(\bar{x}\)	Sample mean
\(\mu_0\)	Hypothesized population mean
\(\sigma\)	Population standard deviation
\(n\)	Sample size

Under \(H_0\), this statistic follows a standard normal distribution \(Z \sim \mathcal{N}(0, 1)\).

The Standard Normal Distribution

Worked Example

Scenario: A manufacturer claims mean weight = \(\mu_0 = 500\) g. A sample of \(n = 36\) gives \(\bar{x} = 508\) g, \(\sigma = 24\) g. Test at \(\alpha = 0.05\).

Step 1 — Hypotheses: \[H_0: \mu = 500 \quad \text{vs.} \quad H_a: \mu \neq 500\]

Step 2 — Test Statistic: \[Z = \frac{508 - 500}{24 / \sqrt{36}} = \frac{8}{4} = 2.0\]

Step 3 — Critical Value: \(z_{\alpha/2} = 1.96\)

Step 4 — Decision: Since \(|Z| = 2.0 > 1.96\), we reject \(H_0\).

Visualising the Worked Example

Effect of n and 3c3 on Z-statistic (Static)

Effect of n and 3c3 on Z-statistic (Interactive 3D)

Type I and Type II Errors

	\(H_0\) True	\(H_0\) False
Reject \(H_0\)	❌ Type I Error (\(\alpha\))	✅ Correct (Power)
Fail to Reject \(H_0\)	✅ Correct	❌ Type II Error (\(\beta\))

Type I Error: Rejecting \(H_0\) when it is actually true — probability = \(\alpha\)
Type II Error: Failing to reject \(H_0\) when it is actually false — probability = \(\beta\)
Power = \(1 - \beta\) = probability of correctly detecting a false \(H_0\)

Increasing sample size \(n\) improves power and reduces both error types.

R Code: Running the Test

set.seed(42)
sample_data <- rnorm(n = 36, mean = 508, sd = 24)

x_bar <- mean(sample_data)
mu0   <- 500
sigma <- 24
n     <- length(sample_data)

Z       <- (x_bar - mu0) / (sigma / sqrt(n))
p_value <- 2 * pnorm(-abs(Z))

cat(sprintf("Sample mean : %.3f\n", x_bar))

## Sample mean : 509.621

cat(sprintf("Z-statistic : %.4f\n", Z))

## Z-statistic : 2.4053

cat(sprintf("p-value     : %.4f\n", p_value))

## p-value     : 0.0162

cat(sprintf("Decision    : %s H0 at alpha = 0.05\n",
            ifelse(p_value < 0.05, "Reject", "Fail to Reject")))

## Decision    : Reject H0 at alpha = 0.05

Summary & Key Takeaways

The Hypothesis Testing Workflow:

State \(H_0\) and \(H_a\) clearly
Choose significance level \(\alpha\) (commonly 0.05)
Collect data and compute the test statistic
Find the p-value or compare to the critical value
Make a decision and state a conclusion in context

Remember:

A small p-value = strong evidence against \(H_0\)
Failing to reject \(H_0\) does not prove it true
Statistical significance \(\neq\) practical significance
Larger samples increase power and reduce errors