Hypothesis Testing

October 19, 2025

What is Hypothesis Testing?

Hypothesis Testing is a statistical procedure used to test an assumption about a population using sample data.
Need to know terms:
- Null hypothesis (H₀): The null hypothesis is the default assumption that says there is no effect or no difference.
- Alternative hypothesis (Ha or H₁): The alternative hypothesis is the assumption we are trying to prove.
- Significance level (α): The significance level is the threshold for determining whether we should or should no reject the null hypothesis. (\(\alpha\) is typically 0.05)
- P-value: The p-value, or probability, is the value we compare to the significance level.
  - If p ≤ α, we reject H₀ (the result is statistically significant).
  - If p > α, we fail to reject H₀ (not enough evidence against it).

The Z-test is a statistical test used to test hypotheses about the population mean or proportions when:
- The population standard deviation (σ) is known
- The sample size is large, usually n ≥ 30 (given by the Central Limit Theorem)
Used to test whether:
- The sample mean differs from a known population mean
- Two sample means or proportions differ significantly

The Central Limit Theorem (CLT) states that if a sample size is large enough, the distribution of sample means will be an approximately normal distribution, regardless of the shape of the original population.
Conditions that need to be met:
- Random: Each individual of the population has an equal chance of being selected.
- Independent: The result of one sample does not affect the outcome of another.
- Large enough: The sample size must be sufficiently large, typically \(n\ge 30\)

Visual of the Central Limit Theorem

\[ z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}} \]

Where:
- \(\bar{x}\) = sample mean
- \(\mu_0\) = hypothesized mean
- \(\sigma\) = population standard deviation
- \(n\) = sample size

If |z| > z_critical, reject H₀.

A factory claims that the average weight of its cereal boxes is 500 g. You suspect the boxes might weigh less than advertised.

You take a sample of 40 boxes and find:

Hypotheses:

# Sample data
n <- 40
xbar <- 496
mu0 <- 500
sigma <- 10

z = (xbar - mu0) / (sigma / sqrt(n))
z

[1] -2.529822

For a left-tailed test at α = 0.05: \[ p = P(Z \le z) < 0.05 \Rightarrow \text{Reject } H_0 \] Critical z = -1.645

Z = −2.53 < −1.645

p = 0.0057 < 0.05

Conclusion: There is significant evidence at the 5% level that the average weight of cereal boxes is less than 500 g.