What is Hypothesis Testing?
- A method to decide whether data provides enough evidence to reject a claim about a population.
- Uses sample statistics to test statements (hypotheses) about population parameters.
In hypothesis testing, we start with two statements:
\[ H_0: \text{No effect, no difference (status quo)} \]
\[ H_1: \text{Effect or difference exists (research claim)} \]
Example: testing if the mean height differs from 170 cm.
\[ H_0: \mu = 170, \quad H_1: \mu \neq 170 \]
| Decision | True State = H₀ | True State = H₁ |
|---|---|---|
| Reject H₀ | Type I error (α) | Correct |
| Fail to reject H₀ | Correct | Type II error (β) |
\[ \alpha = P(\text{reject } H_0 \mid H_0 \text{ true}), \quad \beta = P(\text{fail to reject } H_0 \mid H_1 \text{ true}) \]
A smaller α reduces false positives but increases β (false negatives).
\[ t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}} \]
Imagine that two classrooms took the same test
- Class A: taught with traditional methods
- Class B: taught with interactive methods
Let us use a t-test to determine if the groups are significantly different
\[
t = \frac{\bar{X_1} - \bar{X_2}}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}
\]
\[
\bar{X_1}, \, \bar{X_2} = \text{sample means} \\
s_p = \text{pooled standard deviation} \\
n_1, \, n_2 = \text{sample sizes}
\]
## ## Two Sample t-test ## ## data: group1 and group2 ## t = -3.0399, df = 48, p-value = 0.003825 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -19.990085 -4.073946 ## sample estimates: ## mean of x mean of y ## 99.50005 111.53206
Given a 95% confidence interval and the following:
\[ H_0: \mu_A = \mu_b \qquad H_1: \mu_A \neq \mu_b \]
We can reject the null hypothesis since our p-value < 0.05. We conclude that there is statistically significant evidence that Class B score are higher than Class A with a mean difference of nearly 11 points.