2024-11-14

Introduction to Hypothesis Testing

Hypothesis testing is a method used to make inferences on whether research results support a theory that applies to a population

  • This helps to determine if there is enough evidence to support a claim about a population

  • It involves comparing the null hypothesis and the alternative hypothesis

  • It allows us to quantify how strongly the evidence goes against the null hypothesis

Key Terms in Hypothesis Testing

Null Hypothesis (H₀)

The default assumption and usually assumes no effect or no difference

Alternative Hypothesis (H₁ or Hₐ)

The claim that we gather evidence for and is contradictory to the Null Hypothesis

Type I Error (α)

Rejecting Null Hypothesis when it is true

Type II Error (β)

Failing to reject Null Hypothesis when it is false

Significance Level (α)

A predetermined threshold for determining when something it statistically significant. Common values include .05, .01, .001.

3D Visualization of Type I and II Errors

Hypothesis Testing Steps

  1. State your hypotheses (H₀ and H₁)
  2. Choose a significance level (α)
  3. Select the appropriate test statistic
  4. Calculate that test statistic
  5. Determine the p-value
  6. Reject H₀ or Fail to reject H₀
  7. Interpret the results

The p-value

This probability value describes the probability that you data would have occurred if the null hypothesis was correct.

You have strong evidence against the null hypothesis when you have a small p-value that is less than or equal to the significance level, and weak evidence when the p-value is above that level.

P-value calculation (for a two-tailed test): \[ p−value=2×P(T≥∣tobs∣) \]

Example: Two-Sample t-Test

Let’s imagine we want to compare the effectiveness of two different study methods on exam scores

H₀: μ₁ = μ₂ (no difference in mean scores)
H₁: μ₁ ≠ μ₂ (there is a difference in mean scores) α: .05 (Significance Level)

To perform our test we will run the following:

# Generate sample data
method1 <- c(75, 80, 85, 90, 95)
method2 <- c(70, 75, 80, 85, 90)

# Perform two-sample t-test
t_test_result <- t.test(method1, method2)

Test Results

## 
##  Welch Two Sample t-test
## 
## data:  method1 and method2
## t = 1, df = 8, p-value = 0.3466
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.530021 16.530021
## sample estimates:
## mean of x mean of y 
##        85        80

\[ p-value > α \]

Therefore we have failed to reject H₀ since we only have weak evidence. Considering the small sample size we might say the results are inconclusive at the moment.

Visualizing the Test Results

Statistical Power

Statistical power refers to how likely it is to correctly reject a false null hypothesis. This graph shows the statistical power relating to Two-Sample t-Tests.