2024-11-03

1. Introduction to Hypothesis Testing

Hypothesis testing is a statistical method to make inferences about population parameters based on sample data.

  • Null Hypothesis (H0): No effect or difference.
  • Alternative Hypothesis (H1): There is an effect or difference.

Goal:

Determine whether to reject H0 based on sample data.

2. Steps for Hypothesis Testing

  1. State the Hypotheses:
    • H0: Population mean = some value.
    • H1: Population mean ≠ some value.
  2. Choose the Significance Level (α):
    • Commonly set at 0.05.
  3. Collect Data:
    • Obtain sample data.
  4. Calculate the Test Statistic:
    • Use appropriate tests (e.g., t-test, z-test).
  5. Make a Decision:
    • Compare the p-value with α.

3. Example Scenario

Suppose a school claims that their app helps improve student performance in their classes. They say that the average grade of users is higher than the current average grade of 75 on state-wide tests. To test if the claim is valid, we use the hypothesis testing method.

4. Code

Hypotheses:

  • H0: µ = 75 hours
  • H1: µ > 75 hours

Sample Data:

  • Sample size (n) = 40
  • Sample mean (x) = 78 hours
  • Sample standard deviation (s) = 10 hours
# Sample data
n <- 40
x_bar <- 78
mu <- 75
s <- 10

5. Performing the Test

  1. Calculate the Test Statistic:

Using the t-test formula: \[ t = \frac{x̄ - µ}{s/\sqrt{n}} \]

  1. Calculate the p-value using the t-distribution.

  2. Decision Rule:

    • Reject H0 if p-value < α.

6. Example Scenario (continued)

# Calculate t statistic (value determined from the t-test [seen below] to measure the 
# difference of the the sample mean from the null hypothesis mean)
t_stat <- (x_bar - mu) / (s / sqrt(n))

# Calculate p-value (value to help determine the significance of the results)
p_value <- 2 * pt(-abs(t_stat), df = n - 1)

# Output results
t_stat
## [1] 1.897367
p_value
## [1] 0.0652013

7. Visualization with ggplot2

Visualizing the sample data and the population mean:

## Warning in rep(c("Sample", "Population"), each = c(n, n)): first element used
## of 'each' argument

8. Interactive Density Plotly Plot

9. Boxplot

10. Conclusion

  • Hypothesis testing provides a framework for making decisions based on sample data.
  • Always check assumptions and consider the context of results before making conclusions.
  • This process is very important in various fields like medicine, business, and social sciences and has many applications.

11. Credits

  • Stack Overflow