HW 3

2025-11-10

Description

Statistical Testing: T-Tests

Example: US Unemployment

  1. Line chart of unemployment rate over time Let’s just simply test if the average U.S. unemployment rate differed between the 2000s and the 2010s.
## # A tibble: 6 × 4
##   date        year decade unemp_rate
##   <date>     <dbl> <chr>       <dbl>
## 1 2000-01-01  2000 2000s        2.03
## 2 2000-02-01  2000 2000s        2.08
## 3 2000-03-01  2000 2000s        2.04
## 4 2000-04-01  2000 2000s        1.95
## 5 2000-05-01  2000 2000s        2.04
## 6 2000-06-01  2000 2000s        2.00

Welch t-test formulas

The t-test

## n_2000s: 120 
##  n_2010s: 64 
##  mean_2000s: 2.79% 
##  mean_2010s: 3.88% 
##  difference (2010s - 2000s): 1.09% 
##  t (df): -10.09 (df = 133.8) 
##  p-value: 3.82e-18 
##  95% CI: [-1.30, -0.87]%

Graphed in Boxplot

ggplot(econ, aes(decade, unemp_rate)) +
  geom_boxplot(outlier.alpha = 0.3) +
  geom_jitter(width = 0.08, alpha = 0.6) +
  labs(title = "Unemployment Rate by Decade",
       x = "Decade", y = "Percent")

Boxplot Interpretation

Histogram

Conclusion

## Difference in means (2010s − 2000s): 1.09 % 
##  t (df): 10.09 (df = 133.8) 
##  p-value: 3.82e-18 
##  95% CI: [0.87, 1.30] %