Understanding t-tests with R

Illya Mowerman, Ph.D.

What is a t-test?

Understanding Key Terms

One-Sample t-test Example

Let’s test if the average MPG of cars in mtcars differs from 20 MPG

# Perform one-sample t-test
t_test_result <- t.test(mtcars$mpg, mu = 20)
t_test_result
## 
##  One Sample t-test
## 
## data:  mtcars$mpg
## t = 0.08506, df = 31, p-value = 0.9328
## alternative hypothesis: true mean is not equal to 20
## 95 percent confidence interval:
##  17.91768 22.26357
## sample estimates:
## mean of x 
##  20.09062

Interpretation:

Visualizing One-Sample t-test

ggplot(mtcars, aes(x = mpg)) +
  geom_histogram(binwidth = 2, fill = "skyblue", color = "black") +
  geom_vline(xintercept = 20, color = "red", linetype = "dashed") +
  annotate("text", x = 21, y = 8, label = "H₀: μ = 20") +
  labs(title = "Distribution of MPG",
       x = "Miles per Gallon",
       y = "Count")

Independent Two-Sample t-test

Let’s compare MPG between automatic and manual transmission cars

# Convert am to factor
mtcars$am <- factor(mtcars$am, labels = c("Automatic", "Manual"))

# Perform two-sample t-test
t_test_trans <- t.test(mpg ~ am, data = mtcars)
t_test_trans
## 
##  Welch Two Sample t-test
## 
## data:  mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means between group Automatic and group Manual is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean in group Automatic    mean in group Manual 
##                17.14737                24.39231

Visualizing Two-Sample t-test

ggplot(mtcars, aes(x = am, y = mpg, fill = am)) +
  geom_boxplot() +
  stat_summary(fun = mean, geom = "point", shape = 18, size = 3, color = "red") +
  labs(title = "MPG by Transmission Type",
       x = "Transmission",
       y = "Miles per Gallon") +
  theme_minimal()

Interpreting Two-Sample t-test Results

For transmission type comparison:

Paired t-test Example

Let’s simulate before/after data for a fuel efficiency modification:

# Simulate paired data
set.seed(123)
before <- mtcars$mpg
after <- before + rnorm(32, mean = 2, sd = 1)
paired_data <- data.frame(before, after)

# Perform paired t-test
paired_test <- t.test(after, before, paired = TRUE)
paired_test
## 
##  Paired t-test
## 
## data:  after and before
## t = 11.626, df = 31, p-value = 7.811e-13
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  1.616110 2.303783
## sample estimates:
## mean difference 
##        1.959946

Visualizing Paired t-test

paired_long <- data.frame(
  mpg = c(before, after),
  time = rep(c("Before", "After"), each = length(before))
)

ggplot(paired_long, aes(x = time, y = mpg, fill = time)) +
  geom_boxplot() +
  geom_line(aes(group = rep(1:length(before), 2)), alpha = 0.2) +
  labs(title = "MPG Before and After Modification",
       x = "Time",
       y = "Miles per Gallon") +
  theme_minimal()

t-test Assumptions

  1. Normality
    • Data should be approximately normally distributed
    • Can check using QQ plots
  2. Equal Variances (for independent t-test)
    • Can use Levene’s test
    • Alternative: Welch’s t-test (default in R)
par(mfrow = c(1, 2))
qqnorm(mtcars$mpg[mtcars$am == "Automatic"], main = "QQ Plot: Automatic")
qqline(mtcars$mpg[mtcars$am == "Automatic"])
qqnorm(mtcars$mpg[mtcars$am == "Manual"], main = "QQ Plot: Manual")
qqline(mtcars$mpg[mtcars$am == "Manual"])

Common Mistakes to Avoid

  1. Using t-test when assumptions are violated
    • Consider non-parametric alternatives (e.g., Wilcoxon test)
  2. Multiple testing without correction
    • Increases risk of false positives
    • Use Bonferroni or other corrections
  3. Confusing statistical and practical significance
    • Small p-value ≠ Important difference
    • Consider effect size and practical implications

Effect Size (Cohen’s d)

# Calculate Cohen's d for transmission comparison
library(effsize)
cohen.d(mpg ~ am, data = mtcars)
## 
## Cohen's d
## 
## d estimate: -1.477947 (large)
## 95 percent confidence interval:
##     lower     upper 
## -2.304209 -0.651685

Interpretation:

When to Use Each Type of t-test

  1. One-sample t-test
    • Comparing sample mean to known value
    • Example: Is average customer satisfaction different from 3.5?
  2. Independent t-test
    • Comparing means between independent groups
    • Example: Do treatment and control groups differ?
  3. Paired t-test
    • Comparing means of paired observations
    • Example: Before/after measurements

Summary

Thank You!