Statistical Inference: One-Sample t-test

Hypothesis Testing and Normality Checks in R

Author

Abdullah Al Shamim

Published

February 16, 2026

Introduction

A One-Sample t-test is used to determine whether the mean of a single sample is statistically different from a known or hypothesized population mean (\(\mu\)).


1. Data Preparation and Exploration

We will use the mpg (miles per gallon) variable from the built-in mtcars dataset to test our hypothesis.

Code
# 1. Prepare Data
data <- mtcars$mpg  

# 2. Check Data
summary(data)  # Basic descriptive statistics
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  10.40   15.43   19.20   20.09   22.80   33.90 
Code
hist(data, main="Distribution of MPG", col="steelblue", border="white")


2. Assumptions Check: Normality

Before running a t-test, we must ensure the data follows a normal distribution. We use the Shapiro-Wilk Test for this purpose.

Code
# Shapiro-Wilk Normality Test
shapiro.test(data)

    Shapiro-Wilk normality test

data:  data
W = 0.94756, p-value = 0.1229

Interpretation: If the p-value > 0.05, we fail to reject the null hypothesis, meaning the data is normally distributed.


3. Performing the One-Sample t-test

We are testing if the average mileage of the cars is significantly different from 20 mpg ().

Code
# One-Sample t-test
t_test_result <- t.test(data, mu = 20)
print(t_test_result)

    One Sample t-test

data:  data
t = 0.08506, df = 31, p-value = 0.9328
alternative hypothesis: true mean is not equal to 20
95 percent confidence interval:
 17.91768 22.26357
sample estimates:
mean of x 
 20.09062 
Code
# Advanced options (commented out):
# t.test(data, mu = 20, alternative = "greater", conf.level = 0.99)

4. Results and Interpretation

We extract specific values from the test result to formulate our conclusion.

Code
cat("p-value:", t_test_result$p.value, "\n")
p-value: 0.9327606 
Code
cat("95% Confidence Interval:", t_test_result$conf.int, "\n")
95% Confidence Interval: 17.91768 22.26357 
Code
cat("Sample Mean:", mean(data), "\n")
Sample Mean: 20.09062 
Code
# Logic-based Conclusion
if(t_test_result$p.value < 0.05) {
  cat("Result: The mean mileage is statistically different from 20 mpg (p < 0.05).")
} else {
  cat("Result: There is no statistical difference from 20 mpg (p > 0.05).")
}
Result: There is no statistical difference from 20 mpg (p > 0.05).

5. Visualization

Visualizing the data with a boxplot helps us see where the hypothesized mean (20) sits relative to our sample distribution.

Code
library(tidyverse)

mtcars %>% 
  ggplot(aes(x = "", y = mpg)) +
  geom_boxplot(fill = "lightblue", outlier.color = "red") +
  geom_hline(yintercept = 20, 
             color = "red", 
             linetype = "dashed", 
             size = 1) +
  labs(title = "One-Sample t-test: MPG vs Hypothesized Mean (20)",
       subtitle = "Red dashed line represents mu = 20",
       y = "Miles Per Gallon (mpg)",
       x = "") +
  theme_minimal()


6. Non-Parametric Alternative

If your data fails the normality test (Shapiro-Wilk p-value < 0.05), you should use the Wilcoxon Signed-Rank Test instead.

Code
# Non-parametric alternative
wilcox.test(data, mu = 20)

    Wilcoxon signed rank test with continuity correction

data:  data
V = 249, p-value = 0.7863
alternative hypothesis: true location is not equal to 20

Systematic Checklist (Cheat Sheet):

  • Normality Check: shapiro.test()
  • Parametric Test: t.test(data, mu = target)
  • Non-Parametric Test: wilcox.test(data, mu = target)
  • Significance Threshold: p < 0.05 (Typically)

Well done! You have successfully performed and interpreted a One-Sample t-test.