2025-11-09

Introduction

In this presentation, I explore Hypothesis Testing with mtcars dataset

  • I will test whether the mean fuel efficiency, miles per gallon (mpg) of cars differs from 20 mpg.
  • Components:
    • Null hypothesis (\(H_0\))
    • Alternative hypothesis (\(H_a\))
    • Test statistic
    • P-value

Hypothesis Test Setup

  • Null hypothesis: \(H_0: \mu = 20\)
  • Alternative hypothesis (two-sided): \(H_a: \mu \neq 20\) Test statistic (t-test):

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]

  • \(\bar{x}\) = sample mean
  • \(s\) = sample standard deviation
  • \(n\) = sample size

Sample Data

data(mtcars)
head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Histogram of mpg

ggplot(mtcars, aes(x = mpg)) +
geom_histogram(binwidth = 2, fill = "#8C1D40", color = "black") +
labs(title = "Histogram of MPG", x = "Miles per Gallon", y = "Frequency") +
  theme_minimal()

t-test

    One Sample t-test

data:  mtcars$mpg
t = 0.08506, df = 31, p-value = 0.9328
alternative hypothesis: true mean is not equal to 20
95 percent confidence interval:
 17.91768 22.26357
sample estimates:
mean of x 
 20.09062 

t-test interpretation of Hypothesis Testing

  • Suppose \(H_0: \mu = 20\)
  • Alternative: \(H_a: \mu \neq 20\)
  • Test statistic: \(t = \frac{\bar{x}-\mu_0}{s/\sqrt{n}}\)
  • P-value < 0.05 → Reject \(H_0\)
  • P-value ≥ 0.05 → Fail to reject \(H_0\)

Diamond Chart : Weight vs MPG.

ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point(shape = 23, size = 4, fill = "#8C1D40", color = "black") +
labs(title = "Diamond Chart: Weight vs MPG",
x = "Weight (1000 lbs)", y = "Miles per Gallon") +
theme_minimal(base_size = 6)

#Heavier cars  tend to have lower mpg, which may contribute to the observed differences in mean mpg

3D scatter plot “Weight,MPG,Horsepower”

plot_ly(mtcars, x = ~wt, y = ~mpg, z = ~hp,
type = 'scatter3d', mode = 'markers',
marker = list(size = 5, color = mtcars$hp, colorscale = 'Viridis')) %>%
layout(title = '3D Relationship: Weight, MPG, and Horsepower')

Confidence Interval for Mean.

\[\bar{x} \pm t_{\alpha/2, n-1} \frac{s}{\sqrt{n}}\]

Conclusion.

  • The t-test suggests whether the mean mpg significantly differs from 20.
  • Visualization shows that heavier cars tend to have lower mpg.
  • Combining statistical testing and visualization gives deeper insight into vehicle efficiency.