Motor Trend Magazine commissioned an investigation of vehicle data to determine answers to the following questions.
“Is an automatic or manual transmission better for MPG”
“Quantify the MPG difference between automatic and manual transmissions”
The data does not provide significant statistical evidence that transmission type affects MPG.
A difference of 7.2mpg is observed between the average manual and automatic transmission vehicles in the data assessed. Other factors were identified as having more impact on fuel economy than transmission type.
Data from the mtcars dataset in the datasets package in R was used.
https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/mtcars.html
The mtcars package documentation provides the following explanation of the data.
[, 1] mpg Miles/(US) gallon
[, 2] cyl Number of cylinders
[, 3] disp Displacement (cu.in.)
[, 4] hp Gross horsepower
[, 5] drat Rear axle ratio
[, 6] wt Weight (lb/1000)
[, 7] qsec 1/4 mile time
[, 8] vs V/S
[, 9] am Transmission (0 = automatic, 1 = manual)
[,10] gear Number of forward gears
[,11] carb Number of carburetors
A boxplot of fuel economy vs transmission type shows a visible difference between Automatic and Manual transmission. [Appendix fig 1]
Specifically the mpg for Manual transmissions plots higher than the mpg for Automatic transmissions.
Calculating the mean of automatic and manual cars shows a difference in fuel economy.
mean(subset(mtcars, mtcars$am=="Automatic")$mpg)
## [1] 17.14737
mean(subset(mtcars, mtcars$am=="Manual")$mpg)
## [1] 24.39231
This is obviously misleading, since the data assessed includes different sizes of engines, car weights and a range of factors that affect fuel economy.
For example, a comparison of the average weights of Manual and Automatic vehicles shows automatic vehicles are on average heavier.
mean(subset(mtcars, mtcars$am=="Manual")$wt)
## [1] 2.411
mean(subset(mtcars, mtcars$am=="Automatic")$wt)
## [1] 3.768895
Similarly, a comparison of the mean engine sizes between Manual and Automatic reveal Automatic vehicles have larger engines.
mean(subset(mtcars, mtcars$am=="Automatic")$disp)
## [1] 290.3789
mean(subset(mtcars, mtcars$am=="Manual")$disp)
## [1] 143.5308
Diagnostic testing below shows vehicle weight and engine size are significant predictors of fuel economy.
We now explore statistical measures to identify if transmission type has a significant impact in fuel economy.
The dataset parameters are then plotted to graphically show which variables are likely to be good predictors of fuel economy. [Appendix fig 2]
Plots of parameter pairs with high scatter or variance are discounted, plots indicating high correlation were shortlisted for consideration.
Parameter pairs with visually obvious correlation to fuel economy include mpg-cyl, mpg-disp, mpg-hp, mpg-wt. Other parameter pair plots showed visible scatter.
Evaluating the covariance of the parameter pairs revealed a small number with high levels of correlation and which parameters were positive or negative. [Appendix fig 3]
Positive correlation parameter pairs : mpg-‘1/4 mile time’, mpg-vs, mpg-‘automatic transmission’, mpg-‘number of gears’, mpg-‘rear axle ratio’.
Negative correlation parameter pairs : mpg-‘number of cylinders’, mpg-‘engine displacement’, mpg-‘gross horsepower’, mpg-‘vehicle weight’.
Of more statistical interest was the size of the correlations.
Parameter pairs with low correlation (ie: less than 0.5) were identified : mpg-‘1/4 mile time’, mpg-‘number of gears’.
Parameter pairs with close to +/-1.0 correlation were identified: mpg-‘number of cylinders’, mpg-‘engine displacement’ and mpg-‘weight’.
Interestingly, the parameter pair mpg-‘transmission type’ returned a correlation of 0.5998. The question remains if this parameter pair has a statistically significant correlation.
Fitting a linear model to fuel economy and using all the parameters calculated the coefficients for each of the parameters.
lm(mpg ~ ., data = mtcars)$coeff
## (Intercept) cyl6 cyl8 disp hp drat
## 23.87913244 -2.64869528 -0.33616298 0.03554632 -0.07050683 1.18283018
## wt qsec vs1 amManual gear4 gear5
## -4.52977584 0.36784482 1.93085054 1.21211570 1.11435494 2.52839599
## carb2 carb3 carb4 carb6 carb8
## -0.97935432 2.99963875 1.09142288 4.47756921 7.25041126
The two highest value coefficients are Weight and Transmission-type. Motor size and Motor power had significantly lower coefficient values.
A t-test was then conducted to verify if transmission type has a significant effect on fuel economy.
t.test(mpg ~ am, data = mtcars)
##
## Welch Two Sample t-test
##
## data: mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.280194 -3.209684
## sample estimates:
## mean in group Automatic mean in group Manual
## 17.14737 24.39231
The null hypothesis being that transmission types have an effect on fuel economy.
The returned p-value of 0.001374 is less than the significance level and the observed data are inconsistent with the assumption that the null hypothesis is true and thus that hypothesis must be rejected.
Appendix Figure 5 presents a residual plot after fitting the model against variables.
The data does not support the hypothesis that transmission types have a significant impact on fuel economy.
Figure 1
boxplot(mpg ~ am, data = mtcars,
xlab = "Type of Transmission", ylab = "Miles per gallon (mpg)",
main = "Fuel Economy vs Type of Transmission", col = c("red", "yellow"),
names = c("Auto", "Manual"))
Figure 2
p1 = pairs(mtcars, panel = panel.smooth, main = "mtcars data - variable comparison")
Figure 3
cov2cor(cov(sapply(mtcars, as.numeric)))[1,]
## mpg cyl disp hp drat wt
## 1.0000000 -0.8521620 -0.8475514 -0.7761684 0.6811719 -0.8676594
## qsec vs am gear carb
## 0.4186840 0.6640389 0.5998324 0.4802848 -0.6067431
Figure 4
fit <- lm(mpg~., data=mtcars)
hist(resid(fit))