Executive Summary

Motor Trend Magazine commissioned an investigation of vehicle data to determine answers to the following questions.
“Is an automatic or manual transmission better for MPG”
“Quantify the MPG difference between automatic and manual transmissions”
The data does not provide significant statistical evidence that transmission type affects MPG.
A difference of 7.2mpg is observed between the average manual and automatic transmission vehicles in the data assessed. Other factors were identified as having more impact on fuel economy than transmission type.

Data Summary

Data from the mtcars dataset in the datasets package in R was used.
https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/mtcars.html
The mtcars package documentation provides the following explanation of the data.
[, 1] mpg Miles/(US) gallon
[, 2] cyl Number of cylinders
[, 3] disp Displacement (cu.in.)
[, 4] hp Gross horsepower
[, 5] drat Rear axle ratio
[, 6] wt Weight (lb/1000)
[, 7] qsec 1/4 mile time
[, 8] vs V/S
[, 9] am Transmission (0 = automatic, 1 = manual)
[,10] gear Number of forward gears
[,11] carb Number of carburetors

Exploratory Analysis of Data

A boxplot of fuel economy vs transmission type shows a visible difference between Automatic and Manual transmission. [Appendix fig 1]
Specifically the mpg for Manual transmissions plots higher than the mpg for Automatic transmissions.
Calculating the mean of automatic and manual cars shows a difference in fuel economy.

mean(subset(mtcars, mtcars$am=="Automatic")$mpg)
## [1] 17.14737
mean(subset(mtcars, mtcars$am=="Manual")$mpg)
## [1] 24.39231

This is obviously misleading, since the data assessed includes different sizes of engines, car weights and a range of factors that affect fuel economy.
For example, a comparison of the average weights of Manual and Automatic vehicles shows automatic vehicles are on average heavier.

mean(subset(mtcars, mtcars$am=="Manual")$wt)
## [1] 2.411
mean(subset(mtcars, mtcars$am=="Automatic")$wt)
## [1] 3.768895

Similarly, a comparison of the mean engine sizes between Manual and Automatic reveal Automatic vehicles have larger engines.

mean(subset(mtcars, mtcars$am=="Automatic")$disp)
## [1] 290.3789
mean(subset(mtcars, mtcars$am=="Manual")$disp)
## [1] 143.5308

Diagnostic testing below shows vehicle weight and engine size are significant predictors of fuel economy.
We now explore statistical measures to identify if transmission type has a significant impact in fuel economy.

The dataset parameters are then plotted to graphically show which variables are likely to be good predictors of fuel economy. [Appendix fig 2]
Plots of parameter pairs with high scatter or variance are discounted, plots indicating high correlation were shortlisted for consideration.
Parameter pairs with visually obvious correlation to fuel economy include mpg-cyl, mpg-disp, mpg-hp, mpg-wt. Other parameter pair plots showed visible scatter.

Diagnostic Testing

Evaluating the covariance of the parameter pairs revealed a small number with high levels of correlation and which parameters were positive or negative. [Appendix fig 3]
Positive correlation parameter pairs : mpg-‘1/4 mile time’, mpg-vs, mpg-‘automatic transmission’, mpg-‘number of gears’, mpg-‘rear axle ratio’.
Negative correlation parameter pairs : mpg-‘number of cylinders’, mpg-‘engine displacement’, mpg-‘gross horsepower’, mpg-‘vehicle weight’.
Of more statistical interest was the size of the correlations.
Parameter pairs with low correlation (ie: less than 0.5) were identified : mpg-‘1/4 mile time’, mpg-‘number of gears’.
Parameter pairs with close to +/-1.0 correlation were identified: mpg-‘number of cylinders’, mpg-‘engine displacement’ and mpg-‘weight’.
Interestingly, the parameter pair mpg-‘transmission type’ returned a correlation of 0.5998. The question remains if this parameter pair has a statistically significant correlation.
Fitting a linear model to fuel economy and using all the parameters calculated the coefficients for each of the parameters.

lm(mpg ~ ., data = mtcars)$coeff  
## (Intercept)        cyl6        cyl8        disp          hp        drat 
## 23.87913244 -2.64869528 -0.33616298  0.03554632 -0.07050683  1.18283018 
##          wt        qsec         vs1    amManual       gear4       gear5 
## -4.52977584  0.36784482  1.93085054  1.21211570  1.11435494  2.52839599 
##       carb2       carb3       carb4       carb6       carb8 
## -0.97935432  2.99963875  1.09142288  4.47756921  7.25041126

The two highest value coefficients are Weight and Transmission-type. Motor size and Motor power had significantly lower coefficient values.

Hypothesis testing

A t-test was then conducted to verify if transmission type has a significant effect on fuel economy.

t.test(mpg ~ am, data = mtcars)  
## 
##  Welch Two Sample t-test
## 
## data:  mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean in group Automatic    mean in group Manual 
##                17.14737                24.39231

The null hypothesis being that transmission types have an effect on fuel economy.
The returned p-value of 0.001374 is less than the significance level and the observed data are inconsistent with the assumption that the null hypothesis is true and thus that hypothesis must be rejected.

Appendix Figure 5 presents a residual plot after fitting the model against variables.

Conclusion

The data does not support the hypothesis that transmission types have a significant impact on fuel economy.

Appendix

Figure 1

boxplot(mpg ~ am, data = mtcars,  
        xlab = "Type of Transmission", ylab = "Miles per gallon (mpg)",  
        main = "Fuel Economy vs Type of Transmission", col = c("red", "yellow"),   
        names = c("Auto", "Manual"))  

Fig 1

Figure 2

p1 = pairs(mtcars, panel = panel.smooth, main = "mtcars data - variable comparison")  

Fig 2

Figure 3

cov2cor(cov(sapply(mtcars, as.numeric)))[1,]  
##        mpg        cyl       disp         hp       drat         wt 
##  1.0000000 -0.8521620 -0.8475514 -0.7761684  0.6811719 -0.8676594 
##       qsec         vs         am       gear       carb 
##  0.4186840  0.6640389  0.5998324  0.4802848 -0.6067431

Figure 4

fit <- lm(mpg~., data=mtcars)
hist(resid(fit))

Fig 5