Automatic or Manual transmission? Which one should you get?

Executive summary

This study attempts to determine whether automatic transmission cars are less fuel efficient compared to manual transmission cars.

In this study, we apply a parametric test (t-test) and regression analysis to analyze this question.

Results of the statistical tests show that without controlling for other car features manual cars are more fuel-efficient than automatic cars.

On average, they run about 7 miles more than automatic cars per gallon. However Regression Analysis indicates that taking into account other car features such as displacement, rear axle ratio and car weight, manual transmission cars are no longer signficiantly better for MPG compared to automatic cars.

Section I. Data processing and summary statistics

The mtcar dataset for this study is from the 1974 “Motor Trend US”“ magazine, consisting of fuel consumption measurement (mpg) for 32 automobiles for models ranging from 1973 to 1974.

#Information about the cars dataset.
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
summary(mtcars)
##       mpg            cyl            disp             hp       
##  Min.   :10.4   Min.   :4.00   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.4   1st Qu.:4.00   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.2   Median :6.00   Median :196.3   Median :123.0  
##  Mean   :20.1   Mean   :6.19   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.8   3rd Qu.:8.00   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.9   Max.   :8.00   Max.   :472.0   Max.   :335.0  
##       drat            wt            qsec            vs       
##  Min.   :2.76   Min.   :1.51   Min.   :14.5   Min.   :0.000  
##  1st Qu.:3.08   1st Qu.:2.58   1st Qu.:16.9   1st Qu.:0.000  
##  Median :3.69   Median :3.33   Median :17.7   Median :0.000  
##  Mean   :3.60   Mean   :3.22   Mean   :17.8   Mean   :0.438  
##  3rd Qu.:3.92   3rd Qu.:3.61   3rd Qu.:18.9   3rd Qu.:1.000  
##  Max.   :4.93   Max.   :5.42   Max.   :22.9   Max.   :1.000  
##        am             gear           carb     
##  Min.   :0.000   Min.   :3.00   Min.   :1.00  
##  1st Qu.:0.000   1st Qu.:3.00   1st Qu.:2.00  
##  Median :0.000   Median :4.00   Median :2.00  
##  Mean   :0.406   Mean   :3.69   Mean   :2.81  
##  3rd Qu.:1.000   3rd Qu.:4.00   3rd Qu.:4.00  
##  Max.   :1.000   Max.   :5.00   Max.   :8.00

Section II: Statistical Analysis

#pre-process the data for statistical analysis
aggmpg <- tapply(mtcars$mpg, mtcars$am, mean, na.rm = TRUE)

sdmpg <- tapply(mtcars$mpg, mtcars$am, sd, na.rm = TRUE)

aggam <- unique(factor(c("automatic", "manual")))

#validate pre-processing
barmpg <- barplot(aggmpg, names = aggam, ylim = c(0, 35), main = paste("Average Miles per Gallon by Transmission Type"), 
    space = 0.4, axes = TRUE, axis.lty = 10, col = "white", xlab = "Transmission Type", 
    ylab = "MPG")

box()

segments(barmpg, aggmpg - sdmpg, barmpg, aggmpg + sdmpg, lwd = 3)

segments(barmpg - 0.05, aggmpg - sdmpg, barmpg + 0.05, aggmpg - sdmpg, lwd = 2)

segments(barmpg - 0.05, aggmpg + sdmpg, barmpg + 0.05, aggmpg + sdmpg, lwd = 2)

plot of chunk unnamed-chunk-2

Statistical t-test to compare mpg between automatic vs manual transmission. The results shows that manual transmission are more gas efficient than automatic cars.

t.test(mpg ~ factor(am), data = mtcars)
## 
##  Welch Two Sample t-test
## 
## data:  mpg by factor(am)
## t = -3.767, df = 18.33, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.28  -3.21
## sample estimates:
## mean in group 0 mean in group 1 
##           17.15           24.39

Section III. Regressional Analysis

The results from the statistical tests focus on mpg and am only, without controlling for influence from other variables. If we apply a multivariate regression, the marginal impact of automatic vs manual transmission cars does not turn out to be significant. The confounding variables include displacement (disp), rear axle ratio (drat) and car weight(wt). Using car weight as an example:

fit0 <- lm(mpg ~ factor(am), data = mtcars)
summary(fit0)
## 
## Call:
## lm(formula = mpg ~ factor(am), data = mtcars)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.392 -3.092 -0.297  3.244  9.508 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    17.15       1.12   15.25  1.1e-15 ***
## factor(am)1     7.24       1.76    4.11  0.00029 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.9 on 30 degrees of freedom
## Multiple R-squared:  0.36,   Adjusted R-squared:  0.338 
## F-statistic: 16.9 on 1 and 30 DF,  p-value: 0.000285
fit1 <- lm(mpg ~ factor(am) + wt, data = mtcars)
summary(fit1)
## 
## Call:
## lm(formula = mpg ~ factor(am) + wt, data = mtcars)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.530 -2.362 -0.132  1.403  6.878 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.3216     3.0546   12.22  5.8e-13 ***
## factor(am)1  -0.0236     1.5456   -0.02     0.99    
## wt           -5.3528     0.7882   -6.79  1.9e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.1 on 29 degrees of freedom
## Multiple R-squared:  0.753,  Adjusted R-squared:  0.736 
## F-statistic: 44.2 on 2 and 29 DF,  p-value: 1.58e-09
#The regression suggests that holding other variables constant that manual transmission cars consume on avarage more gallons of gas per mile and the results are not statistically significant.

anova(fit0, fit1)
## Analysis of Variance Table
## 
## Model 1: mpg ~ factor(am)
## Model 2: mpg ~ factor(am) + wt
##   Res.Df RSS Df Sum of Sq    F  Pr(>F)    
## 1     30 721                              
## 2     29 278  1       443 46.1 1.9e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##QQ plots of the residuals suggest that the errors scatter around zero.

plot(residuals(fit1), main = "QQ plot of the residuals of model 1")

plot of chunk unnamed-chunk-4

qqnorm(residuals(fit1))
qqline(residuals(fit1))

plot of chunk unnamed-chunk-4

#Diagnostics (dfbeta and hatvalues) indicates that there are no outlier in the dataset - individual coefficients and predicted response do not change much no matter which point is removed.

round(dfbetas(fit1)[1:32, 2], 4)
##           Mazda RX4       Mazda RX4 Wag          Datsun 710 
##             -0.1512             -0.0749             -0.0996 
##      Hornet 4 Drive   Hornet Sportabout             Valiant 
##             -0.0897              0.0115              0.0378 
##          Duster 360           Merc 240D            Merc 230 
##              0.1902             -0.3057             -0.1745 
##            Merc 280           Merc 280C          Merc 450SE 
##             -0.0161              0.0612             -0.0153 
##          Merc 450SL         Merc 450SLC  Cadillac Fleetwood 
##              0.0021              0.0659              0.0725 
## Lincoln Continental   Chrysler Imperial            Fiat 128 
##              0.1637              0.4549              0.3072 
##         Honda Civic      Toyota Corolla       Toyota Corona 
##              0.0088              0.1303              0.3386 
##    Dodge Challenger         AMC Javelin          Camaro Z28 
##              0.1522              0.2128              0.1105 
##    Pontiac Firebird           Fiat X1-9       Porsche 914-2 
##             -0.0768              0.0088              0.0058 
##        Lotus Europa      Ford Pantera L        Ferrari Dino 
##             -0.0018             -0.4877             -0.2121 
##       Maserati Bora          Volvo 142E 
##             -0.4433             -0.0775
round(hatvalues(fit1)[1:32], 4)
##           Mazda RX4       Mazda RX4 Wag          Datsun 710 
##              0.0798              0.0909              0.0775 
##      Hornet 4 Drive   Hornet Sportabout             Valiant 
##              0.0725              0.0596              0.0588 
##          Duster 360           Merc 240D            Merc 230 
##              0.0552              0.0743              0.0774 
##            Merc 280           Merc 280C          Merc 450SE 
##              0.0596              0.0596              0.0585 
##          Merc 450SL         Merc 450SLC  Cadillac Fleetwood 
##              0.0527              0.0526              0.1947 
## Lincoln Continental   Chrysler Imperial            Fiat 128 
##              0.2300              0.2135              0.0798 
##         Honda Civic      Toyota Corolla       Toyota Corona 
##              0.1179              0.0984              0.1627 
##    Dodge Challenger         AMC Javelin          Camaro Z28 
##              0.0566              0.0598              0.0530 
##    Pontiac Firebird           Fiat X1-9       Porsche 914-2 
##              0.0530              0.0916              0.0817 
##        Lotus Europa      Ford Pantera L        Ferrari Dino 
##              0.1291              0.1142              0.0853 
##       Maserati Bora          Volvo 142E 
##              0.1639              0.0857