Executive Summary

The type of transmission does not affect fuel economy (mpg). A simple analysis will show that manual transmission cars have have on average 3.6 - 10.8 mpg (p < 0.05) better fuel economy than automatics. However, a more complete, multivariate analysis will show that 87% of fuel economy can be explained by vehicle weight combined with horsepower. Adding transmission to this model does a poorer job of explaining the data.

Analysis

A boxplot of transmission versus fuel effiency shows a difference in the median fuel economy and no overlap in the interquartile ranges of the two types of transmissions. Using linear regression, we can model fuel efficieny (in mpg) as a function of transmission (am), which can be either automatic (0) or manual (1). We include an intercept in the model because we are interested in the difference between the two groups. The slope coefficient represents the change in expected mpg when going from an automatic to a manual transmission. Constructing a 95% confidence interval (with 30 degrees of freedom) around the slope coefficient produces a difference of 3.6 - 10.8 mpg between the means of the two groups.

Running a multivariate regression using all available variables revealed high correlation and variance inflation factors for engine displacement (disp), number of cylinders (cyl), and weight (wt). Of these three, weight is the controlling variable. A heavier car requires larger displacement to produce more torque in order to have an acceptable acceleration off the line. Larger displacement can either be produced by larger pistons or increasing the number of cylinders. More cylinders have to be arranged in a V rather than Straight configuration (vs). Torque can also be increased by changing the gear ratio in the rear differential (drat), which is correlated with the number of front gear (gear) because both relate to the turning of the crankshaft.

Quarter-mile time (qsec) is determined by acceleration off the starting line and by horsepower (hp). Increasing the number of barrels in the carburetor (carb) will increase acceleration off the line. Of these three variables, horsepower is the most fundamental.

A model using transmission (am), weight (wt), and horsepower (hp) does a good job fitting the data, however, the summary reveals that all of the coefficients are significant at the 95% level, except for transmission. Including an interaction term between weight and horsepower improves the R^2, but the error for the am coefficient is an order of magnitude larger than the coefficient. Removing am, leaves a parsimonious model that explains 87% of mpg with no signifcant outliers.

Figures

##              Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 17.147368   1.124603 15.247492 1.133983e-15
## factor(am)1  7.244939   1.764422  4.106127 2.850207e-04
## [1]  3.64151 10.84837
## Warning: package 'car' was built under R version 3.2.2

##       cyl      disp        hp      drat        wt      qsec        vs 
## 15.373833 21.620241  9.832037  3.374620 15.164887  7.527958  4.965873 
##        am      gear      carb 
##  4.648487  5.357452  7.908747
## [1]  0.7824958  0.8879799 -0.7124406 -0.5549157
## [1] -0.8108118  0.9020329 -0.6999381 -0.7102139  0.6996101 -0.7104159
## [7]  0.4402785
## [1] -0.7082234  0.7498125 -0.6562492
##                Estimate  Std. Error   t value     Pr(>|t|)
## (Intercept) 34.00287512 2.642659337 12.866916 2.824030e-13
## factor(am)1  2.08371013 1.376420152  1.513862 1.412682e-01
## wt          -2.87857541 0.904970538 -3.180850 3.574031e-03
## hp          -0.03747873 0.009605422 -3.901830 5.464023e-04
##                Estimate  Std. Error     t value     Pr(>|t|)
## (Intercept) 49.45224079 5.280730731  9.36465866 5.694894e-10
## factor(am)1  0.12510693 1.333430965  0.09382333 9.259423e-01
## wt          -8.10055755 1.789325217 -4.52715777 1.084926e-04
## hp          -0.11930318 0.026549992 -4.49352965 1.187315e-04
## wt:hp        0.02748826 0.008472529  3.24439879 3.130390e-03
## 
## Call:
## lm(formula = mpg ~ wt * hp, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.0632 -1.6491 -0.7362  1.4211  4.5513 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 49.80842    3.60516  13.816 5.01e-14 ***
## wt          -8.21662    1.26971  -6.471 5.20e-07 ***
## hp          -0.12010    0.02470  -4.863 4.04e-05 ***
## wt:hp        0.02785    0.00742   3.753 0.000811 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.153 on 28 degrees of freedom
## Multiple R-squared:  0.8848, Adjusted R-squared:  0.8724 
## F-statistic: 71.66 on 3 and 28 DF,  p-value: 2.981e-13