MPG - Effect of Transmission?

Brandon Bartell

Thursday, January 22, 2015

Executive Summary

Using the mtcars data set, we conducted some exploratory analyses and identified 4 key covariates in addition to transmission type with which to predict fuel economy in miles per gallon. We created a linear model based to predict fuel economy based on these 5 variables and concluded that for cars in this data set, manual transmissions offer better fuel economy by 4.32 +/- 1.57 mpg.

Is an automatic or manual trasmission better for MPG?

What is the quantitative difference in MPG between automatic and manual transmissions?

We performed our MPG analysis on the mtcars data set taken from the 1974 Motor Trend US magazine. This data set consists of 10 features of automobile design and performance for 32 distinct 1973-1974 model automobiles. The 10 features include, miles per gallon (mpg), number of cylinders, engine displacement in cubic inches, gross horsepower, rear axle ratio, weight in 1000 lbs, ¼ mile time, engine type (V line or Straight line), transmission type (automatic or manual), number of forward gears, and number of carburetors.

Initially, in order to understand the data, we did some exploratory analysis by plotting mpg vs transmission type for all cars in the data set (this and all figures referenced henceforth can be found in the appendix). We then fit a linear model for the outcome, mpg, with only the transmission type as a predictor. The results can be seen below.

fit1<-lm(mpg~factor(am),data=mtcars)
round(fit1$coef,2)
## (Intercept) factor(am)1 
##       17.15        7.24

Based on this fit, the average fuel economy for cars with an automatic transmission is 17.15 mpg and the effect of having an automatic transmission is 7.24 mpg, which means the manual transmission cars have an average fuel economy of 24.39 mpg. The p-value is of order 10-15, suggesting that this difference is statistically significant. Box plots of the data also show that for mpg for cars with manual transmissions, the lower quartile is greater than the upper quartile for cars with automatic transmissions.

Of course, there are 8 other covariates in the data that may contribute to the difference in mpg. We quickly examined these by looking at pairwise plots of all variables relative to one another. These showed clear correlations between mpg and the other variables, suggesting that transmission type should not be the only variable included in the model. As such, we ran a second model for the outcome mpg as a linear combination of all 9 other covariates, assigning engine type, transmission type, cylinder number, and foward gear number to be discrete factor variables.

We ran the anova() function on this fit, which analyzed the variance of each covariate and its effect on the response variable. The output is displayed below.

fit3<-lm(mpg~factor(vs)+factor(am)+factor(cyl)+factor(gear)+.,data=mtcars)
round(anova(fit3),2)
## Analysis of Variance Table
## 
## Response: mpg
##              Df Sum Sq Mean Sq F value Pr(>F)    
## factor(vs)    1 496.53  496.53   72.54 <2e-16 ***
## factor(am)    1 276.03  276.03   40.33 <2e-16 ***
## factor(cyl)   2  94.59   47.30    6.91   0.01 ** 
## factor(gear)  2   6.30    3.15    0.46   0.64    
## disp          1  35.75   35.75    5.22   0.03 *  
## hp            1  57.93   57.93    8.46   0.01 ** 
## drat          1   4.66    4.66    0.68   0.42    
## wt            1  14.99   14.99    2.19   0.16    
## qsec          1   5.26    5.26    0.77   0.39    
## carb          1   3.95    3.95    0.58   0.46    
## Residuals    19 130.05    6.84                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This suggests that the statistically significant (p<0.05) covariates to include in our model are engine type, cylinder number, displacement, and horsepower in addition to transmission type, our predictor of interest. Fitting a new model with only these 5 predictors yields the following coefficients.

fit4<-lm(mpg~factor(vs)+factor(am)+factor(cyl)+I(disp-mean(disp))+I(hp-mean(hp)),data=mtcars)
summ<-round(summary(fit4)$coef,2)
summ
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)             16.95       2.99    5.67     0.00
## factor(vs)1              2.38       1.92    1.24     0.23
## factor(am)1              4.32       1.57    2.76     0.01
## factor(cyl)6            -2.09       1.82   -1.15     0.26
## factor(cyl)8             1.84       3.78    0.49     0.63
## I(disp - mean(disp))    -0.01       0.01   -1.34     0.19
## I(hp - mean(hp))        -0.04       0.01   -2.78     0.01

The intercept, 16.95 +/- 2.99, should be interpreted as the predicted fuel economy in mpg for a car with 4 cylinders, automatic transmission, and straight line engine that has the mean horsepower and displacement for all cars in the data set. Our model predicts that the quantitative difference between a car with a manual and automatic transmission is 4.32 +/- 1.57 mpg, with the manual transmission delivering superior fuel economy.

Now we look at the residuals to confirm model fit. There appears to be no pattern to suggest poor model fit in either the plot of the residuals vs. the predictor (transmission type) nor the plot of the residuals vs. the model predicted outcome.

Doing some residual diagnostics, we can see which cars were most influential in the data. The dfbeta values are a measure of how influential each car in the data set was to each predictor coefficient, while the hat values are a measure of how influential each car was to its own prediction. The Maserati Bora had the highest hat value at 0.526, and the Toyota Corolla had the highest dfbeta value for the transmission type coefficient at 0.419. The Hornet 4 Drive had the largest absolute dfbeta value for the intercept. In all appendix plots, the Maserati, Toyota and Hornet are highlighted in blue, red, and green respectively.

round(dfbetas(fit4),3)
##                     (Intercept) factor(vs)1 factor(am)1 factor(cyl)6
## Mazda RX4                -0.022       0.088      -0.045       -0.067
## Mazda RX4 Wag            -0.022       0.088      -0.045       -0.067
## Datsun 710                0.110      -0.313      -0.383        0.124
## Hornet 4 Drive           -0.109       0.349       0.093        0.347
## Hornet Sportabout        -0.039      -0.014       0.025        0.039
## Valiant                   0.046      -0.097      -0.008       -0.117
## Duster 360                0.094      -0.043      -0.104       -0.071
## Merc 240D                 0.051      -0.023      -0.050       -0.050
## Merc 230                  0.013      -0.006      -0.014       -0.012
## Merc 280                 -0.004       0.008      -0.006        0.013
## Merc 280C                 0.057      -0.115       0.080       -0.175
## Merc 450SE                0.012       0.007       0.022       -0.017
## Merc 450SL               -0.017      -0.010      -0.031        0.024
## Merc 450SLC               0.051       0.030       0.094       -0.072
## Cadillac Fleetwood       -0.163       0.006      -0.163        0.187
## Lincoln Continental      -0.159       0.019      -0.083        0.169
## Chrysler Imperial         0.175      -0.037      -0.002       -0.170
## Fiat 128                 -0.209       0.232       0.298        0.044
## Honda Civic              -0.054       0.050       0.070        0.019
## Toyota Corolla           -0.325       0.340       0.419        0.085
## Toyota Corona            -0.266       0.120       0.309        0.239
## Dodge Challenger          0.108       0.002      -0.018       -0.109
## AMC Javelin               0.149       0.008       0.002       -0.154
## Camaro Z28                0.016      -0.008      -0.021       -0.012
## Pontiac Firebird         -0.011       0.000       0.155       -0.010
## Fiat X1-9                 0.112      -0.125      -0.161       -0.023
## Porsche 914-2             0.323      -0.355      -0.105       -0.261
## Lotus Europa             -0.007       0.185       0.149       -0.134
## Ford Pantera L            0.074      -0.083      -0.140       -0.037
## Ferrari Dino              0.062      -0.083      -0.021        0.016
## Maserati Bora            -0.016       0.072       0.043       -0.006
## Volvo 142E               -0.003      -0.336      -0.383        0.262
##                     factor(cyl)8 I(disp - mean(disp)) I(hp - mean(hp))
## Mazda RX4                  0.011               -0.012            0.053
## Mazda RX4 Wag              0.011               -0.012            0.053
## Datsun 710                -0.033               -0.146            0.075
## Hornet 4 Drive            -0.020                0.384           -0.122
## Hornet Sportabout          0.120                0.058           -0.198
## Valiant                   -0.026               -0.052            0.031
## Duster 360                -0.060               -0.038            0.143
## Merc 240D                 -0.041                0.005            0.007
## Merc 230                  -0.011               -0.002            0.008
## Merc 280                   0.005               -0.007            0.005
## Merc 280C                 -0.066                0.093           -0.063
## Merc 450SE                -0.038                0.051            0.007
## Merc 450SL                 0.053               -0.071           -0.009
## Merc 450SLC               -0.162                0.216            0.028
## Cadillac Fleetwood         0.241               -0.615            0.157
## Lincoln Continental        0.212               -0.444            0.051
## Chrysler Imperial         -0.203                0.298            0.067
## Fiat 128                   0.209               -0.014           -0.230
## Honda Civic                0.055                0.000           -0.069
## Toyota Corolla             0.343               -0.084           -0.334
## Toyota Corona              0.207                0.099           -0.197
## Dodge Challenger          -0.207                0.072            0.223
## AMC Javelin               -0.293                0.155            0.270
## Camaro Z28                -0.009               -0.011            0.027
## Pontiac Firebird           0.054                0.377           -0.373
## Fiat X1-9                 -0.112                0.006            0.124
## Porsche 914-2             -0.270                0.039            0.032
## Lotus Europa              -0.030               -0.027            0.146
## Ford Pantera L            -0.049               -0.044           -0.025
## Ferrari Dino              -0.052               -0.036            0.066
## Maserati Bora              0.005               -0.106            0.277
## Volvo 142E                 0.133               -0.238           -0.110
round(hatvalues(fit4),3)
##           Mazda RX4       Mazda RX4 Wag          Datsun 710 
##               0.293               0.293               0.127 
##      Hornet 4 Drive   Hornet Sportabout             Valiant 
##               0.272               0.102               0.230 
##          Duster 360           Merc 240D            Merc 230 
##               0.132               0.203               0.253 
##            Merc 280           Merc 280C          Merc 450SE 
##               0.240               0.240               0.181 
##          Merc 450SL         Merc 450SLC  Cadillac Fleetwood 
##               0.181               0.181               0.277 
## Lincoln Continental   Chrysler Imperial            Fiat 128 
##               0.228               0.177               0.149 
##         Honda Civic      Toyota Corolla       Toyota Corona 
##               0.180               0.152               0.274 
##    Dodge Challenger         AMC Javelin          Camaro Z28 
##               0.167               0.179               0.139 
##    Pontiac Firebird           Fiat X1-9       Porsche 914-2 
##               0.143               0.149               0.458 
##        Lotus Europa      Ford Pantera L        Ferrari Dino 
##               0.131               0.273               0.333 
##       Maserati Bora          Volvo 142E 
##               0.526               0.136

Appendix of Figures

data(mtcars)

plot(mtcars$am,mtcars$mpg, xaxt="n", ylab="MPG",xlim=c(-0.5,1.5),xlab="Trasmission Type")
axis(1,at=c(0,1),labels=c("Automatic","Manual"))
abline(fit1)
points(mtcars[c("Maserati Bora","Toyota Corolla","Hornet 4 Drive"),]$am,mtcars[c("Maserati Bora","Toyota Corolla","Hornet 4 Drive"),]$mpg,col=c("blue","red","forestgreen"),cex=2)

plot of chunk unnamed-chunk-5

MPG vs. transmission type with regression based on predicting mpg with transmission type only.

plot(factor(mtcars$am),mtcars$mpg,xaxt="n",ylab="MPG")
axis(1,at=c(1,2),labels=c("Automatic","Manual"))

plot of chunk unnamed-chunk-6

Box plot of mpg vs. transmission type, illustrating difference in distribution.

pairs(mtcars)

plot of chunk unnamed-chunk-7

Pairwise plot of all variables in the mtcars data set

plot(mtcars$am,resid(fit4),xaxt="n",xlim=c(-0.5,1.5),xlab="Trasmission Type")
axis(1,at=c(0,1),labels=c("Automatic","Manual"))
points(mtcars[c("Maserati Bora","Toyota Corolla","Hornet 4 Drive"),]$am,resid(fit4)[c("Maserati Bora","Toyota Corolla","Hornet 4 Drive")],col=c("blue","red","forestgreen"),cex=2)

plot of chunk unnamed-chunk-8

Residuals vs. transmission type

plot(predict(fit4),resid(fit4))
points(predict(fit4)[c("Maserati Bora","Toyota Corolla","Hornet 4 Drive")],resid(fit4)[c("Maserati Bora","Toyota Corolla","Hornet 4 Drive")],col=c("blue","red","forestgreen"),cex=2)

plot of chunk unnamed-chunk-9

Residuals vs. predicted values