Here at Motor Trend, we will explore whether automatic or manual transmission cars acheive more miles per gallon (mpg). Specifically, we will answer the following two questions:
We will apply different regression models to the mtcars dataset comprising data of fuel consumption (mpg) and 10 aspects of automobile design and performance for 32 cars, including number of cylinders (cyl), displacement (disp), horsepower (hp), rear axle weight (drat), weight in 1000 lbs (wt), 1/4 mile time (qsec), V or Straight engine (vs), automatic [0] or manual [1] transmission (am), number of forward gears (gear) and number of carburetors (carb).
The structure of the mtcars data is shown in Appendix Figure 1. There are no missing (NA) values any(is.na(mtcars)) = FALSE, however, five variables (cyl, vs, am, gear, and carb) will be changed from class numeric to factor as follows:
mtcars$cyl <- as.factor(mtcars$cyl); mtcars$vs <- as.factor(mtcars$vs); mtcars$am <- as.factor(mtcars$am); mtcars$gear <- as.factor(mtcars$gear); mtcars$carb <- as.factor(mtcars$carb).
A plot of mpg by the two different transmission types (Appendix Figure 2), shows that there are no outlier values and that manual transmission cars appear to acheive more mpg than automatic transmission cars. We can continue our analysis using this data.
Linear Regression Model
In , we will make a linear fit of mpg with transmission (am) and this simple strategy will show how they relate to one another, without considering the other variables.
model1 <- lm(mpg ~ am, mtcars)
summary(model1)$coeff
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.147368 1.124603 15.247492 1.133983e-15
## am 7.244939 1.764422 4.106127 2.850207e-04
Interpreting the coefficients of shows, on average, automatic transmission cars acheive 17.15 mpg, whereas manual cars acheive 24.39 mpg. These values are also the mean mpg of the respective automatic and manual transmission data. This suggests that manual cars are significantly more fuel efficient (p=0.00029); however, this model only explains 0.36 of the total variation (r\(^2\)), so it is a rather poor model.
Multivariate linear Models
In we will compare all 10 variables with mpg and show the covariates of each variable (Appendix Figure 3).
Interpreting the coefficients of shows the variables with the largest impact on mpg are wt, followed by am and qsec. Changing from automatic to manual transmisson adds and extra 2.52 mpg; however, for each 1000 lb increase in weight there is a reduction of 3.715 mpg; and for each second increase in the 1/4 mile time there is an increase of 0.8 mpg. This model explains 0.869 of the total variation, but may include variables that are not informative (overfitting the model).
In , our startegy is to only add the variables with the greates impact on mpg, which are wt, am and qsec, to show how transmission affects mpg when holding wt and qsec constant.
model3 <- lm(mpg ~ am + wt + qsec, mtcars)
summary(model3)$coeff
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.617781 6.9595930 1.381946 1.779152e-01
## am 2.935837 1.4109045 2.080819 4.671551e-02
## wt -3.916504 0.7112016 -5.506882 6.952711e-06
## qsec 1.225886 0.2886696 4.246676 2.161737e-04
Interpreting the coefficients of shows when holding weight and 1/4 mile time constant, on average, manual transmission cars acheive and extra 2.94 mpg compared to automatic cars (9.62 versus 12.56 mpg), when holding weight and 1/4 mile time constant. For each 1000lb increase in weight, there is a reduction of 3.92 mpg, and for each second increase in 1/4 mile time there is a 1.23 mpg increase in mpg. This model explains 0.85 of the variation. Model3 is a parsimonious model that explains most of the variation in mpg and the influence of transmission on mpg (p=0.0467) is significant at t-test alpha level 0.05; however, the influence of weight (p=6.95e-06) and qsec (p=0.000216) are more influential on mpg.
The simplest diagnostic plot displays residuals versus fitted values (Appendix Figure 4). Residuals are uncorrelated with the fitted values of model3, and look to be independent and identically disributed with mean of zero (8.50014503228635e-17). The influence is the change which inclusion or exclusion of a sample induces in coefficents and is measure by the dfbeta function (Appendix Figure 5). We can see that no car exerts a large effect on the slope (the am column) than other cars.
Question 1: Is automatic or manual better for MPG?
A manual transmission car is more fuel efficient than an automatic transmission car (p=0.0467), although the weight of the car is more influential.
Question 2: Quantify the mpg difference between automatic and manual transmission.
An automatic car acheives 9.62 mpg, wheras a manual car acheives 12.55 mpg (95% confidence interval 9.66 - 15.44)
Figure 1. Structure of the mtcars data set
# structure of mtcars data set
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
Figure 2. MPG of automatic and manual cars
myColors <- c("red", "blue")
names(myColors) <- levels(mtcars$am)
boxplot(mpg ~ am, mtcars, main = "MPG with automatic and manual transmission")
stripchart(mpg ~ am, mtcars, vertical=TRUE, add=TRUE, col=myColors, pch=20)
Figure 3. Coefficients of 10 variables with mpg
summary(lm(mpg ~ ., mtcars))$coeff
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.30337416 18.71788443 0.6573058 0.51812440
## cyl -0.11144048 1.04502336 -0.1066392 0.91608738
## disp 0.01333524 0.01785750 0.7467585 0.46348865
## hp -0.02148212 0.02176858 -0.9868407 0.33495531
## drat 0.78711097 1.63537307 0.4813036 0.63527790
## wt -3.71530393 1.89441430 -1.9611887 0.06325215
## qsec 0.82104075 0.73084480 1.1234133 0.27394127
## vs 0.31776281 2.10450861 0.1509915 0.88142347
## am 2.52022689 2.05665055 1.2254035 0.23398971
## gear 0.65541302 1.49325996 0.4389142 0.66520643
## carb -0.19941925 0.82875250 -0.2406258 0.81217871
Figure 4. Model3 residuals versus fitted values
plot(model3, which=1)
Figure 5. Influence (dfbeta) of each car on Model3
dfbeta(model3)
## (Intercept) am wt qsec
## Mazda RX4 -0.2461584964 -0.11820185 -4.746437e-03 0.014521532
## Mazda RX4 Wag 0.2251908640 -0.15870039 -4.248697e-02 -0.003578144
## Datsun 710 1.2851506339 -0.41166650 -4.838616e-02 -0.060648888
## Hornet 4 Drive -0.0009160085 -0.04770156 -1.531829e-02 0.004927993
## Hornet Sportabout 1.1461897563 -0.24792055 -8.625946e-02 -0.039763689
## Valiant 1.2013961470 0.07340726 -2.151292e-02 -0.070445608
## Duster 360 -0.7982961510 0.13767654 4.813027e-02 0.031369673
## Merc 240D -0.4884355806 -0.19942472 -5.800431e-02 0.047679073
## Merc 230 3.7768092638 -0.09364883 -8.861203e-02 -0.199856961
## Merc 280 0.1431986851 -0.05984083 -1.695030e-02 -0.002447620
## Merc 280C -0.0477877566 0.11373204 2.570864e-02 -0.007369114
## Merc 450SE 0.1678357970 -0.06001273 1.859599e-02 -0.008793664
## Merc 450SL 0.1958833740 -0.05844892 -9.622038e-03 -0.006580399
## Merc 450SLC -0.1825165399 0.10164559 3.597664e-03 0.004158812
## Cadillac Fleetwood 0.7093013414 -0.11601028 -1.088590e-01 -0.019057137
## Lincoln Continental -0.2060673288 0.03555769 3.244818e-02 0.005315500
## Chrysler Imperial -4.0502327416 0.73800106 7.232273e-01 0.090353150
## Fiat 128 -2.7847883142 0.63390200 8.648916e-02 0.135225124
## Honda Civic 0.2055636072 0.02417316 -7.972712e-02 0.005220188
## Toyota Corolla -2.1348391286 0.42825460 -3.476154e-02 0.124612663
## Toyota Corona -0.9939850447 0.56323564 2.849541e-01 -0.014762378
## Dodge Challenger -0.6838130862 0.14263474 4.686719e-02 0.024664836
## AMC Javelin -1.2570053528 0.29443304 9.988023e-02 0.041576099
## Camaro Z28 -0.1744629320 0.02715155 7.759725e-03 0.007407989
## Pontiac Firebird 1.4415687252 -0.32374700 -4.788059e-02 -0.057740876
## Fiat X1-9 0.1790722703 -0.06392546 1.410698e-02 -0.012772259
## Porsche 914-2 0.5549474391 0.03080781 -4.990977e-02 -0.020174866
## Lotus Europa 2.5114514103 -0.19045782 -3.004374e-01 -0.075846799
## Ford Pantera L -1.0886062551 -0.19828288 -4.397052e-02 0.068982876
## Ferrari Dino -0.4095707439 -0.06879133 2.542940e-06 0.022524204
## Maserati Bora -0.2795976381 -0.24007693 -9.517475e-02 0.035103981
## Volvo 142E 2.1398103249 -0.57794121 -1.782619e-01 -0.080731844