Consumption of Cars with Automatic vs Manual Transmissions

Summary

The new and kickass implementation of automatic transmissions is a huge step towards a more convenient driving experience. However, this comfort might be more expensive in terms of gasoline consumption. This analysis sheds some light on the relation between consumption (in miles per gallon) and the kind of transmission.

The calculated regression model confirms the suspicion that automatic transmissions are more expensive in that you last almost 3 miles more per gallon fuel if you use a manula transmission, given that everything is equal.

Explorative Data Analysis

First, let's examine the direct connection between the kind of transmission and the consumption.

library(datasets)
data(mtcars)

boxplot(mpg ~ am, data = mtcars, ylab = "Miles per Gallon", names = c("automatic", 
    "manual"))

plot of chunk unnamed-chunk-1

At first glance manual transmissions have a much better consumption rate.

## For the purpose of this analysis I create two subsets - one for cars with
## manual, one for cars with automatic transmission.
carsa <- subset(mtcars, mtcars$am == "0")
carsm <- subset(mtcars, mtcars$am == "1")
t.test(carsa$mpg, carsm$mpg)$p.value
## [1] 0.001374

With such a small p-value the difference is clearly significant.

However, other effects might come into play. Cars with automatic transmissions might be more expensive, heavier and generally less fuel-efficient. Let's check out the plots of mpg (miles per gallon) with the other variables. (See Appendix 1) The plot which compares mpg with the weight (wt) shows that heavier cars have automatic transmissions while cars with manual transmission are lighter. Let's build a regression model.

Building the Model

The regression model is built by starting with mpg as the dependent variable and all other variables as predictors. Unnecessary (insignificant) variables get eliminated stepwise until there are only significant predictors left. (See Appendix 2 for the stepwise calculations.)

model <- step(lm(mpg ~ ., data = mtcars), direction = "backward")
summary(model)$coef
##             Estimate Std. Error t value  Pr(>|t|)
## (Intercept)    9.618     6.9596   1.382 1.779e-01
## wt            -3.917     0.7112  -5.507 6.953e-06
## qsec           1.226     0.2887   4.247 2.162e-04
## am             2.936     1.4109   2.081 4.672e-02

Model

The regression model is based on the kind of transmission, the weight (wt), and the acceleration (qsec).

mpg = 9.62 + 2.94*am + 1.23*qsec - 3.92*wt

If we keep all other variables constant, a car with manual transmission has a longer cruising range of almost 3 miles per gallon than a car with automatic transmission.

Residual Analysis

With the determination coefficient R²=0.85, around 85% of the variance is explained by the model and only 15% is unexplained.

Finally, let's check if the residuals are distributed normally.

par(mfrow = c(2, 2))
plot(model)

plot of chunk unnamed-chunk-5

The first plot is a bit worrisome, but due to the small sample size, my ongoing ignorance, the non-existence of a training data set and a soccerball tournament in South America I'm going to leave the model as it is.

Appendix 1

The 9 plots of mpg (miles per gallon) with the other variables while coloring the dots for cars with automatic transmission red, and those for cars with manual transmission blue.

par(mfrow = c(3, 3))

plot(mpg ~ hp, data = mtcars, type = "n")
points(mpg ~ hp, data = carsa, col = "red")
points(mpg ~ hp, data = carsm, col = "blue")
abline(lm(mpg ~ hp, data = carsa))
abline(lm(mpg ~ hp, data = carsm))

plot(mpg ~ cyl, data = mtcars, type = "n")
points(mpg ~ cyl, data = carsa, col = "red")
points(mpg ~ cyl, data = carsm, col = "blue")
abline(lm(mpg ~ cyl, data = carsa))
abline(lm(mpg ~ cyl, data = carsm))

plot(mpg ~ disp, data = mtcars, type = "n")
points(mpg ~ disp, data = carsa, col = "red")
points(mpg ~ disp, data = carsm, col = "blue")
abline(lm(mpg ~ disp, data = carsa))
abline(lm(mpg ~ disp, data = carsm))

plot(mpg ~ drat, data = mtcars, type = "n")
points(mpg ~ drat, data = carsa, col = "red")
points(mpg ~ drat, data = carsm, col = "blue")
abline(lm(mpg ~ drat, data = carsa))
abline(lm(mpg ~ drat, data = carsm))

plot(mpg ~ wt, data = mtcars, type = "n")
points(mpg ~ wt, data = carsa, col = "red")
points(mpg ~ wt, data = carsm, col = "blue")
abline(lm(mpg ~ wt, data = carsa))
abline(lm(mpg ~ wt, data = carsm))

plot(mpg ~ qsec, data = mtcars, type = "n")
points(mpg ~ qsec, data = carsa, col = "red")
points(mpg ~ qsec, data = carsm, col = "blue")
abline(lm(mpg ~ qsec, data = carsa))
abline(lm(mpg ~ qsec, data = carsm))

plot(mpg ~ vs, data = mtcars, type = "n")
points(mpg ~ vs, data = carsa, col = "red")
points(mpg ~ vs, data = carsm, col = "blue")
abline(lm(mpg ~ vs, data = carsa))
abline(lm(mpg ~ vs, data = carsm))

plot(mpg ~ gear, data = mtcars, type = "n")
points(mpg ~ gear, data = carsa, col = "red")
points(mpg ~ gear, data = carsm, col = "blue")
abline(lm(mpg ~ gear, data = carsa))
abline(lm(mpg ~ gear, data = carsm))

plot(mpg ~ carb, data = mtcars, type = "n")
points(mpg ~ carb, data = carsa, col = "red")
points(mpg ~ carb, data = carsm, col = "blue")
abline(lm(mpg ~ carb, data = carsa))
abline(lm(mpg ~ carb, data = carsm))

plot of chunk unnamed-chunk-6

Appendix 2

The stepwise calculation of the final regression model.

model <- step(lm(mpg ~ ., data = mtcars), direction = "backward")
## Start:  AIC=70.9
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
## 
##        Df Sum of Sq RSS  AIC
## - cyl   1      0.08 148 68.9
## - vs    1      0.16 148 68.9
## - carb  1      0.41 148 69.0
## - gear  1      1.35 149 69.2
## - drat  1      1.63 149 69.2
## - disp  1      3.92 151 69.7
## - hp    1      6.84 154 70.3
## - qsec  1      8.86 156 70.8
## <none>              148 70.9
## - am    1     10.55 158 71.1
## - wt    1     27.01 174 74.3
## 
## Step:  AIC=68.92
## mpg ~ disp + hp + drat + wt + qsec + vs + am + gear + carb
## 
##        Df Sum of Sq RSS  AIC
## - vs    1      0.27 148 67.0
## - carb  1      0.52 148 67.0
## - gear  1      1.82 149 67.3
## - drat  1      1.98 150 67.3
## - disp  1      3.90 152 67.7
## - hp    1      7.36 155 68.5
## <none>              148 68.9
## - qsec  1     10.09 158 69.0
## - am    1     11.84 159 69.4
## - wt    1     27.03 175 72.3
## 
## Step:  AIC=66.97
## mpg ~ disp + hp + drat + wt + qsec + am + gear + carb
## 
##        Df Sum of Sq RSS  AIC
## - carb  1      0.69 148 65.1
## - gear  1      2.14 150 65.4
## - drat  1      2.21 150 65.4
## - disp  1      3.65 152 65.8
## - hp    1      7.11 155 66.5
## <none>              148 67.0
## - am    1     11.57 159 67.4
## - qsec  1     15.68 164 68.2
## - wt    1     27.38 175 70.4
## 
## Step:  AIC=65.12
## mpg ~ disp + hp + drat + wt + qsec + am + gear
## 
##        Df Sum of Sq RSS  AIC
## - gear  1       1.6 150 63.5
## - drat  1       1.9 150 63.5
## <none>              148 65.1
## - disp  1      10.1 159 65.2
## - am    1      12.3 161 65.7
## - hp    1      14.8 163 66.2
## - qsec  1      26.4 175 68.4
## - wt    1      69.1 218 75.3
## 
## Step:  AIC=63.46
## mpg ~ disp + hp + drat + wt + qsec + am
## 
##        Df Sum of Sq RSS  AIC
## - drat  1       3.3 153 62.2
## - disp  1       8.5 159 63.2
## <none>              150 63.5
## - hp    1      13.3 163 64.2
## - am    1      20.0 170 65.5
## - qsec  1      25.6 176 66.5
## - wt    1      67.6 218 73.4
## 
## Step:  AIC=62.16
## mpg ~ disp + hp + wt + qsec + am
## 
##        Df Sum of Sq RSS  AIC
## - disp  1       6.6 160 61.5
## <none>              153 62.2
## - hp    1      12.6 166 62.7
## - qsec  1      26.5 180 65.3
## - am    1      32.2 186 66.3
## - wt    1      69.0 222 72.1
## 
## Step:  AIC=61.52
## mpg ~ hp + wt + qsec + am
## 
##        Df Sum of Sq RSS  AIC
## - hp    1       9.2 169 61.3
## <none>              160 61.5
## - qsec  1      20.2 180 63.3
## - am    1      26.0 186 64.3
## - wt    1      78.5 239 72.3
## 
## Step:  AIC=61.31
## mpg ~ wt + qsec + am
## 
##        Df Sum of Sq RSS  AIC
## <none>              169 61.3
## - am    1      26.2 195 63.9
## - qsec  1     109.0 278 75.2
## - wt    1     183.3 353 82.8