Executive Summary

For those of us who drive stick shift cars, we know that manual transmission yields more miles per gallon (MPG) than automatic! But wait, you can find automatic cars that yield higher MPG than certain manual cars. So which is better? In this analysis I analyze the relationship between type of transmission and MPG together with additional variables.

First I look at a simple linear model with MPG and type of transmission. Then I take a deep look at other variables in particular weight, which has significant correlation with both MPG and type of transmission. After all variables are included in the model I estimate with 95% confidence that manual transmission is better and it yields between 0.25 and 18 more MPG than automatic transmission.

An interesting point in this analysis is that after adjusting for weight and holding weight constant, the type of transmission has little or no impact on MPG. Furthermore, when the interaction between type of tranmission and MPG is included the model can explain more than 80% of the variation; this indicates the model is very robust.

Data Fields

The data set includes the following variables: mpg Miles/(US) gallon cyl Number of cylinders hp Gross horsepower drat Rear axle ratio wt Weight (lb/1000) am Transmission (0 = automatic, 1 = manual) gear Number of forward gears

Approach

I use a multivariable regression model with particular focus on the interaction between type of transmission and weight.

Step 1: Process Data

First I load the mtcars dataset and define the factor variables.

Step 2: Data Analysis

In Figure 1 I create a boxplot to show the relationship between type of transmission and MPG. By looking at it I can say that manual transmission is better than automatic because it has a higher median MPG. Not only there’s a clear difference between the median values there is also no overlap between the Inter Quartile Ranges (IQR). However, I need to check if this difference is significant by doing a sample t-test analysis:

## 
##  Welch Two Sample t-test
## 
## data:  mn and at
## t = 3.767, df = 18.33, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   3.21 11.28
## sample estimates:
## mean of x mean of y 
##     24.39     17.15

Because P-value is < 0.05 I can reject hypothesis “Ho = true difference is 0” and state that the difference in means is significant. So, manual is better than automatic. Further, assuming all other variables are held constant (unadjusted), manual transmission yields 7 more MPG (24 - 17) on average. Notice that the linear model mpg(hat) = b0 + b1(transmission = “manual”) gives the same result:

tr <- lm(mpg ~ am
         ,data = motor)
round(tr$coeff[2],0)
## ammanual 
##        7

Since we have many othe variables that affect MPG, I will explore their relationship with transmission by doing a scatterplot matrix. See Figure 2.

I decided to exclude qsec (1/4 mile time) because it’s not actually a specification of the car and Number of Carburatos and Displacement because they are directly related to horsepower. Finally I ignore V/S which is probably the shape of the motor.

The scatterplot in Figure 2 shows some useful information: * All other variables seem to be related to am (type transmission) * The weakest related variable is perhaps hp (horsepower). * drat (Rear axle ratio) and wt (weight) show a strong correlation with am.

In Figure 3 I take a closer look at the relationship with weight. The plot shows a correlation between weight and MPG; so let’s go ahead and adjust for weight:

ftwt1 <- lm(mpg ~ wt + am
            ,data = motor)
ftwt1$coeff
## (Intercept)          wt    ammanual 
##    37.32155    -5.35281    -0.02362

After adjusting for weight there is almost no difference in MPG between automatic and manual transmissions (only -0.02). In other words holding weight constant, transmission has little or no impact on mpg. Let’s update the model to see if there is interaction w/ weight: mpg (hat) = b0 + b1wt + b2(transmission = “manual”) + b3wt where b0: mpg when wt = 0 for transmission = “automatic” b1: change in mpg by wt for “automatic” b0 + b2: mpg when wt = 0 for “manual” b1 + b3: change in mpg by wt for “manual”

fitwt <- lm(mpg ~ wt + am*wt
            ,data = motor)
fitwt$coeff
## (Intercept)          wt    ammanual wt:ammanual 
##      31.416      -3.786      14.878      -5.298

Clearly after adjusting for weight and taking into account interaction between transmission and weight, the model shows that manual transmission yields almost 15 mpg more than automatic. More than our original 7!

I’ll go ahead and do a quick check on residuals to make sure the model is robust. See Figure 5.

The Residual Plot shows that residuals are normally distributed there is a linear relationship. Good ! Now I will check R2, the percentage of variation explained by the model

summary(fitwt)$r.squared
## [1] 0.833

More than 80% of variation is explained by this model. Very good. So far we have adjusted for Weight and we still conclude that Manual transmission is better than Automatic because it yields about 14 more mpg. Now I will adjust for all other variables and check the coefficientfor Manual transmission.

fitmotor <- lm(mpg ~ . + am:wt
               ,data = motor
               )
fitmotor$coeff
##  (Intercept)   cylsix cyl cyleight cyl           hp         drat 
##     28.46885     -2.22328     -2.10653     -0.02310     -0.06085 
##           wt     ammanual         gear  wt:ammanual 
##     -2.11607      9.57144      0.73338     -3.21353

After adjusting for all other variables, the model shows that manual transmission yields almost 10 additional mpg than automatic. Interestingly, this is somewhere in between 7 and 14 which were my previous estimates. I’ll check confidence interval for the coefficient.

fitmcf <- summary(fitmotor)$coefficients

fitmcf[7,1] + c(-1, 1) * qt(0.975, df = fitmotor$df) * fitmcf[7,2]
## [1]  0.2512 18.8917

With 95% confidence I estimate that manual transmission is better because it yields between 0.25 and 18 more mpg than automatic transmission.

Appendix of Figures

plot of chunk figure01

plot of chunk figure02

plot of chunk figure03

plot of chunk figure04

plot of chunk figure05