Executive Summary

We used data from the 1974 edition of Motor Trend Magazine to examine the effect of manual and automatic transmission types on fuel efficiency (i.e. miles-per-gallon MPG). We are particularly interested in answering the following questions:

  1. “Is an automatic or manual transmission better for MPG”
  2. “Quantifying how different is the MPG between automatic and manual transmissions?”

There is indeed a difference in fuel efficiency based on transmission type.

Nonetheless, we concluded that transmission type on its own is not a good predictor of MPG as the weight of the car, number of cylinders and horsepower are better predictors of fuel efficiency with an adjusted R-squared of 0.82. If added to the model, then the MPG difference for manual vehicles are much smaller.

Loading and Preprocessing of Data

# load the mtcars dataset
data(mtcars)

Qualitative variables such as number of cylinders, gears and carburetors are converted to factors variables

# convert qualitative data to factors
mtcars$cyl <- factor(mtcars$cyl)
mtcars$vs <- factor(mtcars$vs)
mtcars$gear <- factor(mtcars$gear)
mtcars$am <- factor(mtcars$am, labels = c("Automatic", "Manual"))
mtcars$carb <- factor(mtcars$carb)

Exploratory Data Analysis

A pair-wise scatterplot matrix (Appendix, Figure 1) was constructed to observe the correlation between **miles per gallon “MPG”“** and other variables of interest such as displacement “disp”, horsepower “hp”, cylinders “cyl”, rear axle ratio “draft”, weight “wt”, transmission “am”, V/S “vs”, etc

A box-and-whisker plot (Appendix, Figure 2) was produced to explore the relationship between manual and automatic transmission type on miles-per-gallon MPG and from here we see that there is an increase in MPG when the car transmission type is manual.

Regression Analysis

Here we built several regression models to find the best model. Analysis of Residuals is performed after model selection.

The stepwise selecion function is used to determine the best model. It does so by creating multple regression models with different variables and produces list of best predictors

# step wise selection function
best.model <- step(lm(mpg ~ ., data = mtcars), trace = 0)

From the result presented in Figure 3 (see Appendix), we observe that the best model includes cyl6, cyl8, hp, wt, and amManual variables (overall p-value <0.001). The R-squared indicates that approximately 84% of the variance is explained by the regression model.

Also, by examining the output of this model, we observed that mpg decreases with respect to cylinders (-3.03 and -2.16 for cyl6 and cyl8, respectively), horsepower (-0.03), and weight (for every 1,000lb, by -2.5), while mpg increases with having a manual transmission (by 1.8).

Diagnostics

The Residuals are plotted in the Appendix, Figure 4. From the residual plot, we observe that:

Statistical Inference

The t-test output shown below shows that the difference between manual and automatic transmission is statistically significant with p-value < 0.05.

t.test(mpg ~ am, data = mtcars)
## 
##  Welch Two Sample t-test
## 
## data:  mpg by am
## t = -3.767, df = 18.33, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.28  -3.21
## sample estimates:
## mean in group Automatic    mean in group Manual 
##                   17.15                   24.39

Conclusion

From the result obtained, we can state that cars with manual transmission have a better fuel efficiency (mpg) than cars with automatic transmission.

Although, transmission type on its own is not a good predictor of MPG as we saw that weight of the car, number of cylinders and horsepower are good predictors of fuel efficiency (see best.model summary above).

Therefore, we conclude that transmission type alone is not a good predictor of MPG

Appendix

Figure 1 - Pair-wise scatter plot

fit1 <- lm(mpg ~ am, data = mtcars)

# pair-wise scatterplot
pairs(mtcars, panel = panel.smooth, main = "Pairwise plot of mtcars data")

plot of chunk Fig1

Figure 2 - Box-plot

# boxplot
boxplot(mpg ~ am, data = mtcars,
        xlab = "Transmission type", ylab = "Miles per gallon",
        main = "MPG vs Transmission", col = c("green", "purple"), 
        names = c("Automatic", "Manual"))

plot of chunk Fig2

Figure 3 - Summary of Best Model Coefficients

best.model <- step(lm(mpg ~ ., data = mtcars), trace = 0)
summary(best.model)$coef
##             Estimate Std. Error t value  Pr(>|t|)
## (Intercept) 33.70832    2.60489 12.9404 7.733e-13
## cyl6        -3.03134    1.40728 -2.1540 4.068e-02
## cyl8        -2.16368    2.28425 -0.9472 3.523e-01
## hp          -0.03211    0.01369 -2.3450 2.693e-02
## wt          -2.49683    0.88559 -2.8194 9.081e-03
## amManual     1.80921    1.39630  1.2957 2.065e-01

Figure 4 - Residual plot

# residual plot
par(mfrow=c(2, 2))
plot(best.model)

plot of chunk Fig4