Car Mileage Study

V. Rajaraman

Executive Summary

This is a study of car mileages conducted for a leading automobile industry magazine. The input data set is the mtcars data with 11 parameters of 32 different car models. The main questions to be answered are:

The study came to the conclusion that manual cars give, on the average, 7.25 miles per gallon more than automatic cars.

Exploratory Analysis

We first load the mtcars dataset and observe its key parameters. The average mileage is plotted as a red dot on the boxplot. The plots confirm that the first quartile of manual transmission mileage is better than the third quartile for automatic cars.

data(mtcars)
means = tapply(mtcars$mpg, mtcars$am, mean)
boxplot(mtcars$mpg ~ mtcars$am)
points(means, col = "red", pch = 19, cex = 1.5)

plot of chunk unnamed-chunk-1

Model Selection

Several models were considered for this study, including tree based models and linear regression using all the available parameters. But the interaction between the parameters made all of the coefficients insignificant ! To keep things simple, it was decided to assume a simple linear model between transmission type and mileage.

We fit a linear model between auto and manual gear mileage. Variable am is actually a factor variable, so we take this into account while fitting the model. Automatic transmission (am=0) has been taken as baseline and a dummy variable is created.

lm1 = lm(mpg ~ as.factor(am), data = mtcars)
summary(lm1)
## 
## Call:
## lm(formula = mpg ~ as.factor(am), data = mtcars)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.392 -3.092 -0.297  3.244  9.508 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       17.15       1.12   15.25  1.1e-15 ***
## as.factor(am)1     7.24       1.76    4.11  0.00029 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.9 on 30 degrees of freedom
## Multiple R-squared:  0.36,   Adjusted R-squared:  0.338 
## F-statistic: 16.9 on 1 and 30 DF,  p-value: 0.000285

The intercept is 17.147 and the slope is 7.245. This should be interpreted as follows: When am=0 (automatic) the expected mileage is 17.147. But when the gear is manual, 7.245 is added to the baseline. This means manual cars give an average mileage of (17.147 + 7.245) = 24.392 mpg.

plot(mtcars$am, mtcars$mpg, pch = 19, col = rgb(0, 0, 1, 0.5), xlab = "Transmission Type", 
    ylab = "Miles per Gallon", main = "Linear Model")
abline(lm1, col = "red", lwd = 2)
abline(h = coef(lm1)[1], col = "gray", lty = 2)
abline(h = (coef(lm1)[1] + coef(lm1)[2]), col = "gray", lty = 2)

plot of chunk unnamed-chunk-3

The red line is the linear regression, but more significant are the two gray dotted lines. They represent the average mileage of automatic and manual cars.

Conclusion

Appendix: Residual Analysis

resid1 = resid(lm1)
plot(resid1)
abline(h = 0, col = "gray", lty = 2)

plot of chunk unnamed-chunk-4

qqnorm(resid1)

plot of chunk unnamed-chunk-4

hist(resid1)

plot of chunk unnamed-chunk-4