This is a study of car mileages conducted for a leading automobile industry magazine. The input data set is the mtcars data with 11 parameters of 32 different car models. The main questions to be answered are:
The study came to the conclusion that manual cars give, on the average, 7.25 miles per gallon more than automatic cars.
We first load the mtcars dataset and observe its key parameters. The average mileage is plotted as a red dot on the boxplot. The plots confirm that the first quartile of manual transmission mileage is better than the third quartile for automatic cars.
data(mtcars)
means = tapply(mtcars$mpg, mtcars$am, mean)
boxplot(mtcars$mpg ~ mtcars$am)
points(means, col = "red", pch = 19, cex = 1.5)
Several models were considered for this study, including tree based models and linear regression using all the available parameters. But the interaction between the parameters made all of the coefficients insignificant ! To keep things simple, it was decided to assume a simple linear model between transmission type and mileage.
We fit a linear model between auto and manual gear mileage. Variable am is actually a factor variable, so we take this into account while fitting the model. Automatic transmission (am=0) has been taken as baseline and a dummy variable is created.
lm1 = lm(mpg ~ as.factor(am), data = mtcars)
summary(lm1)
##
## Call:
## lm(formula = mpg ~ as.factor(am), data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.392 -3.092 -0.297 3.244 9.508
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.15 1.12 15.25 1.1e-15 ***
## as.factor(am)1 7.24 1.76 4.11 0.00029 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.9 on 30 degrees of freedom
## Multiple R-squared: 0.36, Adjusted R-squared: 0.338
## F-statistic: 16.9 on 1 and 30 DF, p-value: 0.000285
The intercept is 17.147 and the slope is 7.245. This should be interpreted as follows: When am=0 (automatic) the expected mileage is 17.147. But when the gear is manual, 7.245 is added to the baseline. This means manual cars give an average mileage of (17.147 + 7.245) = 24.392 mpg.
plot(mtcars$am, mtcars$mpg, pch = 19, col = rgb(0, 0, 1, 0.5), xlab = "Transmission Type",
ylab = "Miles per Gallon", main = "Linear Model")
abline(lm1, col = "red", lwd = 2)
abline(h = coef(lm1)[1], col = "gray", lty = 2)
abline(h = (coef(lm1)[1] + coef(lm1)[2]), col = "gray", lty = 2)
The red line is the linear regression, but more significant are the two gray dotted lines. They represent the average mileage of automatic and manual cars.
resid1 = resid(lm1)
plot(resid1)
abline(h = 0, col = "gray", lty = 2)
qqnorm(resid1)
hist(resid1)