Automatic or Manual Transmission

Which gives better Miles Per Gallon?

Executive Summary

Which is better? Manual or automatic? How do they work? Which one to choose? Why? These are all are always asked by anyone who wishes to buy a car. It is a general myth that manual transmission (MT)cars use less fuel. It gives more control over the car but are less convenient, while automatic transmissions (AT) burn (a little) more gas, provide less control but they are easier to use.

Looking at “mtcars”data set, we are interested in exploring the relationship between a set of variables and Miles Per Gallon (MPG).

We are particularly interested in the following two questions:

 1.Is an automatic or manual transmission better for MPG?
 2. Quantify the MPG difference between automatic and manual transmissions?
 

Our Conclusion is summarized here below

 1.It is arrived after extensive analysis  that Manual Transmission is  better for MPG.
 2 Our  model explains 84% of the variance in Miles Per Gallon (MPG). It also shows that Manual Transmission vehicles have 2.94 MPG  more than Automatic Transmission vehicles.

Data Processing

We will explore the dataset and try to answer above questions using exploratory data analysis and regression models.

library(knitr)
library(ggplot2)
dim(mtcars)
## [1] 32 11
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

The numeric variable ‘am’ classifies if the car is with Automatic Trasmission or Manual Trasmission.

We will convert the vriable as AT and MT for easy classification.

mtcars$am <- as.factor(mtcars$am)
levels(mtcars$am) <-c("AT", "MT")
hist(mtcars$mpg,breaks=12, col="green", xlab="Miles Per Gallon", main="MPG Histogram")

A boxplot was created to examine the relationship between mpg and transmission type.

boxplot(mpg ~ am, data=mtcars, xlab="Transmission Type", ylab="Miles per Gallon",
        main="Automatic versus Manual Transmission MPG", col="yellow")

From the above Plot we can infer that Manual transmission gives better MPG of around 24 compared to 18 of Automatic Transmission

A t-test was done to get the exact values and confidence interval for fuel consumption between the automatic transmission and manual transmission vehicles. We set the null hypothesis as automatic transmissions have a high mpg compared with manual transmission vehicles.

mpg.at <- mtcars[mtcars$am == "AT",]$mpg
mpg.mt <- mtcars[mtcars$am == "MT",]$mpg
t.test(mpg.at, mpg.mt) 
## 
##  Welch Two Sample t-test
## 
## data:  mpg.at and mpg.mt
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean of x mean of y 
##  17.14737  24.39231

The p-value is 0.001374, thus we can reject the null hypothesis and conclude that automatic transmission vehicles has low mpg compared with manual transmission vehicles. This would be true assuming all other characteristics of auto cars and manual cars are same.

Regression Analysis

We do a simple gression model to analyse further

fit <- lm(mpg~am, data = mtcars)
summary(fit)
## 
## Call:
## lm(formula = mpg ~ am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.3923 -3.0923 -0.2974  3.2439  9.5077 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   17.147      1.125  15.247 1.13e-15 ***
## amMT           7.245      1.764   4.106 0.000285 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.902 on 30 degrees of freedom
## Multiple R-squared:  0.3598, Adjusted R-squared:  0.3385 
## F-statistic: 16.86 on 1 and 30 DF,  p-value: 0.000285

The null hypothesis is rejected by p-value = 0.000285 but the R Squared value is 0.3598.This means that our model only explains 35.98% of the variance. We need to include other factors.

A multivarate regression model was done.The step function was used to find the best model.

stepmodel = step(lm(data = mtcars, mpg ~ .),trace=0,steps=10000)
summary(stepmodel)
## 
## Call:
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4811 -1.5555 -0.7257  1.4110  4.6610 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9.6178     6.9596   1.382 0.177915    
## wt           -3.9165     0.7112  -5.507 6.95e-06 ***
## qsec          1.2259     0.2887   4.247 0.000216 ***
## amMT          2.9358     1.4109   2.081 0.046716 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
## F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11

The R Squared value is 0.85.This means that our model explains 84% of the variation in mpg indicating it is a robust and highly predictive model. In adddition to transmission, weight of the vehicle as well as accelaration speed have the highest relation to explaining the variation in mpg.

A model with 3 variables wt, qsec and am was done.

bestfit <- lm(mpg~am + wt + qsec, data = mtcars)
anova(fit, bestfit)
## Analysis of Variance Table
## 
## Model 1: mpg ~ am
## Model 2: mpg ~ am + wt + qsec
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)    
## 1     30 720.90                                 
## 2     28 169.29  2    551.61 45.618 1.55e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This model captured 84% of the overall variation in mpg. With a p-value of 3.745e-09, we reject the null hypothesis and claim that our multivariate model is significantly different from our simple linear regression model.

summary(bestfit)
## 
## Call:
## lm(formula = mpg ~ am + wt + qsec, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4811 -1.5555 -0.7257  1.4110  4.6610 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9.6178     6.9596   1.382 0.177915    
## amMT          2.9358     1.4109   2.081 0.046716 *  
## wt           -3.9165     0.7112  -5.507 6.95e-06 ***
## qsec          1.2259     0.2887   4.247 0.000216 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
## F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11

Conclusion

This model explains 84% of the variance in miles per gallon (mpg). It also shows that manual transmission vehicles have 2.94 mpg more than automatic transmission vehicles.

Thus, we can conclude that manual transmission is better for mpg.