executive Summary

We explore the mtcars dataset to examine the effect of transmission type on miles per galon. We prove by fitting a linear model and by other methods that manual transmission has an MPG greater than an automatic transmission by 1.8

Problem Statement

In this project, we explore the mtcars dataset to examine two questions:

  1. Is an automatic or manual transmission better for MPG
  2. Quantify the MPG difference between automatic and manual transmissions

Exploratory Analysis

Variables

  • mpg: Miles/(US) gallon
  • cyl: Number of cylinders
  • disp: Displacement (cu.in.)
  • hp: Gross horsepower
  • drat: Rear axle ratio
  • wt: Weight (1000 lbs)
  • qsec: 1/4 mile time
  • vs: V/S
  • am: Transmission (0 = automatic, 1 = manual)
  • gear: Number of forward gears
  • carb: Number of carburetors

For purposes of consequent exploration and modelling, we transform relevant variables into factors.

#transforming into factors
mtcars$cyl  <- factor(mtcars$cyl)
mtcars$vs   <- factor(mtcars$vs)
mtcars$gear <- factor(mtcars$gear)
mtcars$carb <- factor(mtcars$carb)
mtcars$am   <- factor(mtcars$am,labels=c("Automatic","Manual"))

Effect of transmission on MPG

Visually, manual transmission appears to be more effective in terms of mpg.

boxplot(mpg ~ am, data = mtcars)

T-test proves that the dfference is statistically significant.

automatic <- mtcars$mpg[which(mtcars$am == "Automatic")]
manual <- mtcars$mpg[which(mtcars$am == "Manual")]
t.test(automatic,manual)
## 
##  Welch Two Sample t-test
## 
## data:  automatic and manual
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean of x mean of y 
##  17.14737  24.39231

To quantify the difference, we calculate means of mpg per group:

aggregate(mpg~am, data = mtcars, mean)
##          am      mpg
## 1 Automatic 17.14737
## 2    Manual 24.39231

Manual is 7.25 higher in terms of mileage than automatic. This however does not control for other variables. To find effect of only the type of transmission, we fit a linear model.

Regression modelling

Candidate Models

We fit three candidate models: * Simple model with one predictor. * Model with all variables as predictors * Model with predictors selected by the step function

data(mtcars)
simple_fit <- lm(mpg ~ am, data = mtcars)
init_fit <- lm(mpg ~ ., data = mtcars)
best_fit <- step(init_fit, direction = "both", trace = FALSE)

We compare the three models using anova.

anova(simple_fit,best_fit,init_fit)
## Analysis of Variance Table
## 
## Model 1: mpg ~ am
## Model 2: mpg ~ wt + qsec + am
## Model 3: mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
##   Res.Df    RSS Df Sum of Sq       F    Pr(>F)    
## 1     30 720.90                                   
## 2     28 169.29  2    551.61 39.2687 8.025e-08 ***
## 3     21 147.49  7     21.79  0.4432    0.8636    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model selected by the step function performs the best.

Description of the best model

summary(best_fit)
## 
## Call:
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4811 -1.5555 -0.7257  1.4110  4.6610 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9.6178     6.9596   1.382 0.177915    
## wt           -3.9165     0.7112  -5.507 6.95e-06 ***
## qsec          1.2259     0.2887   4.247 0.000216 ***
## am            2.9358     1.4109   2.081 0.046716 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
## F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11

The model reads as follows:

  • Cars with manual transmissions get 1.8 more MPG than automatic. This is adjusted for horsepower, number of cylinders, and the weight of the vehicle.

  • MPG decreases with the weight of the car, about 2.5 for every 1000 lb increase.

  • MPG will decrease by only 0.32 for every increase of 10 in horsepower.

  • If the number of cylinders increases from 4 to 6 or 8, the MPG will decrease by 3.0 or 2.2, respectively.

Model Diagnistics

par(mfrow = c(2,2))
plot(best_fit)