Regression Study of manual and automatic transmission efficiency of MPG

Executive Summary

This study will attempt to investigate the question brought up by Motor Trend Magazine who asked if there were an relationship between different transmission types and MPG. We will look at the mtcars data set in R and using various techniques to discover relationships with different models and regression . Specifically we will answer two questions,

Further more in the linear model when separating the two transmission types we concluded that manual had a mean of 24.39 while Automatic had a mean of 17.14 MPG. Thought these numbers are subject to the same omitted variable bias as the linear model before. We can at least agree to the notion at Manual is more efficient than Automatic in MPGs.

Load Data

Load the dataset and convert categorical variables to factors.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.6.3
data(mtcars)
head(mtcars, n=3)
dim(mtcars)
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$vs <- as.factor(mtcars$vs)
mtcars$am <- factor(mtcars$am)
mtcars$gear <- factor(mtcars$gear)
mtcars$carb <- factor(mtcars$carb)
attach(mtcars)

The Data set

The data set was extracted from the 1974 edition of Motor Trend US Magazine and it deals with 1973 - 1974 models. It consists of 32 observations on 11 variables:

  • mpg: Miles per US gallon
  • cyl: Number of cylinders
  • disp: Displacement (cubic inches)
  • hp: Gross horsepower
  • drat: Rear axle ratio
  • wt: Weight (lb / 1000)
  • qsec: 1 / 4 mile time
  • vs: V/S
  • am: Transmission (0 = automatic, 1 = manual)
  • gear: Number of forward gears
  • carb: Number of carburetors

Exploratory Analysis

See Appendix Figure I Exploratory Box graph that compares Automatic and Manual transmission MPG. The graph leads us to believe that there is a significant increase in MPG when for vehicles with a manual transmission vs automatic.

Statistical Inference

T-Test transmission type and MPG

testResults <- t.test(mpg ~ am)
testResults$p.value
## [1] 0.001373638

The T-Test rejects the null hypothesis that the difference between transmission types is 0.

testResults$estimate
## mean in group 0 mean in group 1 
##        17.14737        24.39231

The difference estimate between the 2 transmissions is 7.24494 MPG in favor of manual.

Regression Analysis

Fit the full model of the data

fullModelFit <- lm(mpg ~ ., data = mtcars)
summary(fullModelFit)  # results hidden
summary(fullModelFit)$coeff  # results hidden

Since none of the coefficients have a p-value less than 0.05 we cannot conclude which variables are more statistically significant.

Backward selection to determine which variables are most statistically significant

stepFit <- step(fullModelFit)
summary(stepFit) # results hidden
summary(stepFit)$coeff # results hidden

The new model has 4 variables (cylinders, horsepower, weight, transmission). The R-squared value of 0.8659 confirms that this model explains about 87% of the variance in MPG. The p-values also are statistically significantly because they have a p-value less than 0.05. The coefficients conclude that increasing the number of cylinders from 4 to 6 with decrease the MPG by 3.03. Further increasing the cylinders to 8 with decrease the MPG by 2.16. Increasing the horsepower is decreases MPG 3.21 for every 100 horsepower. Weight decreases the MPG by 2.5 for each 1000 lbs increase. A Manual transmission improves the MPG by 1.81.

Residuals & Diagnostics

Residual Plot See Appendix Figure II

The plots conclude:

  1. The randomness of the Residuals vs. Fitted plot supports the assumption of independence
  2. The points of the Normal Q-Q plot following closely to the line conclude that the distribution of residuals is normal
  3. The Scale-Location plot random distribution confirms the constant variance assumption
  4. Since all points are within the 0.05 lines, the Residuals vs. Leverage concludes that there are no outliers
sum((abs(dfbetas(stepFit)))>1)
## [1] 0

Conclusion

There is a difference in MPG based on transmission type. A manual transmission will have a slight MPG boost. However, it seems that weight, horsepower, & number of cylinders are more statistically significant when determining MPG.

Appendix Figures

I

II