Research Question 1: Is an automatic or manual transmission better for MPG
Research Question 2: Quantify the MPG difference between automatic and manual transmissions
The manual transmission car in 1974 is better on mpg than the automatic The mpg difference is if other coefficients held constant, for every unit increase of mpg, the manual car is higher than the automatic car in 1974, given the best fitted model.
To tackle the second question, we tended to fit a parsimonious model first by using backward elimination method, where we constructed first a full model, then removed the predictor that had the highest p-value before we fitted the model again;we repeated this method until all the predictors had lowest p-value less than the critical value 0.05.
data(mtcars)
library(ggplot2)
library(dplyr)
2.Exploratory analysis
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
summary(mtcars$mpg)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.40 15.42 19.20 20.09 22.80 33.90
mtcars$am <- as.factor(mtcars$am)
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$vs <- as.factor(mtcars$vs)
mtcars$gear <- as.factor(mtcars$gear)
mtcars$carb <- as.factor(mtcars$carb)
# look for correlation
pairs(mtcars, panel = panel.smooth)
# mpg difference between manual and auto transmission
diff(tapply(mtcars$mpg, mtcars$am, mean))
## 1
## 7.244939
boxplot(mpg ~ am, data = mtcars, main = "auto vs manual on mpg", xlab = "transmission( 0 : auto, 1: manual)", ylab = "mpg" )
The paired plot showed that there are correlations of am versus vs, gear, carb. The boxplot revealed that manual cars have higher mpg than automatic cars.
Conclusion for Question 1: In 1973 to 1974 models of 32 cars, manual transmission is generally better for mpg compared with automatic transmission.
# Backward Elimination
mdl_whole <- lm(mpg ~., data = mtcars)
summary(mdl_whole)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.87913244 20.06582026 1.19004018 0.25252548
## cyl6 -2.64869528 3.04089041 -0.87102622 0.39746642
## cyl8 -0.33616298 7.15953951 -0.04695316 0.96317000
## disp 0.03554632 0.03189920 1.11433290 0.28267339
## hp -0.07050683 0.03942556 -1.78835344 0.09393155
## drat 1.18283018 2.48348458 0.47627845 0.64073922
## wt -4.52977584 2.53874584 -1.78425732 0.09461859
## qsec 0.36784482 0.93539569 0.39325050 0.69966720
## vs1 1.93085054 2.87125777 0.67247551 0.51150791
## am1 1.21211570 3.21354514 0.37718957 0.71131573
## gear4 1.11435494 3.79951726 0.29328856 0.77332027
## gear5 2.52839599 3.73635801 0.67670068 0.50889747
## carb2 -0.97935432 2.31797446 -0.42250436 0.67865093
## carb3 2.99963875 4.29354611 0.69863900 0.49546781
## carb4 1.09142288 4.44961992 0.24528452 0.80956031
## carb6 4.47756921 6.38406242 0.70136677 0.49381268
## carb8 7.25041126 8.36056638 0.86721532 0.39948495
mdl <- lm(mpg ~. - cyl - carb - gear - vs -drat - disp - hp, data = mtcars)
summary(mdl)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.617781 6.9595930 1.381946 1.779152e-01
## wt -3.916504 0.7112016 -5.506882 6.952711e-06
## qsec 1.225886 0.2886696 4.246676 2.161737e-04
## am1 2.935837 1.4109045 2.080819 4.671551e-02
model = mpg ~ wt + qsec + am1 is the most parsimonious model after backward elimination of greater p values.
res <- resid(mdl)
qqnorm(res)
qqline(res)
hist(res, breaks = 5)
The residuals of the fitted model seem like skewed to the right. Under this model, the mpg difference is if other coefficients held constant, for every unit increase of mpg, the manual car is 2.9358372 mpg higher than the automatic car in 1974.