Executive Summary

This project analyzes the Motor Trend Car Road Tests data in the R datasets package to investigate if automatic or manual transmissions are better for MPG. Also this project quantifies the MPG difference between automatic and manual transmissions.

Summarizing Data

As Summary of Data section in Appendix indicates, this project uses a data frame with 32 observations of 11 variables.

Exploratory Data Analysis

As Figure-1 in Appendix indicates, it appears that the cars with automatic transmissions have lower MPG than the cars with manual transmissions.

Building Regression Models

Observing Simple Linear Regression Model

Now, observe the fit of the simple linear regression by using MPG as the dependent variable and transmission type as an independent variable.

summary(lm(mpg ~ am, data=mtcars))$adj.r.squared
## [1] 0.3384589

From the result above, the adjusted R-squared is 0.338 and this indicates this model only shows 33.8% of the variance. Therefore, this project will use a multivariable linear regression.

Observing Variables Correlation with MPG

From the results of Variable Correlation with MPG section in Appendix , the variables wt, cyl, disp, and hp have a strong correlation with mpg. These variables will be used to build the multivariable linear regression in the following sections.

Investigating Variables

From the results of Model Comparison section in Appendix, wt, hp, cyl, and disp affect the correlation for mpg and am. The results also show the adjusted R-squared is 0.827 and this indicates this model shows 82.7% of the variance.

Conclusion

Appendix

Summary of Data

data(mtcars)
head(mtcars)
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Exploratory Data Analysis

library(ggplot2)
mtcars_mp_transmission <- mtcars["mpg"]
mtcars_mp_transmission$transmission<-factor(mtcars$am, labels = c('automatic', 'manual'))
g <- ggplot(mtcars_mp_transmission, aes(x = transmission, y = mpg, colour = transmission))
g <- g + geom_boxplot()
g

Median for MPG of Automatic and Manual Transmission Cars

aggregate(mpg~transmission, data = mtcars_mp_transmission, median)

Variable Correlation with MPG

library(car)
## Loading required package: carData
sort(vif(lm(mpg ~ . , data = mtcars)), decreasing = TRUE)
##      disp       cyl        wt        hp      carb      qsec      gear 
## 21.620241 15.373833 15.164887  9.832037  7.908747  7.527958  5.357452 
##        vs        am      drat 
##  4.965873  4.648487  3.374620
sort(round(cor(mtcars), 3)["mpg",])
##     wt    cyl   disp     hp   carb   qsec   gear     am     vs   drat 
## -0.868 -0.852 -0.848 -0.776 -0.551  0.419  0.480  0.600  0.664  0.681 
##    mpg 
##  1.000

Model Comparison

lm_summary <- summary(lm(mpg ~ am + wt + cyl + disp + hp, data = mtcars))
sort(lm_summary $coefficients[-1,4])
##          wt          hp         cyl          am        disp 
## 0.007256888 0.055096587 0.113932156 0.289843011 0.304719404
lm_summary$adj.r.squared
## [1] 0.8272816
lm_summary <- summary(lm(mpg ~ am + wt + cyl + hp, data = mtcars))
sort(lm_summary $coefficients[-1,4])
##          wt          hp         cyl          am 
## 0.008603218 0.078553374 0.211916611 0.314179886
lm_summary$adj.r.squared
## [1] 0.8266657

MPG Difference Between Automatic and Manual transmissions

summary(lm(mpg ~ am, data = mtcars))$coefficients["am", "Estimate"]
## [1] 7.244939
summary(lm(mpg ~ am + wt + cyl + disp + hp, data = mtcars))$coefficients["am", "Estimate"]
## [1] 1.556492

Residual Plot

par(mfrow = c(2, 2))
plot(lm(mpg ~ am + wt + cyl + disp + hp, data = mtcars))