This report provides an analysis and evaluation of fuel efficiency in terms of miles per gallon (i.e., MPG) between automatic and manual transmission automobiles. The data was obtained from the 1974 Motor Trend US magazine, which comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).
The methods used include “t.test”, “lm”, and “step”. Other analyses used to support the results are “boxplot” and summaries of the data per transmission type.
Results of the data analyzed show that, on average, manual transmission automobiles are better for MPG than their automatic transmission counterparts. Further results show that, in addition to transmission types, other factors may need to be considered to establish a more precise fuel efficiency estimation.
The objective of this study is to respond to the following inquiries:
1. “Is an automatic or manual transmission better for MPG?”
2. “Quantify the MPG difference between automatic and manual transmissions”
# Load the dataset
library(datasets)
data(mtcars)
mpg: Miles/(US) gallon
cyl: Number of cylinders
disp: Displacement (cu.in.)
hp: Gross horsepower
drat: Rear axle ratio
wt: Weight (lb/1000)
qsec: 1/4 mile time
vs: V/S
am: Transmission (0 = automatic, 1 = manual)
gear: Number of forward gears
carb: Number of carburetors
# Inspect the structure of the data
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
# Transform numeric values of needed variables to factors
mtcars$am <- factor(mtcars$am, labels=c('Automatic', 'Manual'))
A preliminary data exploration shows that:
a) The box plot (Appendix A) suggests that manual transmission automobiles are better for MPG.
b) The summaries per transmission type (Appendix A) also suggest that manual transmission automobiles are better for MPG, as the mean for Manual is 24.39 MPG, while the mean for Automatic is 17.15 MPG.
Hence, test for:
H0: There is no difference in MPG between automatic and manual transmissions.
H1: There is a difference in MPG between automatic and manual transmissions.
t.test(mtcars$mpg ~ mtcars$am, conf.level=0.95)
##
## Welch Two Sample t-test
##
## data: mtcars$mpg by mtcars$am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.280194 -3.209684
## sample estimates:
## mean in group Automatic mean in group Manual
## 17.14737 24.39231
The result shows that p-value is less than the chosen significance level of 0.05, indicating strong evidence against the null hypothesis, and thus confirming the differences noted in items a) and b) above. Therefore, the null hypothesis is rejected.
In this step an attempt is made to determine whether there exists other variables that account for MPG variability. Given the considerable number of variables that could affect MPG, function step() will be used to have R choose the best linear regression model.
bestModel <- step(lm(data = mtcars, mpg ~ .), trace=0)
summary(bestModel)
##
## Call:
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4811 -1.5555 -0.7257 1.4110 4.6610
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.6178 6.9596 1.382 0.177915
## wt -3.9165 0.7112 -5.507 6.95e-06 ***
## qsec 1.2259 0.2887 4.247 0.000216 ***
## amManual 2.9358 1.4109 2.081 0.046716 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared: 0.8497, Adjusted R-squared: 0.8336
## F-statistic: 52.75 on 3 and 28 DF, p-value: 1.21e-11
The best selected model includes weight (wt), 1/4 mile time (qsec) and transmission (am). The coefficients indicate that MPG decreases -3.91 for every lb/1000 increase in weight, whereas every increase of 1/4 mile time increases MPG by 1.23. With respect to transmission types, on average manual transmission is 2.9 better than automatic transmission.
boxplot(mpg ~ am, data = mtcars, xlab = "Transmission Type", ylab = "Miles per Gallon", main="Miles per Gallon (MPG) by Transmission Type")
summary(mtcars[mtcars$am == "Automatic",])
## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. :120.1 Min. : 62.0
## 1st Qu.:14.95 1st Qu.:6.000 1st Qu.:196.3 1st Qu.:116.5
## Median :17.30 Median :8.000 Median :275.8 Median :175.0
## Mean :17.15 Mean :6.947 Mean :290.4 Mean :160.3
## 3rd Qu.:19.20 3rd Qu.:8.000 3rd Qu.:360.0 3rd Qu.:192.5
## Max. :24.40 Max. :8.000 Max. :472.0 Max. :245.0
## drat wt qsec vs
## Min. :2.760 Min. :2.465 Min. :15.41 Min. :0.0000
## 1st Qu.:3.070 1st Qu.:3.438 1st Qu.:17.18 1st Qu.:0.0000
## Median :3.150 Median :3.520 Median :17.82 Median :0.0000
## Mean :3.286 Mean :3.769 Mean :18.18 Mean :0.3684
## 3rd Qu.:3.695 3rd Qu.:3.842 3rd Qu.:19.17 3rd Qu.:1.0000
## Max. :3.920 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Automatic:19 Min. :3.000 Min. :1.000
## Manual : 0 1st Qu.:3.000 1st Qu.:2.000
## Median :3.000 Median :3.000
## Mean :3.211 Mean :2.737
## 3rd Qu.:3.000 3rd Qu.:4.000
## Max. :4.000 Max. :4.000
summary(mtcars[mtcars$am == "Manual",])
## mpg cyl disp hp
## Min. :15.00 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:21.00 1st Qu.:4.000 1st Qu.: 79.0 1st Qu.: 66.0
## Median :22.80 Median :4.000 Median :120.3 Median :109.0
## Mean :24.39 Mean :5.077 Mean :143.5 Mean :126.8
## 3rd Qu.:30.40 3rd Qu.:6.000 3rd Qu.:160.0 3rd Qu.:113.0
## Max. :33.90 Max. :8.000 Max. :351.0 Max. :335.0
## drat wt qsec vs
## Min. :3.54 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.85 1st Qu.:1.935 1st Qu.:16.46 1st Qu.:0.0000
## Median :4.08 Median :2.320 Median :17.02 Median :1.0000
## Mean :4.05 Mean :2.411 Mean :17.36 Mean :0.5385
## 3rd Qu.:4.22 3rd Qu.:2.780 3rd Qu.:18.61 3rd Qu.:1.0000
## Max. :4.93 Max. :3.570 Max. :19.90 Max. :1.0000
## am gear carb
## Automatic: 0 Min. :4.000 Min. :1.000
## Manual :13 1st Qu.:4.000 1st Qu.:1.000
## Median :4.000 Median :2.000
## Mean :4.385 Mean :2.923
## 3rd Qu.:5.000 3rd Qu.:4.000
## Max. :5.000 Max. :8.000
par(mfrow = c(2,2))
plot(bestModel)
sessionInfo()
## R version 3.2.2 (2015-08-14)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] magrittr_1.5 tools_3.2.2 htmltools_0.3 yaml_2.1.13
## [5] stringi_0.5-5 rmarkdown_0.9.2 knitr_1.12.3 stringr_1.0.0
## [9] digest_0.6.8 evaluate_0.8