The MTCARS dataset was extracted from the 1974 Motor Trend US magazine, and comprises fuel efficiency in MPG and 10 aspects of automobile design and performance. When analysing the relationship between Transmission Type (Automatic/Manual) and Fuel Efficiency (Miles Per Gallon), we are particularly interested in the following two questions:
** Is an automatic or manual transmission better for MPG?
** Quantify the MPG difference between automatic and manual transmissions
To summarize, the conclusion from the analysis is:
** Cars with Manual transmissions produce more miles per gallon and achieve better fuel efficiency than automatic transmissions
** When using Transmission Type as an explanatory variable in a linear regression model and changing the transmission type from automatic to manual, the fuel efficiency improved, MPG increased by 7.245
** The variables Cylinders, Horsepower and Weight have a statistically more significant effect on MPG than Trnasmission Type.
Let’s explore the data and display the first 10 lines:
data("mtcars")
head(mtcars, n=10)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
dim(mtcars)
## [1] 32 11
Create factor variables
mtcars$cyl=factor(mtcars$cyl)
mtcars$vs=factor(mtcars$vs)
mtcars$gear=factor(mtcars$gear)
mtcars$carb=factor(mtcars$carb)
mtcars$am=factor(mtcars$am,labels=c('Automatic','Manual'))
tta=subset(mtcars, am=="Automatic")
ttm=subset(mtcars, am=="Manual")
The following graph can be used to determine if there is a significant difference between transmission types Automatic and Manual.
ttacol="darkred"
ttmcol="darkblue"
boxplot(tta$mpg,ttm$mpg,col=c(ttacol,ttmcol),varwidth=TRUE,xlab="transmission type",ylab="Fuel consumption (mpg)",main="Fuel consumption vs Transmission type",names=c("automatic","manual"))
stripchart(mpg ~ am, data = mtcars, vertical = TRUE,method = "jitter",
jitter = 0.15, pch = 16, col = c("red","blue"),
bg = "green", add = TRUE)
When visually inspecting the graph, a clear difference can be observed in MPG for Automatic versus Manual transmissions.
Now let’s apply the T-Test function to determine if the means of the two groups are equal to each other.
ttest <- t.test(mtcars$mpg ~ mtcars$am)
ttest$p.value
## [1] 0.001373638
The T-Test rejects the null hypothesis that the difference between transmission types is 0.
ttest$estimate
## mean in group Automatic mean in group Manual
## 17.14737 24.39231
The estimate indicates that the difference is 24.39231 - 17.14737 = 7.24494. So cars with Manual transmission achieve a better fuel efficienty, more than 7 additional miles per gallon compared to automatic transmission.
Fitting the full model:
fitmodel <- lm(mpg ~ ., data = mtcars)
summary(fitmodel)
##
## Call:
## lm(formula = mpg ~ ., data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5087 -1.3584 -0.0948 0.7745 4.6251
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.87913 20.06582 1.190 0.2525
## cyl6 -2.64870 3.04089 -0.871 0.3975
## cyl8 -0.33616 7.15954 -0.047 0.9632
## disp 0.03555 0.03190 1.114 0.2827
## hp -0.07051 0.03943 -1.788 0.0939 .
## drat 1.18283 2.48348 0.476 0.6407
## wt -4.52978 2.53875 -1.784 0.0946 .
## qsec 0.36784 0.93540 0.393 0.6997
## vs1 1.93085 2.87126 0.672 0.5115
## amManual 1.21212 3.21355 0.377 0.7113
## gear4 1.11435 3.79952 0.293 0.7733
## gear5 2.52840 3.73636 0.677 0.5089
## carb2 -0.97935 2.31797 -0.423 0.6787
## carb3 2.99964 4.29355 0.699 0.4955
## carb4 1.09142 4.44962 0.245 0.8096
## carb6 4.47757 6.38406 0.701 0.4938
## carb8 7.25041 8.36057 0.867 0.3995
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.833 on 15 degrees of freedom
## Multiple R-squared: 0.8931, Adjusted R-squared: 0.779
## F-statistic: 7.83 on 16 and 15 DF, p-value: 0.000124
summary(fitmodel$coeff)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -4.53000 -0.07051 1.11400 2.32400 2.52800 23.88000
The p-values of all coefficients of a value higher than 0.05, therefore it cannot be concluded from the full model which variables are significant.
Step fitting the model to determine statistically significant variables:
fitstep <- step(fitmodel)
## Start: AIC=76.4
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
##
## Df Sum of Sq RSS AIC
## - carb 5 13.5989 134.00 69.828
## - gear 2 3.9729 124.38 73.442
## - am 1 1.1420 121.55 74.705
## - qsec 1 1.2413 121.64 74.732
## - drat 1 1.8208 122.22 74.884
## - cyl 2 10.9314 131.33 75.184
## - vs 1 3.6299 124.03 75.354
## <none> 120.40 76.403
## - disp 1 9.9672 130.37 76.948
## - wt 1 25.5541 145.96 80.562
## - hp 1 25.6715 146.07 80.588
##
## Step: AIC=69.83
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear
##
## Df Sum of Sq RSS AIC
## - gear 2 5.0215 139.02 67.005
## - disp 1 0.9934 135.00 68.064
## - drat 1 1.1854 135.19 68.110
## - vs 1 3.6763 137.68 68.694
## - cyl 2 12.5642 146.57 68.696
## - qsec 1 5.2634 139.26 69.061
## <none> 134.00 69.828
## - am 1 11.9255 145.93 70.556
## - wt 1 19.7963 153.80 72.237
## - hp 1 22.7935 156.79 72.855
##
## Step: AIC=67
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am
##
## Df Sum of Sq RSS AIC
## - drat 1 0.9672 139.99 65.227
## - cyl 2 10.4247 149.45 65.319
## - disp 1 1.5483 140.57 65.359
## - vs 1 2.1829 141.21 65.503
## - qsec 1 3.6324 142.66 65.830
## <none> 139.02 67.005
## - am 1 16.5665 155.59 68.608
## - hp 1 18.1768 157.20 68.937
## - wt 1 31.1896 170.21 71.482
##
## Step: AIC=65.23
## mpg ~ cyl + disp + hp + wt + qsec + vs + am
##
## Df Sum of Sq RSS AIC
## - disp 1 1.2474 141.24 63.511
## - vs 1 2.3403 142.33 63.757
## - cyl 2 12.3267 152.32 63.927
## - qsec 1 3.1000 143.09 63.928
## <none> 139.99 65.227
## - hp 1 17.7382 157.73 67.044
## - am 1 19.4660 159.46 67.393
## - wt 1 30.7151 170.71 69.574
##
## Step: AIC=63.51
## mpg ~ cyl + hp + wt + qsec + vs + am
##
## Df Sum of Sq RSS AIC
## - qsec 1 2.442 143.68 62.059
## - vs 1 2.744 143.98 62.126
## - cyl 2 18.580 159.82 63.466
## <none> 141.24 63.511
## - hp 1 18.184 159.42 65.386
## - am 1 18.885 160.12 65.527
## - wt 1 39.645 180.88 69.428
##
## Step: AIC=62.06
## mpg ~ cyl + hp + wt + vs + am
##
## Df Sum of Sq RSS AIC
## - vs 1 7.346 151.03 61.655
## <none> 143.68 62.059
## - cyl 2 25.284 168.96 63.246
## - am 1 16.443 160.12 63.527
## - hp 1 36.344 180.02 67.275
## - wt 1 41.088 184.77 68.108
##
## Step: AIC=61.65
## mpg ~ cyl + hp + wt + am
##
## Df Sum of Sq RSS AIC
## <none> 151.03 61.655
## - am 1 9.752 160.78 61.657
## - cyl 2 29.265 180.29 63.323
## - hp 1 31.943 182.97 65.794
## - wt 1 46.173 197.20 68.191
summary(fitstep)
##
## Call:
## lm(formula = mpg ~ cyl + hp + wt + am, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9387 -1.2560 -0.4013 1.1253 5.0513
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.70832 2.60489 12.940 7.73e-13 ***
## cyl6 -3.03134 1.40728 -2.154 0.04068 *
## cyl8 -2.16368 2.28425 -0.947 0.35225
## hp -0.03211 0.01369 -2.345 0.02693 *
## wt -2.49683 0.88559 -2.819 0.00908 **
## amManual 1.80921 1.39630 1.296 0.20646
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.41 on 26 degrees of freedom
## Multiple R-squared: 0.8659, Adjusted R-squared: 0.8401
## F-statistic: 33.57 on 5 and 26 DF, p-value: 1.506e-10
summary(fitstep$coeff)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.031 -2.414 -1.098 4.632 1.349 33.710
The R-squared value of 0.8659 indicates the level of variance in MPG.
The variables Cylinders, Horsepower, Weight and Transmission type all have an impact on fuel efficiency MPG.
** Transmission Type has an effect on MPG, cars with Manual transmissions produce more miles per gallon and achieve better fuel efficiency than automatic transmissions
** The variables Cylinders, Horsepower and Weight have a statistically more significant effect on MPG than Trnasmission Type.