1. Executive Summary

The MTCARS dataset was extracted from the 1974 Motor Trend US magazine, and comprises fuel efficiency in MPG and 10 aspects of automobile design and performance. When analysing the relationship between Transmission Type (Automatic/Manual) and Fuel Efficiency (Miles Per Gallon), we are particularly interested in the following two questions:

** Is an automatic or manual transmission better for MPG?

** Quantify the MPG difference between automatic and manual transmissions

To summarize, the conclusion from the analysis is:

** Cars with Manual transmissions produce more miles per gallon and achieve better fuel efficiency than automatic transmissions

** When using Transmission Type as an explanatory variable in a linear regression model and changing the transmission type from automatic to manual, the fuel efficiency improved, MPG increased by 7.245

** The variables Cylinders, Horsepower and Weight have a statistically more significant effect on MPG than Trnasmission Type.

2. Exploratory data analysis

Let’s explore the data and display the first 10 lines:

data("mtcars")
head(mtcars, n=10)
##                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
dim(mtcars)
## [1] 32 11

Create factor variables

mtcars$cyl=factor(mtcars$cyl)
mtcars$vs=factor(mtcars$vs)
mtcars$gear=factor(mtcars$gear)
mtcars$carb=factor(mtcars$carb)
mtcars$am=factor(mtcars$am,labels=c('Automatic','Manual'))
tta=subset(mtcars, am=="Automatic")
ttm=subset(mtcars, am=="Manual")

The following graph can be used to determine if there is a significant difference between transmission types Automatic and Manual.

ttacol="darkred"
ttmcol="darkblue"

boxplot(tta$mpg,ttm$mpg,col=c(ttacol,ttmcol),varwidth=TRUE,xlab="transmission type",ylab="Fuel consumption (mpg)",main="Fuel consumption vs Transmission type",names=c("automatic","manual"))
stripchart(mpg ~ am, data = mtcars, vertical = TRUE,method = "jitter", 
           jitter = 0.15, pch = 16, col = c("red","blue"),
           bg = "green", add = TRUE) 

When visually inspecting the graph, a clear difference can be observed in MPG for Automatic versus Manual transmissions.

Now let’s apply the T-Test function to determine if the means of the two groups are equal to each other.

ttest <- t.test(mtcars$mpg ~ mtcars$am)
ttest$p.value
## [1] 0.001373638

The T-Test rejects the null hypothesis that the difference between transmission types is 0.

ttest$estimate
## mean in group Automatic    mean in group Manual 
##                17.14737                24.39231

The estimate indicates that the difference is 24.39231 - 17.14737 = 7.24494. So cars with Manual transmission achieve a better fuel efficienty, more than 7 additional miles per gallon compared to automatic transmission.

3. Regression analysis

Fitting the full model:

fitmodel <- lm(mpg ~ ., data = mtcars)
summary(fitmodel)
## 
## Call:
## lm(formula = mpg ~ ., data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5087 -1.3584 -0.0948  0.7745  4.6251 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 23.87913   20.06582   1.190   0.2525  
## cyl6        -2.64870    3.04089  -0.871   0.3975  
## cyl8        -0.33616    7.15954  -0.047   0.9632  
## disp         0.03555    0.03190   1.114   0.2827  
## hp          -0.07051    0.03943  -1.788   0.0939 .
## drat         1.18283    2.48348   0.476   0.6407  
## wt          -4.52978    2.53875  -1.784   0.0946 .
## qsec         0.36784    0.93540   0.393   0.6997  
## vs1          1.93085    2.87126   0.672   0.5115  
## amManual     1.21212    3.21355   0.377   0.7113  
## gear4        1.11435    3.79952   0.293   0.7733  
## gear5        2.52840    3.73636   0.677   0.5089  
## carb2       -0.97935    2.31797  -0.423   0.6787  
## carb3        2.99964    4.29355   0.699   0.4955  
## carb4        1.09142    4.44962   0.245   0.8096  
## carb6        4.47757    6.38406   0.701   0.4938  
## carb8        7.25041    8.36057   0.867   0.3995  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.833 on 15 degrees of freedom
## Multiple R-squared:  0.8931, Adjusted R-squared:  0.779 
## F-statistic:  7.83 on 16 and 15 DF,  p-value: 0.000124
summary(fitmodel$coeff)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -4.53000 -0.07051  1.11400  2.32400  2.52800 23.88000

The p-values of all coefficients of a value higher than 0.05, therefore it cannot be concluded from the full model which variables are significant.

Step fitting the model to determine statistically significant variables:

fitstep <- step(fitmodel)
## Start:  AIC=76.4
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
## 
##        Df Sum of Sq    RSS    AIC
## - carb  5   13.5989 134.00 69.828
## - gear  2    3.9729 124.38 73.442
## - am    1    1.1420 121.55 74.705
## - qsec  1    1.2413 121.64 74.732
## - drat  1    1.8208 122.22 74.884
## - cyl   2   10.9314 131.33 75.184
## - vs    1    3.6299 124.03 75.354
## <none>              120.40 76.403
## - disp  1    9.9672 130.37 76.948
## - wt    1   25.5541 145.96 80.562
## - hp    1   25.6715 146.07 80.588
## 
## Step:  AIC=69.83
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear
## 
##        Df Sum of Sq    RSS    AIC
## - gear  2    5.0215 139.02 67.005
## - disp  1    0.9934 135.00 68.064
## - drat  1    1.1854 135.19 68.110
## - vs    1    3.6763 137.68 68.694
## - cyl   2   12.5642 146.57 68.696
## - qsec  1    5.2634 139.26 69.061
## <none>              134.00 69.828
## - am    1   11.9255 145.93 70.556
## - wt    1   19.7963 153.80 72.237
## - hp    1   22.7935 156.79 72.855
## 
## Step:  AIC=67
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am
## 
##        Df Sum of Sq    RSS    AIC
## - drat  1    0.9672 139.99 65.227
## - cyl   2   10.4247 149.45 65.319
## - disp  1    1.5483 140.57 65.359
## - vs    1    2.1829 141.21 65.503
## - qsec  1    3.6324 142.66 65.830
## <none>              139.02 67.005
## - am    1   16.5665 155.59 68.608
## - hp    1   18.1768 157.20 68.937
## - wt    1   31.1896 170.21 71.482
## 
## Step:  AIC=65.23
## mpg ~ cyl + disp + hp + wt + qsec + vs + am
## 
##        Df Sum of Sq    RSS    AIC
## - disp  1    1.2474 141.24 63.511
## - vs    1    2.3403 142.33 63.757
## - cyl   2   12.3267 152.32 63.927
## - qsec  1    3.1000 143.09 63.928
## <none>              139.99 65.227
## - hp    1   17.7382 157.73 67.044
## - am    1   19.4660 159.46 67.393
## - wt    1   30.7151 170.71 69.574
## 
## Step:  AIC=63.51
## mpg ~ cyl + hp + wt + qsec + vs + am
## 
##        Df Sum of Sq    RSS    AIC
## - qsec  1     2.442 143.68 62.059
## - vs    1     2.744 143.98 62.126
## - cyl   2    18.580 159.82 63.466
## <none>              141.24 63.511
## - hp    1    18.184 159.42 65.386
## - am    1    18.885 160.12 65.527
## - wt    1    39.645 180.88 69.428
## 
## Step:  AIC=62.06
## mpg ~ cyl + hp + wt + vs + am
## 
##        Df Sum of Sq    RSS    AIC
## - vs    1     7.346 151.03 61.655
## <none>              143.68 62.059
## - cyl   2    25.284 168.96 63.246
## - am    1    16.443 160.12 63.527
## - hp    1    36.344 180.02 67.275
## - wt    1    41.088 184.77 68.108
## 
## Step:  AIC=61.65
## mpg ~ cyl + hp + wt + am
## 
##        Df Sum of Sq    RSS    AIC
## <none>              151.03 61.655
## - am    1     9.752 160.78 61.657
## - cyl   2    29.265 180.29 63.323
## - hp    1    31.943 182.97 65.794
## - wt    1    46.173 197.20 68.191
summary(fitstep)
## 
## Call:
## lm(formula = mpg ~ cyl + hp + wt + am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9387 -1.2560 -0.4013  1.1253  5.0513 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 33.70832    2.60489  12.940 7.73e-13 ***
## cyl6        -3.03134    1.40728  -2.154  0.04068 *  
## cyl8        -2.16368    2.28425  -0.947  0.35225    
## hp          -0.03211    0.01369  -2.345  0.02693 *  
## wt          -2.49683    0.88559  -2.819  0.00908 ** 
## amManual     1.80921    1.39630   1.296  0.20646    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.41 on 26 degrees of freedom
## Multiple R-squared:  0.8659, Adjusted R-squared:  0.8401 
## F-statistic: 33.57 on 5 and 26 DF,  p-value: 1.506e-10
summary(fitstep$coeff)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -3.031  -2.414  -1.098   4.632   1.349  33.710

The R-squared value of 0.8659 indicates the level of variance in MPG.

The variables Cylinders, Horsepower, Weight and Transmission type all have an impact on fuel efficiency MPG.

4. Conclusion

** Transmission Type has an effect on MPG, cars with Manual transmissions produce more miles per gallon and achieve better fuel efficiency than automatic transmissions

** The variables Cylinders, Horsepower and Weight have a statistically more significant effect on MPG than Trnasmission Type.