Executive Summary

This investigation aims to analyse the factors that effect fuel economy for a selection of automobiles, with a particular emphasis on the effect of the transmission system (manual or automatic). The investigation will use data collected from the 1974 issue of Motor Trend US magazine. The data contains fuel consumption information and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models)

Exploratory Analysis

Design and performance variables

The data set:

##               mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4      21   6  160 110  3.9 2.620 16.46  0  1    4    4
## Mazda RX4 Wag  21   6  160 110  3.9 2.875 17.02  0  1    4    4

The correlation plot in the appendix shows a strong negative correlation between mpg (fuel consumption) and cyl, disp, hp and wt, however these variables also have a strong positive correlation with each other. The histogram and mean plot in the appendix shows the effect of transmission on fuel consumption specifically. Clearly there is some evidence that manual transmissions are more fuel efficient than automatics.

Model Selection

Because of the necessity of including am as a regressor, I will build a model using a nested model approach. I will start with a model that contains just mpg as an outcome and am as the regressor and then include other regressors that correlate well with mpg and assess the effects of the inclusion using analysis of variance techniques. I will however ignore cyl as it is discrete variable that correlates very well with disp so there will be a large variance inflation factor associated with including this variable. The model I choose will be refered to as the “selected model”. I will also consider a model containing all of the regressors, as a control, and refer to it as the “full model” and perform subsequent analysis on both.

## Analysis of Variance Table
## 
## Model 1: mpg ~ as.factor(am)
## Model 2: mpg ~ as.factor(am) + wt
## Model 3: mpg ~ as.factor(am) + wt + disp
## Model 4: mpg ~ as.factor(am) + wt + disp + hp
## Model 5: mpg ~ as.factor(am) + wt + disp + hp + drat
## Model 6: mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
##   Res.Df    RSS Df Sum of Sq       F    Pr(>F)    
## 1     30 720.90                                   
## 2     29 278.32  1    442.58 63.0133 9.325e-08 ***
## 3     28 246.56  1     31.76  4.5224  0.045474 *  
## 4     27 179.91  1     66.65  9.4893  0.005672 ** 
## 5     26 175.67  1      4.24  0.6037  0.445844    
## 6     21 147.49  5     28.17  0.8023  0.560634    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Assessing the P values for the F statistic above, I will include just am, wt, disp and hp as the regressors in my selected model.

Residuals and Diagnostics

The first two dfbetas plots in the appendix identify the Chrysler Imperial and the Maserati Bora as having a high influence for the selected model. Whilst the full model has the Ford Pantera L and the Merc 230 as having high influence. For both models the level of variation of the dfbetas stays within broadly similar bounds and for both models none of the coefficients associated with a particular regressor seem to be effected more than any of the others.

The first hatvalue plot in the appendix show that overall there is less leverage for the selected model when compared to the full model.

The residual plots in the appendix do not reveal any obvious patterns. Note the Chrysler Imperial and feature as influential point for the selected model as seen above. Similarly for the Ford Pantera L and the Merc 230 appear with the full model.

Conclusion

Statistical inference for model containing am as the only regressor.

##                 Estimate Std. Error   t value     Pr(>|t|)
## (Intercept)    17.147368   1.124603 15.247492 1.133983e-15
## as.factor(am)1  7.244939   1.764422  4.106127 2.850207e-04

The average increase in mpg is 7.24 when going from an automatic to a manual automobile, and this result is significantly different from 0 at the 95% confidence levels

Statistical inference for the Selected Model

##                    Estimate Std. Error    t value     Pr(>|t|)
## (Intercept)    34.209443370 2.82282610 12.1188632 1.979953e-12
## as.factor(am)1  2.159270737 1.43517565  1.5045341 1.440531e-01
## wt             -3.046747000 1.15711931 -2.6330448 1.382936e-02
## disp            0.002489354 0.01037681  0.2398959 8.122229e-01
## hp             -0.039323213 0.01243358 -3.1626624 3.842032e-03

The average increase in mpg is 2.16 when going from an automatic to a manual automobile, holding variables disp, hp, and wt fixed. However this result is not significantly different from 0 at the 95% confidence levels.

Statistical inference for Full Model

##                Estimate  Std. Error    t value   Pr(>|t|)
## (Intercept) 12.30337416 18.71788443  0.6573058 0.51812440
## cyl         -0.11144048  1.04502336 -0.1066392 0.91608738
## disp         0.01333524  0.01785750  0.7467585 0.46348865
## hp          -0.02148212  0.02176858 -0.9868407 0.33495531
## drat         0.78711097  1.63537307  0.4813036 0.63527790
## wt          -3.71530393  1.89441430 -1.9611887 0.06325215
## qsec         0.82104075  0.73084480  1.1234133 0.27394127
## vs           0.31776281  2.10450861  0.1509915 0.88142347
## am           2.52022689  2.05665055  1.2254035 0.23398971
## gear         0.65541302  1.49325996  0.4389142 0.66520643
## carb        -0.19941925  0.82875250 -0.2406258 0.81217871

The average increase in mpg is 2.52 when going from an automatic to a manual automobile, holding all the other variables fixed. However this result is not significantly different from 0 at the 95% confidence levels

To conclude a manual transmission is more likely to be fuel efficient than an automatic transmission. However, this effect becomes less significant when other variables are held constant. This makes sense with the initial correlation plots which show a negative correlation between am and disp, am and hp and am and wt recall (0-am automatic 1-am manual), so choosing an manual transmission will cause these variables to decrease and therefore mpg to increase. Therefore an automobile with a manual transmission is likely to be more fuel efficient, but we cannot conclude that there is an increase in efficiency is due to the transmission system alone.

Appendix - Exploratory Plots

## Warning: replacing previous import by 'utils::capture.output' when loading
## 'GGally'
## Warning: replacing previous import by 'utils::head' when loading 'GGally'
## Warning: replacing previous import by 'utils::installed.packages' when
## loading 'GGally'
## Warning: replacing previous import by 'utils::str' when loading 'GGally'

Appendix - Residual and Diagnostic Plots

Residual plots for the selected model

Residual plots for the full model