Summary

In this analysis we investigate the effect of car transmission on fuel economy. We show with a naive analysis of the data that manual transmission vehicles appear to have better fuel economy in comparision to automatic transmission vehicles. Upon further investigation we show that the apparent effect of transmission on fuel economy can be explained away in this data by the weight of the vehicle. Finally this leads to the conclusion that we cannot use this data to make a reasonable argument on the effect of fuel economy and give some suggestions on future data collection to help avoid the problems faced with this data.

Data and Preprocessing

We use the mtcars dataset provided in R for this analysis. The data contains information on 32 distinct cars from the 1974 Motor Trend US magazine and 11 variables such as fuel consumption and other aspects of automobile design.

The only preprocessing done for this data was renaming the levels of the am variable to Automatic and Manual instead of the default values of 0 and 1.

Analysis

First we explore the direct effect of transmission on fuel economy. We fit a linear model comparing fuel economy (mpg) to transmission. This model was fit without an intercept term so we can directly look at the mean values of transmission effect on fuel economy via the model’s coefficients.

Estimate Std. Error t value Pr(>|t|)
transmissionAutomatic 17.147 1.125 15.248 0.000
transmissionManual 24.392 1.360 17.941 0.000

We see the difference in average fuel economy between transmission types. And indeed this difference is also very apparent when looking at a boxplot of the data in Figure 1

To quantify the significance of the difference in means we can look at the P-values for a linear model fit with Automatic transmission as the reference.

Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.1474 1.1246 15.2475 0.0000
transmissionManual 7.2449 1.7644 4.1061 0.0003

We see that manual transmission does indeed appear to have on average a better fuel economy when compared to automatic transmission, and given the low p-value we can be confident that this apparent difference in means is significant.

Perhaps another variable can explain this apparent difference in fuel economy. Indeed, when looking at fuel economy vs weight (Figure 2) we see two noticable features:

Indeed, we can see a dramatic change in conclusions if we take into account the interactions between weight of the car and transmission type. Via the coefficients we see that the expected change in fuel economy (mpg) per 1000lbs of weight in a manual transmission car is -9.08 while for an automatic transmission the expected change in fuel economy (mpg) per 1000lbs of weight is -4. It appears as if the trend has reversed itself from our initial conclusion that manual transmission cars have lower fuel economy; when we look at the interactions with weight, we see that manual transmission cars tend to have worse fuel economy than automatic transmission cars as the car’s weight increases. Figure 3 highlights this trend reversal.

How confident can we be in the conclusions from this trend reversal? If we look at the 95% confidence intervals for the line estimates of an average-weight car for each transmission type we can see that we cannot conclude with confidence which transmission results in the better fuel economy.

fit lwr upr
Automatic Transmission 19.24 17.72 20.75
Manual Transmission 17.07 14.30 19.84

It is difficult to make a conclusion about a car’s fuel economy given the transmission type with this dataset. We encounter a key problem: the distributions of cars across transmission type is not properly represented across the weights of the cars in the data. This makes it difficult to come to a conclusion. In order to make better conclusions we require a dataset that is more representitive of cars of both transmission types across a broader range of weights than those provided in mtcars. In an ideal situation, we would like to compare the fuel economy across transmission type of cars within the same weight range. Unfortunately this data set does not provide us with this luxury, and we must remain undecided until more data is made available.

Appendix

Figure 1

Boxplot showing the distributions of fuel economy vs transmission type

plot of chunk Fig1

Figure 2

Point plot showing the distributions of data comparing fuel economy vs weight

plot of chunk Fig2

Figure 3

Point plot showing the distributions of data comparing fuel economy vs weight and the regression line associated with each

plot of chunk Fig3

Figure 4

Residual plots

plot of chunk Fig4