Executive summary

The study confirms the hypothesis that automatic transmission makes for a better fuel economy in cars and when using weight as the most important observable control variable for other car characteristics. The difference is small (manual transmission reduces fuel efficiency with around 13 percent compared to automatic transmission) but significant at the sample mean weight. Study validity is low due to skewedness in the subsamples, small sample size in general and the potentially large confounding impact that unobserved driving behaviour and patterns have on observed fuel efficiency which will also have a larger bearing on regression results in a smaller sized sample.


This paper investigates for the Motor Trend US magazine the impact of automatic or manual transmission on fuel efficiency with (now) veteran automobiles in the US marketplace. But the cars were not veteran in the context of the study and the data collected.

Specifically we want to know:

  1. Whether automatic or manual transmission is better for miles per gallon (mpg)?
  2. And specifically, what is the quantifiable difference between automatic and manual transmissions?

Theory and hypothesis

The hypothesis tested is that vehicles with automatic transmission has superior fuel efficiency. This hypothesis takes ouset in the USEPA report on the same subject published in 1975 which showed that driving behaviour often is one of the main factors behind fuel efficiency. Hence on average automatic transmission must be more efficient for fuel economy as it reduces the impact of the human factor. Alternatively it may be that this impact only affects the variation in fuel economy of the two transmission types, but this should also mean that our regression analysis will show that there is no significant difference between the two transmission types.

The available dataset gives limited opportunity to test this hypothesis. A larger dataset and a dataset that gives access to control for driving behaviour would help to shed more light on the research questions.

The best way to select control variables is to take outset in some theory about what determines fuel efficiency in cars. As I am not a mechanic I searched for prior similar studies to see what they had found out. Based on the available variables in the dataset (see Henderson and Velleman, 1981), intuition would suggest that in particular the weight variable is the most important control factor. A report from around the same age and place as the dataset (USEPA, 1975) concludes that the most salient characteristics of the vehicle affecting fuel economy are in order of importance: 1) weight, 2) displacement and hereafter factors that are not relevant to the present study such as driving behaviour and driving patterns. To make the study simple we therefore just focus on regressions that include control for weight and displacement.


The data comes from the mtcars dataset in r originally published in Henderson and Velleman (1981).

Summarising statistics for the study variables cross tabulated for transmission (0=automatic, 1=manual) is shown with the summary table. Somewhat suprisngly manual transmission has an average better fuel economy (as measured with mpg whihc is an abbreviation for Miles Per Gallon) in this sample. But this may be due to the skewedness in weight (wt) in the two subsamples of cars where we can see that automatic transmission is represented with an average higher weight and vice-versa. This factor may inflate the mpg average and hence it is paramount to control for weight in the study. The same is true for the displacement (disp), where apparently there are more of the high displacement type vehicles in the automatic transmission subsample 1. The sample has 13 cars with automatic transmission and 19 with manual.

summaryBy(mpg + wt + disp ~ am, data = mtcars, 
    FUN = function(x) { c(m = mean(x), s = sd(x)) } )
##   am    mpg.m    mpg.s     wt.m      wt.s   disp.m    disp.s
## 1  0 17.14737 3.833966 3.768895 0.7774001 290.3789 110.17165
## 2  1 24.39231 6.166504 2.411000 0.6169816 143.5308  87.20399

Initial Exploratory Data Plot

Here I used a conditioning dataplot (as suggested in the notes to the mtcars dataset) to visualise in a simple way the role that transmission has for fuel economy in the sampled cars. The plot shows that conditional on weight, manual transmission (exhibit on the right hand side) would still appear to be more fuel efficient because it can go more miles per gallon. However, there is a lack of observations for cars with automatic transmission in the lower weight classes. Against intuition we also observe that there is a stronger linear relationship between mpg and wt for manual transmission where one might think the opposite would be the case because automatic transmission seemingly should be reducing the impact of human factors and increase the importance of car characteristics such as weight in general.

coplot(mpg ~ wt | as.factor(am), data = mtcars,
       col = "red", bg = "pink", pch = 21, = c(fac = "light blue"))

Regression results

fit1 <- lm(mpg ~ wt + disp + factor(am), data=mtcars)
fit2 <- lm(mpg ~ factor(am) + wt*factor(am) + disp*factor(am), data=mtcars)
fit3 <- lm(mpg ~ factor(am) + wt*factor(am), data=mtcars)
stargazer(fit1, fit2, fit3, dep.var.labels=c("Miles per Gallon"), covariate.labels=c("weight", "displacement", "weight if manual", "displacement if manual", "manual", "constant (automatic)"), align=TRUE,, type='html')
Dependent variable:
Miles per Gallon
(1) (2) (3)
weight -3.279** -1.775 -3.786***
(1.328) (1.299) (0.786)
displacement -0.018* -0.017*
(0.009) (0.009)
weight if manual -5.171** -5.298***
(2.440) (1.445)
displacement if manual -0.001
manual 0.178 14.885*** 14.878***
(1.484) (4.726) (4.264)
constant (automatic) 34.676*** 28.868*** 31.416***
(3.241) (3.166) (3.020)
Observations 32 32 32
R2 0.781 0.861 0.833
Adjusted R2 0.758 0.834 0.815
Residual Std. Error 2.967 (df = 28) 2.458 (df = 26) 2.591 (df = 28)
F Statistic 33.293*** (df = 3; 28) 32.085*** (df = 5; 26) 46.567*** (df = 3; 28)
Note: p<0.1; p<0.05; p<0.01

Several regression lines were fitted to the data. In the first line (column 1) it was tested whether the intercept difference in the two transmissions types is significant. These results now suggest that there is no significant difference in the fuel efficiency of the two transmissions types. Noting that the coefficient estimates labelled manual transmission (as it carries the dummy signifier 1) in the regression table should be read as deviations from the other coefficient estimates which refers to automatic transmission. Hence according to column 1 all the observations should be lying on the same regression line with a similar intercept. In the next column it is tested whether there may be a difference both in intercept and slope across the two transmissions types. Controlling for the slope differences now makes the intercept difference significant. At 0 weight (which is out of sample) manual transmission has a much better fuel economy. However, within the sample this result may be irrelevant. The preferred specification is shown with column 3 since it would appear that weight and displacement are close to being collinear (investigation of which was considered beyond the scope of the report) and better results are obtained also in view to model degrees of freedom by only including weight as control factor. All the parameter estimates are now highly significant and the reduction in overall fit (as captured with R2) is minor by dropping the displacement variable. The inclusion of a separate slope estimator by transmission type solves the apparent paradox in the study as we find out that an increase in weight reduces fuel efficiency with a much greater factor in cars with manual relative to automatic transmission. Again this result may owe entirely to the skewedness in the subsamples of the main study variables (transmission type and wt). Therefore the correct interpretation is to insert the overall sample mean for weight in the last column to get the result we look for, remembering that the intercept and simple slope will give the result for automatic and manual is accounted for by adding the additional intercept and slope estimator (e.g. since am takes the value of 1 for manual transmission):

## [1] 3.21725
  1. Automatic at the sample mean weight: 31,416-3,786x3,21725 = 19,235 mpg.
  2. Manual at the sample mean weight: (31,416+14,878) - (3,786+5,298)x3,21725 = 17,069 mpg.

In other words at the sample mean weight automatic is (19,235-17,069)/17,069 ~ 13% more efficient in terms of fuel consumption compared to manual transmission.


The study investigates for the impact of transmission on fuel efficiency. This question is investigated using a small sample of 32 veteran automobiles (1973-74 models) available from the R datasets (mtcars). It is assumed that we should answer the research questions for a representative veteran car and to the extent that is possible given the limited size of the sample. The best access we have to attain some representativity in this situation is to include as many conditioning or control variables as possible. However, as we have also learned in this class unnecessary control variables should be avoided and also there is a tradeoff between number of control variables included and inference in this situation since the model degrees of freedom are already low as it is. The most important aim for study validity is therefore to select the correct control variables. The subsequent regression analysis suggests that only weight is an important control variable and that the influence of transmission on fuel efficiency also depends on the relative weight of the vehicle in question. Therefore the final equation includes both a separate intercept and slope estimator for transmission. The results are interpreted so that at the mean weight atuomatic transmission increases fuel efficiency with 13 percent and therefore confirms the theoretical hypothesis proposed at the outset of the study.

An alternative approach could have been to exclude outliers (as reported in the appendix results of which suggest to remove the observations for the Fiat 128, the Mercedes 240D and the Toyota Corolla) or conduct a matching procedure of the two types of cars. However, at such a small sample size this would be costly strategies and therefore possibly futile towards improving the validity of the study.


Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37, 391???411.

USEPA (1975). Factors Affecting Automotive Fuel Economy. US Environmental Protection Agency. September 1975.

Appendix - outlier diagnostics

Here I show using the plot() call in r the outlier diagnostics for the preferred specification in the study. Three cars stand out as outliers with unusually good fuel economy relative to their weight class. These are the observations for the Mercedes 240D (automatic), the Fiat 128 (manual) and the Toyota Corolla (manual). This could owe to inaccuracies of the data collected or related with individual driving behaviour of the specific owners of these three vehicles. Comparing these three cars’ sampled fuel economy with other similar cars from around the same time (which can be looked up at it is confirmed that these outliers have unusually good fuel economy for their time and weight class. However, as the factors behind these outliers would appear to be independent of the main study variable which is transmission it would probably not affect the reported results very much to exclude them except perhaps shift the results slightly more in favour of the automatic transmission as two of the outliers are with manual gear.

fit3 <- lm(mpg ~ factor(am) + wt*factor(am), data=mtcars)