After analysis of the mtcars dataset, we are confident at 85% that manual transmissions have higher MPG values than automatic transmissions.
When considered by itself, the automatic/manual variable has a significant effect of 7.24 miles per gallon in favor of manual drive (see Appendix 1). However when it is combined with the weight and horsepower variables, this effect goes down to 2.08 and is associated with more uncertainty.
The uncertainty of this effect is quantified under the form of an interval of possible values in which we have 85% confidence. The interval range from 0.0464 to 4.121 MPG and represent the MPG improvement in favor of manual transmissions.
When plotting the relationship between gearbox and miles per gallon without the influence of other variables, there is a significant difference between the means, see Appendix 1
A t test with equal variance and confidence level of .95 confirm an effect of 7.24 MPG with a significance value of 0.0014. We can also observe a high overlap between the groups.
In the rest of the analysis, we will try to challenge the validity of this effect by removing the effect of other covariates and check if the initial effect is still present.
For model selection, we shall try two methods: one that is based on domain knowledge and one that is automated.
Based on our domain knowledge, it seems reasonable to assume that the miles per gallon are related to the car weight and the horsepowers. Furthermore, the car weight does not necessary uniquely depends on horsepowers, so both variable should be included to avoid potential bias.
fitneg <- lm(mpg~am+wt+hp,mtcars); summary(fitneg)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.00287512 2.642659337 12.866916 2.824030e-13
## am 2.08371013 1.376420152 1.513862 1.412682e-01
## wt -2.87857541 0.904970538 -3.180850 3.574031e-03
## hp -0.03747873 0.009605422 -3.901830 5.464023e-04
Now for the automated version, we have use the step() function from the MASS package that has given us the following model:
fitpos <- lm(mpg~am+wt+qsec,mtcars); summary(fitpos)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.617781 6.9595930 1.381946 1.779152e-01
## am 2.935837 1.4109045 2.080819 4.671551e-02
## wt -3.916504 0.7112016 -5.506882 6.952711e-06
## qsec 1.225886 0.2886696 4.246676 2.161737e-04
Both models have very similar R-squared (variability explained by the models) values: 82.27% and 83.36% respectively.
Their F-statistics (overall model match) are 48.96 and 52.75 respectively. They are both highly significant (2.908e-11 and 1.21e-11).
However when we compute the 95% confidence intervals for both models, we can observe that the first model accept the null hypothesis of no effect (it is including the 0 value) but the second reject the null hypothesis of no effect (it is excluding the 0 value). When computing a 85% confidence interval, both models reject the null hypothesis of no effect.
## 2.5 % 97.5 % 7.5 % 92.5 %
## Manual -0.73575874 4.903179 0.04641094 4.121009
## Automatic 0.04573031 5.825944 0.84749620 5.024178
The residual plots of both models, contained in the diagnostic plots (Appendix 2 and Appendix 3), are very similar. No apparent pattern can be observed and the residuals are more or less evenly distributed around zero.
The Q-Q plots inform us about the normality of the residuals. It looks like the first model has a better residual normality than the second (S shape).
Finally, both models have outliers.
Both models seem to indicate that there is an independent positive effect on MPG when going from automatic to manual transmission. The one including the weight and horsepower variables has a better normality of errors and a more significant intercept term but is less confident about the effect. The one with the qsec variable has a better R-squared value and is more confident about the effect. For our conclusions we will use the results of the first model that are more conservative.
To the question “Is an automatic or manual transmission better for MPG” we can answer that we are confident at 85% that manual transmission is better for MPG.
To the second question “Quantify the MPG difference between automatic and manual transmissions”, we give the 85% confidence interval of the more restrictive model, which range from 0.0464 and 4.121 and represent the MPG improvement in favor of manual transmissions.