An exploratory analysis that investigates the conditional effect of transmission type—specifically manual versus automatic—on vehicular fuel efficiency (\(MPG\)), using the foundational mtcars dataset. An initial, unadjusted two-sample \(t\)-test indicates a significant baseline disparity: manual vehicles exhibit a \(7.25\text{ MPG}\) advantage over their automatic counterparts (\(p < 0.05\)). However, this bivariate relationship suffers from acute omitted variable bias.To isolate the true causal effect, we employed a stepwise backward elimination strategy to construct an optimized multivariate linear regression model. Upon controlling for key confounding variables—specifically vehicle weight (\(wt\)), gross horsepower (\(hp\)), and cylinder count (\(cyl\))—the apparent effect size of a manual transmission attenuates drastically, yielding a marginal, less statistically significant improvement of just \(1.81\text{ MPG}\). The final model demonstrates that vehicular weight and powertrain geometry dominate the variance in fuel efficiency (\(R^2 = 0.866\)), concluding that the initial raw advantage attributed to transmission type is largely driven by underlying structural differences in the vehicles themselves.
library(ggplot2)
View(mtcars)
See Appendix Figure I Exploratory scatter plot that compares Automatic and Manual transmission MPG. The graph leads us to believe that there is a significant increase in MPG when for vehicles with a manual transmission vs automatic.
T-Test on miles per gallon by transmission(manual, automatic)
testResults <- t.test(mpg ~ am, mtcars)
testResults$p.value
## [1] 0.001373638
The T-Test rejects the null hypothesis that the difference between transmission types is 0.
testResults$estimate
## mean in group 0 mean in group 1
## 17.14737 24.39231
The difference estimate between the 2 transmissions is 7.24494 MPG in favor of manual.
Linear model analysis for all coefficients
all_model_reg <- lm(mpg ~ ., data = mtcars)
summary(all_model_reg)
##
## Call:
## lm(formula = mpg ~ ., data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4506 -1.6044 -0.1196 1.2193 4.6271
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.30337 18.71788 0.657 0.5181
## cyl -0.11144 1.04502 -0.107 0.9161
## disp 0.01334 0.01786 0.747 0.4635
## hp -0.02148 0.02177 -0.987 0.3350
## drat 0.78711 1.63537 0.481 0.6353
## wt -3.71530 1.89441 -1.961 0.0633 .
## qsec 0.82104 0.73084 1.123 0.2739
## vs 0.31776 2.10451 0.151 0.8814
## am 2.52023 2.05665 1.225 0.2340
## gear 0.65541 1.49326 0.439 0.6652
## carb -0.19942 0.82875 -0.241 0.8122
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.65 on 21 degrees of freedom
## Multiple R-squared: 0.869, Adjusted R-squared: 0.8066
## F-statistic: 13.93 on 10 and 21 DF, p-value: 3.793e-07
Since none of the coefficients have a p-value less than 0.05 we cannot conclude which variables are more statistically significant.
Backward selection to determine which variables are most statistically significant
stepFit <- step(all_model_reg)
summary(stepFit)
The new model has 4 variables (cylinders, horsepower, weight, transmission). The R-squared value of 0.8659 confirms that this model explains about 87% of the variance in MPG. The p-values are also statistically significant because they are less than 0.05. The coefficients conclude that, holding all other variables constant: Cylinders: A 6-cylinder engine decreases MPG by 3.03 compared to a 4-cylinder engine, while an 8-cylinder engine decreases MPG by 2.16 compared to a 4-cylinder engine. Horsepower: Every 1-unit increase in horsepower decreases MPG by 0.0321 (or a 3.21 drop per 100 horsepower). Weight: Every 1,000 lbs increase in weight decreases MPG by 2.5. Transmission: A manual transmission improves the MPG by 1.81 compared to an automatic transmission..
Residual Plot See Appendix Figure II
The plots conclude:
sum((abs(dfbetas(stepFit)))>1)
## [1] 1
There is a difference in MPG based on transmission type. A manual transmission will have a slight MPG boost. However, it seems that weight, horsepower, & number of cylinders are more statistically significant when determining MPG.