Executive Summary

In this report, we investigate the difference between manual and automatic transmission types on MPG (miles per gallon) of the mtcars dataset. Through statistical analysis, regression analysis, and review of residuals, we find that the on average, a manual transmission car is about 2.938 mpg higher than an automatic tranmission car.

Data Set-up

Exploratory Data Analysis

We first observe the data and see what kind of unit each variable is. MPG is a numeric value while automatic/manual transmission is a binary variable (0 or 1).

head(mtcars)
summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000

We then do some preliminary investigation into automatic vs manual transmission on number of miles per gallon.

boxplot(mpg ~ am, data=mtcars, ylab = "Miles Per Gallon",xlab="Automatic                                      Manual")

Here we see that automatic transmission cars have a lower median mpg value than that of manual transmission cars.

Hypothesis Testing

We further investigate differences in transmission type on mpg value with a t-test. In this case:

Null hypothesis: There is no difference in mpg value between automatic and manual transmission. Alternative hypothesis: There is a difference in mpg value between both transmission types.

mtcars$am <- as.factor(mtcars$am)
t.test(mpg~am, data=mtcars, conf.level=0.95)
## 
##  Welch Two Sample t-test
## 
## data:  mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -11.280194  -3.209684
## sample estimates:
## mean in group 0 mean in group 1 
##        17.14737        24.39231

As the p-value is 0.001374, which is less than the alpha level of 0.05, we can reject the null hypothesis. At the 95% confidence level, we can say that the manual transmission cars have a higher MPG than automatic transmission cars.

Model selection (Regression Analysis)

In this next section, we aim to quantify the MPG difference between transmission types. We first investigate the relationship of MPG with all other variables to see what variables can be included in our regression model.

model <- lm(mpg ~ ., data=mtcars)
summary(model)$coef
##                Estimate  Std. Error    t value   Pr(>|t|)
## (Intercept) 12.30337416 18.71788443  0.6573058 0.51812440
## cyl         -0.11144048  1.04502336 -0.1066392 0.91608738
## disp         0.01333524  0.01785750  0.7467585 0.46348865
## hp          -0.02148212  0.02176858 -0.9868407 0.33495531
## drat         0.78711097  1.63537307  0.4813036 0.63527790
## wt          -3.71530393  1.89441430 -1.9611887 0.06325215
## qsec         0.82104075  0.73084480  1.1234133 0.27394127
## vs           0.31776281  2.10450861  0.1509915 0.88142347
## am1          2.52022689  2.05665055  1.2254035 0.23398971
## gear         0.65541302  1.49325996  0.4389142 0.66520643
## carb        -0.19941925  0.82875250 -0.2406258 0.81217871

Based on the summary of the coefficients, we can see that wt plays a big role and qsec plays a smaller role. To confirm this theory, we can use the step() function:

finalmodel <- step(lm(mpg~.,data=mtcars),trace=0)
summary(finalmodel)
## 
## Call:
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4811 -1.5555 -0.7257  1.4110  4.6610 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9.6178     6.9596   1.382 0.177915    
## wt           -3.9165     0.7112  -5.507 6.95e-06 ***
## qsec          1.2259     0.2887   4.247 0.000216 ***
## am1           2.9358     1.4109   2.081 0.046716 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
## F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11

As predicted, the stepwise algorithm chooses to include the wt and qsec variables in the model. We end up with an R^2 value of around 0.85. We can quantify the difference in mpg between manual and automatic transmission types to be around ~ 2.9358 miles per gallon.

Appendix

Paired Relationships between all variables

pairs(mpg ~ ., data=mtcars, main = "Relationships Between Variables")

Residuals

par(mfrow=c(2,2))
plot(finalmodel)