Executive Summary

We have been tasked in analysing data for Motor Trend, a magazine about the automobile industry. Looking at a data set of a collection of cars, they are interested in exploring the relationship between a set of variables and miles per gallon (mpg) (outcome). They are particularly interested in the following two questions:

“Is an automatic or manual transmission better for mpg”

“Quantify the mpg difference between automatic and manual transmissions”

Our aim is to answer those 2 questions via exploratory and inferential analysis, and close with trying to ascertain a regression model that can prove (wrap up) our claim.

Analysis

data(mtcars)
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

To begin, we read in the data and have a quick look at its structure. At the moment we are interested in looking at the relationship with (automatic or manual) transmission (am) and miles per gallon (mpg). We quickly compare the mpg means to see if there is a difference between the 2 transmissions. N/B: am (0=automatic,1=manual)

t.m<-tapply(mtcars$mpg,mtcars$am,mean)
names(t.m)<-c("Automatic","Manual")
d.tm<-abs(t.m[1]-t.m[2])
t.m
## Automatic    Manual 
##  17.14737  24.39231

We can verify that there is a difference of 7.2449393 between the mpg means of the 2 transmission types with Manual transmission having a higher mpg.We can test to see if there is truly a statistical significant difference between the 2 transmissions in relation to mpg.

t.t<-with(mtcars,t.test(mpg~am))
p.tt<-round(t.t$p.value,8)

Using the t test, we have a p value of 0.0013736, which indicates we reject the null hypothesis. Therefore assuring us there is a difference between the mpg of manual and automatic transmission as we suspected earlier.

In order to “Quantify” the mpg difference we will need to find a model that accurately predicts mpg using a combination of the variables we currently have. We will use 3 models - Model 1 compares mpg and am only,Model 2 compares mpg and the rest of the variables, and Model 3 uses the stepwise function which gives the best model based by adjusted R square.

m1<-lm(data=mtcars,mpg~am)
m2<-lm(data=mtcars,mpg~.)
m3<-step(lm(data=mtcars,mpg~.),trace=0)
rm1<-round(summary(m1)$adj.r.squared,8)
rm2<-round(summary(m2)$adj.r.squared,8)
rm3<-round(summary(m3)$adj.r.squared,8)

Running regression analysis on the 3 models, we have the following adjusted r squared values Model 1 (0.3384589),Model 2 (0.8066423 ), and Model 3 (0.8335561).

Model 3 (m3) has the highest adjusted r square value

summary(m3)
## 
## Call:
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.4811 -1.5555 -0.7257  1.4110  4.6610 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9.6178     6.9596   1.382 0.177915    
## wt           -3.9165     0.7112  -5.507 6.95e-06 ***
## qsec          1.2259     0.2887   4.247 0.000216 ***
## am            2.9358     1.4109   2.081 0.046716 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.459 on 28 degrees of freedom
## Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
## F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11

This indicates that the model that best predicts mpg consists of wt,qsec, and am as predictor variables. wt has a negative relationship, while qsec and am have a positive relationship. In relation to our objective (mpg and am), manual transmission would seem to have 2.94 more mpg than automatic transmission, meaning Automatic Transmission uses less mpg than manual tranmsission. To conclude, Motor Trend could consider looking at not only at am, but wt and qsec if they want to impact mpg optimally.

Appendix

We can look at the residual plots for peace of mind to further support our model.

require(ggplot2)
## Loading required package: ggplot2
p<-ggplot(mtcars,aes(factor(am,labels=c("Automatic","Manual")),mpg))
p+geom_boxplot()+labs(title="Boxplot of Motor Trends Transmission" ,x="Transmission Type")

plot(m3)