Exploratory Data Analysis Motor Trend

In this project, I will be answering two questions:

"Is an automatic or manual transmission better for MPG"?

"Quantify the MPG difference between automatic and manual transmissions"?


```r
data<-mtcars
data$am<-factor(data$am, labels=c("Auto","Manual"))
str(data)
```

```
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : Factor w/ 2 levels "Auto","Manual": 2 2 2 1 1 1 1 1 1 1 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
```

```r
require(ggplot2)
```

```
## Loading required package: ggplot2
```

```r
ggplot(data, aes(x=am, y=mpg, colour=am))+geom_boxplot()+xlab("Auto or Manual")+ylab("MPG")+ggtitle("MPG vs Auto or Manual")
```

<img src="Courseara-Regression-Model-Final_files/figure-html/cars-1.png" width="672" />

Analysis

From the above, we see that manual transmissions have a higher mean and median of MPG than automatic transmissions mean and median MPG. We would expect that for those who purchases a manual automobile would be better off for MPG than automatic automobile.

##Build Logistics Regression Model

reg<-lm(mpg~am,data=data)
summary(reg)
## 
## Call:
## lm(formula = mpg ~ am, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.3923 -3.0923 -0.2974  3.2439  9.5077 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   17.147      1.125  15.247 1.13e-15 ***
## amManual       7.245      1.764   4.106 0.000285 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.902 on 30 degrees of freedom
## Multiple R-squared:  0.3598, Adjusted R-squared:  0.3385 
## F-statistic: 16.86 on 1 and 30 DF,  p-value: 0.000285

From the above analysis, we find that on average automobiles that are manual to have 7.245 more MPG than automobiles that are automatic. The intercept term tells you that on average automobiles that are automatic would have 17.147 MPG on average. While automobiles that are manual have on average would have 24.392 MPG (we sum the intercept term and amManual). We see that the p-value for automobiles that are manual is statistically significant(less than alpha 0.05). However, our R-squared is small 35.98%, which tells us that this is not a good model to predict MPG. There are other variables that need to be included to predict MPG.

##How good of a fit is our model?

res<-resid(summary(reg))
fit<-reg$fitted.values
plot(fit,res)