Executive Summary

In this report i tried to analyse the relationship between millage (miles per gallon) and type of transmission of different motor vehicles. This analysis aimed to find out the best type of transmission for the greater millage of a motor vehicle based on several other factors such as weight of the vehicle number of cylinders and horse power of the vehicle. The data set used in this analysis is the inbuilt dataset in R called “mtcars”. A multiple linear regression model is fitted to the data to find out the relationship between millage and type of transmission. Average weight of automatic transmission is heigher than the mannual transmission and this fact is not considered in the analysis Appendix fig 2.

Results: From my analysis of mtcars dataset it seems manual transmission boosts millage when compared to automatic transmission. Cylinder and weight of the motor vehicle are two of the most important factors for improving millage.

Load Data

## [1] 32 11
##                mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4     21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710    22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Data

Exploratory Analysis

Plot of correlation matrix shows us the correlation between all the variables in the dataset. by this we can see that “cyl”, “disp”, “hp” and “wt” are negatively related with the millage and “vs”, “am” and “gear” are positively related to millage.

Exploratory Box graph that compares Automatic and Manual transmission MPG.

Exploratory Violin plots to compare millage with the type of transmission

Regression Analysis

full model which contains all the variables in the data to fit millage

convert categorical variables to factors.

## # A tibble: 17 x 5
##    term        estimate std.error statistic p.value
##    <chr>          <dbl>     <dbl>     <dbl>   <dbl>
##  1 (Intercept)  23.9      20.1       1.19    0.253 
##  2 cyl6         -2.65      3.04     -0.871   0.397 
##  3 cyl8         -0.336     7.16     -0.0470  0.963 
##  4 disp          0.0355    0.0319    1.11    0.283 
##  5 hp           -0.0705    0.0394   -1.79    0.0939
##  6 drat          1.18      2.48      0.476   0.641 
##  7 wt           -4.53      2.54     -1.78    0.0946
##  8 qsec          0.368     0.935     0.393   0.700 
##  9 vs1           1.93      2.87      0.672   0.512 
## 10 am1           1.21      3.21      0.377   0.711 
## 11 gear4         1.11      3.80      0.293   0.773 
## 12 gear5         2.53      3.74      0.677   0.509 
## 13 carb2        -0.979     2.32     -0.423   0.679 
## 14 carb3         3.00      4.29      0.699   0.495 
## 15 carb4         1.09      4.45      0.245   0.810 
## 16 carb6         4.48      6.38      0.701   0.494 
## 17 carb8         7.25      8.36      0.867   0.399

P-values are all non significant so we can not conclude about the type of transmission, this is may be due to inclusion of too many variables.

Stepfit

## # A tibble: 6 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)  33.7       2.60      12.9   7.73e-13
## 2 cyl6         -3.03      1.41      -2.15  4.07e- 2
## 3 cyl8         -2.16      2.28      -0.947 3.52e- 1
## 4 hp           -0.0321    0.0137    -2.35  2.69e- 2
## 5 wt           -2.50      0.886     -2.82  9.08e- 3
## 6 am1           1.81      1.40       1.30  2.06e- 1

After doing step wise selection of our model, the final model contains Cylinder, Horse power, Weight and type of transmission as the predictors for the outcome variable millage.

Residuals & Diagnostics

Residual Plot

The plots conclude:

  1. The randomness of the Residuals vs. Fitted plot supports the assumption of independence
  2. The points of the Normal Q-Q plot following closely to the line conclude that the distribution of residuals is normal
  3. The Scale-Location plot random distribution confirms the constant variance assumption
  4. Since all points are within the 0.05 lines, the Residuals vs. Leverage concludes that there are no outliers

Results

The fitted model shows significant effects of the predictors and it shows that if we increase Cylinders from 4 to 6 it will reduce millage by 3 and further reduces it by 2 if we increase cylinders to 8. For every 1000 pounds increase in the weight of the car there was an decrease of 2.5 miles per gallon and finally the manual transmission increases millage by 1.8 miles when compared to automatic transmission.

Conclusion

There is a difference in MPG based on transmission type. A manual transmission will have a slight MPG boost. However, it seems that weight, horsepower, & number of cylinders are more statistically significant when determining MPG.

Appendix

Fig 1
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Fig 2