In this report i tried to analyse the relationship between millage (miles per gallon) and type of transmission of different motor vehicles. This analysis aimed to find out the best type of transmission for the greater millage of a motor vehicle based on several other factors such as weight of the vehicle number of cylinders and horse power of the vehicle. The data set used in this analysis is the inbuilt dataset in R called “mtcars”. A multiple linear regression model is fitted to the data to find out the relationship between millage and type of transmission. Average weight of automatic transmission is heigher than the mannual transmission and this fact is not considered in the analysis Appendix fig 2.
Results: From my analysis of mtcars dataset it seems manual transmission boosts millage when compared to automatic transmission. Cylinder and weight of the motor vehicle are two of the most important factors for improving millage.
## [1] 32 11
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
Plot of correlation matrix shows us the correlation between all the variables in the dataset. by this we can see that “cyl”, “disp”, “hp” and “wt” are negatively related with the millage and “vs”, “am” and “gear” are positively related to millage.
Exploratory Box graph that compares Automatic and Manual transmission MPG.
Exploratory Violin plots to compare millage with the type of transmission
convert categorical variables to factors.
## # A tibble: 17 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 23.9 20.1 1.19 0.253
## 2 cyl6 -2.65 3.04 -0.871 0.397
## 3 cyl8 -0.336 7.16 -0.0470 0.963
## 4 disp 0.0355 0.0319 1.11 0.283
## 5 hp -0.0705 0.0394 -1.79 0.0939
## 6 drat 1.18 2.48 0.476 0.641
## 7 wt -4.53 2.54 -1.78 0.0946
## 8 qsec 0.368 0.935 0.393 0.700
## 9 vs1 1.93 2.87 0.672 0.512
## 10 am1 1.21 3.21 0.377 0.711
## 11 gear4 1.11 3.80 0.293 0.773
## 12 gear5 2.53 3.74 0.677 0.509
## 13 carb2 -0.979 2.32 -0.423 0.679
## 14 carb3 3.00 4.29 0.699 0.495
## 15 carb4 1.09 4.45 0.245 0.810
## 16 carb6 4.48 6.38 0.701 0.494
## 17 carb8 7.25 8.36 0.867 0.399
P-values are all non significant so we can not conclude about the type of transmission, this is may be due to inclusion of too many variables.
Stepfit
## # A tibble: 6 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 33.7 2.60 12.9 7.73e-13
## 2 cyl6 -3.03 1.41 -2.15 4.07e- 2
## 3 cyl8 -2.16 2.28 -0.947 3.52e- 1
## 4 hp -0.0321 0.0137 -2.35 2.69e- 2
## 5 wt -2.50 0.886 -2.82 9.08e- 3
## 6 am1 1.81 1.40 1.30 2.06e- 1
After doing step wise selection of our model, the final model contains Cylinder, Horse power, Weight and type of transmission as the predictors for the outcome variable millage.
Residual Plot
The plots conclude:
The fitted model shows significant effects of the predictors and it shows that if we increase Cylinders from 4 to 6 it will reduce millage by 3 and further reduces it by 2 if we increase cylinders to 8. For every 1000 pounds increase in the weight of the car there was an decrease of 2.5 miles per gallon and finally the manual transmission increases millage by 1.8 miles when compared to automatic transmission.
There is a difference in MPG based on transmission type. A manual transmission will have a slight MPG boost. However, it seems that weight, horsepower, & number of cylinders are more statistically significant when determining MPG.
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'