In this report We need to explore the relationship between different features and miles per gallon (MPG) (outcome) using mtcars dataset. We will also try to answer the following questions: 1- Is an automatic or manual transmission better for MPG? 2- Quantify the MPG difference between automatic and manual transmissions.
1- Variables with highest correlation with mpg are wt, cyl, disp and hp.
cor(mtcars$mpg,mtcars)
## mpg cyl disp hp drat wt qsec
## [1,] 1 -0.852162 -0.8475514 -0.7761684 0.6811719 -0.8676594 0.418684
## vs am gear carb
## [1,] 0.6640389 0.5998324 0.4802848 -0.5509251
Check appendix for ggpairs graph
2- Next we make a violin plot of MPG for automatic and manual transmission
amMpgViolin = ggplot(data = mtcars, aes(x=am, y = mpg, fill = factor(am))) + geom_violin(colour = "black", size = 2)
In the above graph It’s clear that manual transmission has higher MPG than automatic. Therefore automatic is better for mpg.
3- Next we try to choose a model with highest correlated features using anova test
fit1 <- lm(mpg ~ factor(cyl) + wt , data=mtcars)
fit2 <- lm(mpg ~ factor(cyl) + wt + hp, data=mtcars)
fit3 <- lm(mpg ~ factor(cyl) + wt + hp + am, data=mtcars)
fit4 <- lm(mpg ~ factor(cyl)*disp + wt + hp + am, data=mtcars)
Results are Rsquared values of each model followed by anova test results.
## [1] 0.8200146
## [1] 0.8360668
## [1] 0.8400875
## [1] 0.8625561
## Analysis of Variance Table
##
## Model 1: mpg ~ factor(cyl) + wt
## Model 2: mpg ~ factor(cyl) + wt + hp
## Model 3: mpg ~ factor(cyl) + wt + hp + am
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 28 183.06
## 2 27 160.78 1 22.281 3.8358 0.06098 .
## 3 26 151.03 1 9.752 1.6789 0.20646
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
from the results of anova test above adding am as a regressor has a relatively high p value, but r squared value is higher so it explains some of the variabity. fit4 is the best model. Residuals of this model don’t follow specific pattern. Check appendix for complete graph.
1- doing a t-test of the two groups of mpg for automatic and manual transmission
t.test(mpg ~ am, data=mtcars)
##
## Welch Two Sample t-test
##
## data: mpg by am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.280194 -3.209684
## sample estimates:
## mean in group Automatic mean in group Manual
## 17.14737 24.39231
from the result of the test we can see that the difference in mpg mean is not equal for automatic and manual test with a 95 confidence interval tne value of mpg for automatic is less from 11.280194 to 3.209684.
We explored the relationship between different features and miles per gallon (MPG) as the outcome. We found that the best features to include in our model as predictors are cyl, disp, wt, hp and am. We concluded that cars with manual transmission has on average higher mpg than automatic ones
1- Explore different relations between variables and the correlation values between them
print(mtcarsPairs)
2- Detailed model residuals and levarage
plot(fit4)