The report analyses mtcars data that was extracted from he 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models). The report tries to answer following questions based on best model fit and unrelated group confidence intervals.
With reference to this report and the questions it aims to answer, there is one dependent variable (mpg), and 10 predictor variables. checkout the plot in appendix section (5.1) to see the correlation among different variables
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Automatic v/s Manual tranmission seems to have a direct effect on the mpg. The exploratory plot titled “MPG v/s Transmission type”, section 5.2 in appendix section proves that. However, there are several other variables that together with transmission type, affect MPG. We will use forward selection method for our model selection.
Under forward selection method we will start fitting with transmission(am) as the only regressor and will keep adding regressors one by one depending on which changes AIC the most. If the addition of a regressor has no significant positive effect on criteria, we stop and have found our model for further diagnostics.
add1(lm(mpg~am,data=mtcars),mpg~cyl+disp+hp+drat+wt+qsec+vs+am+gear+carb,test="F")
add1(lm(mpg~am+cyl,data=mtcars),mpg~cyl+disp+hp+drat+wt+qsec+vs+am+gear+carb,test="F")
add1(lm(mpg~am+cyl+wt,data=mtcars),mpg~cyl+disp+hp+drat+wt+qsec+vs+am+gear+carb,test="F")
add1(lm(mpg~am+cyl+wt+qsec,data=mtcars),mpg~cyl+disp+hp+drat+wt+qsec+vs+am+gear+carb,test="F")
Fit = lm(mpg~am+cyl+wt+qsec,data=mtcars)
Evaluate model for coliniarity. If the value is greater than 4, we have a problem and will need to re-evaluate our model
## am cyl wt qsec
## 3.667222 8.264447 3.952340 4.363475
Apparently, cyl is quite above the threshold value, hence let us re-evaluate our model fit, this time making use of stepwise function step in both directions
fit = lm(mpg~., data=mtcars)
step(fit, direction = “both”)
Gives the new, better model as \(mpg = {\beta_0} + {\beta_1} wt + {\beta_2} am + {\beta_3} qsec + {\epsilon_i}\)
We can see that the variance inflation is well within the limit
## am wt qsec
## 2.541437 2.482952 1.364339
Conclusion - \(mpg = {\beta_0} + {\beta_1} wt + {\beta_2} am + {\beta_3} qsec + {\epsilon_i}\) appears to be a good fit. Diagnostic plots for it can be found in appendix section 5.3. The adjusted r squared value for this model is 0.8496636
Use t.test() function to get the mpg t interval for two non paired groups, automatic and manual transmissions
t.test(mtcars[mtcars$am==1,]$mpg, mtcars[mtcars$am==0,]$mpg, paired=F)$conf[1:2]
## [1] 3.209684 11.280194
Using above information we can concluse that in 95 percent of the cases the MPG for a manual transmission car would be higher than that of an automatic transmission car by a value in the range 3.2096842, 11.2801944