A magazine Motor Trend about automobile industry is interested to know relation between a set of variables and MPG; and mainly they want to know about MPG-Transmission Type(Automatic/Manual) Relation having below questions: - “Is an automatic or manual transmission better for MPG” - “How different is the MPG between automatic manual transmission?” Our focus here to analysis MPG and automatic/manual transmission and build a linear regression model for that.
Playing around MPG and Transmission types, we have strong evident that Manual Transmission type has better mileage [average 24.39] rather than Automatic Transmission type which gives [average 17.15]. Manual has 7.24 more than automatic transmission. Also there are some variable, treated as confounder, weight, horsepower and cylendar which also have strong impacts on mpg.
data(mtcars)
mtcars$am <- factor(mtcars$am,labels=c('Automatic','Manual'))
Our interest is to analysis relation between mpg and am variable. How transmission type shows its impact on MPG. Which transmission type is better for mileage? and how much difference they have? Let first take a graphical view of two variables mpg vs am.
This boxplot shows difference in MPG due to transmission type. It shows Manual transmission gives better mileage rather than automatic transmission. But this is not a convincing evidence. So lets move one step forward and frame this investigation into hypothesis test.
Hypothesis Testing: Assume that there is no relation between mpg and am
data(mtcars)
fitAm <- lm(mpg~am, mtcars)
summary(fitAm)$coef
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.147 1.125 15.247 1.134e-15
## am 7.245 1.764 4.106 2.850e-04
A small p-value indicates that we can reject null hypothesis that there is no relation between mpg and am. It shows a strong evidence that mpg and transmission types have relation in between. This also shows linear regression coefficients having independent variable am.
We can say that Manual Transmission has better impact on mile per gallon (MPG) rather than Automatic Transmission. Here we can see the average MPG for both transmission : Manual and Automatic-
## amAutomatic amManual
## 17.15 24.39
Correlation between MPG and other variables
cor(mtcars)[1,]
## mpg cyl disp hp drat wt qsec vs am
## 1.0000 -0.8522 -0.8476 -0.7762 0.6812 -0.8677 0.4187 0.6640 0.5998
## gear carb
## 0.4803 -0.5509
cyl, disp, hp, drat, wt have good correlation with mpg. Lets optimize model which variables best fits in the model.
mtcars$cyl <- factor(mtcars$cyl)
mtcars$am <- factor(mtcars$am)
fitAll <- lm(mpg ~ am + cyl + disp + hp + drat + wt, data = mtcars)
fitBest <- step(fitAll, direction = "both")
summary(fitBest)
##
## Call:
## lm(formula = mpg ~ am + cyl + hp + wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.939 -1.256 -0.401 1.125 5.051
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.7083 2.6049 12.94 7.7e-13 ***
## am1 1.8092 1.3963 1.30 0.2065
## cyl6 -3.0313 1.4073 -2.15 0.0407 *
## cyl8 -2.1637 2.2843 -0.95 0.3523
## hp -0.0321 0.0137 -2.35 0.0269 *
## wt -2.4968 0.8856 -2.82 0.0091 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.41 on 26 degrees of freedom
## Multiple R-squared: 0.866, Adjusted R-squared: 0.84
## F-statistic: 33.6 on 5 and 26 DF, p-value: 1.51e-10
We got independent variables ‘hp’ , ‘wt’ and ‘cyl’. Below is given multivariable regression model’s coeffiecient:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.70832 2.60489 12.9404 7.733e-13
## am1 1.80921 1.39630 1.2957 2.065e-01
## cyl6 -3.03134 1.40728 -2.1540 4.068e-02
## cyl8 -2.16368 2.28425 -0.9472 3.523e-01
## hp -0.03211 0.01369 -2.3450 2.693e-02
## wt -2.49683 0.88559 -2.8194 9.081e-03
par(mfrow = c(2,2))
plot(fitBest)
Above plots shows normality no ouliers.
Manual Transmission has more impact on miles per gallon rather than automatic transmission. The have sufficient gap thier average mpg that is 7.24. Also weight and horsepoer played a role to effect in mpg.