Exploring the relationship between various factors influencing the mileage of cars and quantifying the difference between automatic and manual transmission. The analysis supports the general assumption that manual transmission is better for Mileage. Linear Model Regression hints that, on average, a manaul transmission gives 1.80 MPG higher than automatic transmission with an uncerainty of ± 1.40 MPG.
library(datasets)
library(ggplot2)
library(car)
library(stats)
head(mtcars)
## mpg cyl disp hp drat wt qsec vs transmission gear
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 Manual 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 Manual 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 Manual 4
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 Automatic 3
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 Automatic 3
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 Automatic 3
## carb
## Mazda RX4 4
## Mazda RX4 Wag 4
## Datsun 710 1
## Hornet 4 Drive 1
## Hornet Sportabout 2
## Valiant 1
par(mfrow=c(1,2))
colnames(mtcars)[9]<-"transmission"
mtcars$transmission[mtcars$transmission=="0"]<-"Automatic"
mtcars$transmission[mtcars$transmission=="1"]<-"Manual"
mtcars$transmission <- as.factor(mtcars$transmission)
boxplot(mtcars$mpg~mtcars$transmission,
ylab="Miles pre Gallon",col=c("salmon","gold"))
title("MPG vs Transmission")
plot(mtcars$wt,mtcars$mpg,col=mtcars$cyl,pch=17,
ylab="Miles per Gallon",xlab="Weight")
title("MPG vs Weight vs Cyl")
legend("topright", title="No. of Cyl",
c("4","6","8"),pch=17, col=c("black","red","green"), horiz=FALSE)
# Building possible models
lm1 <- lm(mpg~transmission,data=mtcars)
lm2 <- lm(mpg~transmission+cyl,data=mtcars)
lm3 <- lm(mpg~transmission+cyl+hp,data=mtcars)
lm4 <- lm(mpg~transmission+cyl+hp+wt,data=mtcars)
lm5 <- lm(mpg~transmission+cyl+hp+wt+carb,data=mtcars)
anova(lm1,lm2,lm3,lm4,lm5)
## Analysis of Variance Table
##
## Model 1: mpg ~ transmission
## Model 2: mpg ~ transmission + cyl
## Model 3: mpg ~ transmission + cyl + hp
## Model 4: mpg ~ transmission + cyl + hp + wt
## Model 5: mpg ~ transmission + cyl + hp + wt + carb
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 30 720.90
## 2 28 264.50 2 456.40 37.7751 2.783e-08 ***
## 3 27 197.20 1 67.30 11.1399 0.002647 **
## 4 26 151.03 1 46.17 7.6433 0.010547 *
## 5 25 151.03 1 0.00 0.0000 0.999841
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Summary of the selected model
summary(lm4)
##
## Call:
## lm(formula = mpg ~ transmission + cyl + hp + wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9387 -1.2560 -0.4013 1.1253 5.0513
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.70832 2.60489 12.940 7.73e-13 ***
## transmissionManual 1.80921 1.39630 1.296 0.20646
## cyl6 -3.03134 1.40728 -2.154 0.04068 *
## cyl8 -2.16368 2.28425 -0.947 0.35225
## hp -0.03211 0.01369 -2.345 0.02693 *
## wt -2.49683 0.88559 -2.819 0.00908 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.41 on 26 degrees of freedom
## Multiple R-squared: 0.8659, Adjusted R-squared: 0.8401
## F-statistic: 33.57 on 5 and 26 DF, p-value: 1.506e-10
maxvalues <- c(Maximum.Beta=max(dfbetas(lm4)),
Maximum.Hatvalue=max(hatvalues(lm4)))
round(maxvalues,3)
## Maximum.Beta Maximum.Hatvalue
## 0.939 0.471
par(mfrow=c(2,2))
plot(lm4)
Source of Data:
The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models). Henderson and Velleman (1981), Building multiple regression models interactively. Biometrics, 37, 391-411.
Variables:
The data frame(mtcars) with 32 observations on 11 variables.