I work to a Motor Trend, a magazine about the automobile industry. In this study I show the relationship of Miles per Gallon (MPG) to 10 aspects of automobile design. The main objective is to anwser the two questions below:
“Is an automatic or manual transmission better for MPG”
“Quantify the MPG difference between automatic and manual transmissions”
The results of this study verify that the manual transmission should be better than automatic transmission for mpg if we consider only these element(transmission). However when we look for other variables like Weight it is possible to observe that a car with manual transmission will have 0.0236 less miles per galleon than a similar car with automatic transmission.
The variables present in the dataset are:
mpg - Miles/(US) gallon
cyl - Number of cylinders
disp - Displacement (cu.in.)
hp - Gross horsepower
drat - Rear axle ratio
wt - Weight (lb/1000)
qsec - 1/4 mile time
vs - V/S
am - Transmission (0 = automatic, 1 = manual)
gear - Number of forward gears
carb - Number of carburators
First, is important to convert the “am” variable to a factor and put in a correct classification like “Automatic” and “Manual”. After we show some simples analysis to know about the variable ‘MPG’ and ‘AM’.
library(car)
mtcars$am = as.factor(mtcars$am)
levels(mtcars$am) = c("Automatic", "Manual")
summary(mtcars$mpg); summary(mtcars$am)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.40 15.43 19.20 20.09 22.80 33.90
## Automatic Manual
## 19 13
In a exploratory analysis, first, shows a scatterplot dividing the relation of variables considering the differents types of transmission.
scatterplotMatrix(~mpg+disp+drat+wt+hp|am, data=mtcars,
col = c("blue", "red"),
main="Type of Transmission")
The boxplot below show us that if we consider the cars that have manual transmission they appears little bit more economic and efficiently than cars with automatic transmission.
autmanbox = boxplot(mpg ~ am, data=mtcars, main = "Comparison of MPG by type of Transmission",
xlab = "Type of Gear",
ylab = "Car consumption (MPG)",
ylim = c(10, 35),
col = c("blue", "red"))
To check more precising the relations of variable it is necessary to verify the correlations. In this case, indicate to us that weight has a important correlation to MPG variable.
To check more precising the relations of variable it is necessary to verify the correlations. In this case, indicate to us that weight has a important correlation to MPG variable.
cor(mtcars[, -c(9)])[1, ]
## mpg cyl disp hp drat wt
## 1.0000000 -0.8521620 -0.8475514 -0.7761684 0.6811719 -0.8676594
## qsec vs gear carb
## 0.4186840 0.6640389 0.4802848 -0.5509251
wtplot = scatterplot(mpg ~ wt | am, data=mtcars,
xlab="Weight of Car", ylab="Car Consumption (MPG)",
main="Car Weight and Consumption by Type of Transmission",
col = c("blue", "red"),
legend.title = "Type of Transmission",
legend.coords = "topright")
6.1 Fit 1
fit1 = lm(mpg ~ am, data=mtcars)
rmse1 = sqrt(sum(fit1$residuals ^ 2) / nrow(mtcars))
rsq1 = summary(fit1)$r.squared
6.2 Fit 2
fit2 = lm(mpg ~ wt, data=mtcars)
rmse2 = sqrt(sum(fit2$residuals ^ 2) / nrow(mtcars))
rsq2 = summary(fit2)$r.squared
6.3 Fit 3
fit3 = lm(mpg ~ am + wt, data=mtcars)
rmse3 = sqrt(sum(fit3$residuals ^ 2) / nrow(mtcars))
rsq3 = summary(fit3)$r.squared
7.1 Residuals Plot
par(mfcol = c(1, 3))
plot(mtcars$wt, resid(fit1), main = "Model 1", xlab = "Weight (lbs/1000)", ylab = "Residuals")
plot(mtcars$wt, resid(fit2), main = "Model 2", xlab = "Weight (lbs/1000)", ylab = "Residuals")
plot(mtcars$wt, resid(fit3), main = "Model 3", xlab = "Weight (lbs/1000)", ylab = "Residuals")
The residuals for model 1 exhibit a linear pattern. Model 1 has larger residuals than the other two models.The residuals of models 2 and 3 are almost identical.
<<<<<<< HEAD 7.2 table of comparison - Root Mean Squared Error
print(paste('Model 1 = ', rmse1))
## [1] "Model 1 = 4.74636900427014"
print(paste('Model 2 = ', rmse2))
## [1] "Model 2 = 2.94916268595503"
print(paste('Model 3 = ', rmse3))
## [1] "Model 3 = 2.94915081645569"
7.3 Table of comparisation - R2
print(paste('Model 1 = ', rsq1))
## [1] "Model 1 = 0.359798943425465"
print(paste('Model 2 = ', rsq2))
## [1] "Model 2 = 0.752832793658264"
print(paste('Model 3 = ', rsq3))
## [1] "Model 3 = 0.752834783202689"
Model 1 does not predict MPG very well, models 2 and 3 have very similar performance characteristics. The R2 values reveals the fact that adding the transmission type to model 2 does not add any predictive power.
7.4 Coefficients
coef2 = summary(fit2)$coef
coef2
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.285126 1.877627 19.857575 8.241799e-19
## wt -5.344472 0.559101 -9.559044 1.293959e-10
Coefficient Model 3:
coef3 = summary(fit3)$coef
coef3
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.32155131 3.0546385 12.21799285 5.843477e-13
## amManual -0.02361522 1.5456453 -0.01527855 9.879146e-01
## wt -5.35281145 0.7882438 -6.79080719 1.867415e-07
transmission_ci <- coef3[2, 1] + c(-1, 1) * qt(.975, df = fit3$df) * coef3[2, 2]
transmission_ci
## [1] -3.184815 3.137584
The coefficient for the transmission variable has an estimated value of -0.0236, meaning that a car with manual transmission will have 0.0236 less miles per galleon than a similar car with automatic transmission.
The results of this study verify that the manual transmission should be better than automatic transmission for mpg if we consider only these element(transmission). However when we look for other variables like Weight it is possible to observe that a car with manual transmission will have 0.0236 less miles per galleon than a similar car with automatic transmission.
The large width of the confidence interval means that the estimated difference between the cars with manual and automatic transmission should not be taken at face value and a more detailed analysis is necessary.
The coefficient for the transmission variable has an estimated value of -0.0236, meaning that a car with manual transmission will have 0.0236 less miles per galleon than a similar car with automatic transmission. The 95% confidence interval for this coefficient is rather large compared to its estimated value, namely (-3.1848, 3.1376). To provide a basis for comparison, an increase in weight of 1000 lbs would lower the MPG by an average of 5.3528.
The large width of the confidence interval means that the estimated difference between the cars with manual and automatic transmission should not be taken at face value and a more detailed analysis is necessary.