Summary
Motor Trend is a magazine about the automobile industry. The data set of a collection of cars is exploring in order to derive the relationship between a set of variables and miles per gallon (MPG).
Description
The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).
A data frame with 32 observations on 11 (numeric) variables.
[, 1] mpg Miles/(US) gallon
[, 2] cyl Number of cylinders
[, 3] disp Displacement (cu.in.)
[, 4] hp Gross horsepower
[, 5] drat Rear axle ratio
[, 6] wt Weight (1000 lbs)
[, 7] qsec 1/4 mile time
[, 8] vs Engine (0 = V-shaped, 1 = straight)
[, 9] am Transmission (0 = automatic, 1 = manual)
[,10] gear Number of forward gears
[,11] carb Number of carburetors
Is an automatic or manual transmission better for MPG
Load data
Summary of data
## mpg cyl disp hp drat wt qsec vs am gear
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 Manual 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 Manual 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 Manual 4
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 Automatic 3
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 Automatic 3
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 Automatic 3
## carb
## Mazda RX4 4
## Mazda RX4 Wag 4
## Datsun 710 1
## Hornet 4 Drive 1
## Hornet Sportabout 2
## Valiant 1
Plotting Histogram and Box plot for answering question
library("ggplot2")
ggplot(mydata,aes(x=mpg,, fill=am, color=am)) +
geom_histogram( binwidth=5, aes(y=..density..), position="identity", alpha=0.5) +
labs(title="",x="MPG", y = "Density") +
geom_density(alpha=0.6) + theme_dark() + theme(legend.position = "right")ggplot(mydata, aes(x=factor(am), y=mpg, fill=am)) +
geom_boxplot(position=position_dodge(1),alpha=.7) +
scale_fill_manual(values=c("yellow", "blue")) +
labs(title="",x="Transmisson", y = "MPG") + theme_dark() +
theme(legend.position = "right")The conclusion drawn is that manual car gives more milege than automatic car.
Quantify the MPG difference between automatic and manual transmissions
As initial step lets find correlation of MPG with other factor and pick strong relations in multi variable linear regression to justify Transmission with MPG
## Registered S3 method overwritten by 'GGally':
## method from
## +.gg ggplot2
# Correlation plot
ggcorr(mydata, palette = "RdBu", label = TRUE) + theme_dark() + theme(legend.position = "none")From result above cylinder, Displacement, Horse power and weight are considered for regression
reg <- lm(mydata$mpg ~ mydata$cyl+mydata$disp+mydata$disp+mydata$hp+mydata$wt+mydata$am,data = mydata)
pred <- predict(reg)
residuals <- residuals(reg)
summary(reg)##
## Call:
## lm(formula = mydata$mpg ~ mydata$cyl + mydata$disp + mydata$disp +
## mydata$hp + mydata$wt + mydata$am, data = mydata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5952 -1.5864 -0.7157 1.2821 5.5725
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 38.20280 3.66910 10.412 9.08e-11 ***
## mydata$cyl -1.10638 0.67636 -1.636 0.11393
## mydata$disp 0.01226 0.01171 1.047 0.30472
## mydata$hp -0.02796 0.01392 -2.008 0.05510 .
## mydata$wt -3.30262 1.13364 -2.913 0.00726 **
## mydata$amManual 1.55649 1.44054 1.080 0.28984
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.505 on 26 degrees of freedom
## Multiple R-squared: 0.8551, Adjusted R-squared: 0.8273
## F-statistic: 30.7 on 5 and 26 DF, p-value: 4.029e-10
Conclusion
- MPG is significantly affected by weight and cylinder.
- MPG chnange after 200 hp can be considered constant
Apendix of Residual plots
ggplot(mydata,aes(x = cyl, y = mpg)) +
geom_segment(aes(xend = cyl, yend = pred), alpha = .2) +
geom_point(aes(color = residuals)) +
scale_color_gradient2(low = "blue", mid = "white", high = "red") +
guides(color = FALSE) +
geom_smooth(aes(y = pred)) +
theme_dark()## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
ggplot(mydata,aes(x = hp, y = mpg)) +
geom_segment(aes(xend = hp, yend = pred), alpha = .2) +
geom_point(aes(color = residuals)) +
scale_color_gradient2(low = "blue", mid = "white", high = "red") +
guides(color = FALSE) +
geom_smooth(aes(y = pred)) +
theme_dark()## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
ggplot(mydata,aes(x = wt, y = mpg)) +
geom_segment(aes(xend = wt, yend = pred), alpha = .2) +
geom_point(aes(color = residuals)) +
scale_color_gradient2(low = "blue", mid = "white", high = "red") +
guides(color = FALSE) +
geom_smooth(aes(y = pred),method = "lm") +
theme_dark()## `geom_smooth()` using formula 'y ~ x'
ggplot(mydata,aes(x = disp, y = mpg)) +
geom_segment(aes(xend = disp, yend = pred), alpha = .2) +
geom_point(aes(color = residuals)) +
scale_color_gradient2(low = "blue", mid = "white", high = "red") +
guides(color = FALSE) +
geom_smooth(aes(y = pred),method = "lm") +
theme_dark()## `geom_smooth()` using formula 'y ~ x'