The dataframe mtcars contains 32 observations on 11 variabels like miles/gallon(MPG), number of cylinders etc.
Our main focus in the study is how the Transmission type( automatic or manual) affects the miles per gallon. define a relationship between mileage and transmission type.
data("mtcars")
mtcars <- mtcars%>%
mutate(am = as.factor (am))
levels(mtcars$am)<- c("Automatic","Manual")
summary(mtcars)
## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
## Median :19.20 Median :6.000 Median :196.3 Median :123.0
## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
## drat wt qsec vs
## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
## Median :3.695 Median :3.325 Median :17.71 Median :0.0000
## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Automatic:19 Min. :3.000 Min. :1.000
## Manual :13 1st Qu.:3.000 1st Qu.:2.000
## Median :4.000 Median :2.000
## Mean :3.688 Mean :2.812
## 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :5.000 Max. :8.000
The Displacement ,Mileage, HorsePower, axle ratio, quator mile time, weight are all the continous variables.
And other varibles are categorical
And our only intrest is to find relationship between Transmission type and Mileage.
we will analyse the continous variables sactter plot with mileage
mtcars.con <- mtcars[c("mpg","disp","hp","drat","wt", "qsec")]
my_cols <- c("#00AFBB", "#E7B800", "#FC4E07")
pairs(mtcars.con ,pch = 19, cex =0.5, col = my_cols[mtcars$am], lower.panel = NULL, font.labels = 2, cex.labels = 1.3)
g <- ggplot(data = mtcars, aes(x = disp, y = mpg, color = am))
g <- g + geom_point( alpha = 0.5)
g <- g + labs(x = "Displacement in cubic inches", y = "Miles/(US) gallon", title = "Milege Vs Displacement", color = "Transmission Type")
g1 <- ggplot(data = mtcars, aes(x = hp, y = mpg, color = am))
g1 <- g1 + geom_point( alpha = 0.5)
g1 <- g1 + labs(x = "Gross horse Power", y = "Miles/(US) gallon", title = "Milege Vs Gross HorsePower", color = "Type")
g2 <- ggplot(data = mtcars, aes(x = wt, y = mpg, color = am))
g2 <- g2 + geom_point( alpha = 0.5)
g2 <- g2 + labs(x = "Weight in 1000 lbs", y = "Miles/(US) gallon", title = "Milege Vs Weight",color = "Type")
g3 <- ggplot(data = mtcars, aes(x = drat, y = mpg ,color = am))
g3 <- g3 + geom_point( alpha = 0.5 )
g3 <- g3 + labs(x = "Rear axle ratio", y = "Miles/(US) gallon", title = "Milege Vs Rear axle ratio", color = "Type")
g4 <- ggplot(data = mtcars, aes(x = qsec, y = mpg ,color = am))
g4 <- g4 + geom_point( alpha = 0.5 )
g4 <- g4 + labs(x = "Quator Mile Time", y = "Miles/(US) gallon", title = "Milege Vs Quator Mile Time", color = "Type")
ggarrange(g,g1,g2, g3,g4, ncol = 2, nrow = 3,
common.legend = TRUE, legend = "bottom")
corr_disp<- cor(mtcars$disp , mtcars$mpg)
corr_hp<- cor(mtcars$hp , mtcars$mpg)
corr_wt<- cor(mtcars$wt , mtcars$mpg)
corr_drat <-cor(mtcars$drat , mtcars$mpg)
corr_qsec<- cor(mtcars$qsec , mtcars$mpg)
** The Correlation values of the different relationship **
The Plot shows a negative Trend with correlation values of -0.848 between Displacement and Mileage.
The Plot shows a negative Trend with correlation values of -0.776 between HosrePower and Mileage.
The Plot shows a negative Trend with correlation values of -0.868 between weight and Mileage.
The Plot shows a postive Trend with correlation values of 0.681 between rear axle ratio and Mileage.
The Plot shows a postive Trend with correlation values of 0.419 between quator mile time and Mileage.
The Dependency of MPG value on Transmission Type is explained by the Bar and Violin Plots.
l<- labs(x = "Transmission Type", y = "Mile Per Gallon", fill = "Transmission Type")
box <- ggplot(data = mtcars, aes(am , mpg, fill = am))
box_plot <- box+geom_boxplot()+l
violin <- box+geom_violin(color = "black", size = 1)+l
ggarrange(box_plot, violin, ncol = 2, common.legend = TRUE, legend = "bottom")
The Box plot reveals that there is a huge differnce in mean mpg for the automatic and manual Transmission
Since, Our question of analysis is relationship between the Mileage with respect to transmission. And Displacement and Weight Shows high correlation with Mileage.
The Regression analysis of Mileage as outcome and Weight,Mileage and Type as Predictors.
First to test the Transmission Type is really a categorical value to determine the MPG.
t.test(mtcars$mpg~mtcars$am,conf.level=0.95)
##
## Welch Two Sample t-test
##
## data: mtcars$mpg by mtcars$am
## t = -3.7671, df = 18.332, p-value = 0.001374
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.280194 -3.209684
## sample estimates:
## mean in group Automatic mean in group Manual
## 17.14737 24.39231
The T-test rejects the null Hypothesis, the difference between Transmission on MPG is 0.
mdl <- lm(mpg~disp+wt+am , data = mtcars)
coef_mdl <- coef(mdl)
rsquare_val <- summary(mdl)$adj.r.squared
| Feature | coeffcient value |
|---|---|
| Intercept | 34.6759109 |
| displacement | -0.0178049 |
| Weight | -3.2790439 |
| manual transmission | 0.1777241 |
We can step method to as R to choose the best model itself
bestmodel = step(lm(mpg~. , data = mtcars), trace = 0)
coef_bdl <- coef(bestmodel)
rsquare_bval <- summary(bestmodel)$adj.r.squared
vif_model <- vif(bestmodel)
The BestModel that fits perfectly for MPG as outcome is with predictors Weight,Quator Mile time and Transmission Type.
The adjusted R square value for best model is 0.8335561
| Feature | coeffcient value | VIF |
|---|---|---|
| Weight | -3.9165037 | 2.4829515 |
| Quator mile Time | 1.225886 | 1.3643391 |
| manual transmission | 2.9358372 | 2.5414372 |
The Residual Plots for the Fitted values and inputs
par(mfrow = c(2,2))
plot(bestmodel)
Based on the previous analysis, we can say that on average manual transmission is better than automatic transmission by 2.9 mpg but also transmission type is not the only factor accounting for MPG, weight, and acceleration (1/4 mile time) also needs to be considered.