In this homework, we explore how miles per gallon (mpg) is affected by number of cylinders and transmission type with and without the interaction between number of cylinders and transmission type.
The data of this homework comes from the ‘mtcars’ dataset as part of the original R datasets.
library(dplyr)
library(visreg)
library(texreg)
data(mtcars)
glimpse(mtcars)
## Observations: 32
## Variables: 11
## $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19....
## $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, ...
## $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 1...
## $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, ...
## $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.9...
## $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3...
## $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 2...
## $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, ...
## $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, ...
## $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, ...
# rename the colnames so that it is more readable
colnames(mtcars)[c(2,9)] <- c("Cylinder", "Transmission")
mod1 <- lm(mpg ~ Cylinder, data=mtcars) # Model 1
mod2 <- lm(mpg ~ Cylinder + Transmission, data = mtcars) # Model 2 has no interaction term
# Transmission = 0, automatic; manual, # otherwise
mod3 <- lm(mpg ~ Cylinder * Transmission, data = mtcars) # Model 3 has interaction between # of cyclinders and
# and transmission: 0 = automatic, 1=manual
htmlreg(list(mod1, mod2, mod3), doctype = FALSE)
| Model 1 | Model 2 | Model 3 | ||
|---|---|---|---|---|
| (Intercept) | 37.88*** | 34.52*** | 30.87*** | |
| (2.07) | (2.60) | (3.19) | ||
| Cylinder | -2.88*** | -2.50*** | -1.98*** | |
| (0.32) | (0.36) | (0.45) | ||
| Transmission | 2.57 | 10.18* | ||
| (1.29) | (4.30) | |||
| Cylinder:Transmission | -1.31 | |||
| (0.71) | ||||
| R2 | 0.73 | 0.76 | 0.79 | |
| Adj. R2 | 0.72 | 0.74 | 0.76 | |
| Num. obs. | 32 | 32 | 32 | |
| RMSE | 3.21 | 3.06 | 2.94 | |
| p < 0.001, p < 0.01, p < 0.05 | ||||
Model 1 in the above table shows how miles per gallon changes as the number of cylinders changes. One unit increases in number of cylinders leads to 2.88 units decreases in mpg.
Model 2 in the above table shows how miles per gallon changes as the independent changes of number of cylinders and type of transmission. One unit increases in number of cylinder leads to 2.50 units decreases in mpg and manual transmission would lead to 2.57 units increases in mpg. Model 2 does not include the interaction effect between number of cylinders and type of transmission.
Model 3 in the above table shows how mpg changes as number of cylinders and transmission type changes including the interaction between number of cylinders and transmission type. One unit increases in number of cylinder leads to 1.98 unit decreases in mpg and manual transmission leads to 10.18 unit increases in mpg.
Due to the inclusion of the interaction effect between number of cylinders and type of transmission, the overall model performance improves in terms of R-squared and RMSE from model 2 to model 3.
visreg(mod3, "Cylinder", by="Transmission", scale="response")
In the above graph, the slope is steeper for manual transmission ( transmission = 1) and therefore one unit increase in number of cylinders in manual transmission car leads to steeper decreases in mpg than automatic transmission car.
In conclusion, mpg can be predicted by number of cylinders and type of transmission. Adding interaction effect between number of cylinders and type of transmission leads to a better model in terms of R-squared and RMSE. Mpg decreases as the number of cylinder increases and mpg increases as we swtich from a manual transmission car to an automatic transmission car.