This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(mtcars)
## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
## Median :19.20 Median :6.000 Median :196.3 Median :123.0
## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
## drat wt qsec vs
## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
## Median :3.695 Median :3.325 Median :17.71 Median :0.0000
## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Min. :0.0000 Min. :3.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
## Median :0.0000 Median :4.000 Median :2.000
## Mean :0.4062 Mean :3.688 Mean :2.812
## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000 Max. :8.000
colnames(mtcars)
## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
## [11] "carb"
lm_mpg <- lm(mpg ~ wt + hp, data = mtcars)
summary(lm_mpg)
##
## Call:
## lm(formula = mpg ~ wt + hp, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.941 -1.600 -0.182 1.050 5.854
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
## wt -3.87783 0.63273 -6.129 1.12e-06 ***
## hp -0.03177 0.00903 -3.519 0.00145 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.593 on 29 degrees of freedom
## Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
## F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
Intercept (37.2273): When both wt and
hp are 0, the predicted mpg is supposed to be 37.23. This
is a baseline with no real meaning beacuse a car cannot have zero weight
or zero horsepower.
wt (−3.8778): Holding horsepower constant, for every unit increase in wieght (1000), predicted mpg decreases by 3.88 mpg. This is statistically significant beacuse p < 0.001. This means that the heavier vehicles burn more fuel.
hp (−0.0318): Holding weight constant, for every unit increase in horsepower, predicted mpg decreases by 0.032 miles per gallon. This is statistically significant becuase p = 0.0015. More powerful engines consume more fuel.
plot(lm_mpg, which = 1:4)
mean(lm_mpg$residuals^2)
## [1] 6.095242
summary(lm_mpg)$r.squared
## [1] 0.8267855
lm_wt_hp <- lm(mpg ~ wt * hp, data = mtcars)
summary(lm_wt_hp)
##
## Call:
## lm(formula = mpg ~ wt * hp, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.0632 -1.6491 -0.7362 1.4211 4.5513
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 49.80842 3.60516 13.816 5.01e-14 ***
## wt -8.21662 1.26971 -6.471 5.20e-07 ***
## hp -0.12010 0.02470 -4.863 4.04e-05 ***
## wt:hp 0.02785 0.00742 3.753 0.000811 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.153 on 28 degrees of freedom
## Multiple R-squared: 0.8848, Adjusted R-squared: 0.8724
## F-statistic: 71.66 on 3 and 28 DF, p-value: 2.981e-13
This has one more term than modle #1 or lm_mpg. This has 4 terms opposed to 3, and it has 28 degrees of freedom cause it lost one due to the interaction. The intercept is 12.58 larger than the first model. Wt, hp, and wt*hp all changes from -3.878, -0.032, and NA, to -8.217, -0.120, and +0.02785 respectively.
q5 <- quantile(mtcars$hp, 0.05) #5th percentile value of hp
q95 <- quantile(mtcars$hp, 0.95) #95th percentile value of hp
hp_win <- pmax(pmin(mtcars$hp, q95), q5)
lm_win <- lm(mpg ~ wt + hp_win, data = mtcars) #fit win data to first model
summary(lm_win)
##
## Call:
## lm(formula = mpg ~ wt + hp_win, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8825 -1.6545 -0.0968 0.8367 5.7259
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.31722 1.56964 23.774 < 2e-16 ***
## wt -3.58279 0.66427 -5.394 8.5e-06 ***
## hp_win -0.03952 0.01059 -3.732 0.000824 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.546 on 29 degrees of freedom
## Multiple R-squared: 0.833, Adjusted R-squared: 0.8215
## F-statistic: 72.34 on 2 and 29 DF, p-value: 5.348e-12
library(car)
## Loading required package: carData
vif_values <- vif(lm_mpg)
corrplot_data <- cor(mtcars[, c("mpg", "wt", "hp")])
We find that the VIF for both is about 1.77 which is well below the 5 threshold. While the correlation is 0.659, it is not high enough to cause multicoliearity. Both predictors contribute independent explanatory power on mpg.