Last class we talked about quadratic regression. Unlike linear regression, there is not a linear relationship between x and y, there is a smooth relationship. A quadratic equation will help us build an accurate model. The equation for our quadratic regression is :
\(\hat{y_i}\)=\(\hat{β_0}\) + \(\hat{β_1}\)\(x_i\) + \(\hat{β_2}\)\(x^2_i\)
In this equation, \(\hat{β_0}\) is our y intercept, \(\hat{β_1}\) shifts the parabola left or right, and \(\hat{β_0}\) affects the curvature of the parabola.
How do we determine if the quadratic term is necessary? There are two tests. The first option is a T-test for \(\hat{β_2}\). The second option is to run a F-test and compare the reduced model and the complete model with \(x^2\). In the T-test, if the p-value is high for the \(x^2\) term, then the quadratic term is not necessary and can be dropped from the model.
Another way to see if quadratic model is useful, is by looking at the trends in the residuals. Trends happen when more data points are over or under the regression line. If more data points are over the line, we can say that the regression model is systematically over-predicting and under-predicting vice versa. I will use the women data set to show what this looks like.
data(women)
attach(women)
Below, I’m going to plot the residuals and see if there are trends.
womenmod<- lm(weight ~ height)
womenresid<- womenmod$residuals
plot(womenresid ~ womenmod$fitted)
abline(0,0)
After using our linear regression model and then plotting the residuals against the predicted values, it is evident there is a trend. Remember, we don’t want a trend in our residuals because a trend shows that at certain points our line is systematically overpredicting and underpredicting in parts. Which means there could be a better line. We can create a better line by making it quadratic. I will add the quadratic term below.
xsq <- height^2
wommod2<- lm(weight ~ height + xsq)
wommod2
##
## Call:
## lm(formula = weight ~ height + xsq)
##
## Coefficients:
## (Intercept) height xsq
## 261.87818 -7.34832 0.08306
womresid2 <- wommod2$residuals
womresid2
## 1 2 3 4 5
## -0.102941176 -0.473109244 -0.009405301 0.288170653 0.419618617
## 6 7 8 9 10
## 0.384938591 0.184130575 -0.182805430 0.284130575 -0.415061409
## 11 12 13 14 15
## -0.280381383 -0.311829347 -0.509405301 0.126890756 0.597058824
plot(womresid2 ~ wommod2$fitted)
abline(0,0)
After adding the quadratic term, we can observe the residuals are more varied and randomized. Since adding the quadratic term randomized the residuals, it gives us evidence that we should not drop the term.
Now, lets use the T-test to confirm adding the quadratic term creates a better line.
summary(wommod2)
##
## Call:
## lm(formula = weight ~ height + xsq)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.50941 -0.29611 -0.00941 0.28615 0.59706
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 261.87818 25.19677 10.393 2.36e-07 ***
## height -7.34832 0.77769 -9.449 6.58e-07 ***
## xsq 0.08306 0.00598 13.891 9.32e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3841 on 12 degrees of freedom
## Multiple R-squared: 0.9995, Adjusted R-squared: 0.9994
## F-statistic: 1.139e+04 on 2 and 12 DF, p-value: < 2.2e-16
Look at the xsq Pr(>|t|) value. The value of 9.32e-09 is a very small p-value. That means there is significantly strong evidence that the quadratic term creates a strong relationship between x and y.
In conclusion, Adding a x^2 creates a better regression line if the residuals are more varied compared to the linear model . We want to eliminate systematic under and over prediction
Multicollineraity happend when our predictors are correlated. This is not good.
Why is it bad?