Quadratic (polynomial) regression

In general, the idea of this topic is that if we plot a simple linear regression, and that isn’t a good model, we can expand to a polynomial regression.

First, you start with a linear model (lm(y~x)). Next, plot the residuals against the fitted values. If your graph has a clear trend, simple linear regression is not the best tool for you!

data(women)
attach(women)
womenmod<-lm(weight~height)
plot(womenmod$residuals~womenmod$fitted.values)

In this graph, you can see there is a clear parabola-like shape. This is a great indicator to use quadratic regression instead of linear.

If the graph was random, this would be evidence that linear regression is sufficient.

So, if we create a quadratic regression model and plot the residuals against the fitted values, you will see that it is a much better estimator.

womenquadmod<-lm(weight ~ height +I(height^2))
plot(womenquadmod$residuals~womenquadmod$fitted.values)

You can these residuals are much better and more random! This is asign that quadratic regression is a better predictor than linear regression. To get more technical, you can look at the pvalue for the t test on \(\beta_2\).

summary(womenquadmod)

## 
## Call:
## lm(formula = weight ~ height + I(height^2))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.50941 -0.29611 -0.00941  0.28615  0.59706 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 261.87818   25.19677  10.393 2.36e-07 ***
## height       -7.34832    0.77769  -9.449 6.58e-07 ***
## I(height^2)   0.08306    0.00598  13.891 9.32e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3841 on 12 degrees of freedom
## Multiple R-squared:  0.9995, Adjusted R-squared:  0.9994 
## F-statistic: 1.139e+04 on 2 and 12 DF,  p-value: < 2.2e-16

If the pvalue of \(\beta_2\) is significant, and in this scenario it is, we can say that \(\beta_2\) is a coefficient not equal to zero and therefore, useful.

Finally, you can expand this topic to include any degreee polynomial regression. This will be extremely usefull in real world data analysis.

Learning Log 12

Kristan Miarka

March 10, 2018

Quadratic (polynomial) regression

Multicollinearity