auto <- read.table('data/auto-mpg.data')
names(auto) <- c('displacement', 'horsepower', 'weight', 'acceleration', 'mpg')Fits between mpg and the remaining four variables (displacement, horsepower, weight, and acceleration) are created with degrees varying between 1 and 8:
bias_var <- data.frame(N = rep(NA, 8), Error = rep(NA, 8))
set.seed(46) # set seed for replicable results
for (n in 1:8) {
polyfit <- glm(mpg ~ poly(displacement + horsepower + weight + acceleration, n), data = auto)
bias_var$N[n] <- n
bias_var$Error[n] <- cv.glm(auto, polyfit, K = 5)$delta[1]
}The plot below illustrates how the mean cross-validation error against the degree of the polynomial fit to the data. While there is a slight unexpected bump at N=6, the characteristic U-shaped curve can be seen. The lowest error occurs at N=2.