Using the stats and boot libraries in R perform a cross-validation experiment to observe the bias variance tradeoff. You’ll use the auto data set from previous assignments. This dataset has 392 observations across 5 variables. We want to fit a polynomial model of various degrees using the glm function in R and then measure the cross validation error using cv.glm function. Fit various polynomial models to compute mpg as a function of the other four variables acceleration, weight, horsepower, and displacement using glm function. For example: glm.fit=glm(mpg~poly(disp+hp+wt+acc,2), data=auto) cv.err5[2]=cv.glm(auto,glm.fit,K=5)$delta[1] will fit a 2nd degree polynomial function between mpg and the remaining 4 variables and perform 5 iterations of cross-validations. This result will be stored in a cv.err5 array. cv.glm returns the estimated cross validation error and its adjusted value in a variable called delta. Please see the help on cv.glm to see more information. Once you have fit the various polynomials from degree 1 to 8, you can plot the crossvalidation error function as degree=1:8 plot(degree,cv.err5,type=‘b’) For you assignment, please create an R-markdown document where you load the auto data set, perform the polynomial fit and then plot the resulting 5 fold cross validation curve. Your output should show the characteristic U-shape illustrating the tradeoff between bias and variance.
library("stats")
library("boot")
autodata <- read.table('auto-mpg.data', col.names = c('displacement', 'horsepower', 'weight', 'acceleration', 'mpg'))
head(autodata)
## displacement horsepower weight acceleration mpg
## 1 307 130 3504 12.0 18
## 2 350 165 3693 11.5 15
## 3 318 150 3436 11.0 18
## 4 304 150 3433 12.0 16
## 5 302 140 3449 10.5 17
## 6 429 198 4341 10.0 15
glmData <-autodata
cvData <- autodata
cv.err <- c()
degree=1:10
for(i in degree)
{
glm.fit=glm(mpg~poly(displacement + horsepower + weight + acceleration,i), data=glmData)
cv.err[i]=cv.glm(cvData,glm.fit,K=5)$delta[1]
}
cv.err
## [1] 18.53532 16.93100 16.78358 17.18662 16.98080 17.20325 16.79852
## [8] 17.09996 17.21927 17.42376