Using the stats and boot libraries in R perform a cross-validation experiment to observethe bias variance tradeoff. You’ll use the auto data set from previous assignments. This dataset has 392 observations across 5 variables.

library(stats)
library(boot)

We want to fit a polynomial model ofvarious degrees using the glm function in R and then measure the cross validation errorusing cv.glm function.Fit various polynomial models to compute mpg as a function of the other four variables acceleration, weight, horsepower, and displacement using glm function. For example: glm.fit=glm(mpg~poly(disp+hp+wt+acc,2), data=auto) cv.err5[2]=cv.glm(auto,glm.fit,K=5)$delta[1] will fit a 2nd degree polynomial function between mpg and the remaining 4 variables and perform 5 iterations of cross-validations. This result will be stored in a cv.err5 array. cv.glm returns the estimated cross validation error and its adjusted value in a variable called delta.

# Run the polynomial model of degree 1-8
cv_err <- matrix(ncol=1, nrow=8)
for(i in 1:8){
  glm_fit<-glm(mpg~poly(disp+hp+wt+acc,i), data=auto_data)
  cv_err[i,] <- cv.glm(auto_data,glm_fit,K=5)$delta[1]
  # store the cross validation error in array cv_err
}

Once you have fit the various polynomials from degree 1 to 8, you can plot the crossvalidation error function as degree=1:8plot(degree,cv.err5,type=’b’)

# Plot the cross validation error
title<-('Est cross validation error for ploynomials of degree 1-8')
plot(deg = 1:8, cv_err, type='b',ylab='Est MSE', xlab='Polynomial Degree',             main=title,col='blue')
## Warning in plot.window(...): "deg" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "deg" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "deg" is not a
## graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "deg" is not a
## graphical parameter
## Warning in box(...): "deg" is not a graphical parameter
## Warning in title(...): "deg" is not a graphical parameter

Conclusion

We conclude the best bias variance tradeoff is when the degree of polynomial model is 2 or 3. When degree of polynomial model of degree 1 has high bias and all other degrees(2 to 8) have low bias