The questions for this week’s assignment are 9d and 11 from ISLR, see references for citation.
The chart below shows the lasso MSE in comparison with the liniar model and ridge regression MSE for the College data set.
## [1] 1200910
##
## Call:
## lm(formula = Apps ~ ., data = College.train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5189.4 -385.6 12.1 295.4 7845.2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.723e+02 5.599e+02 -0.844 0.39946
## PrivateYes -5.327e+02 1.951e+02 -2.730 0.00663 **
## Accept 1.694e+00 4.703e-02 36.030 < 2e-16 ***
## Enroll -1.339e+00 2.566e-01 -5.218 3.02e-07 ***
## Top10perc 4.944e+01 7.777e+00 6.357 6.05e-10 ***
## Top25perc -1.626e+01 6.317e+00 -2.574 0.01045 *
## F.Undergrad 9.878e-02 4.768e-02 2.072 0.03900 *
## P.Undergrad -4.058e-03 5.214e-02 -0.078 0.93800
## Outstate -9.403e-02 2.550e-02 -3.688 0.00026 ***
## Room.Board 2.064e-01 6.704e-02 3.079 0.00223 **
## Books 1.642e-01 3.904e-01 0.421 0.67424
## Personal 3.746e-02 8.460e-02 0.443 0.65814
## PhD -8.184e+00 6.908e+00 -1.185 0.23687
## Terminal -2.250e+00 7.405e+00 -0.304 0.76137
## S.F.Ratio 2.031e+01 1.741e+01 1.167 0.24407
## perc.alumni 2.403e+00 5.734e+00 0.419 0.67539
## Expend 6.168e-02 1.451e-02 4.250 2.71e-05 ***
## Grad.Rate 6.578e+00 3.951e+00 1.665 0.09684 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1013 on 370 degrees of freedom
## Multiple R-squared: 0.9432, Adjusted R-squared: 0.9406
## F-statistic: 361.7 on 17 and 370 DF, p-value: < 2.2e-16
## [1] 1200546
## [1] 1200861
The calculated MSE for the lasso method is 1.200860710^{6}.
The following is setup to run several different methods to be able to compare.
To evaluate Model performance, it is interesting to plot the MSE:
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2017). An introduction to statistical learning (8th ed.). Stanford, CA: Springer.