The questions for this week’s assignment are 9d and 11 from ISLR, see references for citation.

Week 9 Review for comparison of week 10 results:

Week 10

Problem 9 - D

The chart below shows the lasso MSE in comparison with the liniar model and ridge regression MSE for the College data set.

## [1] 1200910
## 
## Call:
## lm(formula = Apps ~ ., data = College.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5189.4  -385.6    12.1   295.4  7845.2 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4.723e+02  5.599e+02  -0.844  0.39946    
## PrivateYes  -5.327e+02  1.951e+02  -2.730  0.00663 ** 
## Accept       1.694e+00  4.703e-02  36.030  < 2e-16 ***
## Enroll      -1.339e+00  2.566e-01  -5.218 3.02e-07 ***
## Top10perc    4.944e+01  7.777e+00   6.357 6.05e-10 ***
## Top25perc   -1.626e+01  6.317e+00  -2.574  0.01045 *  
## F.Undergrad  9.878e-02  4.768e-02   2.072  0.03900 *  
## P.Undergrad -4.058e-03  5.214e-02  -0.078  0.93800    
## Outstate    -9.403e-02  2.550e-02  -3.688  0.00026 ***
## Room.Board   2.064e-01  6.704e-02   3.079  0.00223 ** 
## Books        1.642e-01  3.904e-01   0.421  0.67424    
## Personal     3.746e-02  8.460e-02   0.443  0.65814    
## PhD         -8.184e+00  6.908e+00  -1.185  0.23687    
## Terminal    -2.250e+00  7.405e+00  -0.304  0.76137    
## S.F.Ratio    2.031e+01  1.741e+01   1.167  0.24407    
## perc.alumni  2.403e+00  5.734e+00   0.419  0.67539    
## Expend       6.168e-02  1.451e-02   4.250 2.71e-05 ***
## Grad.Rate    6.578e+00  3.951e+00   1.665  0.09684 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1013 on 370 degrees of freedom
## Multiple R-squared:  0.9432, Adjusted R-squared:  0.9406 
## F-statistic: 361.7 on 17 and 370 DF,  p-value: < 2.2e-16
## [1] 1200546
## [1] 1200861

The calculated MSE for the lasso method is 1.200860710^{6}.


Problem 11

A:

The following is setup to run several different methods to be able to compare.

To evaluate Model performance, it is interesting to plot the MSE:

B:

The best subset method came out with the highest MSE. The LASSO and Ridge regression methods were both ~equivalent, but the Ridge regression was slightly lower. The Ridge Regression model was selected.

C:

No dimension reduction from the Ridge Regression.


References

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2017). An introduction to statistical learning (8th ed.). Stanford, CA: Springer.

EndNotes