A. Ridge Regression
library(glmnet)
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-16
library(ISLR)
x<-model.matrix(Salary~., data=Hitters)[,-1]
sum(is.na(Hitters$Salary))
## [1] 59
y<-na.omit(Hitters$Salary)
- Conduct regularization with Hitters data. ‘Salary’ is our dependent variable
- model.matrix() : Creates design matrix. Transforms categorical variable to dummy variables.
- Link : Regression Design Matrix => ‘fomula’, ‘model.matrix’
- Exclude fist column of ‘Hitters’ data which is names of each Hitters.
- na.omit(Hitters$Salary) : Get rid of NA values before conducting regularization
grid<-10^seq(10, -2, length=100)
ridge.mod<-glmnet(x, y, alpha=0, lambda=grid, standardize=FALSE)
dim(coef(ridge.mod))
## [1] 20 100
- alpha = 0 to conduct ridge regression. alpha=1 to conduct Lasso.
- grid : Sequence of possible lambda(smaller grid result in ridge regression coefficients similart to that from least squares method)
- ‘standardize’ argument : Ridge regression and lasso are not scale variant as least squares method. ‘standardize’ argument is for standardizing scales for each predictors by adjusting their standard deviation to 1.
- ‘glmnet’ yields a 20*100 matrix. 20 rows correspond to 19 predictors + intercept. 100 columns for each value of lambda
(ridge.mod$lambda[50]==grid[50])
## [1] TRUE
(round(ridge.mod$lambda[50]))
## [1] 11498
coef(ridge.mod)[,50]
## (Intercept) AtBat Hits HmRun Runs
## 104.38902910 -1.66435939 5.33480831 -0.06859588 0.11401450
## RBI Walks Years CAtBat CHits
## 0.55372167 4.62295068 -0.05020915 -0.25901652 0.51066051
## CHmRun CRuns CRBI CWalks LeagueN
## 0.25163392 1.21728524 0.61983092 -0.57672473 0.41004271
## DivisionW PutOuts Assists Errors NewLeagueN
## -1.08214727 0.29365870 0.34306579 -1.59745909 0.29541082
sqrt(sum(coef(ridge.mod)[-1,50]^2))
## [1] 7.726629
- Regression coefficients should be smaller with larger lambda, thus bigger shrinkage penalty.
- ‘sqrt(sum(coef(ridge.mod)[-1,50]^2))’ measures ell 2 norm
round(ridge.mod$lambda[60])
## [1] 705
coef(ridge.mod)[,60]
## (Intercept) AtBat Hits HmRun Runs
## 123.363539847 -2.098526582 7.356526935 1.741212494 -1.670398714
## RBI Walks Years CAtBat CHits
## 0.055540776 5.921229148 -1.164624039 -0.201686479 0.210091798
## CHmRun CRuns CRBI CWalks LeagueN
## -0.002858872 1.495860845 0.718620328 -0.753891109 5.372857892
## DivisionW PutOuts Assists Errors NewLeagueN
## -15.239439937 0.292496398 0.383770604 -2.833804744 3.585811274
sqrt(sum(coef(ridge.mod)[-1,60]^2))
## [1] 19.6571
- As lambda decreased(smaller shrinkage penalty) ell 2 norm increased.
predict(ridge.mod, s=50, type="coefficients")[1:20,]
## (Intercept) AtBat Hits HmRun Runs RBI
## 147.7502425 -2.0383105 7.5228185 3.5411339 -2.2167514 -0.6616756
## Walks Years CAtBat CHits CHmRun CRuns
## 6.1302988 -2.2594989 -0.1865956 0.1479243 -0.1432158 1.5164526
## CRBI CWalks LeagueN DivisionW PutOuts Assists
## 0.7912658 -0.7968943 25.8294222 -79.1662310 0.2856645 0.3810299
## Errors NewLeagueN
## -3.1539824 5.6526915
set.seed(1)
train<-sample(1:263, nrow(x)/2)
test=(-train)
y.test<-y[test]
ridge.mod<-glmnet(x[train,],y[train], alpha=0, lambda=grid, thresh=1e-12)
ridge.pred<-predict(ridge.mod, s=4, newx=x[test,])
mean((ridge.pred-y.test)^2)
## [1] 101036.8
mean((mean(y[train])-y.test)^2)
## [1] 193253.1
ridge.pred<-predict(ridge.mod, s=1e10, newx=x[test,])
mean((ridge.pred-y.test)^2)
## [1] 193253.1
- 1e10 refers to 10^10
- Fitting with lambda = 4 had yield much smaller test MSE compared to when lambda=10^10. This implies that ridge regression obtain smaller variance of regression coefficients with sacrifice of test MSE
- Use argument ‘exact=T’ within predict() function to set lambda=0. The result, however, yeilds slighty different output from least squares method because glmnet approximates numerical values in some parts.
ridge.pred<-predict(ridge.mod, newx=x[test,], exact=T)
mean((ridge.pred-y.test)^2)
## [1] 149173.9
lm(y~x, subset=train)
##
## Call:
## lm(formula = y ~ x, subset = train)
##
## Coefficients:
## (Intercept) xAtBat xHits xHmRun xRuns
## 299.42849 -2.54027 8.36682 11.64512 -9.09923
## xRBI xWalks xYears xCAtBat xCHits
## 2.44105 9.23440 -22.93673 -0.18154 -0.11598
## xCHmRun xCRuns xCRBI xCWalks xLeagueN
## -1.33888 3.32838 0.07536 -1.07841 59.76065
## xDivisionW xPutOuts xAssists xErrors xNewLeagueN
## -98.86233 0.34087 0.34165 -0.64207 -0.67442
predict(ridge.mod, exact=T, type="coefficients")[1:20,1:10]
## 20 x 10 sparse Matrix of class "dgCMatrix"
## [[ suppressing 10 column names 's0', 's1', 's2' ... ]]
##
## (Intercept) 5.479846e+02 5.479845e+02 5.479844e+02 5.479843e+02
## AtBat 5.192368e-08 6.864003e-08 9.073807e-08 1.199504e-07
## Hits 1.790580e-07 2.367041e-07 3.129088e-07 4.136470e-07
## HmRun 8.926999e-07 1.180096e-06 1.560018e-06 2.062251e-06
## Runs 3.141580e-07 4.152984e-07 5.489999e-07 7.257455e-07
## RBI 3.533162e-07 4.670632e-07 6.174300e-07 8.162059e-07
## Walks 4.732016e-07 6.255446e-07 8.269331e-07 1.093157e-06
## Years 1.198689e-06 1.584595e-06 2.094742e-06 2.769125e-06
## CAtBat 3.981558e-09 5.263384e-09 6.957883e-09 9.197911e-09
## CHits 1.501273e-08 1.984594e-08 2.623516e-08 3.468133e-08
## CHmRun 9.980324e-08 1.319340e-07 1.744089e-07 2.305583e-07
## CRuns 3.232510e-08 4.273187e-08 5.648901e-08 7.467513e-08
## CRBI 2.741340e-08 3.623890e-08 4.790568e-08 6.332848e-08
## CWalks 3.625065e-08 4.792122e-08 6.334902e-08 8.374366e-08
## LeagueN 3.471972e-06 4.589742e-06 6.067369e-06 8.020705e-06
## DivisionW -8.090096e-06 -1.069463e-05 -1.413767e-05 -1.868917e-05
## PutOuts 2.727067e-08 3.605021e-08 4.765626e-08 6.299876e-08
## Assists 1.288616e-08 1.703474e-08 2.251892e-08 2.976869e-08
## Errors 2.804015e-07 3.706743e-07 4.900095e-07 6.477636e-07
## NewLeagueN 3.730442e-06 4.931425e-06 6.519053e-06 8.617804e-06
##
## (Intercept) 5.479842e+02 5.479840e+02 5.479838e+02 5.479835e+02
## AtBat 1.585673e-07 2.096166e-07 2.771007e-07 3.663106e-07
## Hits 5.468169e-07 7.228595e-07 9.555774e-07 1.263217e-06
## HmRun 2.726174e-06 3.603841e-06 4.764064e-06 6.297810e-06
## Runs 9.593926e-07 1.268260e-06 1.676565e-06 2.216319e-06
## RBI 1.078976e-06 1.426342e-06 1.885540e-06 2.492572e-06
## Walks 1.445089e-06 1.910322e-06 2.525332e-06 3.338339e-06
## Years 3.660619e-06 4.839121e-06 6.397030e-06 8.456492e-06
## CAtBat 1.215909e-08 1.607360e-08 2.124835e-08 2.808905e-08
## CHits 4.584667e-08 6.060658e-08 8.011830e-08 1.059116e-07
## CHmRun 3.047844e-07 4.029070e-07 5.326190e-07 7.040906e-07
## CRuns 9.871611e-08 1.304968e-07 1.725091e-07 2.280467e-07
## CRBI 8.371650e-08 1.106683e-07 1.462968e-07 1.933957e-07
## CWalks 1.107042e-07 1.463443e-07 1.934585e-07 2.557406e-07
## LeagueN 1.060290e-05 1.401641e-05 1.852887e-05 2.449408e-05
## DivisionW -2.470598e-05 -3.265985e-05 -4.317440e-05 -5.707401e-05
## PutOuts 8.328065e-08 1.100921e-07 1.455352e-07 1.923890e-07
## Assists 3.935244e-08 5.202160e-08 6.876946e-08 9.090913e-08
## Errors 8.563051e-07 1.131985e-06 1.496416e-06 1.978173e-06
## NewLeagueN 1.139223e-05 1.505985e-05 1.990824e-05 2.631752e-05
##
## (Intercept) 5.479831e+02 5.479825e+02
## AtBat 4.842408e-07 6.401373e-07
## Hits 1.669897e-06 2.207504e-06
## HmRun 8.325329e-06 1.100559e-05
## Runs 2.929842e-06 3.873075e-06
## RBI 3.295031e-06 4.355833e-06
## Walks 4.413086e-06 5.833836e-06
## Years 1.117898e-05 1.477793e-05
## CAtBat 3.713204e-08 4.908633e-08
## CHits 1.400088e-07 1.850833e-07
## CHmRun 9.307656e-07 1.230416e-06
## CRuns 3.014641e-07 3.985174e-07
## CRBI 2.556575e-07 3.379639e-07
## CWalks 3.380738e-07 4.469132e-07
## LeagueN 3.237973e-05 4.280410e-05
## DivisionW -7.544847e-05 -9.973843e-05
## PutOuts 2.543268e-07 3.362049e-07
## Assists 1.201764e-07 1.588660e-07
## Errors 2.615026e-06 3.456907e-06
## NewLeagueN 3.479020e-05 4.599058e-05
set.seed(1)
cv.out<-cv.glmnet(x[train,],y[train], alpha=0)
plot(cv.out)

bestlam<-cv.out$lambda.min
bestlam
## [1] 211.7416
ridge.pred<-predict(ridge.mod, s=bestlam, newx<-x[test,])
mean((ridge.pred-y.test))^2
## [1] 1732.389
out<-glmnet(x,y,alpha=0)
predict(out, type="coefficients", s=bestlam)[1:20,]
## (Intercept) AtBat Hits HmRun Runs
## 9.88487157 0.03143991 1.00882875 0.13927624 1.11320781
## RBI Walks Years CAtBat CHits
## 0.87318990 1.80410229 0.13074383 0.01113978 0.06489843
## CHmRun CRuns CRBI CWalks LeagueN
## 0.45158546 0.12900049 0.13737712 0.02908572 27.18227527
## DivisionW PutOuts Assists Errors NewLeagueN
## -91.63411282 0.19149252 0.04254536 -1.81244470 7.21208394
B. The Lasso
lasso.mod<-glmnet(x[train,],y[train],alpha=1, lambda=grid)
plot(lasso.mod)

- Simple put alpha=1 when using glmnet to conduct Lasso
- Plotting Lasso model shows that depending on lambda some coefficients shrink exactly to 0
set.seed(1)
cv.out<-cv.glmnet(x[train,],y[train], alpha=1)
plot(cv.out)

bestlam<-cv.out$lambda.min
lasso.pred<-predict(lasso.mod, s=bestlam, newx<-x[test,])
mean((lasso.pred-y.test)^2)
## [1] 100743.4
out<-glmnet(x,y,alpha=1, lambda=grid)
lasso.coef<-predict(out, type="coefficients", s=bestlam)[1:20,]
lasso.coef
## (Intercept) AtBat Hits HmRun Runs
## 18.5394844 0.0000000 1.8735390 0.0000000 0.0000000
## RBI Walks Years CAtBat CHits
## 0.0000000 2.2178444 0.0000000 0.0000000 0.0000000
## CHmRun CRuns CRBI CWalks LeagueN
## 0.0000000 0.2071252 0.4130132 0.0000000 3.2666677
## DivisionW PutOuts Assists Errors NewLeagueN
## -103.4845458 0.2204284 0.0000000 0.0000000 0.0000000