LASSO Regression
Least Absolute Shrinkage and Selection Operator (LASSO) creates a regression model that is penalized with the L1-norm which is the sum of the absolute coefficients. This has the effect of shrinking coefficient values (and the complexity of the model), allowing some with a minor affect to the response to become zero.
# load the package
library(lars)
## Loaded lars 1.2
# load data
data(longley)
x <- as.matrix(longley[,1:6])
y <- as.matrix(longley[,7])
# fit model
fit <- lars(x, y, type="lasso")
# summarize the fit
summary(fit)
## LARS/LASSO
## Call: lars(x = x, y = y, type = "lasso")
## Df Rss Cp
## 0 1 185.009 1976.7120
## 1 2 6.642 59.4712
## 2 3 3.883 31.7832
## 3 4 3.468 29.3165
## 4 5 1.563 10.8183
## 5 4 1.339 6.4068
## 6 5 1.024 5.0186
## 7 6 0.998 6.7388
## 8 7 0.907 7.7615
## 9 6 0.847 5.1128
## 10 7 0.836 7.0000
# select a step with a minimum error
best_step <- fit$df[which.min(fit$RSS)]
# make predictions
predictions <- predict(fit, x, s=best_step, type="fit")$fit
# summarize accuracy
rmse <- mean((y - predictions)^2)
print(rmse)
## [1] 0.06400169
Ridge Regression
Ridge Regression creates a linear regression model that is penalized with the L2-norm which is the sum of the squared coefficients. This has the effect of shrinking the coefficient values (and the complexity of the model) allowing some coefficients with minor contribution to the response to get close to zero.
# load the package
library(glmnet)
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-5
# load data
data(longley)
x <- as.matrix(longley[, 1:6])
y <- as.matrix(longley[, 7])
# fit model
fit <- glmnet(x, y, family="gaussian", alpha=0, lambda=0.001)
# summarize the fit
summary(fit)
## Length Class Mode
## a0 1 -none- numeric
## beta 6 dgCMatrix S4
## df 1 -none- numeric
## dim 2 -none- numeric
## lambda 1 -none- numeric
## dev.ratio 1 -none- numeric
## nulldev 1 -none- numeric
## npasses 1 -none- numeric
## jerr 1 -none- numeric
## offset 1 -none- logical
## call 6 -none- call
## nobs 1 -none- numeric
# make predictions
predictions <- predict(fit, x, type="link")
# summarize accuracy
rmse <- mean((y - predictions)^2)
print(rmse)
## [1] 0.05919831
Elastic Net Regression
Elastic Net creates a regression model that is penalized with both the L1-norm and L2-norm. This has the effect of effectively shrinking coefficients (as in ridge regression) and setting some coefficients to zero (as in LASSO).
# load the package
library(glmnet)
# load data
data(longley)
x <- as.matrix(longley[, 1:6])
y <- as.matrix(longley[, 7])
# fit model
fit <- glmnet(x, y, family="gaussian", alpha=0.5, lambda=0.001)
# summarize the fit
summary(fit)
## Length Class Mode
## a0 1 -none- numeric
## beta 6 dgCMatrix S4
## df 1 -none- numeric
## dim 2 -none- numeric
## lambda 1 -none- numeric
## dev.ratio 1 -none- numeric
## nulldev 1 -none- numeric
## npasses 1 -none- numeric
## jerr 1 -none- numeric
## offset 1 -none- logical
## call 6 -none- call
## nobs 1 -none- numeric
# make predictions
predictions <- predict(fit, x, type="link")
# summarize accuracy
rmse <- mean((y - predictions)^2)
print(rmse)
## [1] 0.0590839