R - Regression with Penalties

LASSO Regression
Ridge Regression
Elastic Net Regression

LASSO Regression

Least Absolute Shrinkage and Selection Operator (LASSO) creates a regression model that is penalized with the L1-norm which is the sum of the absolute coefficients. This has the effect of shrinking coefficient values (and the complexity of the model), allowing some with a minor affect to the response to become zero.

# load the package
library(lars)

## Loaded lars 1.2

# load data
data(longley)
x <- as.matrix(longley[,1:6])
y <- as.matrix(longley[,7])

# fit model
fit <- lars(x, y, type="lasso")

# summarize the fit
summary(fit)

## LARS/LASSO
## Call: lars(x = x, y = y, type = "lasso")
##    Df     Rss        Cp
## 0   1 185.009 1976.7120
## 1   2   6.642   59.4712
## 2   3   3.883   31.7832
## 3   4   3.468   29.3165
## 4   5   1.563   10.8183
## 5   4   1.339    6.4068
## 6   5   1.024    5.0186
## 7   6   0.998    6.7388
## 8   7   0.907    7.7615
## 9   6   0.847    5.1128
## 10  7   0.836    7.0000

# select a step with a minimum error
best_step <- fit$df[which.min(fit$RSS)]

# make predictions
predictions <- predict(fit, x, s=best_step, type="fit")$fit

# summarize accuracy
rmse <- mean((y - predictions)^2)
print(rmse)

## [1] 0.06400169

Ridge Regression

Ridge Regression creates a linear regression model that is penalized with the L2-norm which is the sum of the squared coefficients. This has the effect of shrinking the coefficient values (and the complexity of the model) allowing some coefficients with minor contribution to the response to get close to zero.

# load the package
library(glmnet)

## Loading required package: Matrix

## Loading required package: foreach

## Loaded glmnet 2.0-5

# load data
data(longley)

x <- as.matrix(longley[, 1:6])
y <- as.matrix(longley[, 7])

# fit model
fit <- glmnet(x, y, family="gaussian", alpha=0, lambda=0.001)

# summarize the fit
summary(fit)

##           Length Class     Mode   
## a0        1      -none-    numeric
## beta      6      dgCMatrix S4     
## df        1      -none-    numeric
## dim       2      -none-    numeric
## lambda    1      -none-    numeric
## dev.ratio 1      -none-    numeric
## nulldev   1      -none-    numeric
## npasses   1      -none-    numeric
## jerr      1      -none-    numeric
## offset    1      -none-    logical
## call      6      -none-    call   
## nobs      1      -none-    numeric

# make predictions
predictions <- predict(fit, x, type="link")

# summarize accuracy
rmse <- mean((y - predictions)^2)
print(rmse)

## [1] 0.05919831

Elastic Net Regression

Elastic Net creates a regression model that is penalized with both the L1-norm and L2-norm. This has the effect of effectively shrinking coefficients (as in ridge regression) and setting some coefficients to zero (as in LASSO).

# load the package
library(glmnet)

# load data
data(longley)

x <- as.matrix(longley[, 1:6])
y <- as.matrix(longley[, 7])

# fit model
fit <- glmnet(x, y, family="gaussian", alpha=0.5, lambda=0.001)

# summarize the fit
summary(fit)

##           Length Class     Mode   
## a0        1      -none-    numeric
## beta      6      dgCMatrix S4     
## df        1      -none-    numeric
## dim       2      -none-    numeric
## lambda    1      -none-    numeric
## dev.ratio 1      -none-    numeric
## nulldev   1      -none-    numeric
## npasses   1      -none-    numeric
## jerr      1      -none-    numeric
## offset    1      -none-    logical
## call      6      -none-    call   
## nobs      1      -none-    numeric

# make predictions
predictions <- predict(fit, x, type="link")

# summarize accuracy
rmse <- mean((y - predictions)^2)
print(rmse)

## [1] 0.0590839

R - Regression with Penalties

MKumar

June 27, 2016

LASSO Regression

Ridge Regression

Elastic Net Regression