Ridge Regression Method

Ridge regression is a way of creating a model when predictors exceed the number of observations or when a data has high correlations among each other. Ridge regression penalizes the model if a predictor is less significant and thus avoids over fitting. It uses ridge estimator as a shrinkage estimator that shrinks the parameter.

In the following example, I am going to create a ridge regression model through preparing and splitting dataset first and then apply ridge regression method on training data and test data eventually. Through tuning the model, lambda of 6 gave a best RMSE value of 13.47491 on training and RMSE value of 10.77 on test data. X245, X244 and X253 seems to be the most important predictor.

library(AppliedPredictiveModeling)
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
data(permeability)

# Preparation
low_frequency <- nearZeroVar(fingerprints) # low frequencies using nearZeroVar function
X <- fingerprints[,-low_frequency] # Removing the low frequencies
print(paste0(dim(X)[2], " columns are left after removing 719 columns using nearZeroVar function"))
## [1] "388 columns are left after removing 719 columns using nearZeroVar function"
# Splitting the data into training and test
splitt <- createDataPartition(permeability, p=0.8, list=FALSE)

# Training
X_train <- X[splitt, ]
y_train <- permeability[splitt, ]

# Test
X_test <- X[-splitt, ]
y_test <- permeability[-splitt, ]

# Reidge Method Fit
ridge_fit <- train(X_train, y_train, method='ridge', metric='Rsquared',
                   tuneGrid = data.frame(.lambda= seq(0,1, by=0.1)),
                   trControl = trainControl(method = 'cv'), preProcess = c('center','scale'))
## Warning: model fit failed for Fold06: lambda=0.0 Error in if (zmin < gamhat) { : missing value where TRUE/FALSE needed
## Warning: model fit failed for Fold09: lambda=0.0 Error in if (zmin < gamhat) { : missing value where TRUE/FALSE needed
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
## There were missing values in resampled performance measures.
ridge_fit
## Ridge Regression 
## 
## 133 samples
## 388 predictors
## 
## Pre-processing: centered (388), scaled (388) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 120, 119, 120, 119, 120, 120, ... 
## Resampling results across tuning parameters:
## 
##   lambda  RMSE          Rsquared   MAE         
##   0.0     2.719855e+15  0.2900293  7.543520e+14
##   0.1     1.239681e+01  0.4812332  9.491152e+00
##   0.2     1.223696e+01  0.5061374  9.436151e+00
##   0.3     1.241176e+01  0.5182994  9.646498e+00
##   0.4     1.275267e+01  0.5255242  9.947672e+00
##   0.5     1.317413e+01  0.5307187  1.029576e+01
##   0.6     1.368476e+01  0.5343161  1.067867e+01
##   0.7     1.425292e+01  0.5369683  1.109728e+01
##   0.8     1.486969e+01  0.5389557  1.155213e+01
##   0.9     1.552540e+01  0.5404674  1.209416e+01
##   1.0     1.621485e+01  0.5416396  1.269090e+01
## 
## Rsquared was used to select the optimal model using the largest value.
## The final value used for the model was lambda = 1.
# Plot
plot(ridge_fit)

# important variables
varImp(ridge_fit)
## loess r-squared variable importance
## 
##   only 20 most important variables shown (out of 388)
## 
##      Overall
## X254  100.00
## X253  100.00
## X240  100.00
## X239  100.00
## X246  100.00
## X244  100.00
## X157  100.00
## X245  100.00
## X129   76.47
## X266   66.08
## X262   66.08
## X260   63.21
## X265   63.21
## X255   61.69
## X247   61.69
## X235   55.56
## X372   55.30
## X373   55.30
## X138   53.79
## X133   53.79
# Checking the accuracy on test dataset
postResample(predict(ridge_fit, X_test), obs=y_test)
##       RMSE   Rsquared        MAE 
## 18.7840977  0.4729294 15.4331342