STA581-07 - Elastic Net, Fused Lasso, and Group Lasso
A. Elastic Net
library(glmnet)
## Loading required package: Matrix
## Loaded glmnet 4.1-1
Data yang digunakan sebagai ilustrasi adalah data hiv
, yang bisa didownload pada link berikut link
load("D:/GDrive/02-Kuliah/STA581-Sains Data/GENAP/Bahan/hiv.rda")
class(hiv.train) # The data are stored as a list
## [1] "list"
names(hiv.train) # The names of the list elements are x and y
## [1] "x" "y"
dim(hiv.train$x) # The explanatory data consists of 704 observations of 208 binary mutation variables
## [1] 704 208
Fitting model menggunakan elastic net dapat dilakukan dengan fungsi glmnet()
. Berikut adalah contoh pemodelan elastic net dengan \(\alpha=0.2\), ini dapat pula ditentukan berdasarkan validasi silang.
<- glmnet(hiv.train$x,hiv.train$y, alpha=0.2) #Elastic Net penalty
fit plot(fit)
Berikut ini adalah contoh syntax untuk melakukan validasi silang dengan fungsi cv.glmnet()
pada berbagai nilai α.
for (i in 0:10) {
assign(paste("fit", i, sep=""), cv.glmnet(hiv.train$x,hiv.train$y, type.measure="mse", alpha=i/10))
}
Selanjutnya dapat dihitung Mean Squared Error (MSE) dari prediksi menggunakan data uji, seperti berikut ini.
<- predict(fit0, s=fit0$lambda.1se, newx=hiv.test$x)
yhat0 <- predict(fit1, s=fit1$lambda.1se, newx=hiv.test$x)
yhat1 <- predict(fit2, s=fit2$lambda.1se, newx=hiv.test$x)
yhat2 <- predict(fit3, s=fit3$lambda.1se, newx=hiv.test$x)
yhat3 <- predict(fit4, s=fit4$lambda.1se, newx=hiv.test$x)
yhat4 <- predict(fit5, s=fit5$lambda.1se, newx=hiv.test$x)
yhat5 <- predict(fit6, s=fit6$lambda.1se, newx=hiv.test$x)
yhat6 <- predict(fit7, s=fit7$lambda.1se, newx=hiv.test$x)
yhat7 <- predict(fit8, s=fit8$lambda.1se, newx=hiv.test$x)
yhat8 <- predict(fit9, s=fit9$lambda.1se, newx=hiv.test$x)
yhat9 <- predict(fit10, s=fit10$lambda.1se, newx=hiv.test$x)
yhat10
<- mean((hiv.test$y - yhat0)^2)
mse0 <- mean((hiv.test$y - yhat1)^2)
mse1 <- mean((hiv.test$y - yhat2)^2)
mse2 <- mean((hiv.test$y - yhat3)^2)
mse3 <- mean((hiv.test$y - yhat4)^2)
mse4 <- mean((hiv.test$y - yhat5)^2)
mse5 <- mean((hiv.test$y - yhat6)^2)
mse6 <- mean((hiv.test$y - yhat7)^2)
mse7 <- mean((hiv.test$y - yhat8)^2)
mse8 <- mean((hiv.test$y - yhat9)^2)
mse9 <- mean((hiv.test$y - yhat10)^2) mse10
<-seq(0,1,by=0.1)
alpha<-c(mse0,mse1,mse2,mse3,mse4,mse5,mse6,mse7,mse8,mse9,mse10)
msecbind(alpha,mse)
## alpha mse
## [1,] 0.0 0.10400213
## [2,] 0.1 0.07609390
## [3,] 0.2 0.07182980
## [4,] 0.3 0.06966223
## [5,] 0.4 0.06902739
## [6,] 0.5 0.06862972
## [7,] 0.6 0.06812711
## [8,] 0.7 0.06896511
## [9,] 0.8 0.06951287
## [10,] 0.9 0.07124925
## [11,] 1.0 0.06742550
Ternyata MSE terkecil diperoleh ketika digunakan nilai α=0.6. Artinya, berdasarkan kriteria tersebut, pendekatan Elastic Net lebih baik dibandingkan dengan lasso dan ridge.
Prediksi dapat dilakukan dengan predict()
. Berikut ini juga diperlihatkan plot antara hasil cross validation dan prediksi.
<- glmnet(hiv.train$x,hiv.train$y, alpha=0.9) #Elastic Net penalty
fit <- cv.glmnet(hiv.train$x,hiv.train$y, alpha=0.9) #10-fold cross-validation
cv.fit plot(cv.fit)
<- predict(fit, hiv.test$x) #predict the test data
pred.y <- apply((pred.y - hiv.test$y)^2,2,mean)
mean.test.error lines(log(fit$lambda), mean.test.error, col="blue",pch="*")
legend("topleft",legend=c("10-fold Cross Validation","Test HIV Data"),pch="*",
col=c("red","blue"), cex=0.75)
B. Group Lasso
Sebagai ilustrasi, kita akan menggunakan data bardet
, yang merupakan ekspresi gen (20 gen dari 120 sampel). Peubah respon (y) adalah level expression dari gen TRIM32, yang menyebabkan sindrom Bardet-Biedl. Data diperoleh dari Scheetz et al. (2006).
<-read.csv("https://raw.githubusercontent.com/raoy/data/master/bardet2")
bardetdim(bardet)
## [1] 120 101
Terdapat beberapa package yang dapat digunakan untuk melakukan pemodelan dengan regularisasi group lasso, di antaranya adalah grpreg
, grplasso
, oem
. Selanjutnya package yang akan digunakan adalah package grpreg
. Sebelum melakukan pemodelan, perlu ditentukan terlebih dahulu pengelompokkan peubah penjelas yang akan digunakan.
library(grpreg)
# define group index
<- rep(1:20,each=5)
group1 group1
## [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5
## [26] 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10 10 10 10 10
## [51] 11 11 11 11 11 12 12 12 12 12 13 13 13 13 13 14 14 14 14 14 15 15 15 15 15
## [76] 16 16 16 16 16 17 17 17 17 17 18 18 18 18 18 19 19 19 19 19 20 20 20 20 20
Setelah itu, kita dapat menggunakan fungsi grpreg untuk melakukan pemodelan dengan pendekatan group lasso.
<- grpreg(bardet[,-1], bardet$Y,group1,penalty="grLasso")
fit plot(fit)
Warna yang berbeda menunjukkan group yang berbeda pula. Untuk mengetahui lebih spesifik nilai dari penduga koefisien, dapat digunakan fungsi coef(). Misalnya kita mengetahui nilai penduga koefisien model pada saat lambda=0.03.
coef(fit, lambda=0.03)
## (Intercept) X1 X2 X3 X4
## 8.1438890477 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X5 X6 X7 X8 X9
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X10 X11 X12 X13 X14
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X15 X16 X17 X18 X19
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X20 X21 X22 X23 X24
## 0.0000000000 -0.0184176366 -0.0096718123 -0.0707926448 -0.0066209284
## X25 X26 X27 X28 X29
## -0.1819705554 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X30 X31 X32 X33 X34
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X35 X36 X37 X38 X39
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X40 X41 X42 X43 X44
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X45 X46 X47 X48 X49
## 0.0000000000 0.0203444929 0.0742885085 0.0709400111 0.0848818199
## X50 X51 X52 X53 X54
## 0.0718342645 0.1102619926 0.0686899478 0.1064836267 0.1173297411
## X55 X56 X57 X58 X59
## 0.1002532155 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X60 X61 X62 X63 X64
## 0.0000000000 0.1392457931 0.0998493873 0.1169540317 0.1252844903
## X65 X66 X67 X68 X69
## 0.1640337431 0.0004270203 0.0015217266 0.0015146601 0.0018883848
## X70 X71 X72 X73 X74
## 0.0014696660 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X75 X76 X77 X78 X79
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X80 X81 X82 X83 X84
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X85 X86 X87 X88 X89
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X90 X91 X92 X93 X94
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X95 X96 X97 X98 X99
## 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## X100
## 0.0000000000
Perhatikan pada output terlihat cukup banyak peubah yang memiliki koefisien nol. Cobalah jika Anda periksa seperti apa koefisien model jika lambda bernilai 0.01?
Umumnya, kita dapat menggunakan cross validation untuk menentukan parameter yang optimal.
<-cv.grpreg(bardet[,-1], bardet$Y,group1,penalty="grLasso")
cvfitplot(cvfit)
summary(cvfit)
## grLasso-penalized linear regression with n=120, p=100
## At minimum cross-validation error (lambda=0.0095):
## -------------------------------------------------
## Nonzero coefficients: 45
## Nonzero groups: 9
## Cross-validation error of 0.01
## Maximum R-squared: 0.61
## Maximum signal-to-noise ratio: 1.54
## Scale estimate (sigma) at lambda.min: 0.090
Koefisien pada saat lambda meminimumkan cross-validation error dapat diperoleh dengan fungsi coef().
coef(cvfit)
## (Intercept) X1 X2 X3 X4 X5
## 7.879572508 -0.005005817 -0.008519942 -0.000718317 -0.018101787 -0.018872884
## X6 X7 X8 X9 X10 X11
## 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## X12 X13 X14 X15 X16 X17
## 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## X18 X19 X20 X21 X22 X23
## 0.000000000 0.000000000 0.000000000 -0.017196640 0.002960264 -0.057029136
## X24 X25 X26 X27 X28 X29
## -0.017210694 -0.114548736 0.000000000 0.000000000 0.000000000 0.000000000
## X30 X31 X32 X33 X34 X35
## 0.000000000 -0.006342348 0.034353982 0.027131325 0.040785231 0.028409820
## X36 X37 X38 X39 X40 X41
## 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## X42 X43 X44 X45 X46 X47
## 0.000000000 0.000000000 0.000000000 0.000000000 -0.054924666 0.123656714
## X48 X49 X50 X51 X52 X53
## 0.101302289 0.120669993 0.098134405 0.139935286 0.142638763 0.208805634
## X54 X55 X56 X57 X58 X59
## 0.262704889 0.174710112 0.000000000 0.000000000 0.000000000 0.000000000
## X60 X61 X62 X63 X64 X65
## 0.000000000 0.288652483 0.245996772 0.260940338 0.291623201 0.430000541
## X66 X67 X68 X69 X70 X71
## 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.013354869
## X72 X73 X74 X75 X76 X77
## 0.004054467 0.013770985 0.006908622 0.012286604 0.000000000 0.000000000
## X78 X79 X80 X81 X82 X83
## 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## X84 X85 X86 X87 X88 X89
## 0.000000000 0.000000000 -0.043718871 -0.027069955 -0.083466946 0.116490444
## X90 X91 X92 X93 X94 X95
## -0.097636203 -0.007785767 0.011527997 -0.092468176 0.175382021 -0.114007348
## X96 X97 X98 X99 X100
## 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
Perhatikan pula pada output di atas bahwa koefisien yang tidak nol dan yang nol sesuai dengan group yang telah ditentukan sebelumnya. Artinya, jika koefisien suatu peubah bernilai nol, maka ini akan berlaku untuk satu group peubah penjelas.
$lambda.min cvfit
## [1] 0.009478626
Berikut ini adalah ilustrasi untuk melakukan prediksi menggunakan model yang telah dibangun sebelumnya.
predict(fit, as.matrix(bardet[1,-1]), type="response", lambda=cvfit$lambda.min)
## [1] 8.374372
C. Fused Lasso
Terdapat beberapa package yang dapat digunakan untuk pemodelan dengan fused lasso, beberapa diantaranya adalah package genlasso
, lqa
, HDPenReg
, extlasso
, dan SQOPT
. Pada ilustrasi yang digunakan saat ini akan digunakan package extlasso
.
library(extlasso)
Sebagai ilustrasi, akan digunakan data simulasi berukuran 50 x 100.
<-read.csv("https://raw.githubusercontent.com/raoy/data/master/sim.csv")
simdim(sim)
## [1] 50 101
Pemodelan menggunakan fused lasso dapat dilakukan dengan fungsi fusedlasso
.
<-fusedlasso(as.matrix(sim[,-1]), as.matrix(sim[,1]),lambda1=0.1,lambda2=1)
f1coef(f1)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 0.997937 0.8803637 1.052084 1.01772 1.339677 1.127577 1.027618 1.053068
## [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
## [1,] 1.019508 0.9932071 1.838558 1.896914 1.912778 1.874429 1.901119 0.9001264
## [,17] [,18] [,19] [,20] [,21] [,22] [,23]
## [1,] 0.8474181 0.9143455 0.7735903 0.8006348 0.01230639 0.01441455 0.01234222
## [,24] [,25] [,26] [,27] [,28] [,29]
## [1,] 0.01384604 2.61722 5.585786e-05 -7.911689e-06 -2.990406e-05 -3.559699e-05
## [,30] [,31] [,32] [,33] [,34]
## [1,] -4.434609e-05 -0.0001171155 -0.0001556465 -8.223635e-05 0.0001260992
## [,35] [,36] [,37] [,38] [,39] [,40]
## [1,] 0.0001211622 0.0003083896 0.000341257 0.0298891 0.03309726 0.02766942
## [,41] [,42] [,43] [,44] [,45] [,46] [,47]
## [1,] 0.9050983 0.9150022 0.9788225 1.001846 0.931141 -0.004743413 -0.005370364
## [,48] [,49] [,50] [,51] [,52] [,53]
## [1,] -0.03002758 -0.04784368 -0.05129934 -0.05684536 -0.05441156 0.006660317
## [,54] [,55] [,56] [,57] [,58] [,59]
## [1,] 0.007487174 0.007604095 0.007902436 0.05840701 0.04802768 0.05120171
## [,60] [,61] [,62] [,63] [,64] [,65]
## [1,] 0.05329837 0.05054208 0.05001177 0.05686708 0.05702977 -0.0356354
## [,66] [,67] [,68] [,69] [,70]
## [1,] -0.02998434 -0.03134273 -0.03285301 -0.03779224 -0.0006809816
## [,71] [,72] [,73] [,74] [,75]
## [1,] -0.0002931374 -0.0003214001 -0.000201296 -0.0001023558 -2.648721e-05
## [,76] [,77] [,78] [,79] [,80]
## [1,] -2.767988e-05 -2.201799e-05 -0.0001167154 -0.0005135863 -0.0007512672
## [,81] [,82] [,83] [,84] [,85]
## [1,] -0.0005629916 -0.0005212105 -0.000427499 -0.0003293806 -0.000254109
## [,86] [,87] [,88] [,89] [,90]
## [1,] -0.0002564317 -0.0003231959 -0.0003530182 -0.0003531786 -0.0003155528
## [,91] [,92] [,93] [,94] [,95]
## [1,] -0.0001762505 -4.664911e-05 0.0001587278 0.0001215214 0.0002798337
## [,96] [,97] [,98] [,99] [,100]
## [1,] 0.02666538 0.03329582 0.02483119 0.02758941 0.03957153
## attr(,"scaled:scale")
## X1 X2 X3 X4 X5 X6 X7 X8
## 6.939902 7.866348 6.581477 6.803221 5.167865 6.140369 6.737816 6.575499
## X9 X10 X11 X12 X13 X14 X15 X16
## 6.793027 6.973019 7.477864 7.274640 7.214235 6.415380 6.325191 6.414886
## X17 X18 X19 X20 X21 X22 X23 X24
## 6.813532 6.314449 6.494148 6.273874 7.672459 6.523397 7.620340 6.789309
## X25 X26 X27 X28 X29 X30 X31 X32
## 8.161741 7.834052 6.530196 6.736080 7.299615 7.362834 6.807923 6.827046
## X33 X34 X35 X36 X37 X38 X39 X40
## 6.390859 6.473684 6.889220 7.322770 8.037910 7.068715 6.402240 7.704442
## X41 X42 X43 X44 X45 X46 X47 X48
## 6.218902 6.607825 6.894281 6.735905 7.246899 7.656877 6.835065 6.335380
## X49 X50 X51 X52 X53 X54 X55 X56
## 7.665685 7.150558 6.440146 6.716975 6.822126 6.399245 6.710955 6.603881
## X57 X58 X59 X60 X61 X62 X63 X64
## 5.857500 7.171870 6.753479 6.486864 6.847070 6.916320 6.084934 6.059492
## X65 X66 X67 X68 X69 X70 X71 X72
## 6.717182 7.996874 7.637940 7.240682 6.264713 5.661580 8.033317 6.724174
## X73 X74 X75 X76 X77 X78 X79 X80
## 7.334116 6.455271 6.946565 7.253525 6.664404 6.011731 6.884126 5.599166
## X81 X82 X83 X84 X85 X86 X87 X88
## 7.521062 6.636285 6.991143 6.718125 7.692717 7.080031 6.270202 7.223226
## X89 X90 X91 X92 X93 X94 X95 X96
## 7.100539 8.695310 7.671459 7.179124 7.224380 7.051190 7.455627 7.737688
## X97 X98 X99 X100
## 6.195473 8.273287 7.441367 5.179346
str(f1)
## List of 7
## $ beta0 : num [1, 1] -0.0149
## $ coef : num [1, 1:100] 0.998 0.88 1.052 1.018 1.34 ...
## ..- attr(*, "scaled:scale")= Named num [1:100] 6.94 7.87 6.58 6.8 5.17 ...
## .. ..- attr(*, "names")= chr [1:100] "X1" "X2" "X3" "X4" ...
## $ lambda1 : num 0.1
## $ lambda2 : num 1
## $ L1norm : num 32.7
## $ lambda.iter: int 15572
## $ of.value : num 109
## - attr(*, "class")= chr "extlasso"
Penjelasan selengkapnya tentang pendekatan fused lasso dapat diperoleh pada Tibshirani et al.(2005).
Refences
Breheny P. (n.d.). Getting started with grpreg. GitHub Pages. https://pbreheny.github.io/grpreg/articles/getting-started.html
Hastie T. (2013, May 9). glmnet: Lasso and elastic-net regularization in R. Revolutions. https://blog.revolutionanalytics.com/2013/05/hastie-glmnet.html
Post J. (2014, September 29). LASSO, Ridge, and Elastic Net. https://www4.stat.ncsu.edu/~post/josh/LASSO_Ridge_Elastic_Net_-_Examples.html
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1), 91-108. https://doi.org/10.1111/j.1467-9868.2005.00490.x
Tripathy A. (2013, July 14). Regularization – Predictive Modeling Beyond Ordinary Least Squares Fit. ShatterLine Blog. https://shatterline.com/blog/2013/07/
Xiaotong C., Chen G., & Chong W. (n.d.). Statistical Learning and Data Mining Codes. Biostatistics - Academic Divisions - School of Public Health - University of Minnesota. https://www.biostat.umn.edu/~weip/course/dm/examples/
IPB University, rahmaanisa@apps.ipb.ac.id︎↩︎
IPB University, gdito@apps.ipb.ac.id↩︎
Badan Informasi Geospasial, abdul.aziz@big.go.id↩︎