Membangun Model Random Forest dengan mlr3
Pustaka mlr3 di R merupakan salah satu framework yang
lengkap serta fleksibel untuk membangun dan mengevaluasi model machine
learning. Berbeda dengan pendekatan klasik di R seperti
caret atau penggunaan fungsi model langsung seperti
randomForest(), mlr3 menghadirkan struktur
modular dan konsisten untuk seluruh pipeline machine learning, mulai
dari pemilihan data, preprocessing, training, evaluasi, hingga tuning
hyperparameter.
Pada tulisan ini, kita akan fokus memahami bagaimana menggunakan
mlr3 untuk membangun model machine learning. Sebagai
ilustrasi, Random Forest akan digunakan, namun pada prinsipnya
pendekatannya akan dapat diterapkan ke model-model lain dengan
mudah.
Instalasi dan Persiapan
Sebelum mulai, pastikan Anda memiliki R dan RStudio versi terbaru.
Instal paket mlr3 dan beberapa ekstensi penting yang dapat
memperkaya proses pembangunan model. Pada tulisan ini memang tidak semua
paket tersebut akan digunakan namun tentunya pembaca dapat
mengeksplorasi secara mandiri jika diinginkan.
Beberapa paket tersebut yaitu:
mlr3: paket inti dari ekosistem mlr3, menyediakan struktur dasar untuk task, learner, resampling, dan evaluasi.mlr3learners: menyediakan kumpulan learner (algoritma pembelajaran mesin) populer yang dapat langsung digunakan, termasuk Random Forest (classif.ranger).mlr3verse: bundel paket yang sering digunakan bersamamlr3, sepertimlr3pipelines,mlr3tuning,mlr3filters, dan lainnya. Ini memudahkan manajemen dan penggunaan bersama.mlr3viz: menyediakan fungsi visualisasi untuk evaluasi performa model, termasuk plotting ROC curve, confusion matrix, dsb.data.table: digunakan untuk manipulasi data yang efisien. Sangat cepat dan kompatibel denganmlr3, terutama dalam load data dan preprocessing.mlr3tuning: modul untuk melakukan tuning hyperparameter secara sistematis dengan berbagai metode seperti grid search, random search, Bayesian optimization.mlr3pipelines: memungkinkan pembentukan alur preprocessing dan modeling yang kompleks, termasuk penggabungan operasi seperti normalisasi, feature selection, hingga model stacking.paradox: mendefinisikan ruang pencarian hyperparameter (parameter space) dalam bentuk objek yang dapat digunakan olehmlr3tuning
Struktur Dasar mlr3
Ekosistem mlr3 dibangun berdasarkan tiga komponen inti,
meliputi:
Task: merepresentasikan dataset dan tujuan analisis. Dalam konteks supervised learning, task mencakup variabel target dan fitur prediktor.
Learner: objek yang mewakili algoritma machine learning tertentu. Learner memiliki parameter dan fungsi untuk training dan prediksi.
Resampling: metode untuk memvalidasi model. Ini mencakup pembagian data seperti cross-validation, holdout, bootstrap, dan lainnya.
Selain itu, terdapat pula beberapa komponen lain yang juga penting, yaitu:
Measure: metrik evaluasi seperti akurasi, AUC, logloss, dll.
Tuning: mekanisme untuk pencarian hyperparameter optimal.
Pipeline: untuk preprocessing dan alur prediktif kompleks.
Penyiapan Data dan Task
Langkah pertama dalam mlr3 adalah membuat objek
TaskClassif yang merepresentasikan tujuan klasifikasi.
Pastikan target klasifikasi bertipe faktor. Data yang akan digunakan
adalah dataset Heart Failure yang diunduh pada halaman
berikut: Heart
Failure Prediction Dataset.
library(mlr3)
library(mlr3verse)
# loading dataset
data_heart <- read.csv('https://raw.githubusercontent.com/sainsdataid/dataset/main/heart.csv')
# merubah tipe data output menjadi factor
data_heart$HeartDisease <- as.factor(data_heart$HeartDisease)
str(data)## function (..., list = character(), package = NULL, lib.loc = NULL, verbose = getOption("verbose"),
## envir = .GlobalEnv, overwrite = TRUE)
##
## 1 function (..., list = character(), package = NULL, lib.loc = NULL,
## 2 verbose = getOption("verbose"), envir = .GlobalEnv, overwrite = TRUE)
## 3 {
## 4 fileExt <- function(x) {
## 5 db <- grepl("\\\\.[^.]+\\\\.(gz|bz2|xz)$", x)
## 6 ans <- sub(".*\\\\.", "", x)
# Membuat Task model klasifikasi
task <- TaskClassif$new(id = "heart", backend = data_heart, target = "HeartDisease")
task##
## ── <TaskClassif> (918x12) ──────────────────────────────────────────────────────
## • Target: HeartDisease
## • Target classes: 0 (positive class, 45%), 1 (55%)
## • Properties: twoclass
## • Features (11):
## • int (5): Age, Cholesterol, FastingBS, MaxHR, RestingBP
## • chr (5): ChestPainType, ExerciseAngina, RestingECG, ST_Slope, Sex
## • dbl (1): Oldpeak
Penentuan Learner
Langkah berikutnya adalah penentuan algoritma yang digunakan atau
learner. mlr3 memiliki kamus learner yang bisa diakses
melalui paket mlr_learners. Misalkan pada contoh ini, kita
akan menggunakan paket ranger untuk membangun model random
forest.
Validasi dan Evaluasi Model
Langkah evaluasi model dilakukan dengan membagi data menjadi set pelatihan dan pengujian, kemudian melakukan pelatihan pada data latih dan evaluasi pada data uji.
set.seed(42)
# Membagi data menjadi train-test (80-20)
split <- partition(task, ratio = 0.8)
train_set <- split$train
test_set <- split$test
# Latih model pada data latih
learner$train(task, row_ids = train_set)
# Prediksi pada data uji
prediction <- learner$predict(task, row_ids = test_set)
# Evaluasi dengan beberapa metrik
prediction$score(list(
msr("classif.acc"),
msr("classif.auc")
))## classif.acc classif.auc
## 0.8967391 0.9546218
Selain metode holdout (train-test), kita juga dapat mengevaluasi model menggunakan cross-validation (CV), misalnya 5-fold CV. Teknik ini sangat bermanfaat saat ukuran data terbatas serta ingin mendapatkan estimasi performa yang lebih tergeneralisasi.
resampling <- rsmp("cv", folds = 5)
resampling$instantiate(task)
# Melakukan pemodelan dengan validasi silang
rr <- resample(task, learner, resampling, store_models = TRUE)## INFO [14:29:25.708] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/5)
## INFO [14:29:25.918] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/5)
## INFO [14:29:26.169] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/5)
## INFO [14:29:26.343] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 4/5)
## INFO [14:29:26.519] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 5/5)
## classif.acc classif.auc
## 0.8725410 0.9293403
Tuning Hyperparameter
Pada bagian sebelumnya, kita membangun model random forest dengan
menggunakan pengaturan default (berdasarkan paket ranger).
Praktiknya, tentu kita dapat mengatur nilai-nilai hyperparameter model
atau melakukan pencarian hyperparameter tersebut untuk mendapatkan
kinerja model yang optimal.
Dalam algoritma random forest yang diimplementasikan oleh paket
ranger, terdapat beberapa hyperparameter penting yang dapat
disesuaikan untuk mengoptimalkan performa model. Tiga di antaranya
adalah:
num.trees: jumlah pohon (trees) dalam ensemble. Semakin banyak pohon, biasanya model menjadi lebih stabil tetapi juga lebih lambatmin.node.size: jumlah minimum observasi yang dibutuhkan di tiap node terminal. Semakin kecil nilainya, semakin dalam pohon dan potensi overfitting meningkatmtry: jumlah fitur yang dipilih secara acak untuk dipertimbangkan pada setiap split. Ini mempengaruhi diversitas antar pohon dan merupakan parameter kunci dalam Random Forest
Ketiga parameter ini (dan parameter lainnya dari ranger)
tersedia dan dapat dituning melalui learner classif.ranger
di mlr3.
Untuk melakukan tuning, mlr3 menyediakan fungsi
tnr() (tuner) yang merupakan antarmuka untuk berbagai
metode pencarian hyperparameter. Beberapa pilihan yang umum digunakan
antara lain:
grid_search: melakukan pencarian sistematis dengan menjelajahi kombinasi nilai dalam grid parameter yang ditentukan.random_search: memilih kombinasi nilai secara acak dalam ruang parameter, cocok untuk ruang yang besar atau tidak beraturan.irace,mbo,hyperband: metode pencarian yang lebih canggih seperti iterated racing, Bayesian optimization, dan bandit-based search. Metode ini membutuhkan instalasi tambahan tetapi dapat memberikan hasil lebih optimal lagi, terutama jika hyperparameter semakin kompleks.
Daftar lengkap metode tuning yang tersedia dapat diperoleh dengan
perintah mlr_tuners$keys()
## Loading required package: paradox
library(paradox)
set.seed(42)
# Menentukan batas bawah dan batas atas setiap hyperparameter
search_space <- ps(
num.trees = p_int(lower = 100, upper = 500),
min.node.size = p_int(lower = 1, upper = 10),
mtry = p_int(lower = 2, upper = 10)
)
# Membuat tuner dengan metode random search
# resolution = 5 -> setiap hyperparameter terdiri dari 5 nilai
tuner <- tnr("grid_search", resolution = 5)
# Membuat instance tuning untuk optimasi satu metrik (misalnya akurasi)
instance <- TuningInstanceSingleCrit$new(
task = task$clone()$filter(rows = train_set), # tuning hanya pada data latih
learner = learner,
resampling = rsmp("cv", folds = 3), # validasi silang 3-fold
measure = msr("classif.acc"), # metrik yang digunakan: akurasi
search_space = search_space,
terminator = trm("evals", n_evals = 20) # batasi proses tuning hingga 20 kombinasi
)## TuningInstanceSingleCrit is deprecated. Use TuningInstanceBatchSingleCrit instead.
## INFO [14:29:27.848] [bbotk] Starting to optimize 3 parameter(s) with '<OptimizerBatchGridSearch>' and '<TerminatorEvals> [n_evals=20, k=0]'
## INFO [14:29:27.869] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:27.879] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:27.895] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:28.007] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:28.116] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:28.227] [mlr3] Finished benchmark
## INFO [14:29:28.270] [bbotk] Result of batch 1:
## INFO [14:29:28.303] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:28.303] [bbotk] 300 8 6 0.8514832 0 0 0.28
## INFO [14:29:28.303] [bbotk] uhash
## INFO [14:29:28.303] [bbotk] a13827c0-21ec-4cf8-af3d-ea89a43968ef
## INFO [14:29:28.307] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:28.316] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:28.331] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:28.445] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:28.558] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:28.670] [mlr3] Finished benchmark
## INFO [14:29:28.716] [bbotk] Result of batch 2:
## INFO [14:29:28.721] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:28.721] [bbotk] 400 6 2 0.8569421 0 0 0.31
## INFO [14:29:28.721] [bbotk] uhash
## INFO [14:29:28.721] [bbotk] 59bd93a8-425e-40ec-b63f-c08a117f5dba
## INFO [14:29:28.727] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:28.737] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:28.752] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:28.813] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:28.872] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:28.927] [mlr3] Finished benchmark
## INFO [14:29:28.972] [bbotk] Result of batch 3:
## INFO [14:29:28.977] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:28.977] [bbotk] 100 1 4 0.8541932 0 0 0.13
## INFO [14:29:28.977] [bbotk] uhash
## INFO [14:29:28.977] [bbotk] 9a013a25-bf7a-41e4-bd58-1c34854fff3f
## INFO [14:29:28.982] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:28.999] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:29.014] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:29.142] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:29.266] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:29.383] [mlr3] Finished benchmark
## INFO [14:29:29.427] [bbotk] Result of batch 4:
## INFO [14:29:29.430] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:29.430] [bbotk] 300 1 8 0.8446638 0 0 0.31
## INFO [14:29:29.430] [bbotk] uhash
## INFO [14:29:29.430] [bbotk] b9bb74c8-7622-44c6-ab36-ee6d0e74361e
## INFO [14:29:29.435] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:29.443] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:29.456] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:29.572] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:29.690] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:29.805] [mlr3] Finished benchmark
## INFO [14:29:29.848] [bbotk] Result of batch 5:
## INFO [14:29:29.851] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:29.851] [bbotk] 300 6 8 0.8541932 0 0 0.3
## INFO [14:29:29.851] [bbotk] uhash
## INFO [14:29:29.851] [bbotk] 79a38b52-94dc-486b-bc85-6de2a75f2ab3
## INFO [14:29:29.856] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:29.864] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:29.877] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:29.999] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:30.129] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:30.245] [mlr3] Finished benchmark
## INFO [14:29:30.309] [bbotk] Result of batch 6:
## INFO [14:29:30.312] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:30.312] [bbotk] 300 10 10 0.8514888 0 0 0.32
## INFO [14:29:30.312] [bbotk] uhash
## INFO [14:29:30.312] [bbotk] f3aad45a-370b-46b6-bedf-0d6021b5d27c
## INFO [14:29:30.317] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:30.324] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:30.337] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:30.403] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:30.466] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:30.526] [mlr3] Finished benchmark
## INFO [14:29:30.572] [bbotk] Result of batch 7:
## INFO [14:29:30.576] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:30.576] [bbotk] 100 1 10 0.8474072 0 0 0.19
## INFO [14:29:30.576] [bbotk] uhash
## INFO [14:29:30.576] [bbotk] 6b859e6a-a296-412f-9522-8ae5fe8e518f
## INFO [14:29:30.582] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:30.589] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:30.604] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:30.691] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:30.776] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:30.862] [mlr3] Finished benchmark
## INFO [14:29:30.908] [bbotk] Result of batch 8:
## INFO [14:29:30.912] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:30.912] [bbotk] 200 8 8 0.8541932 0 0 0.24
## INFO [14:29:30.912] [bbotk] uhash
## INFO [14:29:30.912] [bbotk] 9caf208c-2ffa-4286-91e0-25d507fe6942
## INFO [14:29:30.916] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:30.925] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:30.939] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:31.117] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:31.304] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:31.470] [mlr3] Finished benchmark
## INFO [14:29:31.512] [bbotk] Result of batch 9:
## INFO [14:29:31.515] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:31.515] [bbotk] 500 1 6 0.8514721 0 0 0.48
## INFO [14:29:31.515] [bbotk] uhash
## INFO [14:29:31.515] [bbotk] cbaa627c-2251-4799-b46e-e5045d22aeef
## INFO [14:29:31.519] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:31.526] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:31.538] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:31.696] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:31.858] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:32.010] [mlr3] Finished benchmark
## INFO [14:29:32.052] [bbotk] Result of batch 10:
## INFO [14:29:32.055] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:32.055] [bbotk] 400 1 8 0.8473904 0 0 0.44
## INFO [14:29:32.055] [bbotk] uhash
## INFO [14:29:32.055] [bbotk] 3dd73cf6-4189-4038-8649-3b557fe9dca4
## INFO [14:29:32.059] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:32.066] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:32.080] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:32.174] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:32.266] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:32.359] [mlr3] Finished benchmark
## INFO [14:29:32.404] [bbotk] Result of batch 11:
## INFO [14:29:32.408] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:32.408] [bbotk] 200 10 10 0.8433088 0 0 0.24
## INFO [14:29:32.408] [bbotk] uhash
## INFO [14:29:32.408] [bbotk] 922bda0e-5521-4095-93fb-2d0f84b92647
## INFO [14:29:32.412] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:32.421] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:32.435] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:32.587] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:32.737] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:32.883] [mlr3] Finished benchmark
## INFO [14:29:32.930] [bbotk] Result of batch 12:
## INFO [14:29:32.934] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:32.934] [bbotk] 500 6 4 0.8541932 0 0 0.4
## INFO [14:29:32.934] [bbotk] uhash
## INFO [14:29:32.934] [bbotk] 62fb209a-0854-4c22-b634-348c1d07b592
## INFO [14:29:32.938] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:32.947] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:32.962] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:33.098] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:33.234] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:33.373] [mlr3] Finished benchmark
## INFO [14:29:33.418] [bbotk] Result of batch 13:
## INFO [14:29:33.422] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:33.422] [bbotk] 400 1 4 0.862362 0 0 0.36
## INFO [14:29:33.422] [bbotk] uhash
## INFO [14:29:33.422] [bbotk] 3c38e661-d08b-42d7-a7e2-e62db3dae011
## INFO [14:29:33.426] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:33.434] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:33.449] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:33.530] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:33.609] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:33.682] [mlr3] Finished benchmark
## INFO [14:29:33.727] [bbotk] Result of batch 14:
## INFO [14:29:33.730] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:33.730] [bbotk] 200 3 4 0.8514776 0 0 0.18
## INFO [14:29:33.730] [bbotk] uhash
## INFO [14:29:33.730] [bbotk] 9e6b6860-b248-4a35-aa14-dfe2a36074ee
## INFO [14:29:33.736] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:33.744] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:33.757] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:33.881] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:34.013] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:34.132] [mlr3] Finished benchmark
## INFO [14:29:34.184] [bbotk] Result of batch 15:
## INFO [14:29:34.188] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:34.188] [bbotk] 400 6 4 0.8514776 0 0 0.33
## INFO [14:29:34.188] [bbotk] uhash
## INFO [14:29:34.188] [bbotk] 25fa0f67-8381-470c-b581-04ee75590645
## INFO [14:29:34.193] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:34.203] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:34.217] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:34.344] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:34.489] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:34.610] [mlr3] Finished benchmark
## INFO [14:29:34.656] [bbotk] Result of batch 16:
## INFO [14:29:34.660] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:34.660] [bbotk] 400 1 2 0.8514888 0 0 0.33
## INFO [14:29:34.660] [bbotk] uhash
## INFO [14:29:34.660] [bbotk] 31df6fe4-09cf-4dfe-abdb-d076226a81e9
## INFO [14:29:34.664] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:34.672] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:34.687] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:34.777] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:34.867] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:34.961] [mlr3] Finished benchmark
## INFO [14:29:35.009] [bbotk] Result of batch 17:
## INFO [14:29:35.013] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:35.013] [bbotk] 300 8 2 0.856931 0 0 0.24
## INFO [14:29:35.013] [bbotk] uhash
## INFO [14:29:35.013] [bbotk] aba202b9-0dde-4a6b-b5f3-64d920f5361a
## INFO [14:29:35.018] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:35.026] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:35.040] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:35.128] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:35.215] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:35.298] [mlr3] Finished benchmark
## INFO [14:29:35.343] [bbotk] Result of batch 18:
## INFO [14:29:35.346] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:35.346] [bbotk] 300 10 2 0.8569365 0 0 0.2
## INFO [14:29:35.346] [bbotk] uhash
## INFO [14:29:35.346] [bbotk] 1313b129-9a68-4e93-bc98-85769a50ec4a
## INFO [14:29:35.350] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:35.359] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:35.373] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:35.428] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:35.481] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:35.531] [mlr3] Finished benchmark
## INFO [14:29:35.574] [bbotk] Result of batch 19:
## INFO [14:29:35.578] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:35.578] [bbotk] 100 8 6 0.8487566 0 0 0.12
## INFO [14:29:35.578] [bbotk] uhash
## INFO [14:29:35.578] [bbotk] 44a6498c-67e6-4cf4-a007-321ed80348f6
## INFO [14:29:35.582] [bbotk] Evaluating 1 configuration(s)
## INFO [14:29:35.591] [mlr3] Running benchmark with 3 resampling iterations
## INFO [14:29:35.605] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 1/3)
## INFO [14:29:35.671] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 2/3)
## INFO [14:29:35.728] [mlr3] Applying learner 'classif.ranger' on task 'heart' (iter 3/3)
## INFO [14:29:35.780] [mlr3] Finished benchmark
## INFO [14:29:35.819] [bbotk] Result of batch 20:
## INFO [14:29:35.823] [bbotk] num.trees min.node.size mtry classif.acc warnings errors runtime_learners
## INFO [14:29:35.823] [bbotk] 100 6 8 0.8542099 0 0 0.13
## INFO [14:29:35.823] [bbotk] uhash
## INFO [14:29:35.823] [bbotk] b7e786ef-e680-4f30-8ef1-20ff38571324
## INFO [14:29:35.843] [bbotk] Finished optimizing after 20 evaluation(s)
## INFO [14:29:35.844] [bbotk] Result:
## INFO [14:29:35.847] [bbotk] num.trees min.node.size mtry learner_param_vals x_domain classif.acc
## INFO [14:29:35.847] [bbotk] <int> <int> <int> <list> <list> <num>
## INFO [14:29:35.847] [bbotk] 400 1 4 <list[4]> <list[3]> 0.862362
## num.trees min.node.size mtry learner_param_vals x_domain classif.acc
## <int> <int> <int> <list> <list> <num>
## 1: 400 1 4 <list[4]> <list[3]> 0.862362
Melatih Model dengan Hyperparameter Optimal
Proses tuning akan menghasilkan kombinasi parameter dengan performa terbaik berdasarkan batasan yang diberikan. Berikutnya, kita bisa menerapkannya kembali ke learner dan melatih ulang model pada seluruh data latih, serta mengevaluasinya pada data uji.
# Terapkan parameter terbaik hasil tuning
learner$param_set$values <- instance$result_learner_param_vals
# Menampilkan kombinasi optimal
instance$result_learner_param_vals## $num.threads
## [1] 1
##
## $num.trees
## [1] 400
##
## $min.node.size
## [1] 1
##
## $mtry
## [1] 4
# Latih ulang model pada data latih dengan parameter terbaik
learner$train(task, row_ids = train_set)
# Prediksi pada data uji
prediction <- learner$predict(task, row_ids = test_set)
# Evaluasi model akhir pada data uji
prediction$score(list(
msr("classif.acc"),
msr("classif.auc")
))## classif.acc classif.auc
## 0.9076087 0.9498384