1. Pendahuluan
Latar Belakang
Berlian adalah salah satu ciptaan alam yang paling berharga, indah, dan langkah. Berlian merupakan zat alami terkeras yang tersusun dari karbon murni. Berlian juga merupakan batu permata paling populer. Berlian memiliki sejumlah aplikasi penting dalam dunia industri.Berlian 58 kali lebih keras daripada mineral paling keras kedua di bumi.
Diamond berasal dari bahasa Yunani adamas yang berarti ‘tidak terkalahkan’. Ini sesuai dengan sifat aslinya yaitu batu alam paling keras di bumi. Dalam bahasa Latin disebut juga adamare yang artinya mencintai dengan sepenuh hati.
Berlian ditemukan di tiga jenis endapan: kerikil aluvial, ladang glasial, dan pipa kimberlite. Pipa kimberlite (seperti yang ada di Kimberley, Afrika Selatan) terbentuk dari intrusi magma ke kerak bumi yang membawa berlian serta batuan dan mineral lainnya dari mantel bumi. Pipa-pipa itu sendiri seringkali berumur kurang dari 100 juta tahun. Namun, berlian yang mereka bawa terbentuk 1 hingga 3,3 miliar tahun dengan kedalaman lebih dari sekitar 75 mil (120 km). Sedangkan berlian jenis kerikil fluvial dan glasial berasal dari lepasan erosi fluvial dan glasial yang terjadi di matriks kimberlite. Kemudian disimpan kembali di sungai atau di glasial till (merupakan endapan yang bergerak menjauh dari inti atau gletser).
Memprediksi harga suatu berlian dengan beberapa variabel indikator menjadi sangat penting agar setiap orang yang membeli berlian tidak tertipu dengan kualitas berlian yang dibelinya. Dengan demikian pada project ini kami akan membangun model untuk memprediksi harga berlian berdasarkan ciri/karakteristik tertentu.
Goal
Tujuan dari project ini adalah membangun model yang dapat memprediksi harga suatu diamond berdasarkan karakteristik diamond tersebut.
Algoritma
Adapun algoritma yang akan digunakan dalam pemodelan ini adalah sebagai berikut :
| 1. Pre-processing data |
| 2. Membangun Model |
| 3. Tuning Hyperparameter |
| 4. Seleksi Model |
| 5. Predict Data |
2. Package
Berikut ini adalah package yang dipakai dalam project ini.
library(mlr3tuning)## Loading required package: mlr3
## Loading required package: paradox
library(gam)## Loading required package: splines
## Loading required package: foreach
## Loaded gam 1.20
library(rpart)
library(pROC)## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
library(tidyverse)## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.3 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x purrr::accumulate() masks foreach::accumulate()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x purrr::when() masks foreach::when()
library(mlr3verse)
library(mlr3extralearners)##
## Attaching package: 'mlr3extralearners'
## The following objects are masked from 'package:mlr3':
##
## lrn, lrns
library(precrec)##
## Attaching package: 'precrec'
## The following object is masked from 'package:pROC':
##
## auc
library(adabag)## Loading required package: caret
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
## Loading required package: doParallel
## Loading required package: iterators
## Loading required package: parallel
library(ROCR)
library(ROCit)
library(magrittr)##
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
library(visdat)
library(naniar)
library(UpSetR)##
## Attaching package: 'UpSetR'
## The following object is masked from 'package:lattice':
##
## histogram
library(laeken)
library(vcd)## Loading required package: grid
library(VIM)## Loading required package: colorspace
##
## Attaching package: 'colorspace'
## The following object is masked from 'package:pROC':
##
## coords
## VIM is ready to use.
## Suggestions and bug-reports can be submitted at: https://github.com/statistikat/VIM/issues
##
## Attaching package: 'VIM'
## The following object is masked from 'package:datasets':
##
## sleep
library(sm)## Package 'sm', version 2.2-5.7: type help(sm) for summary information
library(ggplot2)
library(dplyr)
library(mlbench)
library(caret)
library(mlr3verse)
library(mlr3fselect)
library(DataExplorer)
library(skimr)##
## Attaching package: 'skimr'
## The following object is masked from 'package:naniar':
##
## n_complete
## The following object is masked from 'package:mlr3':
##
## partition
library(corrplot)## corrplot 0.92 loaded
library(leaps)
library(olsrr)##
## Attaching package: 'olsrr'
## The following object is masked from 'package:datasets':
##
## rivers
library(kableExtra) #Tampilan Tabel##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
library(agricolae) #Pemeriksaan Asumsi
library(lmtest) #Untuk pengecekan asumsi## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(car) #Untuk pengecekan asumsi## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:ROCit':
##
## logit
## The following object is masked from 'package:dplyr':
##
## recode
## The following object is masked from 'package:purrr':
##
## some
library(tseries) #Untuk pengecekan asumsi## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(glmnet)## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
## Loaded glmnet 4.1-2
library(glmnetUtils)##
## Attaching package: 'glmnetUtils'
## The following objects are masked from 'package:glmnet':
##
## cv.glmnet, glmnet
library(broom)
library(ggpubr)
library(modelr)##
## Attaching package: 'modelr'
## The following object is masked from 'package:broom':
##
## bootstrap
## The following object is masked from 'package:mlr3':
##
## resample
library(precrec)
library(adabag)
library(rpart.plot)
library(mice)##
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
##
## filter
## The following objects are masked from 'package:base':
##
## cbind, rbind
library(caret)
library(randomForest)## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:ggplot2':
##
## margin
library(ROSE)## Loaded ROSE 0.0-4
3. Pre-processing Data
Sebelum data yang ada kita modelkan, terlebih dahulu kita siapkan data yang akan digunakan (training & Testing) dan cleaning data (jika diperlukan).
Dataset
Dataset yang digunakan dalam simulasi ini adalah data diamond, yaitu data yang berisikan tentang spesifikasi suatu diamond dan harga diamond tersebut berdasarkan ciri/karakteristik tersebut.
data3 <- read.csv2("D:/Magister IPB/Kuliah/Semester 2/STA582_Pembelajaran_Mesin_Statistika/Kuliah/Pertemuan 5/diamonds.csv",stringsAsFactors = TRUE)
data3<-data3[,-1]
dim(data3)## [1] 53940 10
str(data3)## 'data.frame': 53940 obs. of 10 variables:
## $ carat : num 0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
## $ cut : Factor w/ 5 levels "Fair","Good",..: 3 4 2 4 2 5 5 5 1 5 ...
## $ color : Factor w/ 7 levels "D","E","F","G",..: 2 2 2 6 7 7 6 5 2 5 ...
## $ clarity: Factor w/ 8 levels "I1","IF","SI1",..: 4 3 5 6 4 8 7 3 6 5 ...
## $ depth : num 61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
## $ table : num 55 61 65 58 58 57 57 55 61 61 ...
## $ price : int 326 326 327 334 335 336 336 337 337 338 ...
## $ x : num 3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
## $ y : num 3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
## $ z : num 2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...
glimpse(data3)## Rows: 53,940
## Columns: 10
## $ carat <dbl> 0.23, 0.21, 0.23, 0.29, 0.31, 0.24, 0.24, 0.26, 0.22, 0.23, 0.~
## $ cut <fct> Ideal, Premium, Good, Premium, Good, Very Good, Very Good, Ver~
## $ color <fct> E, E, E, I, J, J, I, H, E, H, J, J, F, J, E, E, I, J, J, J, I,~
## $ clarity <fct> SI2, SI1, VS1, VS2, SI2, VVS2, VVS1, SI1, VS2, VS1, SI1, VS1, ~
## $ depth <dbl> 61.5, 59.8, 56.9, 62.4, 63.3, 62.8, 62.3, 61.9, 65.1, 59.4, 64~
## $ table <dbl> 55, 61, 65, 58, 58, 57, 57, 55, 61, 61, 55, 56, 61, 54, 62, 58~
## $ price <int> 326, 326, 327, 334, 335, 336, 336, 337, 337, 338, 339, 340, 34~
## $ x <dbl> 3.95, 3.89, 4.05, 4.20, 4.34, 3.94, 3.95, 4.07, 3.87, 4.00, 4.~
## $ y <dbl> 3.98, 3.84, 4.07, 4.23, 4.35, 3.96, 3.98, 4.11, 3.78, 4.05, 4.~
## $ z <dbl> 2.43, 2.31, 2.31, 2.63, 2.75, 2.48, 2.47, 2.53, 2.49, 2.39, 2.~
Total data yang kita miliki adalah 53,940 baris dan 10 kolom (1 kolom respon dan 9 kolom prediktor). Terdiri dari 3 variabel bertipe factor dan 7 variabel bertipe integer.
Data training & Testing
Selanjutnya kita bagi dataset tersebut menjadi dua yaitu data training dan testing, data training akan digunakan untuk membangun model dan data testing untuk menguji model.
# membagi data training dan data testing
set.seed(134)
Sample <- sample (1:53940, 43152)
testing <- data3 [Sample, ]
training <- data3 [-Sample, ]
dim(testing)## [1] 43152 10
dim(training)## [1] 10788 10
Data training yang digunakan 80% dari total data, sedangkan sisanya akan digunakan untuk data testing. Data training yang akan kita gunakan terdiri dari 43,152 baris dan 10 kolom (1 variabel respon dan 9 variabel prediktor).
Eksplorasi dan Cleaning Data
Sebelum dilakukan pemodelan, terlebih dahulu kita periksa apakah ada data hilang pada data serta perlu dilihat juga pola hubungan untuk setiap variabel prediktor terhadap variabel respon nya.
sum_mis3<-miss_var_summary(training)
sum_mis_plot3<-head(sum_mis3,11)
sum_mis_plot3#Sebaran data training (price) sebelum transformasi
ggpubr::gghistogram(data = training,x = "price",fill = "darkgreen")+scale_y_continuous(expand = c(0,0))## Warning: Using `bins = 30` by default. Pick better value with the argument
## `bins`.
plot_intro(training)#Eksplorasi data
DataExplorer::plot_scatterplot(data = training,
by = "price",nrow = 3,ncol = 3,geom_point_args = list(color="Steelblue"))#Boxplot price sebelum transformasi
boxplot(training$price)Transformasi variabel respon karena tidak berdistribusi normal dengan menggunakan fungsi log(basis 2) sebagai berikut.
#Transformasi variabel price
training<-training %>%
mutate(price=log(price))
#Sebaran data training (price) setelah transformasi
ggpubr::gghistogram(data = training,x = "price",fill = "darkgreen")+scale_y_continuous(expand = c(0,0))## Warning: Using `bins = 30` by default. Pick better value with the argument
## `bins`.
#Boxplot price setelah transformasi
boxplot(training$price)Transformasi variabel respon data testing juga perlu dilakukan.
testing<-testing %>%
mutate(price=log(price))
ggpubr::gghistogram(data = testing,x = "price",fill = "darkgreen")+scale_y_continuous(expand = c(0,0))## Warning: Using `bins = 30` by default. Pick better value with the argument
## `bins`.
Berdasarkan output di atas, tidak terdapat data hilang pada data dan beberapa pola hubungan variabel prediktor yang numerik cenderung linier dengan variabel respon nya, serta ada transformasi pada variabel respon karena data nya tidak menyebar normal.
4. Pemodelan
Setelah data training sudah siap, selanjutnya kita akan masuk ke tahap pemodelan, dan pada saat ini metode pemodelan yang akan dicobakan adalah regresi random forest.
Random Forest
Pertama kita akan bangun dengan menggunakan model random forest, untuk kemudian akan kita uji akurasinya dengan menggunakan fungsi predict yang mengacu pada nilai RMSE dan MAE.
Mendefinisikan Data
Berikut ini kita definisikan dahulu fungsi untuk membangun regresi random forest dengan data training.
task_price = TaskRegr$new(id="price",backend = training,target = "price")Menentukan model yang digunakan
Karena menggunakan model random forest, maka fungsi lrn yang digunakan adalah regr.ranger. Berikut ini adalah list parameter pengukuran dari fungsi tsb.
as.data.table(lrn("regr.ranger")$param_set)Pada project ini, parameter yang akan digunakan dalam tuning hyperparameter adalah m-try dan max-depth.
Mengecek variabel importance
Kita juga perlu melihat variabel importance dari model yang kita buat dengan fungsi berikut.
model_rf <- lrn("regr.ranger",importance="impurity")Berikut adalah detail variabel importance.
model_rf$train(task = task_price)
model_rf$model$variable.importance## carat clarity color cut depth table x
## 3252.58073 275.36238 147.23905 21.37164 38.31621 48.89532 2573.28311
## y z
## 3332.65329 1402.01260
Jika dibuat ke dalam dataframe makan akan menjadi output sbb.
importance <- data.frame(Predictors = names(model_rf$model$variable.importance),
impurity = model_rf$model$variable.importance
)
rownames(importance) <- NULL
importance %>% arrange(desc(impurity))Secara terurut variabel importance digambarkan oleh tabel di atas, dimana biggest 5 nya adalah y, carat, x, z, dan clarity.
Mendefinisikan Tuning Hiperparameter
Selanjutnya kita akan coba membuat model dengan menggunakan validasi silang agar model yang dihasilkan tidak overfit.
param_bound_rf <- ParamSet$new(params =
list(ParamInt$new("mtry",
lower = 2,
upper = 9),
ParamInt$new("max.depth",
lower = 2,
upper = 9),
ParamInt$new("num.trees",
lower = 500,
upper = 2000)
)
)Tentukan jumlah iterasinya.
terminate = trm("evals", n_evals = 20)
terminate$param_set## <ParamSet>
## id class lower upper nlevels default value
## 1: n_evals ParamInt 0 Inf Inf 100 20
## 2: k ParamInt 0 Inf Inf 0 0
Menentukan metode optimisasi yang digunakan, pada project ini akan kita gunakan random search.
tuner <- tnr("random_search")
tuner$param_set## <ParamSet>
## id class lower upper nlevels default value
## 1: batch_size ParamInt -Inf Inf Inf 1 1
Menentukan metode resampling (inner resampling)
Definisikan jumlah cv yang diinginkan.
resample_inner = rsmp("cv", folds = 3)model_rf_tune <- AutoTuner$new(learner = model_rf,
measure = msr("regr.mape"),
terminator = terminate,
resampling = resample_inner,
search_space = param_bound_rf,
tuner = tuner,
store_models = TRUE
)Menentukan metode resampling (outer resampling)
resample_outer = rsmp("cv", folds = 3)
set.seed(123)
resample_outer$instantiate(task = task_price)Komparasi Model
Sekarang kita coba bandingkan antara model awal dengan model hasil tuning hyperparameter.
model_price <- list( model_rf,
model_rf_tune
)
design <- benchmark_grid(tasks = task_price,
learners = model_price,
resamplings = resample_outer
)lgr::get_logger("bbotk")$set_threshold("warn")
bmr = benchmark(design,store_models = TRUE)## INFO [22:44:43.875] [mlr3] Running benchmark with 6 resampling iterations
## INFO [22:44:43.979] [mlr3] Applying learner 'regr.ranger.tuned' on task 'price' (iter 2/3)
## INFO [22:44:44.147] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:44:44.153] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:44:48.328] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:44:52.363] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:44:56.410] [mlr3] Finished benchmark
## INFO [22:44:57.261] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:44:57.273] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:44:58.384] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:44:59.506] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:45:00.611] [mlr3] Finished benchmark
## INFO [22:45:00.794] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:45:00.804] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:45:02.486] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:45:04.121] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:45:05.801] [mlr3] Finished benchmark
## INFO [22:45:06.019] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:45:06.026] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:45:07.599] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:45:09.162] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:45:10.680] [mlr3] Finished benchmark
## INFO [22:45:10.860] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:45:10.868] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:45:15.644] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:45:20.329] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:45:25.691] [mlr3] Finished benchmark
## INFO [22:45:25.941] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:45:25.950] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:45:29.768] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:45:34.091] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:45:37.755] [mlr3] Finished benchmark
## INFO [22:45:38.010] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:45:38.018] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:45:43.265] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:45:48.568] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:45:54.389] [mlr3] Finished benchmark
## INFO [22:45:54.663] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:45:54.672] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:45:59.588] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:04.519] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:09.466] [mlr3] Finished benchmark
## INFO [22:46:10.233] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:10.242] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:11.648] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:13.045] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:14.436] [mlr3] Finished benchmark
## INFO [22:46:14.609] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:14.617] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:17.211] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:19.800] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:22.411] [mlr3] Finished benchmark
## INFO [22:46:22.599] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:22.609] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:24.084] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:25.590] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:27.116] [mlr3] Finished benchmark
## INFO [22:46:27.282] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:27.290] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:27.765] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:28.269] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:28.778] [mlr3] Finished benchmark
## INFO [22:46:28.945] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:28.951] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:30.910] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:32.887] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:34.860] [mlr3] Finished benchmark
## INFO [22:46:35.120] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:35.126] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:37.056] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:38.932] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:40.887] [mlr3] Finished benchmark
## INFO [22:46:41.071] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:41.083] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:44.460] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:47.853] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:51.216] [mlr3] Finished benchmark
## INFO [22:46:51.456] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:51.462] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:52.118] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:46:52.760] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:53.429] [mlr3] Finished benchmark
## INFO [22:46:53.629] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:46:53.636] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:46:55.835] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:46:57.996] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:47:00.225] [mlr3] Finished benchmark
## INFO [22:47:00.438] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:47:00.451] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:47:06.925] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:47:13.274] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:47:19.717] [mlr3] Finished benchmark
## INFO [22:47:20.767] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:47:20.775] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:47:21.599] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:47:22.430] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:47:23.251] [mlr3] Finished benchmark
## INFO [22:47:23.444] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:47:23.454] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:47:25.470] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:47:27.500] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:47:29.563] [mlr3] Finished benchmark
## INFO [22:47:36.547] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:47:41.666] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:47:47.250] [mlr3] Applying learner 'regr.ranger.tuned' on task 'price' (iter 1/3)
## INFO [22:47:47.398] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:47:47.406] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:47:50.857] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:47:54.239] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:47:57.642] [mlr3] Finished benchmark
## INFO [22:47:57.825] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:47:57.837] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:47:59.667] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:01.522] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:03.396] [mlr3] Finished benchmark
## INFO [22:48:03.563] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:03.573] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:06.036] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:08.632] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:11.185] [mlr3] Finished benchmark
## INFO [22:48:11.391] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:11.402] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:12.999] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:14.599] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:16.160] [mlr3] Finished benchmark
## INFO [22:48:16.372] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:16.380] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:19.295] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:22.220] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:25.114] [mlr3] Finished benchmark
## INFO [22:48:25.320] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:25.327] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:27.380] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:29.478] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:31.558] [mlr3] Finished benchmark
## INFO [22:48:31.765] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:31.772] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:33.084] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:34.377] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:35.707] [mlr3] Finished benchmark
## INFO [22:48:35.903] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:35.914] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:39.835] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:43.920] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:47.782] [mlr3] Finished benchmark
## INFO [22:48:48.021] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:48.030] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:48.892] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:49.760] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:50.630] [mlr3] Finished benchmark
## INFO [22:48:50.800] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:50.807] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:48:52.136] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:53.474] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:54.784] [mlr3] Finished benchmark
## INFO [22:48:54.961] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:48:54.969] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:48:56.848] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:48:58.772] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:00.772] [mlr3] Finished benchmark
## INFO [22:49:00.958] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:00.966] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:02.058] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:03.151] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:04.274] [mlr3] Finished benchmark
## INFO [22:49:04.435] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:04.442] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:04.968] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:05.477] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:05.987] [mlr3] Finished benchmark
## INFO [22:49:06.153] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:06.162] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:08.016] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:09.853] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:11.718] [mlr3] Finished benchmark
## INFO [22:49:11.902] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:12.579] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:14.742] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:16.938] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:19.142] [mlr3] Finished benchmark
## INFO [22:49:19.335] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:19.345] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:19.835] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:20.326] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:20.815] [mlr3] Finished benchmark
## INFO [22:49:20.977] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:20.985] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:22.333] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:23.636] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:24.861] [mlr3] Finished benchmark
## INFO [22:49:25.031] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:25.039] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:26.503] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:27.924] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:29.352] [mlr3] Finished benchmark
## INFO [22:49:29.540] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:29.548] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:31.781] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:33.989] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:36.210] [mlr3] Finished benchmark
## INFO [22:49:36.385] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:36.392] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:38.379] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:40.349] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:42.362] [mlr3] Finished benchmark
## INFO [22:49:48.232] [mlr3] Applying learner 'regr.ranger.tuned' on task 'price' (iter 3/3)
## INFO [22:49:48.379] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:48.388] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:49:50.684] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:53.074] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:55.428] [mlr3] Finished benchmark
## INFO [22:49:55.635] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:49:55.645] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:49:57.121] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:49:58.640] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:00.165] [mlr3] Finished benchmark
## INFO [22:50:00.357] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:00.366] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:02.220] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:04.073] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:05.933] [mlr3] Finished benchmark
## INFO [22:50:06.108] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:06.118] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:08.362] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:10.574] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:12.732] [mlr3] Finished benchmark
## INFO [22:50:12.957] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:12.966] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:14.803] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:16.622] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:18.462] [mlr3] Finished benchmark
## INFO [22:50:18.720] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:18.727] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:20.021] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:21.320] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:22.648] [mlr3] Finished benchmark
## INFO [22:50:22.815] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:22.824] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:25.745] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:28.673] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:31.563] [mlr3] Finished benchmark
## INFO [22:50:31.745] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:31.753] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:34.057] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:36.381] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:38.671] [mlr3] Finished benchmark
## INFO [22:50:39.648] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:39.658] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:41.234] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:42.832] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:44.392] [mlr3] Finished benchmark
## INFO [22:50:44.575] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:44.584] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:46.663] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:48.708] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:50.799] [mlr3] Finished benchmark
## INFO [22:50:51.011] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:51.017] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:51.731] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:52.455] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:53.175] [mlr3] Finished benchmark
## INFO [22:50:53.351] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:53.361] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:53.771] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:54.166] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:54.563] [mlr3] Finished benchmark
## INFO [22:50:54.763] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:54.771] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:50:56.055] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:50:57.421] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:50:58.888] [mlr3] Finished benchmark
## INFO [22:50:59.084] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:50:59.097] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:51:01.761] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:51:04.524] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:51:07.052] [mlr3] Finished benchmark
## INFO [22:51:07.247] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:51:07.254] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:51:08.141] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:51:08.998] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:51:09.855] [mlr3] Finished benchmark
## INFO [22:51:10.032] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:51:10.040] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:51:13.677] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:51:17.440] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:51:21.174] [mlr3] Finished benchmark
## INFO [22:51:21.453] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:51:21.462] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:51:22.740] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:51:23.993] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:51:25.294] [mlr3] Finished benchmark
## INFO [22:51:25.488] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:51:25.499] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:51:27.326] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:51:29.148] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:51:31.098] [mlr3] Finished benchmark
## INFO [22:51:31.317] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:51:31.325] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:51:37.321] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:51:43.529] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:51:49.661] [mlr3] Finished benchmark
## INFO [22:51:49.960] [mlr3] Running benchmark with 3 resampling iterations
## INFO [22:51:49.968] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 3/3)
## INFO [22:51:55.372] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 1/3)
## INFO [22:52:00.757] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:52:06.304] [mlr3] Finished benchmark
## INFO [22:52:09.840] [mlr3] Applying learner 'regr.ranger' on task 'price' (iter 2/3)
## INFO [22:52:15.744] [mlr3] Finished benchmark
Hasil Komparasi model
result = bmr$aggregate(msr("regr.mape"))
resultHiperparameter Terbaik
get_param_res <- function(i){
as.data.table(bmr)$learner[[i]]$tuning_result
}best_rf_param =map_dfr(1:6,get_param_res)
best_rf_parambest_rf_param %>% slice_min(regr.mape)Model yang paling baik adalah model dengan nilai MAEyang paling kecil yaitu model yang dibangun dengan menggunakan m-try = 9 dan maxdepth = 9.
best_rf_param_value <- c(best_rf_param %>%
slice_min(regr.mape) %>%
pull(mtry),
best_rf_param %>%
slice_min(regr.mape) %>%
pull(max.depth)
)5. Predict Data
Selanjutnya kita coba model terbaik hasil tuning hyperparameter tsb dengan menggunakan data baru yang diambil secara random dari data training.
# data dummy
set.seed(123)
data_price_baru <- training %>% slice_sample(n=1000)head(data_price_baru,6)Selanjutnya kita akan predict dengan menggunakan data testing.
model_rf_best <- lrn("regr.ranger",
mtry=best_rf_param_value[1],
max.depth=best_rf_param_value[2]
)
model_rf_best$train(task = task_price)
prediksi_rf_new <- model_rf_best$predict_newdata(newdata = data_price_baru)
as.data.table(prediksi_rf_new)prediksi_rf_new2 <- model_rf_best$predict_newdata(newdata = testing)
as.data.table(prediksi_rf_new2)truth<- prediksi_rf_new2$truth
respon<- prediksi_rf_new2$responseDari hasil tersebut, untuk mengetahui hasil price yang sebenarnya dari data testing maka kita kembalikan lagi nilai transformasi tersebut ke bentuk semula sebagai berikut.
#Mengembalikan fungsi log natural
truth_real<- 2.718282^(truth)
respon_real<- 2.718282^(respon)
hasil_akhir<- as.data.table(cbind(truth_real,respon_real))
hasil_akhirMAE_Testing<-mean(abs(truth_real-respon_real))
MAE_Testing## [1] 347.3942
Didapatkan MAE dari data testing tersebut adalah 346,20.
6. Kesimpulan
Adapun yang dapat disimpulkan dari simulasi di atas adalah sbb:
- Model rf ini bisa memberikan hasil prediksi yang cukup baik.
- Metode ini cukup baik dalam menjelaskan hubungan antara variabel respon dan predictor.
- Tuning hyperparameter adalah hal yang sangat penting untuk dilakukan.
- Semakin presisi model pada data training, maka berpeluang akan overfit pada data test.
- Model ini sangat bergantung dengan jumlah pohon dan juga kedalaman pohon.
Referensi
Breheny P. (n.d.). Getting started with grpreg. GitHub Pages. https://pbreheny.github.io/grpreg/articles/getting-started.html
Hastie T. (2013, May 9). glmnet: Lasso and elastic-net regularization in R. Revolutions. https://blog.revolutionanalytics.com/2013/05/hastie-glmnet.html
Post J. (2014, September 29). LASSO, Ridge, and Elastic Net. https://www4.stat.ncsu.edu/~post/josh/LASSO_Ridge_Elastic_Net_-_Examples.html
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1), 91-108. https://doi.org/10.1111/j.1467-9868.2005.00490.x
Tripathy A. (2013, July 14). Regularization – Predictive Modeling Beyond Ordinary Least Squares Fit. ShatterLine Blog. https://shatterline.com/blog/2013/07/
Xiaotong C., Chen G., & Chong W. (n.d.). Statistical Learning and Data Mining Codes. Biostatistics - Academic Divisions - School of Public Health - University of Minnesota. https://www.biostat.umn.edu/~weip/course/dm/examples/