logistic regression으로 확률 예측 모델링 과정에서 최적 분기점(optimal cutoff)이 어디인가를 찾는 방법에 대한 study입니다. 피그마 인디언 여성들의 당뇨병 관련 데이터로 분석합니다.
필요한 패키지를 먼저 활성화합니다.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
##
## Attaching package: 'MLmetrics'
## The following object is masked from 'package:base':
##
## Recall
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
데이터를 불러와서 데이터프레임 형태로 바꾸고 변수 형식을 살펴봅니다.
##
## -- Column specification --------------------------------------------------------
## cols(
## Pregnancies = col_double(),
## Glucose = col_double(),
## BloodPressure = col_double(),
## SkinThickness = col_double(),
## Insulin = col_double(),
## BMI = col_double(),
## DiabetesPedigreeFunction = col_double(),
## Age = col_double(),
## Outcome = col_double()
## )
## 'data.frame': 768 obs. of 9 variables:
## $ Pregnancies : num 6 1 8 1 0 5 3 10 2 8 ...
## $ Glucose : num 148 85 183 89 137 116 78 115 197 125 ...
## $ BloodPressure : num 72 66 64 66 40 74 50 0 70 96 ...
## $ SkinThickness : num 35 29 0 23 35 0 32 0 45 0 ...
## $ Insulin : num 0 0 0 94 168 0 88 0 543 0 ...
## $ BMI : num 33.6 26.6 23.3 28.1 43.1 25.6 31 35.3 30.5 0 ...
## $ DiabetesPedigreeFunction: num 0.627 0.351 0.672 0.167 2.288 ...
## $ Age : num 50 31 32 21 33 30 26 29 53 54 ...
## $ Outcome : num 1 0 1 0 1 0 1 0 1 1 ...
## - attr(*, "spec")=
## .. cols(
## .. Pregnancies = col_double(),
## .. Glucose = col_double(),
## .. BloodPressure = col_double(),
## .. SkinThickness = col_double(),
## .. Insulin = col_double(),
## .. BMI = col_double(),
## .. DiabetesPedigreeFunction = col_double(),
## .. Age = col_double(),
## .. Outcome = col_double()
## .. )
변수들은 모두 수치형(numeric)이고 각 변수의 의미는 다음과 같습니다.
parameter 의미 해석을 같은 scale상에서 하기 위해 Outcome 변수를 제외한 모든 변수 데이터를 표준화합니다.
diabetes_3 <- scale(diabetes2[-9])
diabetes_s <- data.frame(diabetes2$Outcome, diabetes_3)
names(diabetes_s)[1] <- "Outcome"
str(diabetes_s)## 'data.frame': 768 obs. of 9 variables:
## $ Outcome : num 1 0 1 0 1 0 1 0 1 1 ...
## $ Pregnancies : num 0.64 -0.844 1.233 -0.844 -1.141 ...
## $ Glucose : num 0.848 -1.123 1.942 -0.998 0.504 ...
## $ BloodPressure : num 0.15 -0.16 -0.264 -0.16 -1.504 ...
## $ SkinThickness : num 0.907 0.531 -1.287 0.154 0.907 ...
## $ Insulin : num -0.692 -0.692 -0.692 0.123 0.765 ...
## $ BMI : num 0.204 -0.684 -1.103 -0.494 1.409 ...
## $ DiabetesPedigreeFunction: num 0.468 -0.365 0.604 -0.92 5.481 ...
## $ Age : num 1.4251 -0.1905 -0.1055 -1.0409 -0.0205 ...
training:test=7:3으로 데이터를 분리합니다.
set.seed(100)
index <- sample(1:nrow(diabetes_s), size = nrow(diabetes_s) * 0.7)
training <- diabetes_s[index, ]
test <- diabetes_s[-index, ]
# 데이터 위치 정리
# training <- data.frame(training %>% select(Outcome), training %>% select(-Outcome))
# test <- data.frame(test %>% select(Outcome), test %>% select(-Outcome))Outcome(당뇨유무)을 종속변수로 하는 로지스틱 회귀분석 모델을 추정합니다. glm()함수를 사용하는데, 반드시 종속변수는 0 또는 1을 갖는 수치형 변수여야 합니다.
summary()함수로 결과를 요약해서 봅니다.
##
## Call:
## glm(formula = formula(training), family = "binomial", data = training)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.4625 -0.7341 -0.4391 0.7865 2.7690
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.82075 0.11306 -7.259 3.89e-13 ***
## Pregnancies 0.33920 0.12962 2.617 0.00887 **
## Glucose 1.10363 0.13743 8.030 9.71e-16 ***
## BloodPressure -0.28441 0.11622 -2.447 0.01440 *
## SkinThickness -0.02732 0.13040 -0.209 0.83406
## Insulin -0.15986 0.11693 -1.367 0.17158
## BMI 0.58535 0.13895 4.213 2.52e-05 ***
## DiabetesPedigreeFunction 0.26060 0.11516 2.263 0.02364 *
## Age 0.23079 0.12991 1.777 0.07564 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 697.86 on 536 degrees of freedom
## Residual deviance: 523.53 on 528 degrees of freedom
## AIC: 541.53
##
## Number of Fisher Scoring iterations: 5
SkinThickness, Insulin이 0.1 유의수준에서 의미 없는 변수로 보입니다.
Backward Stepwise로 최종 모델을 구합니다.
## Start: AIC=541.53
## Outcome ~ Pregnancies + Glucose + BloodPressure + SkinThickness +
## Insulin + BMI + DiabetesPedigreeFunction + Age
##
## Df Deviance AIC
## - SkinThickness 1 523.57 539.57
## - Insulin 1 525.39 541.39
## <none> 523.53 541.53
## - Age 1 526.69 542.69
## - DiabetesPedigreeFunction 1 528.79 544.79
## - BloodPressure 1 529.70 545.70
## - Pregnancies 1 530.51 546.51
## - BMI 1 543.36 559.36
## - Glucose 1 604.75 620.75
##
## Step: AIC=539.57
## Outcome ~ Pregnancies + Glucose + BloodPressure + Insulin + BMI +
## DiabetesPedigreeFunction + Age
##
## Df Deviance AIC
## <none> 523.57 539.57
## - Insulin 1 526.08 540.08
## - Age 1 526.87 540.87
## - DiabetesPedigreeFunction 1 528.79 542.79
## - BloodPressure 1 530.22 544.22
## - Pregnancies 1 530.51 544.51
## - BMI 1 544.81 558.81
## - Glucose 1 608.19 622.19
##
## Call:
## glm(formula = Outcome ~ Pregnancies + Glucose + BloodPressure +
## Insulin + BMI + DiabetesPedigreeFunction + Age, family = "binomial",
## data = training)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.4473 -0.7375 -0.4403 0.7840 2.7631
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.8213 0.1130 -7.266 3.71e-13 ***
## Pregnancies 0.3380 0.1296 2.609 0.00909 **
## Glucose 1.1082 0.1358 8.161 3.32e-16 ***
## BloodPressure -0.2891 0.1140 -2.536 0.01121 *
## Insulin -0.1697 0.1070 -1.587 0.11261
## BMI 0.5768 0.1326 4.350 1.36e-05 ***
## DiabetesPedigreeFunction 0.2585 0.1146 2.256 0.02409 *
## Age 0.2342 0.1290 1.816 0.06935 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 697.86 on 536 degrees of freedom
## Residual deviance: 523.57 on 529 degrees of freedom
## AIC: 539.57
##
## Number of Fisher Scoring iterations: 5
AIC(Akaike Information Criterion)1 값이 더 이상 감소하지 않는 모델로 Outcome ~ Pregnancies + Glucose + BloodPressure + Insulin + BMI + DiabetesPedigreeFunction + Age 를 제시하고 있습니다. Insulin은 0.1 유의수준에서 의미 없으나 제외했을 때 전체 모델 AIC 값이 오히려 증가합니다.
test set에 최종 모델 적용하여 확률 기준 Top 10명을 추출합니다. 446번 여성이 0.98이들의 실제 당뇨 유무는 모두 당뇨인걸로 확인됩니다.
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
test_pred1 <- predict(pigma1, newdata=test,type="response")
head(sort(test_pred1, decreasing = TRUE),n=10) %>%
kbl() %>%
kable_paper(full_width = F) %>%
column_spec(1:2, width = "10em", bold = T, background = "lightyellow") | x | |
|---|---|
| 446 | 0.9847286 |
| 207 | 0.9416228 |
| 580 | 0.9300168 |
| 246 | 0.9247962 |
| 762 | 0.9213761 |
| 760 | 0.9107953 |
| 44 | 0.8979082 |
| 260 | 0.8817136 |
| 5 | 0.8676898 |
| 57 | 0.8670064 |
head(arrange(data.frame(test_pred1,test$Outcome),desc(test_pred1)),n=10) %>%
kbl() %>%
kable_paper(full_width = F) %>%
column_spec(1:2, width = "10em", bold = T, background = "lightyellow") | test_pred1 | test.Outcome |
|---|---|
| 0.9847286 | 1 |
| 0.9416228 | 1 |
| 0.9300168 | 1 |
| 0.9247962 | 1 |
| 0.9213761 | 1 |
| 0.9107953 | 1 |
| 0.8979082 | 1 |
| 0.8817136 | 1 |
| 0.8676898 | 1 |
| 0.8670064 | 1 |
Epi 패키지의 ROC() 함수를 이용하면 ROC 곡선을 그릴 수 있고 최적 cutoff를 제시해줍니다.
위의 그래프에서 최적 cut이 0.338임을 제시하고 있습니다. 즉, 확률 0.338 이상이면 당뇨로 의심하는 것을 추천하고 있습니다.
이 때 AUC3는 0.862로 좋은 수준입니다.
Epi 패키지의 ROC() 함수를 이용하면 ROC 곡선을 그릴 수 있고 최적 cutoff를 제시해줍니다.
MLmetrics 패키지의 모델 정확도 관련 함수
## [1] 0.861572
## [1] 0.7820513
## y_pred
## y_true 0 1
## 0 137 16
## 1 33 45
## [1] 0.647482
## [1] 0.5897772
## [1] 0.6564269
## [1] 0.7394827
## [1] 0.723144
## [1] 57.91855
## [1] 1.809571
## [1] 0.4399205
## [1] 0.7787172
## [1] 0.7377049
## [1] 0.5769231
## [1] 0.5769231
## [1] 0.8954248
## [1] 0.2121212
For testing goodness of fit for logistic regression, K-S test is done on TPR and FPR. The main idea is to achieve large separation of these two curves. We can then pick the probability threshold which corresponds to the maximum separability. If the model is ideal, its K-S value will be equal to 1.
\[KSvalue=max(TPR−FPR)\]
options(digits=10)
f1_scores <- sapply(seq(0.01,0.99,0.1), function(x) { F1_Score(test$Outcome, ifelse(test_pred1 > x, 1, 0), positive = 1)})
opt1 <- which.max(f1_scores)/10
# DEoptim 활용 optimize threshold
library(DEoptim)## Loading required package: parallel
##
## DEoptim package
## Differential Evolution algorithm in R
## Authors: D. Ardia, K. Mullen, B. Peterson and J. Ulrich
do1 <- DEoptim(function(x) { -F1_Score(test$Outcome, ifelse(test_pred1 > x, 1, 0), positive = 1)}, 0.01, 0.99)## Iteration: 1 bestvalit: -0.696970 bestmemit: 0.279652
## Iteration: 2 bestvalit: -0.700508 bestmemit: 0.280257
## Iteration: 3 bestvalit: -0.704082 bestmemit: 0.281149
## Iteration: 4 bestvalit: -0.704082 bestmemit: 0.281149
## Iteration: 5 bestvalit: -0.704082 bestmemit: 0.281149
## Iteration: 6 bestvalit: -0.704082 bestmemit: 0.281149
## Iteration: 7 bestvalit: -0.704082 bestmemit: 0.281149
## Iteration: 8 bestvalit: -0.711864 bestmemit: 0.323928
## Iteration: 9 bestvalit: -0.711864 bestmemit: 0.323928
## Iteration: 10 bestvalit: -0.711864 bestmemit: 0.325900
## Iteration: 11 bestvalit: -0.711864 bestmemit: 0.325900
## Iteration: 12 bestvalit: -0.711864 bestmemit: 0.325644
## Iteration: 13 bestvalit: -0.711864 bestmemit: 0.324416
## Iteration: 14 bestvalit: -0.711864 bestmemit: 0.323523
## Iteration: 15 bestvalit: -0.711864 bestmemit: 0.323738
## Iteration: 16 bestvalit: -0.711864 bestmemit: 0.325195
## Iteration: 17 bestvalit: -0.711864 bestmemit: 0.325716
## Iteration: 18 bestvalit: -0.711864 bestmemit: 0.323337
## Iteration: 19 bestvalit: -0.711864 bestmemit: 0.322842
## Iteration: 20 bestvalit: -0.711864 bestmemit: 0.323345
## Iteration: 21 bestvalit: -0.711864 bestmemit: 0.322842
## Iteration: 22 bestvalit: -0.711864 bestmemit: 0.324363
## Iteration: 23 bestvalit: -0.711864 bestmemit: 0.324932
## Iteration: 24 bestvalit: -0.711864 bestmemit: 0.324682
## Iteration: 25 bestvalit: -0.711864 bestmemit: 0.322546
## Iteration: 26 bestvalit: -0.711864 bestmemit: 0.322500
## Iteration: 27 bestvalit: -0.711864 bestmemit: 0.322205
## Iteration: 28 bestvalit: -0.711864 bestmemit: 0.323716
## Iteration: 29 bestvalit: -0.711864 bestmemit: 0.324613
## Iteration: 30 bestvalit: -0.711864 bestmemit: 0.325175
## Iteration: 31 bestvalit: -0.711864 bestmemit: 0.325135
## Iteration: 32 bestvalit: -0.711864 bestmemit: 0.325669
## Iteration: 33 bestvalit: -0.711864 bestmemit: 0.323997
## Iteration: 34 bestvalit: -0.711864 bestmemit: 0.322964
## Iteration: 35 bestvalit: -0.711864 bestmemit: 0.323789
## Iteration: 36 bestvalit: -0.711864 bestmemit: 0.323569
## Iteration: 37 bestvalit: -0.711864 bestmemit: 0.324925
## Iteration: 38 bestvalit: -0.711864 bestmemit: 0.325895
## Iteration: 39 bestvalit: -0.711864 bestmemit: 0.325507
## Iteration: 40 bestvalit: -0.711864 bestmemit: 0.325152
## Iteration: 41 bestvalit: -0.711864 bestmemit: 0.323984
## Iteration: 42 bestvalit: -0.711864 bestmemit: 0.323450
## Iteration: 43 bestvalit: -0.711864 bestmemit: 0.324389
## Iteration: 44 bestvalit: -0.711864 bestmemit: 0.325840
## Iteration: 45 bestvalit: -0.711864 bestmemit: 0.324341
## Iteration: 46 bestvalit: -0.711864 bestmemit: 0.322394
## Iteration: 47 bestvalit: -0.711864 bestmemit: 0.324433
## Iteration: 48 bestvalit: -0.711864 bestmemit: 0.324099
## Iteration: 49 bestvalit: -0.711864 bestmemit: 0.324483
## Iteration: 50 bestvalit: -0.711864 bestmemit: 0.324071
## Iteration: 51 bestvalit: -0.711864 bestmemit: 0.323691
## Iteration: 52 bestvalit: -0.711864 bestmemit: 0.323602
## Iteration: 53 bestvalit: -0.711864 bestmemit: 0.324376
## Iteration: 54 bestvalit: -0.711864 bestmemit: 0.323546
## Iteration: 55 bestvalit: -0.711864 bestmemit: 0.323681
## Iteration: 56 bestvalit: -0.711864 bestmemit: 0.323665
## Iteration: 57 bestvalit: -0.711864 bestmemit: 0.323930
## Iteration: 58 bestvalit: -0.711864 bestmemit: 0.325176
## Iteration: 59 bestvalit: -0.711864 bestmemit: 0.326339
## Iteration: 60 bestvalit: -0.711864 bestmemit: 0.326145
## Iteration: 61 bestvalit: -0.711864 bestmemit: 0.325733
## Iteration: 62 bestvalit: -0.711864 bestmemit: 0.326101
## Iteration: 63 bestvalit: -0.711864 bestmemit: 0.325552
## Iteration: 64 bestvalit: -0.711864 bestmemit: 0.325362
## Iteration: 65 bestvalit: -0.711864 bestmemit: 0.325400
## Iteration: 66 bestvalit: -0.711864 bestmemit: 0.325718
## Iteration: 67 bestvalit: -0.711864 bestmemit: 0.325821
## Iteration: 68 bestvalit: -0.711864 bestmemit: 0.326133
## Iteration: 69 bestvalit: -0.711864 bestmemit: 0.326343
## Iteration: 70 bestvalit: -0.711864 bestmemit: 0.326396
## Iteration: 71 bestvalit: -0.711864 bestmemit: 0.326410
## Iteration: 72 bestvalit: -0.711864 bestmemit: 0.326398
## Iteration: 73 bestvalit: -0.711864 bestmemit: 0.326197
## Iteration: 74 bestvalit: -0.711864 bestmemit: 0.326063
## Iteration: 75 bestvalit: -0.711864 bestmemit: 0.326295
## Iteration: 76 bestvalit: -0.711864 bestmemit: 0.326276
## Iteration: 77 bestvalit: -0.711864 bestmemit: 0.325884
## Iteration: 78 bestvalit: -0.711864 bestmemit: 0.325901
## Iteration: 79 bestvalit: -0.711864 bestmemit: 0.325910
## Iteration: 80 bestvalit: -0.711864 bestmemit: 0.325483
## Iteration: 81 bestvalit: -0.711864 bestmemit: 0.325741
## Iteration: 82 bestvalit: -0.711864 bestmemit: 0.325227
## Iteration: 83 bestvalit: -0.711864 bestmemit: 0.324729
## Iteration: 84 bestvalit: -0.711864 bestmemit: 0.324997
## Iteration: 85 bestvalit: -0.711864 bestmemit: 0.325542
## Iteration: 86 bestvalit: -0.711864 bestmemit: 0.326215
## Iteration: 87 bestvalit: -0.711864 bestmemit: 0.326227
## Iteration: 88 bestvalit: -0.711864 bestmemit: 0.326406
## Iteration: 89 bestvalit: -0.711864 bestmemit: 0.325871
## Iteration: 90 bestvalit: -0.711864 bestmemit: 0.326403
## Iteration: 91 bestvalit: -0.711864 bestmemit: 0.326197
## Iteration: 92 bestvalit: -0.711864 bestmemit: 0.325068
## Iteration: 93 bestvalit: -0.711864 bestmemit: 0.325280
## Iteration: 94 bestvalit: -0.711864 bestmemit: 0.324954
## Iteration: 95 bestvalit: -0.711864 bestmemit: 0.325693
## Iteration: 96 bestvalit: -0.711864 bestmemit: 0.326480
## Iteration: 97 bestvalit: -0.711864 bestmemit: 0.326403
## Iteration: 98 bestvalit: -0.711864 bestmemit: 0.326013
## Iteration: 99 bestvalit: -0.711864 bestmemit: 0.326052
## Iteration: 100 bestvalit: -0.711864 bestmemit: 0.325581
## Iteration: 101 bestvalit: -0.711864 bestmemit: 0.325701
## Iteration: 102 bestvalit: -0.711864 bestmemit: 0.325262
## Iteration: 103 bestvalit: -0.711864 bestmemit: 0.325002
## Iteration: 104 bestvalit: -0.711864 bestmemit: 0.324573
## Iteration: 105 bestvalit: -0.711864 bestmemit: 0.325124
## Iteration: 106 bestvalit: -0.711864 bestmemit: 0.326555
## Iteration: 107 bestvalit: -0.711864 bestmemit: 0.324467
## Iteration: 108 bestvalit: -0.711864 bestmemit: 0.325056
## Iteration: 109 bestvalit: -0.711864 bestmemit: 0.326055
## Iteration: 110 bestvalit: -0.711864 bestmemit: 0.325075
## Iteration: 111 bestvalit: -0.711864 bestmemit: 0.325190
## Iteration: 112 bestvalit: -0.711864 bestmemit: 0.324334
## Iteration: 113 bestvalit: -0.711864 bestmemit: 0.324354
## Iteration: 114 bestvalit: -0.711864 bestmemit: 0.323688
## Iteration: 115 bestvalit: -0.711864 bestmemit: 0.323545
## Iteration: 116 bestvalit: -0.711864 bestmemit: 0.323720
## Iteration: 117 bestvalit: -0.711864 bestmemit: 0.324625
## Iteration: 118 bestvalit: -0.711864 bestmemit: 0.325117
## Iteration: 119 bestvalit: -0.711864 bestmemit: 0.324098
## Iteration: 120 bestvalit: -0.711864 bestmemit: 0.324146
## Iteration: 121 bestvalit: -0.711864 bestmemit: 0.324367
## Iteration: 122 bestvalit: -0.711864 bestmemit: 0.324571
## Iteration: 123 bestvalit: -0.711864 bestmemit: 0.324968
## Iteration: 124 bestvalit: -0.711864 bestmemit: 0.324373
## Iteration: 125 bestvalit: -0.711864 bestmemit: 0.324474
## Iteration: 126 bestvalit: -0.711864 bestmemit: 0.324245
## Iteration: 127 bestvalit: -0.711864 bestmemit: 0.324687
## Iteration: 128 bestvalit: -0.711864 bestmemit: 0.324516
## Iteration: 129 bestvalit: -0.711864 bestmemit: 0.323246
## Iteration: 130 bestvalit: -0.711864 bestmemit: 0.323058
## Iteration: 131 bestvalit: -0.711864 bestmemit: 0.323584
## Iteration: 132 bestvalit: -0.711864 bestmemit: 0.322117
## Iteration: 133 bestvalit: -0.711864 bestmemit: 0.323167
## Iteration: 134 bestvalit: -0.711864 bestmemit: 0.322042
## Iteration: 135 bestvalit: -0.711864 bestmemit: 0.322620
## Iteration: 136 bestvalit: -0.711864 bestmemit: 0.322838
## Iteration: 137 bestvalit: -0.711864 bestmemit: 0.322462
## Iteration: 138 bestvalit: -0.711864 bestmemit: 0.322588
## Iteration: 139 bestvalit: -0.711864 bestmemit: 0.322412
## Iteration: 140 bestvalit: -0.711864 bestmemit: 0.323052
## Iteration: 141 bestvalit: -0.711864 bestmemit: 0.323398
## Iteration: 142 bestvalit: -0.711864 bestmemit: 0.323433
## Iteration: 143 bestvalit: -0.711864 bestmemit: 0.324099
## Iteration: 144 bestvalit: -0.711864 bestmemit: 0.324589
## Iteration: 145 bestvalit: -0.711864 bestmemit: 0.325024
## Iteration: 146 bestvalit: -0.711864 bestmemit: 0.324930
## Iteration: 147 bestvalit: -0.711864 bestmemit: 0.326161
## Iteration: 148 bestvalit: -0.711864 bestmemit: 0.325443
## Iteration: 149 bestvalit: -0.711864 bestmemit: 0.326061
## Iteration: 150 bestvalit: -0.711864 bestmemit: 0.325910
## Iteration: 151 bestvalit: -0.711864 bestmemit: 0.326007
## Iteration: 152 bestvalit: -0.711864 bestmemit: 0.325748
## Iteration: 153 bestvalit: -0.711864 bestmemit: 0.325616
## Iteration: 154 bestvalit: -0.711864 bestmemit: 0.324924
## Iteration: 155 bestvalit: -0.711864 bestmemit: 0.325464
## Iteration: 156 bestvalit: -0.711864 bestmemit: 0.325850
## Iteration: 157 bestvalit: -0.711864 bestmemit: 0.325365
## Iteration: 158 bestvalit: -0.711864 bestmemit: 0.324911
## Iteration: 159 bestvalit: -0.711864 bestmemit: 0.324319
## Iteration: 160 bestvalit: -0.711864 bestmemit: 0.323922
## Iteration: 161 bestvalit: -0.711864 bestmemit: 0.325651
## Iteration: 162 bestvalit: -0.711864 bestmemit: 0.324660
## Iteration: 163 bestvalit: -0.711864 bestmemit: 0.324016
## Iteration: 164 bestvalit: -0.711864 bestmemit: 0.325222
## Iteration: 165 bestvalit: -0.711864 bestmemit: 0.325240
## Iteration: 166 bestvalit: -0.711864 bestmemit: 0.324788
## Iteration: 167 bestvalit: -0.711864 bestmemit: 0.324226
## Iteration: 168 bestvalit: -0.711864 bestmemit: 0.323734
## Iteration: 169 bestvalit: -0.711864 bestmemit: 0.324975
## Iteration: 170 bestvalit: -0.711864 bestmemit: 0.325915
## Iteration: 171 bestvalit: -0.711864 bestmemit: 0.324503
## Iteration: 172 bestvalit: -0.711864 bestmemit: 0.325339
## Iteration: 173 bestvalit: -0.711864 bestmemit: 0.325879
## Iteration: 174 bestvalit: -0.711864 bestmemit: 0.326430
## Iteration: 175 bestvalit: -0.711864 bestmemit: 0.325121
## Iteration: 176 bestvalit: -0.711864 bestmemit: 0.324163
## Iteration: 177 bestvalit: -0.711864 bestmemit: 0.323344
## Iteration: 178 bestvalit: -0.711864 bestmemit: 0.324485
## Iteration: 179 bestvalit: -0.711864 bestmemit: 0.324055
## Iteration: 180 bestvalit: -0.711864 bestmemit: 0.324129
## Iteration: 181 bestvalit: -0.711864 bestmemit: 0.323975
## Iteration: 182 bestvalit: -0.711864 bestmemit: 0.325306
## Iteration: 183 bestvalit: -0.711864 bestmemit: 0.323537
## Iteration: 184 bestvalit: -0.711864 bestmemit: 0.324565
## Iteration: 185 bestvalit: -0.711864 bestmemit: 0.325058
## Iteration: 186 bestvalit: -0.711864 bestmemit: 0.325017
## Iteration: 187 bestvalit: -0.711864 bestmemit: 0.323400
## Iteration: 188 bestvalit: -0.711864 bestmemit: 0.323753
## Iteration: 189 bestvalit: -0.711864 bestmemit: 0.323197
## Iteration: 190 bestvalit: -0.711864 bestmemit: 0.323221
## Iteration: 191 bestvalit: -0.711864 bestmemit: 0.324594
## Iteration: 192 bestvalit: -0.711864 bestmemit: 0.325178
## Iteration: 193 bestvalit: -0.711864 bestmemit: 0.322712
## Iteration: 194 bestvalit: -0.711864 bestmemit: 0.322144
## Iteration: 195 bestvalit: -0.711864 bestmemit: 0.323798
## Iteration: 196 bestvalit: -0.711864 bestmemit: 0.322502
## Iteration: 197 bestvalit: -0.711864 bestmemit: 0.323403
## Iteration: 198 bestvalit: -0.711864 bestmemit: 0.323832
## Iteration: 199 bestvalit: -0.711864 bestmemit: 0.322878
## Iteration: 200 bestvalit: -0.711864 bestmemit: 0.324062
## List of 2
## $ optim :List of 4
## ..$ bestmem: Named num 0.324
## .. ..- attr(*, "names")= chr "par1"
## ..$ bestval: num -0.712
## ..$ nfeval : int 402
## ..$ iter : int 200
## $ member:List of 6
## ..$ lower : Named num 0.01
## .. ..- attr(*, "names")= chr "par1"
## ..$ upper : Named num 0.99
## .. ..- attr(*, "names")= chr "par1"
## ..$ bestmemit: num [1:200, 1] 0.235 0.28 0.28 0.281 0.281 ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:200] "1" "2" "3" "4" ...
## .. .. ..$ : chr "par1"
## ..$ bestvalit: num [1:200] -0.685 -0.697 -0.701 -0.704 -0.704 ...
## ..$ pop : num [1:10, 1] 0.322 0.323 0.323 0.323 0.323 ...
## ..$ storepop : list()
## - attr(*, "class")= chr "DEoptim"
do1op <-do1$optim
opt2 <- do1op$bestmem
opt1; F1_Score(test$Outcome, ifelse(test_pred1 > opt1, 1, 0), positive = 1)## [1] 0.4
## [1] 0.6709677419
## par1
## 0.3240623685
## [1] 0.7118644068
options(digits=10)
f3_scores <- sapply(seq(0.01,0.99,0.1), function(x) { FBeta_Score(test$Outcome, ifelse(test_pred1 > x, 1, 0), positive = 1, beta=3)})
opt31 <- which.max(f3_scores)/10
# DEoptim 활용 optimize threshold
library(DEoptim)
do31 <- DEoptim(function(x) { -FBeta_Score(test$Outcome, ifelse(test_pred1 > x, 1, 0), positive = 1, beta=3)}, 0.01, 0.99)## Iteration: 1 bestvalit: -0.885478 bestmemit: 0.188257
## Iteration: 2 bestvalit: -0.885478 bestmemit: 0.188257
## Iteration: 3 bestvalit: -0.885478 bestmemit: 0.188257
## Iteration: 4 bestvalit: -0.885478 bestmemit: 0.188257
## Iteration: 5 bestvalit: -0.885478 bestmemit: 0.188257
## Iteration: 6 bestvalit: -0.885478 bestmemit: 0.190232
## Iteration: 7 bestvalit: -0.885478 bestmemit: 0.190232
## Iteration: 8 bestvalit: -0.885478 bestmemit: 0.186295
## Iteration: 9 bestvalit: -0.885478 bestmemit: 0.189858
## Iteration: 10 bestvalit: -0.885478 bestmemit: 0.188652
## Iteration: 11 bestvalit: -0.885478 bestmemit: 0.189658
## Iteration: 12 bestvalit: -0.885478 bestmemit: 0.185848
## Iteration: 13 bestvalit: -0.885478 bestmemit: 0.186624
## Iteration: 14 bestvalit: -0.885478 bestmemit: 0.187726
## Iteration: 15 bestvalit: -0.885478 bestmemit: 0.189718
## Iteration: 16 bestvalit: -0.885478 bestmemit: 0.187841
## Iteration: 17 bestvalit: -0.885478 bestmemit: 0.187340
## Iteration: 18 bestvalit: -0.885478 bestmemit: 0.186795
## Iteration: 19 bestvalit: -0.885478 bestmemit: 0.189100
## Iteration: 20 bestvalit: -0.885478 bestmemit: 0.189929
## Iteration: 21 bestvalit: -0.885478 bestmemit: 0.186816
## Iteration: 22 bestvalit: -0.885478 bestmemit: 0.187013
## Iteration: 23 bestvalit: -0.885478 bestmemit: 0.187439
## Iteration: 24 bestvalit: -0.885478 bestmemit: 0.185933
## Iteration: 25 bestvalit: -0.885478 bestmemit: 0.186000
## Iteration: 26 bestvalit: -0.885478 bestmemit: 0.187319
## Iteration: 27 bestvalit: -0.885478 bestmemit: 0.187242
## Iteration: 28 bestvalit: -0.885478 bestmemit: 0.186894
## Iteration: 29 bestvalit: -0.885478 bestmemit: 0.186878
## Iteration: 30 bestvalit: -0.885478 bestmemit: 0.187294
## Iteration: 31 bestvalit: -0.885478 bestmemit: 0.187912
## Iteration: 32 bestvalit: -0.885478 bestmemit: 0.186248
## Iteration: 33 bestvalit: -0.885478 bestmemit: 0.185897
## Iteration: 34 bestvalit: -0.885478 bestmemit: 0.187331
## Iteration: 35 bestvalit: -0.885478 bestmemit: 0.186789
## Iteration: 36 bestvalit: -0.885478 bestmemit: 0.186155
## Iteration: 37 bestvalit: -0.885478 bestmemit: 0.187422
## Iteration: 38 bestvalit: -0.885478 bestmemit: 0.188888
## Iteration: 39 bestvalit: -0.885478 bestmemit: 0.190200
## Iteration: 40 bestvalit: -0.885478 bestmemit: 0.188910
## Iteration: 41 bestvalit: -0.885478 bestmemit: 0.188213
## Iteration: 42 bestvalit: -0.885478 bestmemit: 0.187403
## Iteration: 43 bestvalit: -0.885478 bestmemit: 0.187110
## Iteration: 44 bestvalit: -0.885478 bestmemit: 0.189488
## Iteration: 45 bestvalit: -0.885478 bestmemit: 0.189450
## Iteration: 46 bestvalit: -0.885478 bestmemit: 0.189954
## Iteration: 47 bestvalit: -0.885478 bestmemit: 0.189512
## Iteration: 48 bestvalit: -0.885478 bestmemit: 0.188763
## Iteration: 49 bestvalit: -0.885478 bestmemit: 0.189860
## Iteration: 50 bestvalit: -0.885478 bestmemit: 0.188220
## Iteration: 51 bestvalit: -0.885478 bestmemit: 0.187832
## Iteration: 52 bestvalit: -0.885478 bestmemit: 0.187108
## Iteration: 53 bestvalit: -0.885478 bestmemit: 0.187276
## Iteration: 54 bestvalit: -0.885478 bestmemit: 0.187788
## Iteration: 55 bestvalit: -0.885478 bestmemit: 0.188230
## Iteration: 56 bestvalit: -0.885478 bestmemit: 0.188073
## Iteration: 57 bestvalit: -0.885478 bestmemit: 0.188145
## Iteration: 58 bestvalit: -0.885478 bestmemit: 0.188444
## Iteration: 59 bestvalit: -0.885478 bestmemit: 0.188746
## Iteration: 60 bestvalit: -0.885478 bestmemit: 0.189360
## Iteration: 61 bestvalit: -0.885478 bestmemit: 0.188760
## Iteration: 62 bestvalit: -0.885478 bestmemit: 0.188098
## Iteration: 63 bestvalit: -0.885478 bestmemit: 0.189048
## Iteration: 64 bestvalit: -0.885478 bestmemit: 0.188188
## Iteration: 65 bestvalit: -0.885478 bestmemit: 0.187100
## Iteration: 66 bestvalit: -0.885478 bestmemit: 0.186766
## Iteration: 67 bestvalit: -0.885478 bestmemit: 0.189301
## Iteration: 68 bestvalit: -0.885478 bestmemit: 0.186991
## Iteration: 69 bestvalit: -0.885478 bestmemit: 0.187510
## Iteration: 70 bestvalit: -0.885478 bestmemit: 0.186787
## Iteration: 71 bestvalit: -0.885478 bestmemit: 0.186764
## Iteration: 72 bestvalit: -0.885478 bestmemit: 0.187042
## Iteration: 73 bestvalit: -0.885478 bestmemit: 0.185813
## Iteration: 74 bestvalit: -0.885478 bestmemit: 0.187419
## Iteration: 75 bestvalit: -0.885478 bestmemit: 0.187322
## Iteration: 76 bestvalit: -0.885478 bestmemit: 0.187066
## Iteration: 77 bestvalit: -0.885478 bestmemit: 0.187164
## Iteration: 78 bestvalit: -0.885478 bestmemit: 0.187973
## Iteration: 79 bestvalit: -0.885478 bestmemit: 0.188400
## Iteration: 80 bestvalit: -0.885478 bestmemit: 0.187791
## Iteration: 81 bestvalit: -0.885478 bestmemit: 0.188431
## Iteration: 82 bestvalit: -0.885478 bestmemit: 0.186991
## Iteration: 83 bestvalit: -0.885478 bestmemit: 0.186940
## Iteration: 84 bestvalit: -0.885478 bestmemit: 0.187963
## Iteration: 85 bestvalit: -0.885478 bestmemit: 0.186232
## Iteration: 86 bestvalit: -0.885478 bestmemit: 0.186928
## Iteration: 87 bestvalit: -0.885478 bestmemit: 0.186716
## Iteration: 88 bestvalit: -0.885478 bestmemit: 0.186756
## Iteration: 89 bestvalit: -0.885478 bestmemit: 0.185613
## Iteration: 90 bestvalit: -0.885478 bestmemit: 0.186804
## Iteration: 91 bestvalit: -0.885478 bestmemit: 0.187950
## Iteration: 92 bestvalit: -0.885478 bestmemit: 0.188202
## Iteration: 93 bestvalit: -0.885478 bestmemit: 0.188508
## Iteration: 94 bestvalit: -0.885478 bestmemit: 0.187620
## Iteration: 95 bestvalit: -0.885478 bestmemit: 0.190038
## Iteration: 96 bestvalit: -0.885478 bestmemit: 0.187163
## Iteration: 97 bestvalit: -0.885478 bestmemit: 0.190231
## Iteration: 98 bestvalit: -0.885478 bestmemit: 0.189554
## Iteration: 99 bestvalit: -0.885478 bestmemit: 0.188839
## Iteration: 100 bestvalit: -0.885478 bestmemit: 0.188518
## Iteration: 101 bestvalit: -0.885478 bestmemit: 0.188402
## Iteration: 102 bestvalit: -0.885478 bestmemit: 0.188049
## Iteration: 103 bestvalit: -0.885478 bestmemit: 0.186071
## Iteration: 104 bestvalit: -0.885478 bestmemit: 0.186292
## Iteration: 105 bestvalit: -0.885478 bestmemit: 0.187587
## Iteration: 106 bestvalit: -0.885478 bestmemit: 0.186796
## Iteration: 107 bestvalit: -0.885478 bestmemit: 0.186629
## Iteration: 108 bestvalit: -0.885478 bestmemit: 0.187000
## Iteration: 109 bestvalit: -0.885478 bestmemit: 0.188011
## Iteration: 110 bestvalit: -0.885478 bestmemit: 0.189103
## Iteration: 111 bestvalit: -0.885478 bestmemit: 0.189825
## Iteration: 112 bestvalit: -0.885478 bestmemit: 0.187615
## Iteration: 113 bestvalit: -0.885478 bestmemit: 0.187517
## Iteration: 114 bestvalit: -0.885478 bestmemit: 0.187852
## Iteration: 115 bestvalit: -0.885478 bestmemit: 0.186794
## Iteration: 116 bestvalit: -0.885478 bestmemit: 0.186289
## Iteration: 117 bestvalit: -0.885478 bestmemit: 0.187640
## Iteration: 118 bestvalit: -0.885478 bestmemit: 0.186341
## Iteration: 119 bestvalit: -0.885478 bestmemit: 0.187836
## Iteration: 120 bestvalit: -0.885478 bestmemit: 0.187037
## Iteration: 121 bestvalit: -0.885478 bestmemit: 0.188042
## Iteration: 122 bestvalit: -0.885478 bestmemit: 0.187884
## Iteration: 123 bestvalit: -0.885478 bestmemit: 0.187494
## Iteration: 124 bestvalit: -0.885478 bestmemit: 0.186650
## Iteration: 125 bestvalit: -0.885478 bestmemit: 0.186830
## Iteration: 126 bestvalit: -0.885478 bestmemit: 0.187063
## Iteration: 127 bestvalit: -0.885478 bestmemit: 0.185884
## Iteration: 128 bestvalit: -0.885478 bestmemit: 0.185595
## Iteration: 129 bestvalit: -0.885478 bestmemit: 0.186183
## Iteration: 130 bestvalit: -0.885478 bestmemit: 0.186546
## Iteration: 131 bestvalit: -0.885478 bestmemit: 0.186853
## Iteration: 132 bestvalit: -0.885478 bestmemit: 0.186785
## Iteration: 133 bestvalit: -0.885478 bestmemit: 0.186851
## Iteration: 134 bestvalit: -0.885478 bestmemit: 0.186737
## Iteration: 135 bestvalit: -0.885478 bestmemit: 0.186841
## Iteration: 136 bestvalit: -0.885478 bestmemit: 0.187498
## Iteration: 137 bestvalit: -0.885478 bestmemit: 0.186999
## Iteration: 138 bestvalit: -0.885478 bestmemit: 0.187146
## Iteration: 139 bestvalit: -0.885478 bestmemit: 0.187107
## Iteration: 140 bestvalit: -0.885478 bestmemit: 0.186569
## Iteration: 141 bestvalit: -0.885478 bestmemit: 0.185865
## Iteration: 142 bestvalit: -0.885478 bestmemit: 0.185919
## Iteration: 143 bestvalit: -0.885478 bestmemit: 0.186468
## Iteration: 144 bestvalit: -0.885478 bestmemit: 0.186365
## Iteration: 145 bestvalit: -0.885478 bestmemit: 0.186484
## Iteration: 146 bestvalit: -0.885478 bestmemit: 0.186542
## Iteration: 147 bestvalit: -0.885478 bestmemit: 0.186890
## Iteration: 148 bestvalit: -0.885478 bestmemit: 0.186811
## Iteration: 149 bestvalit: -0.885478 bestmemit: 0.186352
## Iteration: 150 bestvalit: -0.885478 bestmemit: 0.185968
## Iteration: 151 bestvalit: -0.885478 bestmemit: 0.186181
## Iteration: 152 bestvalit: -0.885478 bestmemit: 0.186187
## Iteration: 153 bestvalit: -0.885478 bestmemit: 0.186098
## Iteration: 154 bestvalit: -0.885478 bestmemit: 0.186470
## Iteration: 155 bestvalit: -0.885478 bestmemit: 0.186204
## Iteration: 156 bestvalit: -0.885478 bestmemit: 0.186541
## Iteration: 157 bestvalit: -0.885478 bestmemit: 0.186444
## Iteration: 158 bestvalit: -0.885478 bestmemit: 0.186892
## Iteration: 159 bestvalit: -0.885478 bestmemit: 0.186394
## Iteration: 160 bestvalit: -0.885478 bestmemit: 0.186503
## Iteration: 161 bestvalit: -0.885478 bestmemit: 0.186052
## Iteration: 162 bestvalit: -0.885478 bestmemit: 0.186890
## Iteration: 163 bestvalit: -0.885478 bestmemit: 0.186101
## Iteration: 164 bestvalit: -0.885478 bestmemit: 0.186438
## Iteration: 165 bestvalit: -0.885478 bestmemit: 0.186370
## Iteration: 166 bestvalit: -0.885478 bestmemit: 0.185594
## Iteration: 167 bestvalit: -0.885478 bestmemit: 0.186019
## Iteration: 168 bestvalit: -0.885478 bestmemit: 0.185984
## Iteration: 169 bestvalit: -0.885478 bestmemit: 0.186128
## Iteration: 170 bestvalit: -0.885478 bestmemit: 0.185468
## Iteration: 171 bestvalit: -0.885478 bestmemit: 0.186559
## Iteration: 172 bestvalit: -0.885478 bestmemit: 0.185539
## Iteration: 173 bestvalit: -0.885478 bestmemit: 0.186304
## Iteration: 174 bestvalit: -0.885478 bestmemit: 0.186011
## Iteration: 175 bestvalit: -0.885478 bestmemit: 0.186882
## Iteration: 176 bestvalit: -0.885478 bestmemit: 0.185891
## Iteration: 177 bestvalit: -0.885478 bestmemit: 0.186664
## Iteration: 178 bestvalit: -0.885478 bestmemit: 0.187427
## Iteration: 179 bestvalit: -0.885478 bestmemit: 0.186277
## Iteration: 180 bestvalit: -0.885478 bestmemit: 0.185649
## Iteration: 181 bestvalit: -0.885478 bestmemit: 0.185690
## Iteration: 182 bestvalit: -0.885478 bestmemit: 0.186006
## Iteration: 183 bestvalit: -0.885478 bestmemit: 0.185857
## Iteration: 184 bestvalit: -0.885478 bestmemit: 0.186576
## Iteration: 185 bestvalit: -0.885478 bestmemit: 0.186252
## Iteration: 186 bestvalit: -0.885478 bestmemit: 0.185739
## Iteration: 187 bestvalit: -0.885478 bestmemit: 0.186501
## Iteration: 188 bestvalit: -0.885478 bestmemit: 0.185695
## Iteration: 189 bestvalit: -0.885478 bestmemit: 0.185833
## Iteration: 190 bestvalit: -0.885478 bestmemit: 0.185923
## Iteration: 191 bestvalit: -0.885478 bestmemit: 0.185851
## Iteration: 192 bestvalit: -0.885478 bestmemit: 0.186290
## Iteration: 193 bestvalit: -0.885478 bestmemit: 0.186280
## Iteration: 194 bestvalit: -0.885478 bestmemit: 0.186247
## Iteration: 195 bestvalit: -0.885478 bestmemit: 0.186127
## Iteration: 196 bestvalit: -0.885478 bestmemit: 0.186664
## Iteration: 197 bestvalit: -0.885478 bestmemit: 0.186104
## Iteration: 198 bestvalit: -0.885478 bestmemit: 0.186940
## Iteration: 199 bestvalit: -0.885478 bestmemit: 0.187664
## Iteration: 200 bestvalit: -0.885478 bestmemit: 0.188206
## List of 2
## $ optim :List of 4
## ..$ bestmem: Named num 0.188
## .. ..- attr(*, "names")= chr "par1"
## ..$ bestval: num -0.885
## ..$ nfeval : int 402
## ..$ iter : int 200
## $ member:List of 6
## ..$ lower : Named num 0.01
## .. ..- attr(*, "names")= chr "par1"
## ..$ upper : Named num 0.99
## .. ..- attr(*, "names")= chr "par1"
## ..$ bestmemit: num [1:200, 1] 0.188 0.188 0.188 0.188 0.188 ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:200] "1" "2" "3" "4" ...
## .. .. ..$ : chr "par1"
## ..$ bestvalit: num [1:200] -0.885 -0.885 -0.885 -0.885 -0.885 ...
## ..$ pop : num [1:10, 1] 0.187 0.186 0.187 0.186 0.188 ...
## ..$ storepop : list()
## - attr(*, "class")= chr "DEoptim"
do31op <-do31$optim
opt32 <- do31op$bestmem
opt31; FBeta_Score(test$Outcome, ifelse(test_pred1 > opt31, 1, 0), positive = 1, beta=3)## [1] 0.3
## [1] 0.7911001236
## par1
## 0.1882058738
## [1] 0.8854781582
options(digits=10)
AUCV <- sapply(seq(0.01,0.99,0.1), function(x) { MLmetrics::AUC(ifelse(test_pred1 > x, 1, 0), test$Outcome)})
opt_AUC1 <- which.max(AUCV)/10
# optimize() 함수
optimize(function(x) { -MLmetrics::AUC(ifelse(test_pred1 > x, 1, 0), test$Outcome)}, interval = c(0, 1), tol = 0.0001)## $minimum
## [1] 0.2762629248
##
## $objective
## [1] -0.7820512821
# optim() 함수
q <- optim(par = 0.5, function(x) { -MLmetrics::AUC(ifelse(test_pred1 > x, 1, 0), test$Outcome)}, lower = 0.01, upper = 0.99, method ="L-BFGS-B")
str(q)## List of 5
## $ par : num 0.5
## $ value : num -0.736
## $ counts : Named int [1:2] 1 1
## ..- attr(*, "names")= chr [1:2] "function" "gradient"
## $ convergence: int 0
## $ message : chr "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"
# opm() 함수
library(optimr)
opm(par = 0.5, function(x) { -MLmetrics::AUC(ifelse(test_pred1 > x, 1, 0), test$Outcome)}, lower = 0.01, upper = 0.99, method ="L-BFGS-B")
# DEoptim 활용 optimize threshold
library(DEoptim)
do_AUC <- DEoptim(function(x) { -MLmetrics::AUC(ifelse(test_pred1 > x, 1, 0), test$Outcome)}, 0.01, 0.99)## Iteration: 1 bestvalit: -0.769608 bestmemit: 0.295822
## Iteration: 2 bestvalit: -0.779789 bestmemit: 0.332443
## Iteration: 3 bestvalit: -0.779789 bestmemit: 0.332443
## Iteration: 4 bestvalit: -0.779789 bestmemit: 0.327710
## Iteration: 5 bestvalit: -0.783057 bestmemit: 0.336734
## Iteration: 6 bestvalit: -0.783057 bestmemit: 0.336734
## Iteration: 7 bestvalit: -0.786325 bestmemit: 0.337501
## Iteration: 8 bestvalit: -0.786325 bestmemit: 0.337501
## Iteration: 9 bestvalit: -0.789593 bestmemit: 0.337997
## Iteration: 10 bestvalit: -0.789593 bestmemit: 0.337997
## Iteration: 11 bestvalit: -0.789593 bestmemit: 0.337997
## Iteration: 12 bestvalit: -0.789593 bestmemit: 0.337910
## Iteration: 13 bestvalit: -0.789593 bestmemit: 0.338601
## Iteration: 14 bestvalit: -0.789593 bestmemit: 0.338358
## Iteration: 15 bestvalit: -0.789593 bestmemit: 0.338547
## Iteration: 16 bestvalit: -0.789593 bestmemit: 0.338739
## Iteration: 17 bestvalit: -0.789593 bestmemit: 0.338337
## Iteration: 18 bestvalit: -0.789593 bestmemit: 0.338708
## Iteration: 19 bestvalit: -0.789593 bestmemit: 0.338283
## Iteration: 20 bestvalit: -0.789593 bestmemit: 0.338091
## Iteration: 21 bestvalit: -0.789593 bestmemit: 0.338133
## Iteration: 22 bestvalit: -0.789593 bestmemit: 0.338332
## Iteration: 23 bestvalit: -0.789593 bestmemit: 0.338770
## Iteration: 24 bestvalit: -0.789593 bestmemit: 0.338676
## Iteration: 25 bestvalit: -0.789593 bestmemit: 0.338466
## Iteration: 26 bestvalit: -0.789593 bestmemit: 0.338072
## Iteration: 27 bestvalit: -0.789593 bestmemit: 0.338263
## Iteration: 28 bestvalit: -0.789593 bestmemit: 0.338252
## Iteration: 29 bestvalit: -0.789593 bestmemit: 0.338414
## Iteration: 30 bestvalit: -0.789593 bestmemit: 0.338257
## Iteration: 31 bestvalit: -0.789593 bestmemit: 0.338404
## Iteration: 32 bestvalit: -0.789593 bestmemit: 0.338440
## Iteration: 33 bestvalit: -0.789593 bestmemit: 0.338362
## Iteration: 34 bestvalit: -0.789593 bestmemit: 0.338251
## Iteration: 35 bestvalit: -0.789593 bestmemit: 0.338239
## Iteration: 36 bestvalit: -0.789593 bestmemit: 0.338125
## Iteration: 37 bestvalit: -0.789593 bestmemit: 0.338192
## Iteration: 38 bestvalit: -0.789593 bestmemit: 0.338509
## Iteration: 39 bestvalit: -0.789593 bestmemit: 0.338362
## Iteration: 40 bestvalit: -0.789593 bestmemit: 0.338286
## Iteration: 41 bestvalit: -0.789593 bestmemit: 0.338403
## Iteration: 42 bestvalit: -0.789593 bestmemit: 0.338496
## Iteration: 43 bestvalit: -0.789593 bestmemit: 0.338328
## Iteration: 44 bestvalit: -0.789593 bestmemit: 0.338455
## Iteration: 45 bestvalit: -0.789593 bestmemit: 0.338300
## Iteration: 46 bestvalit: -0.789593 bestmemit: 0.338384
## Iteration: 47 bestvalit: -0.789593 bestmemit: 0.338441
## Iteration: 48 bestvalit: -0.789593 bestmemit: 0.338423
## Iteration: 49 bestvalit: -0.789593 bestmemit: 0.338497
## Iteration: 50 bestvalit: -0.789593 bestmemit: 0.338302
## Iteration: 51 bestvalit: -0.789593 bestmemit: 0.338355
## Iteration: 52 bestvalit: -0.789593 bestmemit: 0.338066
## Iteration: 53 bestvalit: -0.789593 bestmemit: 0.338217
## Iteration: 54 bestvalit: -0.789593 bestmemit: 0.338265
## Iteration: 55 bestvalit: -0.789593 bestmemit: 0.338267
## Iteration: 56 bestvalit: -0.789593 bestmemit: 0.338790
## Iteration: 57 bestvalit: -0.789593 bestmemit: 0.338475
## Iteration: 58 bestvalit: -0.789593 bestmemit: 0.337725
## Iteration: 59 bestvalit: -0.789593 bestmemit: 0.338337
## Iteration: 60 bestvalit: -0.789593 bestmemit: 0.338183
## Iteration: 61 bestvalit: -0.789593 bestmemit: 0.337654
## Iteration: 62 bestvalit: -0.789593 bestmemit: 0.337697
## Iteration: 63 bestvalit: -0.789593 bestmemit: 0.337735
## Iteration: 64 bestvalit: -0.789593 bestmemit: 0.337901
## Iteration: 65 bestvalit: -0.789593 bestmemit: 0.337700
## Iteration: 66 bestvalit: -0.789593 bestmemit: 0.337690
## Iteration: 67 bestvalit: -0.789593 bestmemit: 0.337740
## Iteration: 68 bestvalit: -0.789593 bestmemit: 0.337717
## Iteration: 69 bestvalit: -0.789593 bestmemit: 0.337913
## Iteration: 70 bestvalit: -0.789593 bestmemit: 0.337749
## Iteration: 71 bestvalit: -0.789593 bestmemit: 0.338006
## Iteration: 72 bestvalit: -0.789593 bestmemit: 0.338103
## Iteration: 73 bestvalit: -0.789593 bestmemit: 0.338381
## Iteration: 74 bestvalit: -0.789593 bestmemit: 0.338038
## Iteration: 75 bestvalit: -0.789593 bestmemit: 0.337883
## Iteration: 76 bestvalit: -0.789593 bestmemit: 0.338067
## Iteration: 77 bestvalit: -0.789593 bestmemit: 0.338347
## Iteration: 78 bestvalit: -0.789593 bestmemit: 0.338714
## Iteration: 79 bestvalit: -0.789593 bestmemit: 0.338667
## Iteration: 80 bestvalit: -0.789593 bestmemit: 0.338399
## Iteration: 81 bestvalit: -0.789593 bestmemit: 0.338751
## Iteration: 82 bestvalit: -0.789593 bestmemit: 0.338951
## Iteration: 83 bestvalit: -0.789593 bestmemit: 0.338863
## Iteration: 84 bestvalit: -0.789593 bestmemit: 0.338896
## Iteration: 85 bestvalit: -0.789593 bestmemit: 0.338682
## Iteration: 86 bestvalit: -0.789593 bestmemit: 0.338633
## Iteration: 87 bestvalit: -0.789593 bestmemit: 0.338398
## Iteration: 88 bestvalit: -0.789593 bestmemit: 0.338768
## Iteration: 89 bestvalit: -0.789593 bestmemit: 0.338480
## Iteration: 90 bestvalit: -0.789593 bestmemit: 0.338893
## Iteration: 91 bestvalit: -0.789593 bestmemit: 0.338902
## Iteration: 92 bestvalit: -0.789593 bestmemit: 0.338754
## Iteration: 93 bestvalit: -0.789593 bestmemit: 0.338745
## Iteration: 94 bestvalit: -0.789593 bestmemit: 0.338624
## Iteration: 95 bestvalit: -0.789593 bestmemit: 0.338539
## Iteration: 96 bestvalit: -0.789593 bestmemit: 0.338525
## Iteration: 97 bestvalit: -0.789593 bestmemit: 0.338412
## Iteration: 98 bestvalit: -0.789593 bestmemit: 0.338447
## Iteration: 99 bestvalit: -0.789593 bestmemit: 0.338465
## Iteration: 100 bestvalit: -0.789593 bestmemit: 0.338512
## Iteration: 101 bestvalit: -0.789593 bestmemit: 0.338567
## Iteration: 102 bestvalit: -0.789593 bestmemit: 0.338318
## Iteration: 103 bestvalit: -0.789593 bestmemit: 0.338217
## Iteration: 104 bestvalit: -0.789593 bestmemit: 0.338368
## Iteration: 105 bestvalit: -0.789593 bestmemit: 0.338294
## Iteration: 106 bestvalit: -0.789593 bestmemit: 0.338873
## Iteration: 107 bestvalit: -0.789593 bestmemit: 0.338145
## Iteration: 108 bestvalit: -0.789593 bestmemit: 0.338934
## Iteration: 109 bestvalit: -0.789593 bestmemit: 0.338040
## Iteration: 110 bestvalit: -0.789593 bestmemit: 0.338035
## Iteration: 111 bestvalit: -0.789593 bestmemit: 0.338506
## Iteration: 112 bestvalit: -0.789593 bestmemit: 0.338274
## Iteration: 113 bestvalit: -0.789593 bestmemit: 0.338395
## Iteration: 114 bestvalit: -0.789593 bestmemit: 0.338709
## Iteration: 115 bestvalit: -0.789593 bestmemit: 0.338654
## Iteration: 116 bestvalit: -0.789593 bestmemit: 0.338373
## Iteration: 117 bestvalit: -0.789593 bestmemit: 0.338291
## Iteration: 118 bestvalit: -0.789593 bestmemit: 0.338527
## Iteration: 119 bestvalit: -0.789593 bestmemit: 0.338282
## Iteration: 120 bestvalit: -0.789593 bestmemit: 0.338507
## Iteration: 121 bestvalit: -0.789593 bestmemit: 0.338325
## Iteration: 122 bestvalit: -0.789593 bestmemit: 0.338509
## Iteration: 123 bestvalit: -0.789593 bestmemit: 0.338341
## Iteration: 124 bestvalit: -0.789593 bestmemit: 0.338603
## Iteration: 125 bestvalit: -0.789593 bestmemit: 0.338251
## Iteration: 126 bestvalit: -0.789593 bestmemit: 0.337970
## Iteration: 127 bestvalit: -0.789593 bestmemit: 0.338458
## Iteration: 128 bestvalit: -0.789593 bestmemit: 0.338302
## Iteration: 129 bestvalit: -0.789593 bestmemit: 0.338519
## Iteration: 130 bestvalit: -0.789593 bestmemit: 0.338378
## Iteration: 131 bestvalit: -0.789593 bestmemit: 0.338072
## Iteration: 132 bestvalit: -0.789593 bestmemit: 0.338078
## Iteration: 133 bestvalit: -0.789593 bestmemit: 0.338286
## Iteration: 134 bestvalit: -0.789593 bestmemit: 0.338489
## Iteration: 135 bestvalit: -0.789593 bestmemit: 0.338856
## Iteration: 136 bestvalit: -0.789593 bestmemit: 0.338849
## Iteration: 137 bestvalit: -0.789593 bestmemit: 0.338339
## Iteration: 138 bestvalit: -0.789593 bestmemit: 0.338008
## Iteration: 139 bestvalit: -0.789593 bestmemit: 0.337632
## Iteration: 140 bestvalit: -0.789593 bestmemit: 0.337765
## Iteration: 141 bestvalit: -0.789593 bestmemit: 0.337665
## Iteration: 142 bestvalit: -0.789593 bestmemit: 0.337655
## Iteration: 143 bestvalit: -0.789593 bestmemit: 0.337753
## Iteration: 144 bestvalit: -0.789593 bestmemit: 0.337698
## Iteration: 145 bestvalit: -0.789593 bestmemit: 0.337725
## Iteration: 146 bestvalit: -0.789593 bestmemit: 0.337689
## Iteration: 147 bestvalit: -0.789593 bestmemit: 0.337728
## Iteration: 148 bestvalit: -0.789593 bestmemit: 0.337679
## Iteration: 149 bestvalit: -0.789593 bestmemit: 0.337688
## Iteration: 150 bestvalit: -0.789593 bestmemit: 0.337715
## Iteration: 151 bestvalit: -0.789593 bestmemit: 0.337656
## Iteration: 152 bestvalit: -0.789593 bestmemit: 0.337641
## Iteration: 153 bestvalit: -0.789593 bestmemit: 0.337656
## Iteration: 154 bestvalit: -0.789593 bestmemit: 0.337642
## Iteration: 155 bestvalit: -0.789593 bestmemit: 0.337646
## Iteration: 156 bestvalit: -0.789593 bestmemit: 0.337659
## Iteration: 157 bestvalit: -0.789593 bestmemit: 0.337657
## Iteration: 158 bestvalit: -0.789593 bestmemit: 0.337640
## Iteration: 159 bestvalit: -0.789593 bestmemit: 0.337641
## Iteration: 160 bestvalit: -0.789593 bestmemit: 0.337639
## Iteration: 161 bestvalit: -0.789593 bestmemit: 0.337646
## Iteration: 162 bestvalit: -0.789593 bestmemit: 0.337647
## Iteration: 163 bestvalit: -0.789593 bestmemit: 0.337661
## Iteration: 164 bestvalit: -0.789593 bestmemit: 0.337648
## Iteration: 165 bestvalit: -0.789593 bestmemit: 0.337639
## Iteration: 166 bestvalit: -0.789593 bestmemit: 0.337644
## Iteration: 167 bestvalit: -0.789593 bestmemit: 0.337647
## Iteration: 168 bestvalit: -0.789593 bestmemit: 0.337649
## Iteration: 169 bestvalit: -0.789593 bestmemit: 0.337667
## Iteration: 170 bestvalit: -0.789593 bestmemit: 0.337667
## Iteration: 171 bestvalit: -0.789593 bestmemit: 0.337657
## Iteration: 172 bestvalit: -0.789593 bestmemit: 0.337658
## Iteration: 173 bestvalit: -0.789593 bestmemit: 0.337659
## Iteration: 174 bestvalit: -0.789593 bestmemit: 0.337660
## Iteration: 175 bestvalit: -0.789593 bestmemit: 0.337665
## Iteration: 176 bestvalit: -0.789593 bestmemit: 0.337672
## Iteration: 177 bestvalit: -0.789593 bestmemit: 0.337661
## Iteration: 178 bestvalit: -0.789593 bestmemit: 0.337664
## Iteration: 179 bestvalit: -0.789593 bestmemit: 0.337655
## Iteration: 180 bestvalit: -0.789593 bestmemit: 0.337653
## Iteration: 181 bestvalit: -0.789593 bestmemit: 0.337662
## Iteration: 182 bestvalit: -0.789593 bestmemit: 0.337653
## Iteration: 183 bestvalit: -0.789593 bestmemit: 0.337651
## Iteration: 184 bestvalit: -0.789593 bestmemit: 0.337661
## Iteration: 185 bestvalit: -0.789593 bestmemit: 0.337652
## Iteration: 186 bestvalit: -0.789593 bestmemit: 0.337636
## Iteration: 187 bestvalit: -0.789593 bestmemit: 0.337632
## Iteration: 188 bestvalit: -0.789593 bestmemit: 0.337646
## Iteration: 189 bestvalit: -0.789593 bestmemit: 0.337644
## Iteration: 190 bestvalit: -0.789593 bestmemit: 0.337646
## Iteration: 191 bestvalit: -0.789593 bestmemit: 0.337659
## Iteration: 192 bestvalit: -0.789593 bestmemit: 0.337649
## Iteration: 193 bestvalit: -0.789593 bestmemit: 0.337659
## Iteration: 194 bestvalit: -0.789593 bestmemit: 0.337651
## Iteration: 195 bestvalit: -0.789593 bestmemit: 0.337644
## Iteration: 196 bestvalit: -0.789593 bestmemit: 0.337647
## Iteration: 197 bestvalit: -0.789593 bestmemit: 0.337648
## Iteration: 198 bestvalit: -0.789593 bestmemit: 0.337651
## Iteration: 199 bestvalit: -0.789593 bestmemit: 0.337643
## Iteration: 200 bestvalit: -0.789593 bestmemit: 0.337634
## List of 2
## $ optim :List of 4
## ..$ bestmem: Named num 0.338
## .. ..- attr(*, "names")= chr "par1"
## ..$ bestval: num -0.79
## ..$ nfeval : int 402
## ..$ iter : int 200
## $ member:List of 6
## ..$ lower : Named num 0.01
## .. ..- attr(*, "names")= chr "par1"
## ..$ upper : Named num 0.99
## .. ..- attr(*, "names")= chr "par1"
## ..$ bestmemit: num [1:200, 1] 0.386 0.296 0.332 0.332 0.328 ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:200] "1" "2" "3" "4" ...
## .. .. ..$ : chr "par1"
## ..$ bestvalit: num [1:200] -0.755 -0.77 -0.78 -0.78 -0.78 ...
## ..$ pop : num [1:10, 1] 0.338 0.338 0.338 0.338 0.338 ...
## ..$ storepop : list()
## - attr(*, "class")= chr "DEoptim"
do_AUCop <-do_AUC$optim
opt_AUC2 <- do_AUCop$bestmem
opt_AUC1; MLmetrics::AUC(ifelse(test_pred1 > opt_AUC1, 1, 0), test$Outcome)## [1] 0.4
## [1] 0.7516339869
## par1
## 0.3376342009
## [1] 0.7895927602
AUC 최대화는 위의 방법으로 최적값이 구해지지 않는 것 같습니다.
AIC = -2log(likelihood) + 2p, p: 변수의 갯수, n: 데이터 갯수. 변수가 많은 모델은(p가 큰) 우도(likelihood)는 커집니다. AIC를 최소화 한다는 뜻은 우도(likelihood)를 가장 크게 하는 동시에 변수 갯수는 가장 적은 최적의 모델(parsimonious & explainable)을 의미합니다.↩︎
수신자 조작 특성(受信者操作特性, Receiver operating characteristics, ROC)은 신호탐지이론에서 적중확률(Y축,True Positive Rate, Sensitivity) 대 오경보확률(X축, False Positive Rate, 1- Specificity)의 그래프입니다.↩︎
AUC (Area Under the Curve)는 모델 정확도 판단 지표로 ROC 곡선 밑의 넓이입니다. AUC의 가준: .0.90-1 = excellent, 0.80-0.90 = good, 0.70-0.80 = fair, 0.60-0.70 = poor, 0.50-0.60 = fail↩︎