This is a summary of a set of 1 experiments using a LONI pipeline workflow file that performs 3000 independent jobs, each one with the CBDA-SL and the knockoff filter feature mining strategies. Each experiments has a total of 9000 jobs and is uniquely identified by 6 input arguments: # of jobs [M], % of missing values [misValperc], min [Kcol_min] and max [Kcol_max] % for FSR-Feature Sampling Range, min [Nrow_min] and max [Nrow_max] % for SSR-Subject Sampling Range.
This document has the final results, by experiment. See https://drive.google.com/file/d/0B5sz_T_1CNJQWmlsRTZEcjBEOEk/view?ths=true for some general documentation of the CBDA-SL project and github https://github.com/SOCR/CBDA for some of the code.
Features selected by both the knockoff filter and the CBDA-SL algorithms are shown as spikes in the histograms shown below. I list the top features selected, set to 15 here.
## [1] EXPERIMENT 1
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 0 1 5 30 60
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 32 128 3.465079 70 179 4.809242 70 267 6.046196
## 70 114 3.086086 32 125 3.358409 40 240 5.434783
## 60 99 2.680022 30 111 2.982268 100 219 4.959239
## 30 89 2.409312 90 100 2.686728 30 216 4.891304
## 90 85 2.301029 100 94 2.525524 60 198 4.483696
## 80 83 2.246887 60 92 2.471789 80 183 4.144022
## 57 76 2.057390 10 87 2.337453 10 161 3.645833
## 21 69 1.867894 80 75 2.015046 90 158 3.577899
## 100 66 1.786681 21 66 1.773240 50 151 3.419384
## 22 63 1.705468 57 64 1.719506 32 118 2.672101
## 79 61 1.651326 22 61 1.638904 65 92 2.083333
## 39 59 1.597185 39 59 1.585169 20 79 1.788949
## 24 57 1.543043 82 58 1.558302 49 72 1.630435
## 15 54 1.461830 79 55 1.477700 94 61 1.381341
## 10 53 1.434759 61 54 1.450833 21 55 1.245471
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 7
## [1] "MSE Count"
## [1] 7
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 2
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 0 5 15 30 60
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 70 215 1.973201 70 282 2.512921 70 586 12.5724094
## 32 205 1.881424 60 218 1.942613 100 541 11.6069513
## 60 204 1.872247 30 214 1.906968 40 527 11.3065866
## 80 187 1.716226 32 209 1.862413 30 521 11.1778588
## 10 175 1.606094 90 208 1.853502 60 376 8.0669384
## 30 174 1.596916 10 197 1.755480 80 299 6.4149324
## 90 169 1.551028 80 193 1.719836 90 267 5.7283845
## 57 151 1.385830 100 191 1.702014 10 261 5.5996567
## 21 143 1.312408 50 156 1.390127 50 204 4.3767432
## 41 138 1.266520 57 155 1.381215 32 133 2.8534649
## 22 137 1.257342 79 140 1.247549 65 79 1.6949153
## 100 136 1.248164 39 139 1.238638 20 76 1.6305514
## 46 134 1.229809 15 131 1.167350 49 60 1.2872774
## 79 131 1.202276 21 131 1.167350 21 44 0.9440034
## 82 129 1.183921 20 130 1.158439 94 41 0.8796396
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 7
## [1] "MSE Count"
## [1] 9
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 3
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 0 15 30 30 60
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 70 420 1.788680 70 484 2.012139 70 1024 16.8089297
## 80 369 1.571483 10 398 1.654610 100 912 14.9704531
## 60 363 1.545931 100 379 1.575622 40 845 13.8706500
## 10 342 1.456497 60 375 1.558992 30 825 13.5423506
## 90 332 1.413909 80 370 1.538206 60 568 9.3237032
## 32 324 1.379839 50 366 1.521576 80 402 6.5988181
## 100 322 1.371321 32 358 1.488318 90 348 5.7124097
## 30 306 1.303181 90 353 1.467531 10 295 4.8424163
## 50 300 1.277629 30 333 1.384385 50 250 4.1037426
## 21 278 1.183936 20 273 1.134946 32 134 2.1996060
## 22 266 1.132831 79 271 1.126632 20 73 1.1982928
## 57 263 1.120055 22 268 1.114160 65 54 0.8864084
## 79 262 1.115796 21 264 1.097531 49 35 0.5745240
## 72 260 1.107278 39 261 1.085059 94 35 0.5745240
## 41 259 1.103019 41 260 1.080901 21 30 0.4924491
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 8
## [1] "MSE Count"
## [1] 9
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 4
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 0 1 5 60 80
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 32 133 3.488067 70 149 3.872141 100 337 6.643012
## 60 91 2.386572 30 119 3.092516 30 325 6.406466
## 21 76 1.993181 90 113 2.936590 70 315 6.209344
## 70 73 1.914503 32 108 2.806653 40 307 6.051646
## 30 69 1.809599 60 90 2.338877 60 281 5.539129
## 57 68 1.783373 100 88 2.286902 90 253 4.987187
## 90 68 1.783373 21 80 2.079002 80 249 4.908338
## 15 63 1.652242 80 79 2.053015 50 216 4.257836
## 80 63 1.652242 10 75 1.949064 10 208 4.100138
## 61 62 1.626016 57 66 1.715177 32 162 3.193377
## 34 61 1.599790 34 62 1.611227 20 129 2.542874
## 39 61 1.599790 22 58 1.507277 65 110 2.168342
## 22 58 1.521112 61 58 1.507277 49 99 1.951508
## 83 58 1.521112 39 57 1.481289 21 97 1.912084
## 41 56 1.468660 78 57 1.481289 94 84 1.655825
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 5
## [1] "MSE Count"
## [1] 7
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 5
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 0 5 15 60 80
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 60 183 1.666667 70 238 2.085524 100 831 12.3275478
## 80 178 1.621129 80 215 1.883982 70 799 11.8528408
## 55 172 1.566485 60 208 1.822643 30 758 11.2446225
## 21 164 1.493625 100 205 1.796355 40 754 11.1852841
## 10 156 1.420765 10 201 1.761304 60 665 9.8650052
## 67 153 1.393443 90 196 1.717490 80 525 7.7881620
## 22 149 1.357013 32 174 1.524711 10 453 6.7200712
## 32 149 1.357013 21 167 1.463372 90 434 6.4382139
## 15 146 1.329690 55 166 1.454609 50 405 6.0080107
## 79 145 1.320583 22 157 1.375745 32 191 2.8334075
## 34 143 1.302368 79 143 1.253067 20 125 1.8543243
## 83 141 1.284153 83 143 1.253067 65 100 1.4834594
## 70 140 1.275046 76 139 1.218016 94 67 0.9939178
## 90 139 1.265938 62 135 1.182965 21 65 0.9642486
## 62 136 1.238616 30 133 1.165440 49 64 0.9494140
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 5
## [1] "MSE Count"
## [1] 7
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 6
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 0 15 30 60 80
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 80 360 1.535640 70 447 1.872800 70 1694 15.4986276
## 70 355 1.514311 80 403 1.688453 100 1577 14.4281793
## 90 320 1.365013 90 378 1.583710 40 1511 13.8243367
## 55 314 1.339419 100 356 1.491537 30 1481 13.5498628
## 32 309 1.318091 32 347 1.453829 60 1157 10.5855444
## 34 302 1.288231 60 335 1.403553 80 893 8.1701738
## 76 290 1.237043 50 327 1.370035 90 774 7.0814273
## 10 289 1.232777 10 326 1.365845 10 632 5.7822507
## 21 279 1.190121 55 317 1.328138 50 589 5.3888381
## 60 279 1.190121 21 288 1.206637 32 233 2.1317475
## 62 273 1.164527 30 287 1.202447 20 108 0.9881061
## 79 272 1.160261 34 284 1.189878 65 76 0.6953339
## 22 270 1.151730 76 279 1.168929 49 45 0.4117109
## 41 269 1.147464 41 273 1.143791 94 32 0.2927722
## 69 268 1.143198 62 269 1.127032 21 26 0.2378774
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 5
## [1] "MSE Count"
## [1] 8
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 7
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 20 1 5 30 60
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 60 131 3.580213 70 177 4.770889 70 249 5.856068
## 32 128 3.498224 30 129 3.477089 100 247 5.809031
## 70 123 3.361574 100 116 3.126685 30 237 5.573848
## 30 97 2.650998 32 114 3.072776 60 213 5.009407
## 80 74 2.022410 60 109 2.938005 40 195 4.586077
## 90 73 1.995081 10 103 2.776280 80 168 3.951082
## 57 71 1.940421 90 81 2.183288 10 162 3.809972
## 100 70 1.913091 57 74 1.994609 90 141 3.316087
## 39 65 1.776442 39 73 1.967655 50 114 2.681091
## 10 60 1.639792 50 59 1.590296 32 107 2.516463
## 21 53 1.448483 21 55 1.482480 20 86 2.022578
## 22 53 1.448483 80 54 1.455526 65 81 1.904986
## 35 50 1.366494 79 50 1.347709 49 66 1.552211
## 76 48 1.311834 82 50 1.347709 48 61 1.434619
## 79 48 1.311834 83 49 1.320755 94 61 1.434619
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 7
## [1] "MSE Count"
## [1] 8
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 8
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 20 5 15 30 60
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 70 256 2.315485 70 320 2.855358 70 606 12.752525
## 60 231 2.089363 60 227 2.025520 30 526 11.069024
## 80 203 1.836107 80 227 2.025520 40 514 10.816498
## 32 193 1.745658 90 211 1.882752 100 493 10.374579
## 90 185 1.673300 10 207 1.847060 60 402 8.459596
## 10 175 1.582851 30 198 1.766753 80 315 6.628788
## 30 168 1.519537 32 194 1.731061 90 250 5.260943
## 21 156 1.410999 100 192 1.713215 10 245 5.155724
## 57 137 1.239146 21 158 1.409833 50 207 4.356061
## 100 137 1.239146 50 142 1.267065 32 152 3.198653
## 22 133 1.202967 40 137 1.222450 20 100 2.104377
## 83 133 1.202967 41 137 1.222450 65 72 1.515152
## 41 132 1.193922 55 134 1.195681 94 55 1.157407
## 55 132 1.193922 39 132 1.177835 49 51 1.073232
## 78 132 1.193922 19 131 1.168912 21 48 1.010101
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 7
## [1] "MSE Count"
## [1] 9
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 9
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 20 15 30 30 60
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 70 395 1.677781 70 449 1.863844 70 928 15.7261481
## 60 363 1.541860 60 384 1.594022 100 907 15.3702762
## 80 362 1.537612 100 384 1.594022 30 813 13.7773259
## 30 329 1.397443 80 374 1.552511 40 781 13.2350449
## 90 329 1.397443 90 366 1.519303 60 552 9.3543467
## 100 310 1.316740 30 356 1.477792 80 447 7.5749873
## 32 306 1.299749 10 349 1.448734 90 362 6.1345535
## 10 298 1.265769 32 334 1.386467 10 268 4.5416031
## 21 274 1.163828 50 311 1.290992 50 238 4.0332147
## 39 266 1.129848 40 298 1.237028 32 127 2.1521776
## 79 264 1.121352 39 278 1.154006 65 67 1.1354008
## 22 263 1.117105 79 273 1.133250 20 55 0.9320454
## 41 263 1.117105 94 273 1.133250 49 37 0.6270124
## 20 260 1.104362 20 270 1.120797 21 31 0.5253347
## 50 259 1.100115 22 264 1.095890 94 29 0.4914421
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 9
## [1] "MSE Count"
## [1] 10
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 10
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 20 1 5 60 80
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 32 121 3.125807 70 159 4.048892 70 357 6.828615
## 60 84 2.169982 90 112 2.852050 100 337 6.446060
## 21 79 2.040816 32 99 2.521008 30 320 6.120888
## 57 74 1.911651 80 98 2.495544 40 300 5.738332
## 90 73 1.885818 60 96 2.444614 60 279 5.336649
## 80 71 1.834151 10 95 2.419149 90 247 4.724560
## 39 70 1.808318 30 95 2.419149 10 230 4.399388
## 70 66 1.704986 100 89 2.266361 80 218 4.169855
## 34 64 1.653320 21 82 2.088108 50 207 3.959449
## 30 63 1.627486 57 69 1.757066 32 193 3.691660
## 82 62 1.601653 39 63 1.604278 65 125 2.390972
## 24 59 1.524154 83 58 1.476954 20 108 2.065800
## 67 58 1.498321 82 54 1.375095 94 85 1.625861
## 22 56 1.446655 34 53 1.349631 49 81 1.549350
## 83 56 1.446655 67 53 1.349631 21 78 1.491966
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 5
## [1] "MSE Count"
## [1] 7
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 11
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 20 5 15 60 80
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 21 185 1.681360 70 241 2.119799 70 862 12.7609178
## 80 174 1.581387 80 220 1.935087 100 813 12.0355292
## 90 159 1.445060 100 212 1.864720 30 742 10.9844560
## 10 155 1.408707 10 202 1.776761 40 714 10.5699482
## 32 155 1.408707 90 191 1.680007 60 629 9.3116210
## 60 154 1.399618 60 189 1.662415 80 568 8.4085862
## 22 150 1.363265 21 180 1.583253 90 467 6.9133975
## 55 150 1.363265 32 176 1.548069 10 449 6.6469282
## 83 150 1.363265 55 171 1.504090 50 382 5.6550703
## 39 146 1.326911 83 161 1.416132 32 241 3.5677276
## 70 146 1.326911 30 154 1.354561 20 140 2.0725389
## 76 143 1.299646 22 151 1.328173 65 111 1.6432272
## 67 142 1.290557 41 149 1.310581 49 77 1.1398964
## 41 140 1.272380 79 142 1.249010 94 74 1.0954848
## 15 138 1.254203 67 141 1.240215 21 57 0.8438194
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 5
## [1] "MSE Count"
## [1] 7
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100
##
##
##
##
##
##
## [1] EXPERIMENT 12
## M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9000 20 15 30 60 80
## [1] 10 20 30 40 50 60 70 80 90 100
## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
## Accuracy Count Density MSE Count Density Knockoff Count Density
## 80 359 1.544750 70 406 1.704593 70 1687 15.5197792
## 90 316 1.359725 80 403 1.691998 30 1582 14.5538178
## 21 301 1.295181 100 364 1.528256 100 1540 14.1674333
## 70 301 1.295181 10 345 1.448484 40 1478 13.5970561
## 60 298 1.282272 90 340 1.427492 60 1137 10.4599816
## 32 297 1.277969 50 325 1.364514 80 937 8.6200552
## 62 296 1.273666 60 325 1.364514 90 691 6.3569457
## 10 288 1.239243 21 320 1.343522 10 619 5.6945722
## 15 282 1.213425 32 307 1.288941 50 604 5.5565777
## 79 276 1.187608 30 292 1.225964 32 215 1.9779209
## 39 274 1.179002 20 287 1.204971 20 110 1.0119595
## 55 273 1.174699 62 286 1.200773 65 65 0.5979761
## 99 273 1.174699 99 275 1.154589 49 46 0.4231831
## 100 273 1.174699 39 274 1.150390 94 33 0.3035879
## 76 266 1.144578 55 274 1.150390 21 29 0.2667893
## [1] "M" "Top-ranked"
## [1] 9000 1000
## [1] "Accuracy Count"
## [1] 6
## [1] "MSE Count"
## [1] 9
## [1] "Knockoff Count"
## [1] 10
## [1] "Nonzero Features"
## [1] 10 20 30 40 50 60 70 80 90 100