Some general info to grasp the plots and tables to follow

This is a summary of a set of 30 experiments I ran on Cranium using a single pipe workflow file that performs the following tasks: * Loads a text file with arguments (each line is an experiment with the specs: M[jobs],misValperc[missing value %],min[Kcol_min] and max[Kcol_max] for FSR-Feature Sampling Range %, min[Nrow_min] and max[Nrow_max] for SSR-Subject Sampling Range %). * Loads a dataset for machine learning * Sets the number of jobs/iteration of the CBDA-SL algorithm [j_global] * Sets the experiment to run [i_exp, in a sequence constrained by the max number of jobs that can be submitted on Cranium through the LONI pipeline as a guest] * Set the working directory where every workspace is saved * Set the R script/scripts to be run

This document has the final results, by experiment. See https://drive.google.com/file/d/0B5sz_T_1CNJQWmlsRTZEcjBEOEk/view?ths=true for some general documentation of the CBDA-SL project and github https://github.com/SOCR/CBDA for the code [still in progress]. The test dataset is defined as below:

# Problem parameters
n = 300          # number of observations
p = 100          # number of variables
nonzero=c(1,seq(10,p,10))  # variables with nonzero coefficients (fix location)
k = length(nonzero)      # number of variables with nonzero coefficients
amplitude = 3.5  # signal amplitude (for noise level = 1)

X1 = matrix(rnorm(n*p), nrow=n, ncol=p)
nonzero=c(1,seq(10,p,10))
beta = amplitude * (1:p %in% nonzero)
y.sample <- function() X1 %*% beta + rnorm(n)
Ytemp = y.sample()# Here I write the data in a text file [not executed]
X2 <- cbind(Ytemp,X1)
#write.table(X2,"C:/Users/simeonem/Documents/CBDA-SL/Cranium/Gaussian_dataset.txt",sep=",")
# Here I load the dataset [not executed]
#Gaussian_dataset = read.csv("C:/Users/simeonem/Documents/CBDA-SL/Cranium/Gaussian_dataset.txt",header = TRUE)
# Here the X and Y matrix/vector are set for the CBDA-SL algorithm to proceed [not executed]
#Ytemp <- Gaussian_dataset[,1]
#Xtemp <- Gaussian_dataset[,-1]

Thus, the features that should be extracted by both the knockoff filter and the CBDA-SL algorithms are 1, 10, 20, 30, 40, 50, 60, 70, 80, 90 and 100. I don’t list the False Discovery Rates.

## [1] 1
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 1 3000          0        5       15       60       80

##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 1 3000          0        5       15       60       80

## [1] "CBDA-SL results"
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 1 3000          0        5       15       60       80
##  CBDA Frequency Density  Knockoff Density 
##  59   12        2.424242  30      5.028448
##  92   11        2.222222  40      4.936183
##  31    9        1.818182  20      4.813163
##  42    9        1.818182  10      4.582500
##  52    9        1.818182  80      4.567123
##  54    9        1.818182 100      4.459480
##  63    9        1.818182  60      4.336460
##  77    9        1.818182  70      4.321083
##  87    9        1.818182   1      4.167307
##  27    8        1.616162  50      4.151930
##  29    8        1.616162  90      3.829002
## [1] 100
## [1] 0
## [1] 5
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 5 3000         40        5       15       60       80

##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 5 3000         40        5       15       60       80

## [1] "CBDA-SL results"
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 5 3000         40        5       15       60       80
##  CBDA Frequency Density  Knockoff Density 
##  13   15        2.862595  70      5.053838
##  26   10        1.908397  30      4.810698
##  87   10        1.908397  40      4.637027
##  93   10        1.908397  50      4.150747
##  16    9        1.717557  10      4.081278
##  53    9        1.717557  80      4.046544
##  55    8        1.526718  90      3.942341
##  60    8        1.526718  20      3.751303
##  79    8        1.526718  60      3.647100
##  81    8        1.526718   1      3.299757
##  95    8        1.526718 100      2.830844
## [1] 90.90909
## [1] 0
## [1] 6
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 6 3000          0        5       15      100      100

##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 6 3000          0        5       15      100      100

## [1] "CBDA-SL results"
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 6 3000          0        5       15      100      100
##  CBDA Frequency Density  Knockoff Density 
##  43   12        2.564103  70      4.957116
##  26   10        2.136752  80      4.739061
##  92   10        2.136752  10      4.535543
##  18    9        1.923077  40      4.506469
##  31    9        1.923077  30      4.419247
##  48    9        1.923077  50      4.419247
##  80    8        1.709402 100      4.390173
##  84    8        1.709402  60      4.244803
##  5     7        1.495726  20      4.186655
##  14    7        1.495726  90      4.186655
##  22    7        1.495726   1      3.765082
## [1] 90.90909
## [1] 0
## [1] 7
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 7 3000         10        5       15      100      100

##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 7 3000         10        5       15      100      100

## [1] "CBDA-SL results"
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 7 3000         10        5       15      100      100
##  CBDA Frequency Density  Knockoff Density 
##  4    10        1.980198  30      4.933273
##  23   10        1.980198  70      4.798321
##  5     9        1.782178  50      4.708352
##  13    9        1.782178  80      4.423452
##  89    9        1.782178 100      4.288499
##  54    8        1.584158  40      4.273504
##  58    8        1.584158  90      4.213525
##  62    8        1.584158  60      4.183536
##  65    8        1.584158   1      4.168541
##  68    8        1.584158  20      4.033588
##  79    8        1.584158  10      3.958614
## [1] 100
## [1] 0
## [1] 8
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 8 3000         20        5       15      100      100

##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 8 3000         20        5       15      100      100

## [1] "CBDA-SL results"
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 8 3000         20        5       15      100      100
##  CBDA Frequency Density  Knockoff Density 
##  94   10        2.109705  30      4.718257
##  20    8        1.687764  50      4.688204
##  38    8        1.687764  40      4.583020
##  52    8        1.687764  70      4.583020
##  61    8        1.687764  20      4.537941
##  68    8        1.687764  10      4.492863
##  74    8        1.687764  80      4.477836
##  78    8        1.687764  60      4.297521
##  82    8        1.687764  90      4.282494
##  1     7        1.476793 100      4.042074
##  4     7        1.476793   1      3.906837
## [1] 81.81818
## [1] 0
## [1] 9
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9 3000         30        5       15      100      100

##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9 3000         30        5       15      100      100

## [1] "CBDA-SL results"
##      M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 9 3000         30        5       15      100      100
##  CBDA Frequency Density  Knockoff Density 
##  89   11        2.165354  30      4.959561
##  85   10        1.968504  70      4.593316
##  2     9        1.771654  60      4.547535
##  70    9        1.771654  90      4.547535
##  98    9        1.771654  10      4.410194
##  10    8        1.574803  50      4.303373
##  14    8        1.574803  80      4.288112
##  37    8        1.574803  40      4.272852
##  39    8        1.574803   1      4.028689
##  59    8        1.574803  20      3.982909
##  60    8        1.574803 100      3.830307
## [1] 72.72727
## [1] 0
## [1] 10
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 10 3000         40        5       15      100      100

##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 10 3000         40        5       15      100      100

## [1] "CBDA-SL results"
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 10 3000         40        5       15      100      100
##  CBDA Frequency Density  Knockoff Density 
##  3    11        2.315789  40      4.837959
##  90   10        2.105263  30      4.791441
##  40    9        1.894737  90      4.388277
##  5     8        1.684211  70      4.341758
##  42    8        1.684211  60      4.295240
##  71    8        1.684211  80      3.923089
##  81    8        1.684211  20      3.845557
##  31    7        1.473684  10      3.799039
##  33    7        1.473684  50      3.752520
##  38    7        1.473684   1      3.674988
##  44    7        1.473684 100      3.659482
## [1] 81.81818
## [1] 0
## [1] 11
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 11 3000          0       15       30       60       80

##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 11 3000          0       15       30       60       80

## [1] "CBDA-SL results"
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 11 3000          0       15       30       60       80
##  CBDA Frequency Density  Knockoff Density 
##  4    18        1.620162  30      6.570039
##  5    17        1.530153  50      6.398398
##  56   17        1.530153  40      6.303042
##  59   17        1.530153  60      6.255364
##  69   17        1.530153  90      6.179079
##  74   17        1.530153  70      6.131401
##  2    16        1.440144  20      5.978831
##  48   16        1.440144  10      5.835797
##  77   16        1.440144  80      5.749976
##  10   15        1.350135 100      5.578335
##  11   15        1.350135   1      5.559264
## [1] 90.90909
## [1] 0
## [1] 12
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 12 3000         10       15       30       60       80

##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 12 3000         10       15       30       60       80

## [1] "CBDA-SL results"
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 12 3000         10       15       30       60       80
##  CBDA Frequency Density  Knockoff Density 
##  3    18        1.588703  30      6.576629
##  27   17        1.500441  40      6.410256
##  59   17        1.500441  70      6.302603
##  82   17        1.500441  90      6.302603
##  26   16        1.412180  50      6.146017
##  34   16        1.412180  80      5.960070
##  64   16        1.412180  60      5.871991
##  65   16        1.412180  20      5.862204
##  7    15        1.323919  10      5.695831
##  36   15        1.323919   1      5.255432
##  41   15        1.323919 100      4.922685
## [1] 100
## [1] 0
## [1] 13
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 13 3000         20       15       30       60       80

##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 13 3000         20       15       30       60       80

## [1] "CBDA-SL results"
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 13 3000         20       15       30       60       80
##  CBDA Frequency Density  Knockoff Density 
##  47   19        1.636520  70      6.895165
##  35   18        1.550388  30      6.473012
##  43   18        1.550388  40      6.473012
##  50   18        1.550388  50      5.859885
##  59   18        1.550388  60      5.849834
##  1    17        1.464255  80      5.789527
##  58   17        1.464255  10      5.538245
##  71   17        1.464255  90      5.538245
##  4    16        1.378122  20      5.186451
##  33   16        1.378122   1      5.025631
##  68   16        1.378122 100      4.623580
## [1] 81.81818
## [1] 0
## [1] 14
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 14 3000         30       15       30       60       80

##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 14 3000         30       15       30       60       80

## [1] "CBDA-SL results"
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 14 3000         30       15       30       60       80
##  CBDA Frequency Density  Knockoff Density 
##  39   21        1.865009  70      7.062720
##  76   18        1.598579  30      6.752858
##  14   17        1.509769  40      6.186558
##  95   17        1.509769  90      6.111764
##  27   16        1.420959  80      5.577519
##  28   16        1.420959  50      5.566834
##  86   16        1.420959  10      5.278342
##  94   16        1.420959  60      4.925740
##  96   16        1.420959  20      4.915055
##  13   15        1.332149   1      4.754781
##  34   15        1.332149 100      4.306016
## [1] 100
## [1] 0
## [1] 15
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 15 3000         40       15       30       60       80

##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 15 3000         40       15       30       60       80

## [1] "CBDA-SL results"
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 15 3000         40       15       30       60       80
##  CBDA Frequency Density  Knockoff Density 
##  16   21        1.905626  70      7.469022
##  17   20        1.814882  30      7.033043
##  48   17        1.542650  40      6.620009
##  75   17        1.542650  90      5.828362
##  3    16        1.451906  50      5.610372
##  19   16        1.451906  80      5.564479
##  31   15        1.361162  20      5.059660
##  38   15        1.361162  60      4.703993
##  39   15        1.361162  10      4.290959
##  41   15        1.361162   1      3.946765
##  95   15        1.361162 100      3.843506
## [1] 100
## [1] 0
## [1] 30
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 30 3000         40       30       50      100      100

##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 30 3000         40       30       50      100      100

## [1] "CBDA-SL results"
##       M misValperc Kcol_min Kcol_max Nrow_min Nrow_max
## 30 3000         40       30       50      100      100
##  CBDA Frequency Density  Knockoff Density 
##  76   28        1.430761  70      8.339091
##  23   27        1.379663  30      8.021280
##  2    26        1.328564  40      7.793285
##  12   26        1.328564  50      7.171480
##  37   26        1.328564  80      6.839851
##  9    24        1.226367  90      6.819124
##  52   24        1.226367  60      6.591129
##  62   24        1.226367  10      6.100594
##  65   24        1.226367  20      5.782783
##  79   24        1.226367   1      5.534061
##  82   24        1.226367 100      4.843167
## [1] 100
## [1] 0