Some useful information

This is a summary of a set of 9 experiments I ran on Cranium using a single pipe workflow file that performs 3000 independent jobs, each one with the CBDA-SL and the knockoff filter feature mining strategies. Each experiments has a total of 9000 jobs and is uniquely identified by 6 input arguments: # of jobs [M], % of missing values [misValperc], min [Kcol_min] and max [Kcol_max] % for FSR-Feature Sampling Range, min [Nrow_min] and max [Nrow_max] % for SSR-Subject Sampling Range.

This document has the final results, by experiment. See https://drive.google.com/file/d/0B5sz_T_1CNJQWmlsRTZEcjBEOEk/view?ths=true for some general documentation of the CBDA-SL project and github https://github.com/SOCR/CBDA for some of the code [still in progress].

# # Here I load the dataset [not executed]
# ADNI_dataset = read.csv("C:/Users/simeonem/Documents/CBDA-SL/Cranium/ADNI_dataset.txt",header = TRUE)

Features selected by both the knockoff filter and the CBDA-SL algorithms are shown as spikes in the histograms shown below. No False Discovery Rates are shown (since we don’t have information on the “true” features). I list the top features selected, set to 15 here.

## [1] EXPERIMENT 1
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0          5         15         60         80

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density 
##  6    28        7.734807  4       5.284801
##  4    26        7.182320  6       5.051403
##  2    12        3.314917  2       5.045846
##  5    10        2.762431  9       4.901361
##  9    10        2.762431  5       4.617949
##  12    9        2.486188 60       3.834398
##  27    9        2.486188  7       3.695471
##  66    9        2.486188 59       3.473187
##  7     8        2.209945  3       3.295360
##  26    8        2.209945  8       2.133926
##  53    8        2.209945 56       1.956099
##  13    7        1.933702 16       1.922756
##  55    7        1.933702 63       1.833843
##  60    7        1.933702 55       1.767158
##  1     6        1.657459 15       1.539316
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 2
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0          5         15        100        100

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density 
##  4    26        7.027027  5       4.777456
##  6    26        7.027027  2       4.603634
##  2    18        4.864865  4       4.587832
##  9    11        2.972973  6       4.545694
##  29    9        2.432432  9       4.540427
##  39    9        2.432432  7       4.087437
##  3     8        2.162162 60       3.929418
##  5     8        2.162162  3       3.592310
##  21    7        1.891892 59       3.450092
##  37    7        1.891892  8       2.633658
##  41    7        1.891892 56       2.207006
##  42    7        1.891892 55       2.159600
##  64    7        1.891892 16       1.864630
##  1     6        1.621622 58       1.801422
##  16    6        1.621622 36       1.759284
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 3
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0         15         30         60         80

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density  
##  2    32        4.102564  2       10.005750
##  4    30        3.846154  6        9.708645
##  6    29        3.717949  4        9.569676
##  7    19        2.435897  9        8.534598
##  9    18        2.307692  5        7.221583
##  8    16        2.051282  7        5.458118
##  13   16        2.051282 60        4.331992
##  24   16        2.051282 59        3.675484
##  66   16        2.051282  3        3.445467
##  41   15        1.923077  8        2.597278
##  49   15        1.923077 36        1.408856
##  55   15        1.923077 56        1.293847
##  61   15        1.923077 16        1.231551
##  14   14        1.794872 63        1.231551
##  56   14        1.794872 15        1.126126
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 4
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0         15         30        100        100

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density 
##  4    31        4.116866  6       9.478456
##  6    25        3.320053  4       9.217455
##  2    24        3.187251  9       9.070928
##  7    19        2.523240  2       9.006823
##  9    18        2.390438  5       7.408764
##  11   17        2.257636  7       5.916022
##  25   16        2.124834 60       4.290489
##  41   15        1.992032  3       3.974541
##  5    14        1.859230 59       3.795961
##  35   14        1.859230  8       3.365539
##  55   14        1.859230 16       1.712533
##  57   14        1.859230 36       1.579743
##  13   13        1.726428 15       1.501900
##  22   13        1.726428 56       1.492742
##  39   13        1.726428 55       1.401163
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 5
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0         30         50         60         80

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density  
##  4    40        2.923977  4       14.575684
##  6    36        2.631579  2       14.456042
##  2    31        2.266082  6       14.390033
##  55   30        2.192982  9       10.932794
##  23   28        2.046784  5        8.139775
##  14   27        1.973684  7        4.509262
##  66   27        1.973684 60        2.661001
##  17   26        1.900585  8        2.636247
##  42   26        1.900585  3        2.198936
##  32   25        1.827485 59        2.087545
##  35   25        1.827485 65        1.505838
##  5    24        1.754386 36        1.097405
##  7    24        1.754386 39        1.027270
##  9    24        1.754386 16        0.899377
##  26   24        1.754386 63        0.820991
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 6
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0         30         50        100        100

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density   
##  4    38        2.783883  6       14.4776594
##  6    33        2.417582  2       14.2828749
##  2    31        2.271062  4       14.0483384
##  9    31        2.271062  9       12.1561457
##  51   28        2.051282  5        8.2723803
##  42   27        1.978022  7        5.5215456
##  36   24        1.758242  8        3.4146923
##  40   24        1.758242 60        2.7627604
##  43   24        1.758242  3        2.2857370
##  46   24        1.758242 59        2.0710765
##  59   24        1.758242 65        1.9080935
##  14   23        1.684982 39        1.4429957
##  16   23        1.684982 16        1.0176499
##  20   23        1.684982 36        0.9023692
##  24   23        1.684982 44        0.8586421
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 7
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0          5         15         40         60

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density 
##  6    27        7.417582  6       5.399143
##  2    25        6.868132  4       5.074731
##  4    19        5.219780  2       4.958869
##  9    11        3.021978  5       4.918318
##  51   11        3.021978  9       4.906732
##  26    9        2.472527 60       3.765496
##  30    9        2.472527 59       3.493222
##  32    9        2.472527  7       3.354188
##  16    8        2.197802  3       3.336809
##  41    8        2.197802  8       1.946472
##  5     7        1.923077 63       1.807438
##  20    7        1.923077 56       1.714749
##  27    7        1.923077 16       1.622060
##  1     6        1.648352 64       1.552543
##  21    6        1.648352 36       1.488819
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 8
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0         15         30         40         60

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density  
##  4    30        4.115226  6       10.199869
##  6    26        3.566529  2        9.746765
##  2    23        3.155007  4        9.676283
##  9    20        2.743484  9        8.226351
##  11   19        2.606310  5        7.440971
##  12   16        2.194787  7        4.344762
##  5    15        2.057613 60        4.002417
##  16   15        2.057613 59        3.483864
##  55   15        2.057613  3        3.217037
##  27   14        1.920439  8        2.205105
##  44   14        1.920439 36        1.369380
##  54   14        1.920439 63        1.293863
##  63   14        1.920439 16        1.263656
##  38   13        1.783265 56        1.263656
##  57   13        1.783265 15        1.077380
## 
## 
## 
## 
## 
## 
## [1] EXPERIMENT 9
##          M misValperc   Kcol_min   Kcol_max   Nrow_min   Nrow_max 
##       9000          0         30         50         40         60

## [1] "TABLE with CBDA-SL & KNOCKOFF FILTER RESULTS"
##  CBDA Frequency Density  Knockoff Density   
##  4    39        2.895323  6       15.6680411
##  6    39        2.895323  2       15.3609966
##  2    34        2.524128  4       14.5012720
##  9    31        2.301411  9        9.2332661
##  11   27        2.004454  5        7.5664532
##  60   26        1.930215  7        3.9740328
##  21   25        1.855976 60        2.3730152
##  41   25        1.855976  3        2.0835161
##  18   24        1.781737  8        2.0089482
##  24   24        1.781737 59        1.9212212
##  27   24        1.781737 65        1.2237916
##  30   24        1.781737 36        1.0264058
##  7    23        1.707498 39        1.0220195
##  16   23        1.707498 16        0.9079744
##  17   23        1.707498 63        0.7500658