Document

Main information can be foud in this page: https://www.synapse.org/#!Synapse:syn20940518/wiki/600165

Step 1: Create Rscript to create the model

Example of a model created:

## [1] "list"
##   [1] "17-AAG (Tanespimycin)"      "A-674563"                  
##   [3] "ABT-737"                    "Afatinib (BIBW-2992)"      
##   [5] "Alisertib (MLN8237)"        "AT7519"                    
##   [7] "Axitinib (AG-013736)"       "AZD1480"                   
##   [9] "Barasertib (AZD1152-HQPA)"  "Bay 11-7085"               
##  [11] "BEZ235"                     "BI-2536"                   
##  [13] "BMS-345541"                 "Bortezomib (Velcade)"      
##  [15] "Bosutinib (SKI-606)"        "Cabozantinib"              
##  [17] "Canertinib (CI-1033)"       "Cediranib (AZD2171)"       
##  [19] "CHIR-99021"                 "CI-1040 (PD184352)"        
##  [21] "Crenolanib"                 "Crizotinib (PF-2341066)"   
##  [23] "CYT387"                     "Dasatinib"                 
##  [25] "DBZ"                        "Doramapimod (BIRB 796)"    
##  [27] "Dovitinib (CHIR-258)"       "Elesclomol"                
##  [29] "Entospletinib (GS-9973)"    "Entrectinib"               
##  [31] "Erlotinib"                  "Flavopiridol"              
##  [33] "Foretinib (XL880)"          "GDC-0879"                  
##  [35] "GDC-0941"                   "Gefitinib"                 
##  [37] "Gilteritinib (ASP-2215)"    "Go6976"                    
##  [39] "GSK-1838705A"               "GSK-1904529A"              
##  [41] "GSK690693"                  "GW-2580"                   
##  [43] "Ibrutinib (PCI-32765)"      "Idelalisib"                
##  [45] "Imatinib"                   "INK-128"                   
##  [47] "JAK Inhibitor I"            "JNJ-28312141"              
##  [49] "JNJ-38877605"               "JNJ-7706621"               
##  [51] "JQ1"                        "KI20227"                   
##  [53] "KU-55933"                   "KW-2449"                   
##  [55] "Lapatinib"                  "Lenalidomide"              
##  [57] "Lenvatinib"                 "Lestaurtinib (CEP-701)"    
##  [59] "Linifanib (ABT-869)"        "Lovastatin"                
##  [61] "LY-333531"                  "Masitinib (AB-1010)"       
##  [63] "MGCD-265"                   "Midostaurin"               
##  [65] "MK-2206"                    "MLN120B"                   
##  [67] "MLN8054"                    "Motesanib (AMG-706)"       
##  [69] "Neratinib (HKI-272)"        "NF-kB Activation Inhibitor"
##  [71] "Nilotinib"                  "Nutlin 3a"                 
##  [73] "NVP-ADW742"                 "NVP-TAE684"                
##  [75] "Palbociclib"                "Panobinostat"              
##  [77] "Pazopanib (GW786034)"       "PD173955"                  
##  [79] "Pelitinib (EKB-569)"        "PHA-665752"                
##  [81] "PHT-427"                    "PI-103"                    
##  [83] "PLX-4720"                   "Ponatinib (AP24534)"       
##  [85] "PP242"                      "PRT062607"                 
##  [87] "Quizartinib (AC220)"        "RAF265 (CHIR-265)"         
##  [89] "Rapamycin"                  "Regorafenib (BAY 73-4506)" 
##  [91] "Roscovitine (CYC-202)"      "Ruxolitinib (INCB018424)"  
##  [93] "S31-201"                    "Saracatinib (AZD0530)"     
##  [95] "SB-431542"                  "Selinexor"                 
##  [97] "Selumetinib (AZD6244)"      "SGX-523"                   
##  [99] "SNS-032 (BMS-387032)"       "Sorafenib"                 
## [101] "SR9011"                     "Staurosporine"             
## [103] "STO609"                     "SU11274"                   
## [105] "Sunitinib"                  "Tandutinib (MLN518)"       
## [107] "TG100-115"                  "TG101348"                  
## [109] "Tivozanib (AV-951)"         "Tofacitinib (CP-690550)"   
## [111] "Tozasertib (VX-680)"        "Trametinib (GSK1120212)"   
## [113] "Vandetanib (ZD6474)"        "Vargetef"                  
## [115] "Vatalanib (PTK787)"         "Vemurafenib (PLX-4032)"    
## [117] "Venetoclax"                 "Vismodegib (GDC-0449)"     
## [119] "Volasertib (BI-6727)"       "VX-745"                    
## [121] "XAV-939"                    "YM-155"
## Random Forest 
## 
##  184 samples
## 1000 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 3 times) 
## Summary of sample sizes: 165, 165, 164, 165, 166, 167, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   38.41165  0.3336447  31.06616
## 
## Tuning parameter 'mtry' was held constant at a value of 31.63858

Step 2: Create Rscript to make predictions using the model

Subchallenge 1 (Source: https://www.synapse.org/#!Synapse:syn20940518/wiki/600161)

Predict ex-vivo drug sensitivity using clinical and genomic features of tumors. Drug sensitivity is measured as the area under the dose response curve (AUC).

Assessment Each drug will be evaluated using a Spearman correlation([0, 1]), and the Spearman correlations will be averaged to produce a final metric. (Davidson-Pilon et al. 2019)

Output The Docker container must write a csv at /output/predictions.csv with one row per (specimen, inhibitor) pair and three columns:

  • lab_id: specimen identifier
  • inhibitor: name of the drug
  • auc: predicted AUC

The list of all inhibitors can be pulled from aucs.csv.

run_model_SC1.R code

#------------------------------------------------
# Dream challange SC1 prediction Rscript  
# - Drescription: 
#    Reads input files in /input/ folder
#    Use models to predict drug score
#    Output prediction scores in /output/ folder
# - Author: Tiago C. Silva
# - Date: 02/12/2020
#------------------------------------------------

#--------------------------------------
# Loading libraries
#--------------------------------------
library(readr, quietly = TRUE)
library(plyr, quietly = TRUE)
library(dplyr, quietly = TRUE)
library(caret, quietly = TRUE)
library(matrixStats, quietly = TRUE)

#--------------------------------------
# INPUT
#--------------------------------------
# Reading the data
input.dir <- "/input/"
input.files <- dir(input.dir,full.names = TRUE)
input.dfs <- lapply(input.files, function(f) read_csv(f,col_types = readr::cols()))
names(input.dfs) <- gsub(".csv","",basename(input.files))

# Treating input the same way input training data
rna <- input.dfs$rnaseq[,-c(1:2)] %>% t %>% as.data.frame() 
colnames(rna) <- input.dfs$rnaseq$Symbol
rna.var.order.idx <- rna %>% as.matrix %>% colVars %>% order(decreasing = TRUE)
rna.zscore <- apply(rna, 2,scale) %>% as.data.frame()
rownames(rna.zscore) <- rownames(rna)

#--------------------------------------
# MODEL
#--------------------------------------
model <- readRDS("/usr/local/bin/modelsSC1_rna_seq_only.rds")

# Prediction
output_df <- plyr::adply(names(model),1,function(drug){
    variables <- names(model[[1]]$trainingData)[-1]
    auc <- predict(model[[drug]], rna.zscore) 
    data.frame("lab_id" = names(auc),
               "inhibitor" = rep(drug,length(auc)), 
               "auc" = auc)
},.progress = "time",.inform = TRUE,.id = NULL)

#--------------------------------------
# Output
#--------------------------------------
output.dir <- "/output/"
write_csv(output_df, file.path(output.dir,"predictions.csv"))

Step 4: build, test and send to synapse

The script below is an example to make it easier to:

  • build the docker with command: ./submit.sh build
  • test the docker with command: ./submit.sh test
  • send the docker to snapse with command: ./submit.sh send

The SYNAPSE_PROJECT_ID should be modified by each student. To make the bash script executable run chmod +x submit.sh

Step 5: submit docker image in synapse

Select the docker submitted and click on Submit Docker Repository to Challenge. docker

Step 6: Check results

You will receive an email with the success or failure (with logs).

You should be able to check you results at: https://www.synapse.org/#!Synapse:syn20940518/wiki/600158

score

score

LS0tCnRpdGxlOiAiU3VibWl0dGluZyB0byBEUkVBTSBjaGFsbGFuZ2UiCmF1dGhvcjogIlRpYWdvIEMuIFNpbHZhIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCm91dHB1dDoKICBybWFya2Rvd246Omh0bWxfZG9jdW1lbnQ6CiAgICB0aGVtZTogbHVtZW4KICAgIGhpZ2hsaWdodDogemVuYnVybgogICAgdG9jOiB0cnVlCiAgICBudW1iZXJfc2VjdGlvbnM6IGZhbHNlCiAgICBkZl9wcmludDogcGFnZWQKICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKICAgIHRvY19mbG9hdDoKICAgICAgY29sbGFwc2VkOiB5ZXMKICAgIHRvY19kZXB0aDogMwplZGl0b3Jfb3B0aW9uczoKICBjaHVua19vdXRwdXRfdHlwZTogaW5saW5lICAgIAotLS0KCiMgRG9jdW1lbnQKCk1haW4gaW5mb3JtYXRpb24gY2FuIGJlIGZvdWQgaW4gdGhpcyBwYWdlOiBodHRwczovL3d3dy5zeW5hcHNlLm9yZy8jIVN5bmFwc2U6c3luMjA5NDA1MTgvd2lraS82MDAxNjUKCiMgU3RlcCAxOiBDcmVhdGUgUnNjcmlwdCB0byBjcmVhdGUgdGhlIG1vZGVsCgpFeGFtcGxlIG9mIGEgbW9kZWwgY3JlYXRlZDoKCmBgYHtSfQptb2RlbC5saXN0IDwtIHJlYWRSRFMoIm1vZGVscy9tb2RlbHNTQzFfcm5hX3NlcV9vbmx5LnJkcyIpCmNsYXNzKG1vZGVsLmxpc3QpCm5hbWVzKG1vZGVsLmxpc3QpCm1vZGVsLmxpc3RbWzFdXQpgYGAKCiMgU3RlcCAyOiBDcmVhdGUgUnNjcmlwdCB0byBtYWtlIHByZWRpY3Rpb25zIHVzaW5nIHRoZSBtb2RlbCAKClN1YmNoYWxsZW5nZSAxIChTb3VyY2U6IGh0dHBzOi8vd3d3LnN5bmFwc2Uub3JnLyMhU3luYXBzZTpzeW4yMDk0MDUxOC93aWtpLzYwMDE2MSkKClByZWRpY3QgZXgtdml2byBkcnVnIHNlbnNpdGl2aXR5IHVzaW5nIGNsaW5pY2FsIGFuZCBnZW5vbWljIGZlYXR1cmVzIG9mIHR1bW9ycy4gRHJ1ZyBzZW5zaXRpdml0eSBpcyBtZWFzdXJlZCBhcyB0aGUgYXJlYSB1bmRlciB0aGUgZG9zZSByZXNwb25zZSBjdXJ2ZSAoQVVDKS4KCkFzc2Vzc21lbnQgCUVhY2ggZHJ1ZyB3aWxsIGJlIGV2YWx1YXRlZCB1c2luZyBhIFNwZWFybWFuIGNvcnJlbGF0aW9uKFswLCAxXSksIGFuZCB0aGUgU3BlYXJtYW4gY29ycmVsYXRpb25zIHdpbGwgYmUgYXZlcmFnZWQgdG8gcHJvZHVjZSBhIGZpbmFsIG1ldHJpYy4gKERhdmlkc29uLVBpbG9uIGV0IGFsLiAyMDE5KQoKT3V0cHV0IAlUaGUgRG9ja2VyIGNvbnRhaW5lciBtdXN0IHdyaXRlIGEgY3N2IGF0IGAvb3V0cHV0L3ByZWRpY3Rpb25zLmNzdmAgd2l0aCBvbmUgcm93IHBlciAoc3BlY2ltZW4sIGluaGliaXRvcikgcGFpciBhbmQgdGhyZWUgY29sdW1uczoKCi0gbGFiX2lkOiBzcGVjaW1lbiBpZGVudGlmaWVyCi0gaW5oaWJpdG9yOiBuYW1lIG9mIHRoZSBkcnVnCi0gYXVjOiBwcmVkaWN0ZWQgQVVDCgpUaGUgbGlzdCBvZiBhbGwgaW5oaWJpdG9ycyBjYW4gYmUgcHVsbGVkIGZyb20gYXVjcy5jc3YuCgojIyBydW5fbW9kZWxfU0MxLlIgY29kZQoKYGBge3IsIGNvZGUgPSByZWFkTGluZXMoInJ1bl9tb2RlbF9TQzEuUiIpLCBldmFsID0gRkFMU0V9CmBgYAoKIyBTdGVwIDM6IENyZWF0ZSB0aGUgZG9ja2VyIGltYWdlCgpZb3VyIGRvY2tlciBmaWxlIHdpbGwgY3JlYXRlIHRoZSBlbnZpcm9tZW50IHlvdSBuZWVkIHRvIHJ1biB0aGUgcHJlZGljdGlvbiBtb2RlbC4KWW91IHdpbGwgbmVlZCB0byBpbnN0YWxsIHRoZSByZXF1aXJlZCBsaWJyYXJpZXMsIGFkZCB0aGUgbW9kZWxzIGludG8gaXQsIGFuZCBjcmVhdGUgYW4gUnNjcmlwdCB0byAKcHJlZGljdCBhbmQgb3V0cHV0IHRoZSBkYXRhLiAKCk9ic2VydmF0aW9uOiBgRlJPTSByLWJhc2VgIHdpbGwgcHJvdmlkZSBSIDMuNi4yLgoKIyMgRG9ja2VyZmlsZSBjb2RlCgpgYGB7ciwgY29kZSA9IHJlYWRMaW5lcygiRG9ja2VyZmlsZSIpLCBldmFsID0gRkFMU0V9CmBgYAoKIyBTdGVwIDQ6IGJ1aWxkLCB0ZXN0IGFuZCBzZW5kIHRvIHN5bmFwc2UgCgpUaGUgc2NyaXB0IGJlbG93IGlzIGFuIGV4YW1wbGUgdG8gbWFrZSBpdCBlYXNpZXIgdG86CgotIGJ1aWxkIHRoZSBkb2NrZXIgd2l0aCBjb21tYW5kOiBgLi9zdWJtaXQuc2ggYnVpbGRgCi0gdGVzdCB0aGUgZG9ja2VyIHdpdGggY29tbWFuZDogYC4vc3VibWl0LnNoIHRlc3RgCi0gc2VuZCB0aGUgZG9ja2VyIHRvIHNuYXBzZSB3aXRoIGNvbW1hbmQ6IGAuL3N1Ym1pdC5zaCBzZW5kYAoKVGhlIGBTWU5BUFNFX1BST0pFQ1RfSURgIHNob3VsZCBiZSBtb2RpZmllZCBieSBlYWNoIHN0dWRlbnQuClRvIG1ha2UgdGhlIGJhc2ggc2NyaXB0IGV4ZWN1dGFibGUgcnVuIGBjaG1vZCAreCBzdWJtaXQuc2hgCgojIyBzdWJtaXQuc2ggY29kZQpgYGB7ciwgY29kZSA9IHJlYWRMaW5lcygic3VibWl0LnNoIiksIGV2YWwgPSBGQUxTRX0KYGBgCgoKCiMgU3RlcCA1OiBzdWJtaXQgZG9ja2VyIGltYWdlIGluIHN5bmFwc2UKClNlbGVjdCB0aGUgZG9ja2VyIHN1Ym1pdHRlZCBhbmQgY2xpY2sgb24gYFN1Ym1pdCBEb2NrZXIgUmVwb3NpdG9yeSB0byBDaGFsbGVuZ2VgLgohW2RvY2tlcl0oZG9ja2VyLnBuZykKCiMgU3RlcCA2OiBDaGVjayByZXN1bHRzCgpZb3Ugd2lsbCByZWNlaXZlIGFuIGVtYWlsIHdpdGggdGhlIHN1Y2Nlc3Mgb3IgZmFpbHVyZSAod2l0aCBsb2dzKS4KCllvdSBzaG91bGQgYmUgYWJsZSB0byBjaGVjayB5b3UgcmVzdWx0cyBhdDogaHR0cHM6Ly93d3cuc3luYXBzZS5vcmcvIyFTeW5hcHNlOnN5bjIwOTQwNTE4L3dpa2kvNjAwMTU4CgohW3Njb3JlXShzY29yZXMucG5nKQo=