Contents

Installation

GitHub Packages

Changes in the development branch of rapiclient are yet to be merged to the CRAN package. Please use the devel version of rapiclient along with the AnVIL Bioconductor package.

library(BiocManager)
install(c("Bioconductor/AnVIL", "Bioconductor/AnVIL_rapiclient"))
install("waldronlab/cBioPortalData", ref = "apiclient")
library(cBioPortalData)
library(AnVIL)
library(rapiclient)

API representation

Obtaining the cBioPortal API representation object

(cbio <- cBioPortal())
## service: cBioPortal
## tags(); use service$<tab completion>:
## # A tibble: 56 x 3
##    tag          operation               summary                              
##    <chr>        <chr>                   <chr>                                
##  1 Cancer Types getAllCancerTypesUsing… Get all cancer types                 
##  2 Cancer Types getCancerTypeUsingGET   Get a cancer type                    
##  3 Clinical At… fetchClinicalAttribute… Fetch clinical attributes            
##  4 Clinical At… getAllClinicalAttribut… Get all clinical attributes in the s…
##  5 Clinical At… getAllClinicalAttribut… Get all clinical attributes          
##  6 Clinical At… getClinicalAttributeCo… Get counts for clinical attributes a…
##  7 Clinical At… getClinicalAttributeIn… Get specified clinical attribute     
##  8 Clinical Da… fetchAllClinicalDataIn… Fetch clinical data by patient IDs o…
##  9 Clinical Da… fetchClinicalDataUsing… Fetch clinical data by patient IDs o…
## 10 Clinical Da… getAllClinicalDataInSt… Get all clinical data in a study     
## # … with 46 more rows
## tag values:
##   Cancer Types, Clinical Attributes, Clinical Data, Clinical Events,
##   Copy Number Segments, Discrete Copy Number Alterations, Gene
##   Panels, Genes, Molecular Data, Molecular Profiles, Mutations,
##   Patients, Sample Lists, Samples, Studies
## schemas():
##   CancerStudy, CancerStudyTags, ClinicalAttribute,
##   ClinicalAttributeCount, ClinicalAttributeCountFilter
##   # ... with 41 more elements

Operations

Check available tags, operations, and descriptions as a tibble:

tags(cbio)
## # A tibble: 56 x 3
##    tag          operation               summary                              
##    <chr>        <chr>                   <chr>                                
##  1 Cancer Types getAllCancerTypesUsing… Get all cancer types                 
##  2 Cancer Types getCancerTypeUsingGET   Get a cancer type                    
##  3 Clinical At… fetchClinicalAttribute… Fetch clinical attributes            
##  4 Clinical At… getAllClinicalAttribut… Get all clinical attributes in the s…
##  5 Clinical At… getAllClinicalAttribut… Get all clinical attributes          
##  6 Clinical At… getClinicalAttributeCo… Get counts for clinical attributes a…
##  7 Clinical At… getClinicalAttributeIn… Get specified clinical attribute     
##  8 Clinical Da… fetchAllClinicalDataIn… Fetch clinical data by patient IDs o…
##  9 Clinical Da… fetchClinicalDataUsing… Fetch clinical data by patient IDs o…
## 10 Clinical Da… getAllClinicalDataInSt… Get all clinical data in a study     
## # … with 46 more rows
head(tags(cbio)$operation)
## [1] "getAllCancerTypesUsingGET"              
## [2] "getCancerTypeUsingGET"                  
## [3] "fetchClinicalAttributesUsingPOST"       
## [4] "getAllClinicalAttributesInStudyUsingGET"
## [5] "getAllClinicalAttributesUsingGET"       
## [6] "getClinicalAttributeCountsUsingPOST"

Studies

Get the list of studies available:

getStudies(cbio)
## # A tibble: 248 x 12
##    name  shortName description publicStudy groups status importDate
##    <chr> <chr>     <chr>       <lgl>       <chr>   <int> <chr>     
##  1 Brea… Breast (… "TCGA Brea… TRUE        PUBLIC      0 2019-02-1…
##  2 Pros… Prostate… "Copy-numb… TRUE        PUBLIC      0 2017-11-1…
##  3 Hepa… HCC (MSK… MSK-IMPACT… TRUE        ""          0 2019-03-2…
##  4 Head… Head & n… "TCGA Head… TRUE        PUBLIC      0 2019-02-1…
##  5 Colo… Colorect… Targeted s… TRUE        PUBLIC      0 2019-02-1…
##  6 Neur… NBL (Col… Whole-geno… TRUE        ""          0 2019-02-2…
##  7 Ovar… Ovarian … "Ovarian S… TRUE        PUBLI…      0 2019-01-3…
##  8 Pros… MSK-IMPA… Targeted s… TRUE        PUBLIC      0 2019-02-2…
##  9 Live… Liver (T… "TCGA Live… TRUE        PUBLIC      0 2018-10-2…
## 10 Aden… ACBC (MS… Whole exom… TRUE        ACYC;…      0 2019-03-2…
## # … with 238 more rows, and 5 more variables: allSampleCount <int>,
## #   studyId <chr>, cancerTypeId <chr>, pmid <chr>, citation <chr>

Clinical Data

Obtain the clinical data for a particular study:

clinicalData(cbio, "acc_tcga")
## # A tibble: 92 x 20
##    uniqueSampleKey uniquePatientKey sampleId patientId studyId CANCER_TYPE
##    <chr>           <chr>            <chr>    <chr>     <chr>   <chr>      
##  1 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  2 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  3 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  4 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  5 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  6 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  7 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  8 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
##  9 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 10 VENHQS1PUi1BNU… VENHQS1PUi1BNUp… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## # … with 82 more rows, and 14 more variables: CANCER_TYPE_DETAILED <chr>,
## #   DAYS_TO_COLLECTION <chr>, FRACTION_GENOME_ALTERED <chr>, IS_FFPE <chr>,
## #   MUTATION_COUNT <chr>, OCT_EMBEDDED <chr>, ONCOTREE_CODE <chr>,
## #   OTHER_SAMPLE_ID <chr>, PATHOLOGY_REPORT_FILE_NAME <chr>,
## #   PATHOLOGY_REPORT_UUID <chr>, SAMPLE_INITIAL_WEIGHT <chr>,
## #   SAMPLE_TYPE <chr>, SAMPLE_TYPE_ID <chr>, VIAL_NUMBER <chr>

Molecular Profiles

A table of molecular profiles for a particular study can be obtained by running the following:

mols <- molecularProfiles(cbio, "acc_tcga")
mols[["molecularProfileId"]]
## [1] "acc_tcga_rppa"                          
## [2] "acc_tcga_rppa_Zscores"                  
## [3] "acc_tcga_gistic"                        
## [4] "acc_tcga_rna_seq_v2_mrna"               
## [5] "acc_tcga_rna_seq_v2_mrna_median_Zscores"
## [6] "acc_tcga_linear_CNA"                    
## [7] "acc_tcga_methylation_hm450"             
## [8] "acc_tcga_mutations"

Molecular Profile Data

The data for a molecular profile can be obtained with prior knowledge of available entrezGeneIds:

molecularSlice(cbio, profileId = "acc_tcga_rna_seq_v2_mrna",
    entrezGeneIds = c(1, 2),
    sampleIds = c("TCGA-OR-A5J1-01",  "TCGA-OR-A5J2-01")
)
## # A tibble: 2 x 3
##   entrezGeneId `TCGA-OR-A5J1-01` `TCGA-OR-A5J2-01`
##          <int>             <dbl>             <dbl>
## 1            1              16.3              9.60
## 2            2           10374.            9845.

All available genes

A list of all the genes provided by the API service including hugo symbols, and entrez gene IDs can be obtained by using the geneTable function:

geneTable(cbio)
## # A tibble: 100,000 x 6
##    entrezGeneId hugoGeneSymbol       type  length cytoband chromosome
##           <int> <chr>                <chr>  <int> <chr>    <chr>     
##  1       -82128 MIR-548O-2/548O-3P   miRNA      0 <NA>     <NA>      
##  2       -82127 MIR-548O-2/548O-5P   miRNA      0 <NA>     <NA>      
##  3       -82126 MIR-219/219A-1-3P    miRNA      0 <NA>     <NA>      
##  4       -82125 MIR-219/219          miRNA      0 <NA>     <NA>      
##  5       -82124 MIR-219/219A-5P      miRNA      0 <NA>     <NA>      
##  6       -82123 MIR-219A-1/219A-1-3P miRNA      0 <NA>     <NA>      
##  7       -82122 MIR-219A-1/219       miRNA      0 <NA>     <NA>      
##  8       -82121 MIR-219A-1/219A-5P   miRNA      0 <NA>     <NA>      
##  9       -82120 MIR-3687-2/3687      miRNA      0 <NA>     <NA>      
## 10       -82119 MIR-1273G/1273G-3P   miRNA      0 <NA>     <NA>      
## # … with 99,990 more rows

It uses the getAllGenesUsingGET function from the API.

Samples

Sample List Identifiers

To display all available sample list identifiers for a particular study ID, one can use the sampleLists function:

sampleLists(cbio, "acc_tcga")
## [1] "acc_tcga_methylation_hm450" "acc_tcga_methylation_all"  
## [3] "acc_tcga_sequenced"         "acc_tcga_cnaseq"           
## [5] "acc_tcga_cna"               "acc_tcga_rna_seq_v2_mrna"  
## [7] "acc_tcga_all"               "acc_tcga_3way_complete"    
## [9] "acc_tcga_rppa"

Sample Identifiers

One can obtain the barcodes / identifiers for each sample using a specific sample list identifier, in this case we want all the copy number alteration samples:

samplesInSampleLists(cbio, "acc_tcga_cna")
## CharacterList of length 1
## [["acc_tcga_cna"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01

This returns a CharacterList of all identifiers for each sample list identifier input:

samplesInSampleLists(cbio, c("acc_tcga_cna", "acc_tcga_cnaseq"))
## CharacterList of length 2
## [["acc_tcga_cna"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01
## [["acc_tcga_cnaseq"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01

sessionInfo

sessionInfo()
## R Under development (unstable) (2019-03-04 r76198)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] rapiclient_0.1.2.0002-3     cBioPortalData_0.99.52     
##  [3] MultiAssayExperiment_1.9.18 SummarizedExperiment_1.13.0
##  [5] DelayedArray_0.9.9          BiocParallel_1.17.19       
##  [7] matrixStats_0.54.0          Biobase_2.43.1             
##  [9] GenomicRanges_1.35.1        GenomeInfoDb_1.19.3        
## [11] IRanges_2.17.5              S4Vectors_0.21.24          
## [13] BiocGenerics_0.29.2         AnVIL_0.0.11               
## [15] dplyr_0.8.0.1               BiocStyle_2.11.0           
## 
## loaded via a namespace (and not attached):
##  [1] httr_1.4.0               tidyr_0.8.3             
##  [3] bit64_0.9-7              jsonlite_1.6            
##  [5] splines_3.6.0            assertthat_0.2.1        
##  [7] TCGAutils_1.3.33         BiocManager_1.30.4      
##  [9] BiocFileCache_1.7.10     blob_1.1.1              
## [11] Rsamtools_1.99.6         GenomeInfoDbData_1.2.1  
## [13] RTCGAToolbox_2.13.10     progress_1.2.0          
## [15] yaml_2.2.0               pillar_1.3.1            
## [17] RSQLite_2.1.1            lattice_0.20-38         
## [19] glue_1.3.1               limma_3.39.19           
## [21] digest_0.6.18            XVector_0.23.2          
## [23] rvest_0.3.3              htmltools_0.3.6         
## [25] Matrix_1.2-17            XML_3.98-1.19           
## [27] pkgconfig_2.0.2          biomaRt_2.39.4          
## [29] bookdown_0.9             zlibbioc_1.29.0         
## [31] purrr_0.3.2              RCircos_1.2.1           
## [33] tibble_2.1.1             GenomicFeatures_1.35.11 
## [35] cli_1.1.0                survival_2.44-1.1       
## [37] RJSONIO_1.3-1.1          magrittr_1.5            
## [39] crayon_1.3.4             memoise_1.1.0           
## [41] evaluate_0.13            fansi_0.4.0             
## [43] xml2_1.2.0               prettyunits_1.0.2       
## [45] tools_3.6.0              data.table_1.12.2       
## [47] hms_0.4.2                formatR_1.6             
## [49] stringr_1.4.0            Biostrings_2.51.5       
## [51] AnnotationDbi_1.45.1     lambda.r_1.2.3          
## [53] compiler_3.6.0           rlang_0.3.4             
## [55] GenomicDataCommons_1.7.3 futile.logger_1.4.3     
## [57] grid_3.6.0               RCurl_1.95-4.12         
## [59] rappdirs_0.3.1           bitops_1.0-6            
## [61] rmarkdown_1.12           DBI_1.0.0               
## [63] curl_3.3                 R6_2.4.0                
## [65] GenomicAlignments_1.19.1 rtracklayer_1.43.4      
## [67] knitr_1.22               utf8_1.1.4              
## [69] bit_1.1-14               futile.options_1.0.1    
## [71] readr_1.3.1              stringi_1.4.3           
## [73] RaggedExperiment_1.7.5   Rcpp_1.0.1              
## [75] dbplyr_1.4.0             tidyselect_0.2.5        
## [77] xfun_0.6