Changes in the development branch of rapiclient
are yet to be merged to the
CRAN package. Please use the devel version of rapiclient
along with the
AnVIL
Bioconductor package.
library(BiocManager)
install(c("Bioconductor/AnVIL", "Bioconductor/AnVIL_rapiclient"))
install("waldronlab/cBioPortalData", ref = "apiclient")
library(cBioPortalData)
library(AnVIL)
library(rapiclient)
Obtaining the cBioPortal API representation object
(cbio <- cBioPortal())
## service: cBioPortal
## tags(); use service$<tab completion>:
## # A tibble: 56 x 3
## tag operation summary
## <chr> <chr> <chr>
## 1 Cancer Types getAllCancerTypesUsing… Get all cancer types
## 2 Cancer Types getCancerTypeUsingGET Get a cancer type
## 3 Clinical At… fetchClinicalAttribute… Fetch clinical attributes
## 4 Clinical At… getAllClinicalAttribut… Get all clinical attributes in the s…
## 5 Clinical At… getAllClinicalAttribut… Get all clinical attributes
## 6 Clinical At… getClinicalAttributeCo… Get counts for clinical attributes a…
## 7 Clinical At… getClinicalAttributeIn… Get specified clinical attribute
## 8 Clinical Da… fetchAllClinicalDataIn… Fetch clinical data by patient IDs o…
## 9 Clinical Da… fetchClinicalDataUsing… Fetch clinical data by patient IDs o…
## 10 Clinical Da… getAllClinicalDataInSt… Get all clinical data in a study
## # … with 46 more rows
## tag values:
## Cancer Types, Clinical Attributes, Clinical Data, Clinical Events,
## Copy Number Segments, Discrete Copy Number Alterations, Gene
## Panels, Genes, Molecular Data, Molecular Profiles, Mutations,
## Patients, Sample Lists, Samples, Studies
## schemas():
## CancerStudy, CancerStudyTags, ClinicalAttribute,
## ClinicalAttributeCount, ClinicalAttributeCountFilter
## # ... with 41 more elements
Check available tags, operations, and descriptions as a tibble
:
tags(cbio)
## # A tibble: 56 x 3
## tag operation summary
## <chr> <chr> <chr>
## 1 Cancer Types getAllCancerTypesUsing… Get all cancer types
## 2 Cancer Types getCancerTypeUsingGET Get a cancer type
## 3 Clinical At… fetchClinicalAttribute… Fetch clinical attributes
## 4 Clinical At… getAllClinicalAttribut… Get all clinical attributes in the s…
## 5 Clinical At… getAllClinicalAttribut… Get all clinical attributes
## 6 Clinical At… getClinicalAttributeCo… Get counts for clinical attributes a…
## 7 Clinical At… getClinicalAttributeIn… Get specified clinical attribute
## 8 Clinical Da… fetchAllClinicalDataIn… Fetch clinical data by patient IDs o…
## 9 Clinical Da… fetchClinicalDataUsing… Fetch clinical data by patient IDs o…
## 10 Clinical Da… getAllClinicalDataInSt… Get all clinical data in a study
## # … with 46 more rows
head(tags(cbio)$operation)
## [1] "getAllCancerTypesUsingGET"
## [2] "getCancerTypeUsingGET"
## [3] "fetchClinicalAttributesUsingPOST"
## [4] "getAllClinicalAttributesInStudyUsingGET"
## [5] "getAllClinicalAttributesUsingGET"
## [6] "getClinicalAttributeCountsUsingPOST"
Get the list of studies available:
getStudies(cbio)
## # A tibble: 248 x 12
## name shortName description publicStudy groups status importDate
## <chr> <chr> <chr> <lgl> <chr> <int> <chr>
## 1 Brea… Breast (… "TCGA Brea… TRUE PUBLIC 0 2019-02-1…
## 2 Pros… Prostate… "Copy-numb… TRUE PUBLIC 0 2017-11-1…
## 3 Hepa… HCC (MSK… MSK-IMPACT… TRUE "" 0 2019-03-2…
## 4 Head… Head & n… "TCGA Head… TRUE PUBLIC 0 2019-02-1…
## 5 Colo… Colorect… Targeted s… TRUE PUBLIC 0 2019-02-1…
## 6 Neur… NBL (Col… Whole-geno… TRUE "" 0 2019-02-2…
## 7 Ovar… Ovarian … "Ovarian S… TRUE PUBLI… 0 2019-01-3…
## 8 Pros… MSK-IMPA… Targeted s… TRUE PUBLIC 0 2019-02-2…
## 9 Live… Liver (T… "TCGA Live… TRUE PUBLIC 0 2018-10-2…
## 10 Aden… ACBC (MS… Whole exom… TRUE ACYC;… 0 2019-03-2…
## # … with 238 more rows, and 5 more variables: allSampleCount <int>,
## # studyId <chr>, cancerTypeId <chr>, pmid <chr>, citation <chr>
Obtain the clinical data for a particular study:
clinicalData(cbio, "acc_tcga")
## # A tibble: 92 x 20
## uniqueSampleKey uniquePatientKey sampleId patientId studyId CANCER_TYPE
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 2 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 3 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 4 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 5 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 6 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 7 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 8 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 9 VENHQS1PUi1BNU… VENHQS1PUi1BNUo… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## 10 VENHQS1PUi1BNU… VENHQS1PUi1BNUp… TCGA-OR… TCGA-OR-… acc_tc… Adrenocort…
## # … with 82 more rows, and 14 more variables: CANCER_TYPE_DETAILED <chr>,
## # DAYS_TO_COLLECTION <chr>, FRACTION_GENOME_ALTERED <chr>, IS_FFPE <chr>,
## # MUTATION_COUNT <chr>, OCT_EMBEDDED <chr>, ONCOTREE_CODE <chr>,
## # OTHER_SAMPLE_ID <chr>, PATHOLOGY_REPORT_FILE_NAME <chr>,
## # PATHOLOGY_REPORT_UUID <chr>, SAMPLE_INITIAL_WEIGHT <chr>,
## # SAMPLE_TYPE <chr>, SAMPLE_TYPE_ID <chr>, VIAL_NUMBER <chr>
A table of molecular profiles for a particular study can be obtained by running the following:
mols <- molecularProfiles(cbio, "acc_tcga")
mols[["molecularProfileId"]]
## [1] "acc_tcga_rppa"
## [2] "acc_tcga_rppa_Zscores"
## [3] "acc_tcga_gistic"
## [4] "acc_tcga_rna_seq_v2_mrna"
## [5] "acc_tcga_rna_seq_v2_mrna_median_Zscores"
## [6] "acc_tcga_linear_CNA"
## [7] "acc_tcga_methylation_hm450"
## [8] "acc_tcga_mutations"
The data for a molecular profile can be obtained with prior knowledge of
available entrezGeneIds
:
molecularSlice(cbio, profileId = "acc_tcga_rna_seq_v2_mrna",
entrezGeneIds = c(1, 2),
sampleIds = c("TCGA-OR-A5J1-01", "TCGA-OR-A5J2-01")
)
## # A tibble: 2 x 3
## entrezGeneId `TCGA-OR-A5J1-01` `TCGA-OR-A5J2-01`
## <int> <dbl> <dbl>
## 1 1 16.3 9.60
## 2 2 10374. 9845.
A list of all the genes provided by the API service including hugo symbols,
and entrez gene IDs can be obtained by using the geneTable
function:
geneTable(cbio)
## # A tibble: 100,000 x 6
## entrezGeneId hugoGeneSymbol type length cytoband chromosome
## <int> <chr> <chr> <int> <chr> <chr>
## 1 -82128 MIR-548O-2/548O-3P miRNA 0 <NA> <NA>
## 2 -82127 MIR-548O-2/548O-5P miRNA 0 <NA> <NA>
## 3 -82126 MIR-219/219A-1-3P miRNA 0 <NA> <NA>
## 4 -82125 MIR-219/219 miRNA 0 <NA> <NA>
## 5 -82124 MIR-219/219A-5P miRNA 0 <NA> <NA>
## 6 -82123 MIR-219A-1/219A-1-3P miRNA 0 <NA> <NA>
## 7 -82122 MIR-219A-1/219 miRNA 0 <NA> <NA>
## 8 -82121 MIR-219A-1/219A-5P miRNA 0 <NA> <NA>
## 9 -82120 MIR-3687-2/3687 miRNA 0 <NA> <NA>
## 10 -82119 MIR-1273G/1273G-3P miRNA 0 <NA> <NA>
## # … with 99,990 more rows
It uses the getAllGenesUsingGET
function from the API.
To display all available sample list identifiers for a particular study ID,
one can use the sampleLists
function:
sampleLists(cbio, "acc_tcga")
## [1] "acc_tcga_methylation_hm450" "acc_tcga_methylation_all"
## [3] "acc_tcga_sequenced" "acc_tcga_cnaseq"
## [5] "acc_tcga_cna" "acc_tcga_rna_seq_v2_mrna"
## [7] "acc_tcga_all" "acc_tcga_3way_complete"
## [9] "acc_tcga_rppa"
One can obtain the barcodes / identifiers for each sample using a specific sample list identifier, in this case we want all the copy number alteration samples:
samplesInSampleLists(cbio, "acc_tcga_cna")
## CharacterList of length 1
## [["acc_tcga_cna"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01
This returns a CharacterList
of all identifiers for each sample list
identifier input:
samplesInSampleLists(cbio, c("acc_tcga_cna", "acc_tcga_cnaseq"))
## CharacterList of length 2
## [["acc_tcga_cna"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01
## [["acc_tcga_cnaseq"]] TCGA-OR-A5J1-01 TCGA-OR-A5J2-01 ... TCGA-PK-A5HC-01
sessionInfo()
## R Under development (unstable) (2019-03-04 r76198)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] rapiclient_0.1.2.0002-3 cBioPortalData_0.99.52
## [3] MultiAssayExperiment_1.9.18 SummarizedExperiment_1.13.0
## [5] DelayedArray_0.9.9 BiocParallel_1.17.19
## [7] matrixStats_0.54.0 Biobase_2.43.1
## [9] GenomicRanges_1.35.1 GenomeInfoDb_1.19.3
## [11] IRanges_2.17.5 S4Vectors_0.21.24
## [13] BiocGenerics_0.29.2 AnVIL_0.0.11
## [15] dplyr_0.8.0.1 BiocStyle_2.11.0
##
## loaded via a namespace (and not attached):
## [1] httr_1.4.0 tidyr_0.8.3
## [3] bit64_0.9-7 jsonlite_1.6
## [5] splines_3.6.0 assertthat_0.2.1
## [7] TCGAutils_1.3.33 BiocManager_1.30.4
## [9] BiocFileCache_1.7.10 blob_1.1.1
## [11] Rsamtools_1.99.6 GenomeInfoDbData_1.2.1
## [13] RTCGAToolbox_2.13.10 progress_1.2.0
## [15] yaml_2.2.0 pillar_1.3.1
## [17] RSQLite_2.1.1 lattice_0.20-38
## [19] glue_1.3.1 limma_3.39.19
## [21] digest_0.6.18 XVector_0.23.2
## [23] rvest_0.3.3 htmltools_0.3.6
## [25] Matrix_1.2-17 XML_3.98-1.19
## [27] pkgconfig_2.0.2 biomaRt_2.39.4
## [29] bookdown_0.9 zlibbioc_1.29.0
## [31] purrr_0.3.2 RCircos_1.2.1
## [33] tibble_2.1.1 GenomicFeatures_1.35.11
## [35] cli_1.1.0 survival_2.44-1.1
## [37] RJSONIO_1.3-1.1 magrittr_1.5
## [39] crayon_1.3.4 memoise_1.1.0
## [41] evaluate_0.13 fansi_0.4.0
## [43] xml2_1.2.0 prettyunits_1.0.2
## [45] tools_3.6.0 data.table_1.12.2
## [47] hms_0.4.2 formatR_1.6
## [49] stringr_1.4.0 Biostrings_2.51.5
## [51] AnnotationDbi_1.45.1 lambda.r_1.2.3
## [53] compiler_3.6.0 rlang_0.3.4
## [55] GenomicDataCommons_1.7.3 futile.logger_1.4.3
## [57] grid_3.6.0 RCurl_1.95-4.12
## [59] rappdirs_0.3.1 bitops_1.0-6
## [61] rmarkdown_1.12 DBI_1.0.0
## [63] curl_3.3 R6_2.4.0
## [65] GenomicAlignments_1.19.1 rtracklayer_1.43.4
## [67] knitr_1.22 utf8_1.1.4
## [69] bit_1.1-14 futile.options_1.0.1
## [71] readr_1.3.1 stringi_1.4.3
## [73] RaggedExperiment_1.7.5 Rcpp_1.0.1
## [75] dbplyr_1.4.0 tidyselect_0.2.5
## [77] xfun_0.6