Contents

1 Setup

This vignette is running the same analysis as TCGA_5studies.Rmd but with the PCAmodel with a fewer clusters.

1.1 Load packages

suppressPackageStartupMessages({
  library(PCAGenomicSignatures)
  library(dplyr)
  library(ggplot2)
})

1.2 TCGA datasets

load("data/TCGA_validationDatasets.rda")
datasets <- TCGA_validationDatasets[1:5]

1.3 PCAmodel

This PCAmodel used the same 536 studies as the default model, but only top 10 PCs were collected for model building.

PCAmodel
## class: PCAGenomicSignatures 
## dim: 13934 2382 
## metadata(6): cluster size ... MeSH_freq updateNote
## assays(1): model
## rownames(13934): CASKIN1 DDX3Y ... CTC-457E21.9 AC007966.1
## rowData names(0):
## colnames(2382): RAV1 RAV2 ... RAV2381 RAV2382
## colData names(4): RAV studies silhouetteWidth gsea
## trainingData(2): PCAsummary MeSH
## trainingData names(536): DRP000987 SRP059172 ... SRP166108 SRP188526
updateNote(PCAmodel)
## [1] "536 refine.bio studies/ use top 10 PCs/ top 90% varying genes"

2 Multi-comparison (only TCGA)

2.1 heatmapTable (all)

# This process takes little time due to the size of datasets.
val_all <- validate(datasets, PCAmodel)
heatmapTable(val_all, scoreCutoff = 0.68) 

It seems like RAV117 and RAV435 are specific to BRCA while RAV585 is strongly associated with colon/rectal cancers. It’s subtle but potentially RAV345/655 are associated both BRCA and UCEC (uterine cancer).

2.2 heatmapTable - COAD

val_coad <- validate(datasets[["COAD"]], PCAmodel)
heatmapTable(val_coad) 

2.3 heatmapTable - READ

val_read <- validate(datasets[["READ"]], PCAmodel)
heatmapTable(val_read) 

2.4 heatmapTable - BRCA

val_brca <- validate(datasets[["BRCA"]], PCAmodel)
heatmapTable(val_brca)

heatmapTable(val_brca, num.out = 7) 

2.5 MeSH terms and associated studies

2.5.1 BRCA-associated

RAV117 and RAV435 are specific to BRCA.

drawWordcloud(PCAmodel, 117)

drawWordcloud(PCAmodel, 435)

ind <- 117
findStudiesInCluster(PCAmodel, ind, studyTitle = TRUE)
##      studyName
## 279  ERP016798
## 977  SRP023262
## 4275 SRP111343
##                                                                                                      title
## 279                                              Whole transcriptome profiling of 63 breast cancer tumours
## 977  A shared transcriptional program in early breast neoplasias despite genetic and clinical distinctions
## 4275                             RNAseq analysis of chemotherapy and radiation therapy-naïve breast tumors
subsetEnrichedPathways(PCAmodel, ind) %>% as.data.frame
##                                                RAV117
## Up_1                      SMID_BREAST_CANCER_BASAL_DN
## Up_2                  SMID_BREAST_CANCER_LUMINAL_B_UP
## Up_3                   VANTVEER_BREAST_CANCER_ESR1_UP
## Up_4                      DOANE_BREAST_CANCER_ESR1_UP
## Up_5   LIEN_BREAST_CARCINOMA_METAPLASTIC_VS_DUCTAL_DN
## Up_6        CHARAFE_BREAST_CANCER_LUMINAL_VS_BASAL_UP
## Up_7            SMID_BREAST_CANCER_RELAPSE_IN_BONE_UP
## Up_8  CHARAFE_BREAST_CANCER_LUMINAL_VS_MESENCHYMAL_UP
## Up_9           SMID_BREAST_CANCER_RELAPSE_IN_BRAIN_DN
## Up_10                 POOLA_INVASIVE_BREAST_CANCER_DN

ind <- 435
findStudiesInCluster(PCAmodel, ind, studyTitle = TRUE)
##      studyName
## 773  SRP014428
## 4275 SRP111343
## 5720 SRP158730
## 5843 SRP163173
## 5960 SRP169094
##                                                                                                    title
## 773           Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells
## 4275                           RNAseq analysis of chemotherapy and radiation therapy-naïve breast tumors
## 5720                 Separation of breast cancer and organ microenvironment transcriptomes in metastases
## 5843                                Integrative epigenetic taxonomy of primary prostate cancer [RNA-Seq]
## 5960 On-Treatment Biomarkers Improve Prediction of Response to Neoadjuvant Chemotherapy in Breast Cancer
subsetEnrichedPathways(PCAmodel, ind) %>% as.data.frame
##                                                RAV435
## Up_1  CHARAFE_BREAST_CANCER_LUMINAL_VS_MESENCHYMAL_UP
## Up_2   LIEN_BREAST_CARCINOMA_METAPLASTIC_VS_DUCTAL_DN
## Up_3                      DOANE_BREAST_CANCER_ESR1_UP
## Up_4        CHARAFE_BREAST_CANCER_LUMINAL_VS_BASAL_UP
## Up_5                         LIM_MAMMARY_STEM_CELL_DN
## Up_6      ROSTY_CERVICAL_CANCER_PROLIFERATION_CLUSTER
## Up_7                  COLDREN_GEFITINIB_RESISTANCE_DN
## Up_8                   DOANE_BREAST_CANCER_CLASSES_UP
## Up_9                   VANTVEER_BREAST_CANCER_ESR1_UP
## Up_10           SMID_BREAST_CANCER_RELAPSE_IN_BONE_UP

2.5.2 COAD/READ-associated

ind <- 585
findStudiesInCluster(PCAmodel, ind, studyTitle = TRUE)
##      studyName
## 1074 SRP029880
## 3037 SRP077046
## 5411 SRP149847
##                                                                                                                                                                                                          title
## 1074                                                                                                                                           Gene expression profiling study by RNA-seq in colorectal cancer
## 3037 A functional genomics predictive network model identifies regulators of inflammatory bowel disease: Mount Sinai Hospital (MSH) Population Specimen Collection and Profiling of Inflammatory Bowel Disease
## 5411                                                                               Differences in tissue immune cell populations following hematopoietic stem cell transplantation in Crohn's disease patients
subsetEnrichedPathways(PCAmodel, ind) %>% as.data.frame
##                                                        RAV585
## Up_1                 SCHUETZ_BREAST_CANCER_DUCTAL_INVASIVE_UP
## Up_2               VECCHI_GASTRIC_CANCER_ADVANCED_VS_EARLY_UP
## Up_3                                 LIM_MAMMARY_STEM_CELL_UP
## Up_4          CHARAFE_BREAST_CANCER_LUMINAL_VS_MESENCHYMAL_DN
## Up_5           ANASTASSIOU_MULTICANCER_INVASIVENESS_SIGNATURE
## Up_6                       LINDGREN_BLADDER_CANCER_CLUSTER_2B
## Up_7                 PICCALUGA_ANGIOIMMUNOBLASTIC_LYMPHOMA_UP
## Up_8                        SMID_BREAST_CANCER_NORMAL_LIKE_UP
## Up_9                               MCLACHLAN_DENTAL_CARIES_UP
## Up_10 TURASHVILI_BREAST_LOBULAR_CARCINOMA_VS_DUCTAL_NORMAL_UP
drawWordcloud(PCAmodel, ind)

3 Multi-comparison (TCGA + others)

3.1 heatmapTable (all)

Here, we added SLE-WB microarray dataset and 4 colon cancer microarray dataasets to 5 TCGA dataset and scoreCutoff is set to 0.68 instead of the default 0.7.

## Warning: NaNs produced

## Warning: NaNs produced
names(new_datasets)
##  [1] "COAD"     "BRCA"     "LUAD"     "READ"     "UCEC"     "SLE"     
##  [7] "GSE14095" "GSE17536" "GSE2109"  "GSE39582"

Based on this multi-datasets validation table,
- RAV13 and RAV765 are SLE-specific
- Different from 20PC-PCAmodels, which has a RAV associated with both COAD and READ, this 10PC PCAmodel doesn’t seem to have a RAV for both COAD/READ.

3.2 MeSH terms and associated studies

3.2.1 SLE-associated

ind <- 13
findStudiesInCluster(PCAmodel, ind, studyTitle = TRUE)
##      studyName
## 16   DRP001953
## 524  ERP114104
## 1749 SRP051848
## 2024 SRP059039
## 3329 SRP089814
## 4244 SRP110609
## 4533 SRP118733
## 4977 SRP132018
## 5135 SRP136057
## 5140 SRP136108
## 5452 SRP150419
## 5462 SRP150595
##                                                                                                                              title
## 16                         Interactive Transcriptome Analysis of Malaria Patients and Infecting Plasmodium falciparum in Indonesia
## 524                                                                   Altered Gene Expression in Antipsychotic Induced Weight Gain
## 1749                                    Gene Networks Specific for Innate Immunity Define Post-traumatic Stress Disorder [RNA-Seq]
## 2024                 Elucidating the etiology and molecular pathogenicity of infectious diarrhea by high throughput RNA sequencing
## 3329 Differentially Expressed Gene Transcripts Using RNA Sequencing from the Blood of Immunosuppressed Kidney Allograft Recipients
## 4244                 RNA-sequencing analysis of response to P.falciparum infection in Fulani and Mossi ethnic groups, Burkina Faso
## 4533                                                      Transcriptomic analysis of Multiple Myeloma bone marrow microenvironment
## 4977                                                                In-vitro stimulation of healthy donor blood with IL-3 cytokine
## 5135                           Whole Blood Transcriptome Profiling in Juvenile Idiopathic Arthritis and Inflammatory Bowel Disease
## 5140                                               RNA-seq of nine primary human cell types exposed in vitro to methylprednisolone
## 5452                                                                              Haemopedia: Human Haematopoietic Gene Expression
## 5462                                                                                 Homo sapiens Transcriptome or Gene expression
subsetEnrichedPathways(PCAmodel, ind) %>% as.data.frame
##                                                                      RAV13
## Up_1                                                         KEGG_RIBOSOME
## Up_2  REACTOME_SRP_DEPENDENT_COTRANSLATIONAL_PROTEIN_TARGETING_TO_MEMBRANE
## Up_3                            REACTOME_EUKARYOTIC_TRANSLATION_ELONGATION
## Up_4                                  REACTOME_SELENOAMINO_ACID_METABOLISM
## Up_5                                              REACTOME_RRNA_PROCESSING
## Up_6                                                  REACTOME_TRANSLATION
## Up_7                                                     MANALO_HYPOXIA_DN
## Up_8                                       CAIRO_HEPATOBLASTOMA_CLASSES_UP
## Up_9                                              PUJANA_BRCA2_PCC_NETWORK
## Up_10                                        WONG_EMBRYONIC_STEM_CELL_CORE
drawWordcloud(PCAmodel, ind)

ind <- 765
findStudiesInCluster(PCAmodel, ind, studyTitle = TRUE)
##      studyName
## 1740 SRP051688
## 1904 SRP056840
## 2209 SRP062966
## 2699 SRP071965
## 4031 SRP105369
## 4914 SRP131037
## 5462 SRP150595
## 5481 SRP150872
##                                                                                                                                                                    title
## 1740                                                      A Cell-based Systems Biology Assessment of Human Blood to Monitor Immune Responses After Influenza Vaccination
## 1904                                                                                      Renal systems biology of patients with systemic inflammatory response syndrome
## 2209                                                                                                                                                   SLE lupus RNA-seq
## 2699                                                                                     A blood RNA signature for tuberculosis disease risk: a prospective cohort study
## 4031 Transcriptome analysis of G protein-coupled receptors in distinct genetic subgroups of acute myeloid leukemia: identification of potential disease-specific targets
## 4914              Using Next-Generation Sequencing Transcriptomics to Determine Markers of Post-traumatic Symptoms  - preliminary findings from a post-deployment cohort
## 5462                                                                                                                       Homo sapiens Transcriptome or Gene expression
## 5481                                                                                          Discovering in vivo cytokine eQTL interactions from a lupus clinical trial
subsetEnrichedPathways(PCAmodel, ind) %>% as.data.frame
##                                                      RAV765
## Up_1                      REACTOME_NEUTROPHIL_DEGRANULATION
## Up_2                 THEILGAARD_NEUTROPHIL_AT_SKIN_WOUND_DN
## Up_3  ALTEMEIER_RESPONSE_TO_LPS_WITH_MECHANICAL_VENTILATION
## Up_4                       VERHAAK_AML_WITH_NPM1_MUTATED_UP
## Up_5                       VERHAAK_GLIOBLASTOMA_MESENCHYMAL
## Up_6                       HAHTOLA_MYCOSIS_FUNGOIDES_CD4_UP
## Up_7                      BROWN_MYELOID_CELL_DEVELOPMENT_UP
## Up_8             TAKEDA_TARGETS_OF_NUP98_HOXA9_FUSION_8D_DN
## Up_9                   LENAOUR_DENDRITIC_CELL_MATURATION_DN
## Up_10      SMIRNOV_CIRCULATING_ENDOTHELIOCYTES_IN_CANCER_UP
drawWordcloud(PCAmodel, ind)