March 11, 2020

Overview

  • Motivation and Background
  • Current Challenges and Solution Strategies
  • Methods and Software
  • Kidney Cancer Results
  • Conclusion and Next Steps

Background and Motivation

Motivation

  • Each year, 18 Million new cancer cases are diagnosed, and nearly 10 Million people die from cancer (WHO, 2018)
  • A person dies of cancer every 3.3 seconds.
  • Cancer is the second leading cause of death in the US (CDC, 2017)
  • Different cancers cause disparities in mortality for (NCI, 2019):
    • women
    • minorities
    • the indegent
    • the elderly
  • Mortality is affected by society, but incidence is driven by genetics

Primer on Genetics and Cancer

  • The Central Dogma of molecular biology states that DNA (genes) encode RNA, RNA encode proteins, and proteins govern the behavior of the cell (thereby governing the tissue) (Clancy et al., 2008)
  • Cancers are primarily caused by multiple mutations in genes (Knudson hypothesis) belonging to certain biological processes, such as apoptosis (programmed cell death) or proliferation (ACS, 2014)
  • Many cancers are caused by multiple mutations of multiple genes, all working in concert to advance the disease state (Sugimura et al., 1992)

Challenges and Strategies

Challenges

While discovering single-gene cancer drivers is important, such as TP53 (NCBI, 2011), this approach has a few challenges:

  • Cancers are often caused by concurrent abnormalities in multiple genes
  • Gene knockdown experiments only test for single genes, not multiple genes
  • Drug trials often find redundancy in cancer-driving genes
  • Single-gene testing of 20,000-25,000 human genes has very low statistical power after controlling for the false discovery rate

Solutions

To overcome these challenges:

  1. Group genes by their biological pathways (NIH NHGRI, 2015).
    • Depending on the grouping, there are anywhere from 50-5000 pathways to consider.
    • In cancer research, we usually care about
      • The C2 Canonical Pathways collection (Broad Institute) in the Molecular Signatures Database (1,329 pathways), or
      • The WikiPathways collection (approximately 500 pathways) (Slenter et al., 2018).
  2. For each of the pathways selected, test a summary of the pathway for the presesence of a statistically-significant relationship with some outcome (survival time, tumor size, or cancer subtype)

Methods and Software

Methods to Summarize Pathways

SuperPCA

Supervised PCA (SuperPCA; Chen et al., 2008; Chen et al., 2010):

  • ranks each feature in pathway \(i\) by its univariate relationship with the outcome of interest (survival time, tumor size, cancer subtype, etc.), then
  • extracts principal components from the most relevant features

AES-PCA

Execution

Example Results: Kidney Renal Papillary Cell Carcinoma

Pathway Associations with Cancer Survival

  • Many cancers have pronounced survival disparities to gender (Dorak and Karpuzoglu, 2012)
  • Renal cancers have a known gender effect (ACS, 2017)
  • We found a potential association between survival outcomes and the interaction of gender and the first principal component of pathway WP1559.
  • This pathway measures transcription factors related to cardiac hypertrophy (thickening of the heart muscle).
  • A recent paper in Cardiorenal Medicine shows a strong relationship between kidney diseases and cardiac hypertrophy (De Lullo et al., 2015).
  • Our Cox Proportional Hazards model was

\[ h(t) = h_0(t)\exp\left[\beta_1\text{PC}_1 + \beta_2\text{male} + \beta_3(\text{PC}_1\times\text{male})\right] \]

Kidney Cancer Survival

Conclusion

Review

  • Cancers are deadly diseases with massive genetic variability.
  • The vast number of single genes and multi-gene interactions reduce the accuracy of many statistical tests.
  • The pathwayPCA software calculates genetic data summaries that can be used for more accurate statistical testing.
  • This software is open source, so that anyone in the scientific community can use it free of charge.

Acknowledgements

  • Chen and Wang Translational Bio Lab: Steven Chen, Lily Wang, Lizhong Liu, Antonio Colaprico, James Ban, Jenny Zhang, Zhen Gao, Lissette Gomez, and Shirley Sun
    • NIH / NCI R01 CA158472; NIH/NCI R01 CA200987; NIH/NCI U24 CA210954
  • My mentors: Zoran Bursac, Steven Chen, and Lily Wang

Thank You!

Questions?



Be a good steward of science!