The following is an analysis of ChIP-Seq Data for the ChIP-Seq training Mini-Project. The suggested datasets were downloaded from ENCODE and include ChIP-Seq assays targeting NFE2L2 (Experiment ‘ENCSR584GHV’) and BRCA1 (Experiment ‘ENCSR343RJR’). This analysis was carried out in R primarily using the ChIPSeeker and ChIppeakAnno packages.
After the bed files were loaded, the overlapping binding sites for the two datasets were assessed and plotted in the venn diagram below. From this analysis, it is clear there is only a small number of overlapaking binding sites, and the majority of the peaks are divergent.
After the bed files were loaded, a profile of the signal values of the consensus peaks were plotted with regards to adjacent transcription start sites (TSS) and displayed in the following two plots. Although there are some similarities between the two datasets, they are not highly congruent. In the metaplot on the right, the BRCA1 peaks appear to be predominantly distributed over the TSS. Despite a strong presence of NFE2L2 peaks concentrated around a TSS, there are many counts aligned to regions up- and downstream of the TSS.
## >> preparing features information... 2021-03-22 2:10:40 PM
## >> identifying nearest features... 2021-03-22 2:10:42 PM
## >> calculating distance from peak to TSS... 2021-03-22 2:10:43 PM
## >> assigning genomic annotation... 2021-03-22 2:10:43 PM
## >> adding gene annotation... 2021-03-22 2:11:43 PM
## >> assigning chromosome lengths 2021-03-22 2:11:43 PM
## >> done... 2021-03-22 2:11:44 PM
## >> preparing features information... 2021-03-22 2:11:45 PM
## >> identifying nearest features... 2021-03-22 2:11:45 PM
## >> calculating distance from peak to TSS... 2021-03-22 2:11:46 PM
## >> assigning genomic annotation... 2021-03-22 2:11:46 PM
## >> adding gene annotation... 2021-03-22 2:11:57 PM
## >> assigning chromosome lengths 2021-03-22 2:11:57 PM
## >> done... 2021-03-22 2:11:57 PM
## NULL
The difference in distribution between the two samples is especially pronounced in the bar plot below displaying the percentage of binding sites in relation to their proximity to TSS. Again, the majority of the BRCA1 counts are centered around the TSS whereas the NFE2L2 counts are more widely spread. A large percentage of NFE2L2 binding sites are located 10-100kb up- and downstream of the TSS.
Next, the peaks were annotated and the proportion of geonimc features are visualized in the following barplot. The majority of the peaks for BRCA1 bind to promoter regions proximal to the TSS, whereas the the NFE2L2 peaks primarily bind to intronic and intergenic regions.
Finally, the peaks were assigned to genes, and a KEGG pathway enrichment analysis was carried out. The plot below shows that the shared pathways between the two datasets include genes involved in pathways relating to cancer and hormone regulation. Generally speaking, in those pathways that were shared, although the adjusted p-values differed, the magnitude of the gene ratios were similar.
However, the majority of the enriched pathways were not shared between the two sets. The NFE2L2 dataset had targets of genes that are involved in cell adhesion and connection, and within pathways related to cardiovascular disease. This is consistent with NFE2L2’s role as a transcription factor involved in injury or inflammation response. This differs from the BRCA1 dataset where a common theme among the enriched pathways includes transcription, post-transcriptional processing, and overall genetic information processing. This is unsurprising considering BRCA1 is a tumor suppressor gene that is heavily involved in genomic stability.
The ChIP-Seq data targetting NFE2L2 and BRCA1 genes showed significant difference in their binding sites, peak annotations, and pathway enrichment. However, according to their gene interaction summary on BioGRID(“BRCA1 - NFE2L2 Interaction Summary | BioGRID”, 2021), they are involved in regulating antioxidant signaling and cell survival (CITE). Further study could include comparing additional ChIP-Seq experiments targeting other tumor-suppressor genes such as p53 analyzing for interactions and binding similarities.
BRCA1 - NFE2L2 Interaction Summary | BioGRID. (2021). Retrieved 22 March 2021, from https://thebiogrid.org/interaction/918021