This demo shows the new annotation functions to be included in genomation. I tried to keep the function names consistent with the old ones. Although, some class and function names are idiotic we have to keep the convention, otherwise it is a lot of work to change them, and the alternatives could be equially unsatisfactory.
The functions can be used to annotate target regions or a list of target regions with a given set of genomic features. The genomic features to be used should be in named GRangesList format.
We will get p300, SP1 and Nanog peaks,and chromHMM annotations from ENCODE. We wil use the H1-Esc cells. The aim is to annotate the peaks with chromHMM annotations.
p300.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsHaibH1hescP300V0416102UniPk.narrowPeak.gz"
Nanog.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsHaibH1hescNanogsc33759V0416102UniPk.narrowPeak.gz"
SP1.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsHaibH1hescSp1Pcr1xUniPk.narrowPeak.gz"
chrHMM.url="http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmH1hescHMM.bed.gz"
#http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgSegmentation/wgEncodeAwgSegmentationChromhmmH1hesc.bed.gz"
# get chromHMM annotation and make a list out of it
library(genomation)
## Loading required package: grid
chrHMM=readBed(chrHMM.url)
chrHMM.list=GenomicRanges::split(chrHMM, chrHMM$name,drop=TRUE)
# get peaks
p300=readBed(p300.url)
SP1=readBed(SP1.url)
NANOG=readBed(Nanog.url)
here are the new functions located at https://gist.github.com/al2na/c368ae8de776b7d95aec2ee9f9bb9777. we will source them using devtools::source_gist().
library(devtools)
source_gist("https://gist.github.com/al2na/c368ae8de776b7d95aec2ee9f9bb9777")
## Sourcing https://gist.githubusercontent.com/al2na/c368ae8de776b7d95aec2ee9f9bb9777/raw/09c5bf471d2e6744891bcf9d0fe5739c3cc8a64d/annotateWithFeatures.R
## SHA-1 hash of file is 67bf5ef0819289ffc3dba86200aa0abf8735539c
annotateWithFeatures() will calculate the percentage of overlaps in a given GRanges object with a GRangesList object where each element correspods to a different category of genomic features such as promoters, exons and introns.
heatmapTargetAnnotation() will plot a heatmap of annotations returned from annotateWithFeatures() or can return the matrix that is used the create the heatmap.
We can annote a given peak set like this:
peak2ann=annotateWithFeatures(p300,chrHMM.list)
peak2ann
## summary of target set annotation with feature annotation:
## Rows in target set: 8934
## ----------------------------
## percentage of target elements overlapping with features:
## 1_Active_Promoter 10_Txn_Elongation 11_Weak_Txn 12_Repressed
## 28.16 0.07 2.10 1.33
## 13_Heterochrom/lo 14_Repetitive/CNV 15_Repetitive/CNV 2_Weak_Promoter
## 2.74 0.08 0.18 23.86
## 3_Poised_Promoter 4_Strong_Enhancer 5_Strong_Enhancer 6_Weak_Enhancer
## 8.69 11.91 16.33 32.65
## 7_Weak_Enhancer 8_Insulator 9_Txn_Transition
## 8.03 5.72 1.00
##
## percentage of feature elements overlapping with target:
## 1_Active_Promoter 10_Txn_Elongation 11_Weak_Txn 12_Repressed
## 19.81 0.04 0.16 0.68
## 13_Heterochrom/lo 14_Repetitive/CNV 15_Repetitive/CNV 2_Weak_Promoter
## 0.27 0.19 0.57 6.76
## 3_Poised_Promoter 4_Strong_Enhancer 5_Strong_Enhancer 6_Weak_Enhancer
## 6.45 20.36 11.49 3.46
## 7_Weak_Enhancer 8_Insulator 9_Txn_Transition
## 0.53 0.84 0.71
##
We can annotate the GRangesList of different peak sets like this:
peak.list=GenomicRanges::GRangesList(p300=p300,SP1=SP1,NANOG=NANOG)
peak2ann.l=annotateWithFeatures(peak.list,chrHMM.list)
## Working on: p300
## Working on: SP1
## Working on: NANOG
We can make a heatmap using heatmapTargetAnnotation() function. The function also returns a ggplot2 object, so it can be further manipulated via ggplot2.
heatmapTargetAnnotation(peak2ann.l)
We can also get the matrix that is used to make the heatmap and use it in other heatmap functions.
mat=heatmapTargetAnnotation(peak2ann.l,plot=FALSE)
library(gplots)
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
heatmap.2(mat,col="topo.colors",cellnote=round(mat),
notecol="black",trace="none",cexCol=0.5,cexRow=0.8)