From My Blogger Hunting for Differentially Methylated Genomic Regions
It’s a sunny day and I want to go hunting!
So I decide to hunt for some differentially methylated genomic regions between samples from cancer and normal tissues. More specifically, I want to identify regions with the genomic profile deviating from the normal value.
There already exists a Bioconductor package for me – bumphunter.
# biocLite("bumphunter")
library(bumphunter)
The bump hunting algorithm includes 4 steps:
Fit a regression model between logit-transformed methylation measurements aginst status for individuals at each location. The estimated slope \(\hat{\beta}(t_j)\) is retained for step 2.
The estimate \(\hat{\beta}(t_j)\) is smoothed in clusters using loess so that 0 indicates the unmethylated.
The candidate differential methylated regions are identified with an absolute \(\hat{\beta}(t_j)\) above a predefined threshold.
Permutation procedures that construct null distributions for each candidate region.
load("dna1.rda")
# Design matrix: each row for one sample with columns representing its status (cancer or not)
design <- model.matrix(~pd$Status)
# Specify the chromosome at each position
chr <- as.factor(seqnames(gr))
# Specify chromosomal positions
pos <- start(gr)
Perform clustering on each chromosome independently. Group genomic locations within predefined maxGap to each other into one cluster.
cl <- clusterMaker(chr, pos, maxGap=500)
cl provides a set of cluster IDs. Genomic positions assigned with the same ID belong to the same cluster.
result <- bumphunter(meth,X,chr=chr,pos=pos,cluster=cl,cutoff=0.1, nullMethod="permutation", smooth = TRUE, B=200, smoothFunction=loessByCluster)
save(result, file="bump.RData")
The main output I’m interested in is the table of candidate regions – my preys.
tab<-result$table