From My Blogger Hunting for Differentially Methylated Genomic Regions

It’s a sunny day and I want to go hunting!

So I decide to hunt for some differentially methylated genomic regions between samples from cancer and normal tissues. More specifically, I want to identify regions with the genomic profile deviating from the normal value.

There already exists a Bioconductor package for me – bumphunter.

# biocLite("bumphunter")
library(bumphunter)

Algorithm

The bump hunting algorithm includes 4 steps:

  1. Fit a regression model between logit-transformed methylation measurements aginst status for individuals at each location. The estimated slope \(\hat{\beta}(t_j)\) is retained for step 2.

  2. The estimate \(\hat{\beta}(t_j)\) is smoothed in clusters using loess so that 0 indicates the unmethylated.

  3. The candidate differential methylated regions are identified with an absolute \(\hat{\beta}(t_j)\) above a predefined threshold.

  4. Permutation procedures that construct null distributions for each candidate region.

Load Data

load("dna1.rda")

# Design matrix: each row for one sample with columns representing its status (cancer or not)
design <- model.matrix(~pd$Status)

# Specify the chromosome at each position
chr <- as.factor(seqnames(gr))

# Specify chromosomal positions
pos <- start(gr)

Cluster

Perform clustering on each chromosome independently. Group genomic locations within predefined maxGap to each other into one cluster.

cl <- clusterMaker(chr, pos, maxGap=500)

cl provides a set of cluster IDs. Genomic positions assigned with the same ID belong to the same cluster.

Bump Hunting

result <- bumphunter(meth,X,chr=chr,pos=pos,cluster=cl,cutoff=0.1, nullMethod="permutation", smooth = TRUE, B=200, smoothFunction=loessByCluster)

save(result, file="bump.RData")

The main output I’m interested in is the table of candidate regions – my preys.

tab<-result$table