library(diffloop)
library(diffloopdata)
library(ggplot2)
library(GenomicRanges)
library(ggrepel)
library(DESeq2)
We designed diffloop
to have analogous functionality to workflows like DESeq
and edgeR
. While these tools are used to uncovered genes that are differentially expressed, diffloop
is designed to identify features from chormatin topology that are differential between cell conditions. Our package does not process data from raw sequencing reads and thus is reliant on other tools to process FASTQ files. This vignette walks through the various steps involved in an analysis that identifies differential loops and annotates them with biological information after accessing the samples for quality control.
The raw FASTQ read files for the POL2 data were preprocessed with the [mango] (https://github.com/dphansti/mango) package, an R
package that relies on bowtie
, bedtools
, and MACS2
to align reads, call anchor peaks, and summarize PETs per sample in the ChIA-PET experiment.
head -3 *.interactions.all.mango
chr1 32705184 32708527 chr1 32712484 32715408 4 1
chr1 32705184 32708527 chr1 32755920 32759477 3 1
chr1 32705184 32708527 chr1 32799616 32803060 6 1
To read in these files into diffloop, use the loopsMake.mango
function which specifies the .mango
extension of these files.
Another possible format reads .bedpe
files, which are very similar to the output files produced by mango. To use the loopsMake
function from a different preprocessing step, have files X.loop_counts.bedpe
,Y.loop_counts.bedpe
, Z.loop_counts.bedpe
in bed_dir
for samples = (X,Y,Z)
where the first lines should resemble:
head -3 *.loop_counts.bedpe
1 10002272 10004045 10 120968807 120969483 . 1
1 10002272 10004045 10 99551498 99552470 . 1
1 10002272 10004045 1 10002272 10004045 . 17
where the first three columns specify the position (chr:start:stop) of the first anchor, the second three columns specify the position (chr:start:stop) of the secnd anchor, the 7th column is “.” and the 8th column is the number of paired-end reads support that particular PET.
Here, we show two functions of loading data into diffloop
from two different sources of ChIA-PET data.
We read in data from the a different pipeline using the loopsMake
function. The example below uses sample data included in the diffloopdata
package. It contains interactions involving chromosome 1 from a set of six Cohesin ChIA-PET samples.\(^{1,2}\) Users would normally set bed_dir
to the real data directory. Since these samples were processed with a tool other than mango
, we use the generic loopsMake
function that reads .bedpe
files. We load only two samples for a faster i/o.
bed_dir <- system.file("extdata", "esc_jurkat", package="diffloopdata")
bed_dir
## [1] "/Library/Frameworks/R.framework/Versions/3.3/Resources/library/diffloopdata/extdata/esc_jurkat"
samples <- c("naive_esc_1", "jurkat_1")
full <- loopsMake(bed_dir, samples)
celltypes <- c("naive", "jurkat")
full <- updateLDGroups(full, celltypes)
head(full, 4)
## An object of class "loops"
## Slot "anchors":
## GRanges object with 4 ranges and 0 metadata columns:
## seqnames ranges strand
## <Rle> <IRanges> <Rle>
## [1] 1 [713970, 714336] *
## [2] 1 [804939, 805810] *
## [3] 1 [839766, 841153] *
## [4] 1 [847968, 848693] *
## -------
## seqinfo: 51 sequences from an unspecified genome; no seqlengths
##
## Slot "interactions":
## left right
## [1,] 1 1
## [2,] 1 717
## [3,] 1 14435
## [4,] 1 39892
##
## Slot "counts":
## naive_esc_1 jurkat_1
## [1,] 2 2
## [2,] 0 1
## [3,] 1 0
## [4,] 1 0
##
## Slot "colData":
## sizeFactor groups
## naive_esc_1 1.1303883 naive
## jurkat_1 0.8846517 jurkat
##
## Slot "rowData":
## loopWidth
## 1 0
## 2 8801861
## 3 29973879
## 4 50813612
This constitutes a standard workflow to read in data, annotate sample groups, etc.
Similarly, we read in POL2 ChIA-PET data from the ENCODE project process using the loopsMake.mango
function as these samples were processed using the mango
\(^{3}\) preprocessing pipeline.
bed_dir <- system.file("extdata", "pol2", package="diffloopdata")
bed_dir
## [1] "/Library/Frameworks/R.framework/Versions/3.3/Resources/library/diffloopdata/extdata/pol2"
full <- loopsMake.mango(bed_dir)
celltypes <- c("hct116", "k562", "k562", "k562", "mcf7", "mcf7", "mcf7")
full <- updateLDGroups(full, celltypes)
dim(full)
## anchors interactions samples colData rowData
## 1 32206 122628 7 2 1
head(full, 4)
## An object of class "loops"
## Slot "anchors":
## GRanges object with 4 ranges and 0 metadata columns:
## seqnames ranges strand
## <Rle> <IRanges> <Rle>
## [1] 1 [711560, 715580] *
## [2] 1 [761688, 763941] *
## [3] 1 [777171, 779423] *
## [4] 1 [804264, 806568] *
## -------
## seqinfo: 23 sequences from an unspecified genome; no seqlengths
##
## Slot "interactions":
## left right
## [1,] 1 2
## [2,] 1 7
## [3,] 1 11
## [4,] 1 15
##
## Slot "counts":
## hct116 k562_c k562_r1 k562_r2 mcf7_c mcf7_r1 mcf7_r2
## [1,] 0 2 0 0 0 0 0
## [2,] 0 0 0 0 1 0 1
## [3,] 0 0 0 0 1 0 1
## [4,] 0 3 1 2 0 0 0
##
## Slot "colData":
## sizeFactor groups
## hct116 0.266789 hct116
## k562_c 2.805204 k562
## k562_r1 1.169931 k562
## k562_r2 1.892062 k562
##
## Slot "rowData":
## loopWidth
## 1 49244
## 2 146766
## 3 188453
## 4 222317
By default, we use 500 bp as a value to merge nearby anchors into a single peak from the loopsMake.mango
function. This is due to each sample having been processed individually.
In the loopsMake
function, the default is zero and should be parameterized based on the resolution of the ChIA-PET Experiment and whether or not a common set of peaks was used when calling interactions between samples.
Alternatively, one could run full <- loopsMake(bed_dir)
and the samples
would be inferred from the directory contents (as we did in the loopsMake.mango
call). The default call to loopsMake
does not merge nearby anchors but instead keeps them distinct. If we want to group nearby anchors after the initial data read, one can run the mergeAnchors
command with the appropriate gap value.
From here on, we will focus on just the POL2 ChIA-PET data.
While mango
removes self-ligated loops before the final output step, our import function can create new self-ligations through the process of merging loop anchors. We first remove those self-ligated loops before reading in the regions of altered copy number variation from diffloopdata
. The removeRegion()
function can be called to filter loops that have one or more anchors in a particular region that may be blacklisted. While CNVs likely have some interesting effect on topology that should be explored in the future, we simply remove these regions to avoid a confounded signal where we mis-attribute an apparent difference in topology to a difference better explained by altered CNV.
# remove and loops that merged together from import
full <- subsetLoops(full, full@rowData$loopWidth >= 5000)
# Remove regions of CNV
CNVdir <- system.file("extdata", "cnv", package="diffloopdata")
k562.cnv <- makeGRangesFromDataFrame(setNames(read.table(paste0(CNVdir, "/K562-CNV.bedLogR"))[,1:4],
c("chr", "start", "end", "type")),
keep.extra.columns = TRUE)
mcf7.cnv <- makeGRangesFromDataFrame(setNames(read.table(paste0(CNVdir, "/MCF7-CNV.bedLogR"))[,1:4],
c("chr", "start", "end", "type")),
keep.extra.columns = TRUE)
cnv.regions <- union(mcf7.cnv[mcols(mcf7.cnv)$type != "normal"], k562.cnv[mcols(k562.cnv)$type != "normal"])
noCNV <- removeRegion(full, rmchr(cnv.regions))
dim(noCNV)
## anchors interactions samples colData rowData
## 1 19898 79289 7 2 1
After removing these regions with CNV alteration, we then will filter loops that are deemed biased from the mango
Step 5 function. The diffloop
implementation aggregates counts across all samples when apply the mango correction model. Here, we filter all loops with an FDR > 0.01 as implemented in the original mango
software.
noCNV_corrected <- mangoCorrection(noCNV, FDR = 0.01)
dim(noCNV_corrected)
## anchors interactions samples colData rowData
## 1 13079 37265 7 2 3
For a final quality-control stop, we filter loops that are present strongly (>= 5 PETs) in one replicate but absent (== 0 PETs) in the other replicate. There appears to be some artifact of ChIA-PET that has some extreme inconsistency between replicates. We filter these out during the QC step. Presently, it is not clear what these non-replicated loops can be attributed to, but we are actively investigating it. We use the subsetLoops
function with a boolean parameter to filter these loops with severe discordance between samples.
cm <- noCNV_corrected@counts
k_dis <- ((cm[,3]>=5 &cm[,4]==0)|(cm[,4]>=5&cm[,3]==0))
m_dis <- ((cm[,6]>=5 &cm[,7]==0)|(cm[,7]>=5&cm[,6]==0))
qc_filt <- subsetLoops(noCNV_corrected, !(k_dis | m_dis))
dim(qc_filt)
## anchors interactions samples colData rowData
## 1 12993 36565 7 2 3
Now that the quality control filtering steps are complete, we can begin to look at the quality of our samples. Below is the distribution of PETs per sample as a function of the distance separating the anchors.
p1 <- loopDistancePlot(qc_filt)
p1
Loop distance proportions
While the MCF-7 and K562 samples are very consistent, the HCT116 sample has a slightly different behavior. This can be attributed to the low sequencing counts (and hence low number of unique loops) that are present in this sample, which we can discern from the loopMetrics
function. If self-ligation loops or intrachromosomal loops are also present in the loops
object, then those statistics are also listed using this function.
loopMetrics(qc_filt)
## hct116 k562_c k562_r1 k562_r2 mcf7_c mcf7_r1 mcf7_r2
## unique 1993 140420 54615 88381 63320 23525 36917
When the data is imported using either the loopsMake
or the loopsMake.mango
function, the size factors are automatically computed
pcp1dat <- qc_filt
pcp1dat@colData$sizeFactor <- 1
samples <- c("HCT116", "K562_comb", "K562_r1", "K562_r2", "MCF7_comb", "MCF7_r1", "MCF_r2")
pcp1 <- pcaPlot(pcp1dat) + geom_text_repel(aes(label=samples)) +
scale_x_continuous(limits = c(-140, 230)) + ggtitle("PC Plot with no Size Factor Correction") +
theme(legend.position="none")
pcp1
PC Plot 1
The second principal component scales roughly with the total sequencing depth when size factors are not introduced, and the samples are very diffuse. Hence, we adjust be sweeping the size factor of each sample across the counts matrix and run the PC decomposition once again. Note: this is the default behavior, and we only generated the poor PC plot as a proof-of-concept for the use of the size factors in our analyses.
samples <- c("HCT116", "K562_comb", "K562_r1", "K562_r2", "MCF7_comb", "MCF7_r1", "MCF_r2")
pcp2 <-pcaPlot(qc_filt) + geom_text_repel(aes(label=samples)) +
scale_x_continuous(limits = c(-120, 120)) + ggtitle("PC Plot with Size Factor Correction") +
theme(legend.position="none")
pcp2
PC Plot 2
With the utilization of the size factor, we observe much clearner clusters of our samples and break the correlation between the PCs and the total sequencing depth.
The addition of the combined K562 and MCF7 repilcates as well as the HCT116 sample was used primarily to provide more data points on the PC plot. However, these samples shouldn’t be used (combined; singleton) when performing a differential analysis. Thus, we will only use the two replicates of K562 and MCF7 to do further differential analyses, annotation, and visualization. This can be achieved using the [
command. When applied to a loops
S4 object, the columns are the samples and the rows are the loops. Thus, we can isolate the four samples of interest for a 2x2 analysis
km_filt <- qc_filt[,c(3,4,6,7)]
dim(km_filt)
## anchors interactions samples colData rowData
## 1 12347 35363 4 2 3
Currently in diffloop
, we support two models of accessing differential features. The first is based on an over-dispersed Poisson regression, which was initially made popular in the edgeR
package.\(^4\) The quickAssoc
function implements this association measure with automatic correction for size factors and compares two groups annotated in the colData
slot.
km_res <- quickAssoc(km_filt)
head(km_res@rowData)
## loopWidth mango.FDR mango.P logFC logCPM F
## 1 188453 5.540304e-03 2.498863e-03 2.7413570 5.555985 0.6653664
## 2 222317 1.487267e-10 1.630218e-11 -3.3582021 5.842449 1.7935108
## 3 495096 2.412769e-04 7.391462e-05 -2.0302862 5.553936 0.2981673
## 4 36083 1.659347e-10 1.826372e-11 -0.7914776 7.026072 1.1064598
## 5 95337 9.541060e-07 1.707521e-07 -4.2861041 6.189492 4.7575034
## 6 115381 0.000000e+00 0.000000e+00 -0.4085001 6.289984 0.1283866
## PValue FDR
## 1 0.41467451 0.6564917
## 2 0.18050330 0.6564917
## 3 0.58503539 0.6564917
## 4 0.29922105 0.6564917
## 5 0.02917411 0.2789260
## 6 0.72011185 0.7789467
The second model uses a mean-variance relationship of the log-counts of the loops, which generates a precision weight for each loop and enters uses these values in the limma-voom empirical Bayes analysis pipeline.\(^5\) The quickAssocVoom
function has a similar feel as the previous function call and again requires two groups as inputs in the colData
slot. For each of these association tests, we flag the robust
parameters in the model workflows as TRUE
to represent the best fits for the low counts data associated with the loops data presently analyzed.
km_res2 <- quickAssocVoom(km_res)
For specifying models more complicated than comparing two groups, see the documentation of the loopAssoc
function.
We can compare the significance metrics per-loop of each of these two tests and can see that they give very similar results overall.
fdrmat <- -log10(km_res2@rowData[,c(8,13)])
fdrmatdf <- setNames(data.frame(fdrmat), c("edgeR", "Voom"))
qplot(fdrmatdf$Voom, fdrmatdf$edgeR)+labs(title = "diffloop: edgeR versus Voom",
x = "-log10(FDR) Voom", y = "-log10(FDR) edgeR") + scale_y_continuous(limits = c(0, 25)) +
scale_x_continuous(limits = c(0, 25)) + theme_bw()
## Warning: Removed 36 rows containing missing values (geom_point).
Comparison of Association Methods
cor(fdrmatdf$Voom, fdrmatdf$edgeR)
## [1] 0.8909443
cor(fdrmatdf$Voom, fdrmatdf$edgeR, method = "spearman")
## [1] 0.5758859
While there are some subtle differences between the two methods for calling differential loops, they overall are in strong agreement with few, if any, outliers. For the remainder of the vignette, we will use the results from the edgeR
assocation when refering to a differential loop
.
Critical functionality in downstream diffloop
analyses requires the annotation of loops. In particular, we annotate loops based on their assumed transcriptional activity by annotating the loops
object with regions of promoters and enhancers. Here, we import enhancers defined by H3K27ac peaks in either cancer sample and use the RefSeq hg19 annotation transcription start sites padded by 1kb as the promoter regions.
h3dir <- system.file("extdata", "annotation", package="diffloopdata")
kh3 <- paste0(h3dir, "/", "K562_H3K27ac.bed")
mh3 <- paste0(h3dir, "/", "MCF7_H3K27ac.bed")
h3k27ac.k <- rmchr(padGRanges(bedToGRanges(kh3), pad = 1000))
h3k27ac.m <- rmchr(padGRanges(bedToGRanges(mh3), pad = 1000))
enhancer <- union(h3k27ac.m, h3k27ac.k)
promoter <- padGRanges(getHumanTSS(), pad = 1000)
km_anno <- annotateLoops(km_res, enhancer = enhancer, promoter = promoter)
Thus, out of the 3.536310^{4} loops, we annotate 29285 of them to be enhancer-promoter loops. These numbers may change with variable methods of calling enhancers, promoters, or the protein that was ChIP-ed against.
Next, we create a correlation of the logFC of loops (computed in the quickAssoc
step) with the maximum change in methylation at either loop anchor between the two condtions. Admittedly, this is a fairly specialized analysis and the code only uses a handful of native functions in diffloop
, but we demonstrate this example nonetheless to provide an example of how one may correlate epigenetic features with changing loops. Note the use of annotateAnchors.bed
and one may use the annotateAnchors.bigwig
function to similarly compute average epigenetic values over the anchor locii in a particular loops
object.
methyl_dir <- system.file("extdata", "450k-methyl", package="diffloopdata")
k_450k <- paste0(methyl_dir, "/", "K562-450k.bedgraph")
m_450k <- paste0(methyl_dir, "/", "MCF7-450k.bedgraph")
km_res <- annotateAnchors.bed(km_res, k_450k)
km_res <- annotateAnchors.bed(km_res, m_450k)
# Annotate anchors; link anchors together and take the max
mcf7methyl <- mcols(km_res@anchors)$MCF7.450k
k562methyl <- mcols(km_res@anchors)$K562.450k
log2FC.methyl <- log2(mcf7methyl/k562methyl)
change.methyl <- mcf7methyl - k562methyl
mcols(km_res@anchors) <- as.data.frame(cbind(mcols(km_res@anchors),
data.frame(cbind(log2FC.methyl, change.methyl))))
big.meth.summary <- summary(km_res)
# Keep with largest absolute value
idx <- as.logical(abs(big.meth.summary$log2FC.methyl_1) < abs(big.meth.summary$log2FC.methyl_2))
idx[is.na(idx) | is.nan(idx)] <- FALSE
diff.methyl.max <- as.matrix((sapply(1:length(idx), function(i) {
if (idx[i]) { big.meth.summary[i, ]$change.methyl_2
} else {
big.meth.summary[i, ]$change.methyl_1}
})))
big.meth.summary$diff.methyl.max <- as.numeric(diff.methyl.max)
methyl.df <- big.meth.summary[, c("logFC", "diff.methyl.max", "FDR")]
methyl.df <- methyl.df[complete.cases(methyl.df) & !is.infinite(methyl.df$diff.methyl.max), ]
methyl.df$diff.methyl.max.bin <- cut(methyl.df$diff.methyl.max, seq(-1, 1, 0.25))
After processing that data and creating the integrated data frame, we can visualize a violin plot of the binned methylation values against the logFC of the loops.
ggplot(methyl.df[complete.cases(methyl.df), ], aes(diff.methyl.max.bin, logFC)) +
geom_violin(aes(diff.methyl.max.bin, logFC), fill = "firebrick", trim = TRUE, scale = "width") +
geom_hline(yintercept = 0) + theme_bw() +
ggtitle("Differential loops stratified by differential methylation") +
xlab("Max change in anchor methylation") + ylab("log FC Loops")
Here, we employ the DESeq2
pipeline to process our RNA-Seq data and create a summary table. This summary table can then be linked to our loops object in the subsequent steps.
# Just turning the crank on the DESeq2 pipeline
rna_dir <- system.file("extdata", "rna_seq", package="diffloopdata")
files <- grep("counts", list.files(rna_dir), value = TRUE)
condition <- c("k562", "k562", "k562", "mcf7", "mcf7", "mcf7")
names <- c("k1", "k2", "k3", "m1", "m2", "m3")
sampleTable <- data.frame(sampleName = names, fileName = files, condition = condition)
dds <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = rna_dir, design = ~condition)
dds <- dds[rowSums(counts(dds)) > 1, ] #remove zero counts
dds$condition <- factor(dds$condition, levels = c("k562", "mcf7"))
dds <- DESeq(dds)
res <- results(dds)
res <- res[complete.cases(res), ]
resDESeq2 <- res[order(res$pvalue), ]
With the DESeq2
data in a table, we can annotate enhancer-promoter loops with their corresponding gene expression values. To achieve this, sequential execution of the keepEPloops
and annotateLoops.dge
functions are required. This will add data in the rowData
slot that has the transcriptional information associated with the particular loop when multiple = FALSE
, as we show below.
km_res.ep <- keepEPloops(km_anno, enhancer, promoter)
km.linked <- annotateLoops.dge(km_res.ep, resDESeq2, multiple = FALSE)
If multiple = TRUE
, then loops that bridge multiple promoters will be annotated with transcription values. To achieve this, the returning functional value cannot be a loops
object since we have to duplicate the loop to accommodate the multiple promoters.
In this code chunk, we also set up a data frame for plotting global trends, including the methylation values at distal regions and promoter regions.
df <- summary(km.linked)
df$logFC_bin <- cut(df$logFC, seq(-12, 12, 3))
df$log2FoldChange_bin <- cut(df$log2FoldChange, seq(-20, 20, 5))
We use our large data frame to visualize the relationship between increasing strength of enhancer-promoter loops and the increasing relative expression of transcripts that are linked to the loops.
loopViolin <- ggplot(df[complete.cases(df) & df$padj < 0.1, ], aes(logFC_bin, log2FoldChange)) +
geom_violin(aes(logFC_bin, log2FoldChange), fill = "dodgerblue", trim = TRUE, scale = "width") +
geom_hline(yintercept = 0) + theme_bw() +
labs(title = "Differential expression stratified by E-P Loops", x = "log FC Loops",
y = "log FC Gene Expression")
loopViolin
The plot clearly shows a trend where increasing the logFC of the loops results in a corresponding logFC increase in the corresponding transcript that is the target of the enhancer-promoter loop.
We can visualize regions of differential topology using the loopPlot
function. Below are two examples of significantly altered topology between the K562 and MCF7 samples.
chr1reg <- GRanges(seqnames=c("1"),ranges=IRanges(start=c(11840000),end=c(11980000)))
p1 <- loopPlot(km_anno, chr1reg)
This plot highlights the MTHFR gene, which is significantly up-regulated in the K562 cancer cell line. We can see much stronger (line width is linearly proportional to the number of PETs supporting the interaction) E-P loops leading to the promoter of MTHFR as well as multiple enhancer-enhancer loops further localizing enhancer regions in three-dimensional proximity to the MTHFR promoter locus.
We note the color scheme where red loops bridge enhancers and promoters, purple loops bridge two enhancers, orange loops bridge two promoters, blue loops bridge regions of CTCF occupancy, and black have no special color annotation based on the data present.
chr14reg <- GRanges(seqnames=c("14"),ranges=IRanges(start=c(35800000),end=c(35890000)))
p14 <- loopPlot(km_anno, chr14reg)
The second plot highlights the NFKBIA gene, which is upregulated in breast cancers like MCF7. This loop plot shows several strong enhancer-promoter loops localizing downstream enhancers to the NFKBIA promoter. This change in topology helps explain the upregulation of this gene in MCF7 relative to K562.
This vignette was written to demonstrate a common workflow that performs significant quality control measures on a set of loops and ultimately integrates extensive epigenetic and transcriptional data to identify and annotate regions of differential topology. The visualizations provide both large-scale trends (e.g. correlations between logFC of loops and the transcription FC) and anecdotal regions of variable topology, as shown in the plots above. While diffloop
provides an environment to visualize these loop plots statically, we note our shiny app, DNAlandscapeR that provides an interactive framework for visualizing topology samples alongside other genomic and epigenomic data.
All data used in these analyses were processed from GEO and are contained in the diffloopdata
package. Please feel free to contact Caleb (the maintainer) with any questions concerning the data or this workflow.
sessionInfo()
## R version 3.3.1 (2016-06-21)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.11.6 (El Capitan)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats4 parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] DESeq2_1.12.4 SummarizedExperiment_1.2.3
## [3] Biobase_2.32.0 ggrepel_0.5
## [5] GenomicRanges_1.24.3 GenomeInfoDb_1.8.7
## [7] IRanges_2.6.1 S4Vectors_0.10.3
## [9] BiocGenerics_0.18.0 ggplot2_2.1.0
## [11] diffloopdata_1.1.0 diffloop_1.1.11
##
## loaded via a namespace (and not attached):
## [1] edgeR_3.14.0 splines_3.3.1
## [3] foreach_1.4.3 Formula_1.2-1
## [5] assertthat_0.1 statmod_1.4.26
## [7] latticeExtra_0.6-28 Rsamtools_1.24.0
## [9] yaml_2.1.13 RSQLite_1.0.0
## [11] lattice_0.20-34 limma_3.28.21
## [13] chron_2.3-47 digest_0.6.10
## [15] RColorBrewer_1.1-2 XVector_0.12.1
## [17] colorspace_1.2-6 htmltools_0.3.5
## [19] Matrix_1.2-7.1 plyr_1.8.4
## [21] XML_3.98-1.4 biomaRt_2.28.0
## [23] genefilter_1.54.2 zlibbioc_1.18.0
## [25] xtable_1.8-2 scales_0.4.0
## [27] BiocParallel_1.6.6 tibble_1.2
## [29] annotate_1.50.0 pbapply_1.2-2
## [31] lazyeval_0.2.0 nnet_7.3-12
## [33] survival_2.39-5 magrittr_1.5
## [35] evaluate_0.9 foreign_0.8-67
## [37] tools_3.3.1 data.table_1.9.6
## [39] formatR_1.4 matrixStats_0.50.2
## [41] stringr_1.1.0 munsell_0.4.3
## [43] locfit_1.5-9.1 cluster_2.0.4
## [45] AnnotationDbi_1.34.4 Biostrings_2.40.2
## [47] compiler_3.3.1 grid_3.3.1
## [49] RCurl_1.95-4.8 iterators_1.0.8
## [51] labeling_0.3 bitops_1.0-6
## [53] rmarkdown_1.0 gtable_0.2.0
## [55] codetools_0.2-14 DBI_0.5-1
## [57] reshape2_1.4.1 R6_2.1.3
## [59] GenomicAlignments_1.8.4 gridExtra_2.2.1
## [61] zoo_1.7-13 knitr_1.14
## [63] dplyr_0.5.0 rtracklayer_1.32.2
## [65] Sushi_1.10.0 Hmisc_3.17-4
## [67] readr_1.0.0 stringi_1.1.1
## [69] Rcpp_0.12.7 geneplotter_1.50.0
## [71] rpart_4.1-10 acepack_1.3-3.3
\(^1\)Hnisz, Denes, et al. “Activation of proto-oncogenes by disruption of chromosome neighborhoods.” Science (2016).
\(^2\)Ji, Xiong, et al. “3D Chromosome Regulatory Landscape of Human Pluripotent Cells.” Cell stem cell (2015). \(^3\)Phanstiel, D, et al. “Mango: A bias correcting ChIA-PET analysis pipeline.” Bioinformatics (2015). \(^4\)Robinson, M, et al. “edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.” Bioinformatics (2010). \(^5\)Law, C, et al. “Voom: precision weights unlock linear model analysis tools for RNA-seq read counts.” Genome Biology (2014).