Summary

(This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. This markdown file will produce a document with both graphs and the code used to produce them. For more details on using R Markdown see http://rmarkdown.rstudio.com.)

We have 4 sets of data of relative gene expression for paired groups (normoxia = control, hypoxia = test). Our ultimate goal is to visually compare relative expression for MSC (high) with MSC (low) and ACP (high) with ACP (low).

Outlined below are a few different ways to make heatmaps in R from these data.

I have included both non-scaled and row-scaled heatmaps. Row-scaled heatmaps scale values within each row (gene). There are other options for scaling available too.

Importing the data

First, I created a single tab-separated text file from the excel document with all data in one place. We import it into a data frame (table) and check that it worked correctly by visualizing the first 5 rows of the table:

# Import the table
geneExp <- read.table("~/Dropbox/Projects/HeatmapForDevon/relGeneExpression_singleTable.txt",
                      header=T, sep="\t")
head(geneExp)
##      gene             type fc_acp_hi fc_acp_lo fc_msc_hi fc_msc_lo
## 1     18S Selected Control    1.5487    0.8587    0.8159    1.0861
## 2     ADM           Target    7.2128    3.5313    2.9678    5.9351
## 3 ANGPTL4           Target    7.3992    4.1145    3.6495    1.0488
## 4    ARNT           Target    0.6213    0.8117    0.8680    0.6749
## 5   ARNT2           Target    0.2274    0.1781    1.0229    0.3850
## 6  ATP1B1           Target    0.7140    0.6371    0.8053    0.6838
# Extract just the numeric data into a matrix with named rows by gene
rownames(geneExp) <- geneExp$gene
geneExp_matrix <- as.matrix(geneExp[3:6])
head(geneExp_matrix)
##         fc_acp_hi fc_acp_lo fc_msc_hi fc_msc_lo
## 18S        1.5487    0.8587    0.8159    1.0861
## ADM        7.2128    3.5313    2.9678    5.9351
## ANGPTL4    7.3992    4.1145    3.6495    1.0488
## ARNT       0.6213    0.8117    0.8680    0.6749
## ARNT2      0.2274    0.1781    1.0229    0.3850
## ATP1B1     0.7140    0.6371    0.8053    0.6838

Heatmaps

Default heatmap function

Here, we use the default ‘heatmap’ function in R, which outputs a heatmap (with marginal dendrograms) organized using hierarchical clustering on both axes. Every aspect is customizable. For more information on what parameters you can customize, type “?heatmap” in your R console and it will pop up a help page which details every parameter you can change.

Let’s take a look at the default heatmap. The data is stored in columns 3-6 of our table, and needs to be converted to a matrix in order to work as input. We also give it our vector of gene names.

heatmap(geneExp_matrix)

We can also try keeping everything in order and not clustering the samples by suppressing the clustering functions (Rowv and Colv).

heatmap(geneExp_matrix,
        Rowv=NA, Colv=NA)

What about changing our color scheme? We can use a package called RColorBrewer to change our color palette. There are loads of color schemes to choose from, see http://davetang.org/muse/2010/12/06/making-a-heatmap-with-r/display_brewer_all/ :

# Check if the RColorBrewer library is installed. If not, installs it, then loads it
if (!require("RColorBrewer")) {
   install.packages("RColorBrewer", dependencies = TRUE)
   library(RColorBrewer)
}
heatmap(geneExp_matrix,
        Rowv=NA, Colv=NA, col=rev(brewer.pal(9,"RdBu")))

Gplot heatmap function

Here, we use a different heatmap function from the gplots package

if (!require("gplots")) {
   install.packages("gplots", dependencies = TRUE)
   library(gplots)
}
# Default unscaled heatmap
heatmap.2(geneExp_matrix, col=rev(brewer.pal(9,"RdBu")))

# Scaled within rows, no column clustering
heatmap.2(geneExp_matrix, col=rev(brewer.pal(9,"RdBu")), scale="row", Colv = "NA")

NMF heatmap function

Here, we use ‘aheatmap’ to draw an annotated heatmap plotting z-scores of rows and annotating rows by type (so it’s easy to see control genes).

if (!require("NMF")) {
   install.packages("NMF", dependencies = TRUE)
   library(NMF)
}
# Non-scaled
aheatmap(geneExp_matrix, color = "-RdBu", annColors = "Set2", annRow=geneExp$type)

# Scaled within rows, not clustering columns
aheatmap(geneExp_matrix, color = rev(brewer.pal(9,"RdBu")), scale="row", annColors = "Set2", annRow=geneExp$type, Colv = NA)

# We can also plot both cell types, side by side, preserving order, still scaling by row
par(mfrow=c(1,2))
aheatmap(geneExp_matrix[,1:2], color = rev(brewer.pal(9,"RdBu")), scale="row", annColors = "Set2", annRow=geneExp$type, Colv = NA, Rowv=NA)
aheatmap(geneExp_matrix[,3:4], color = rev(brewer.pal(9,"RdBu")), scale="row", annColors = "Set2", annRow=geneExp$type, Colv = NA, Rowv=NA)

# and now not scaling by row
par(mfrow=c(1,2))
aheatmap(geneExp_matrix[,1:2], color = rev(brewer.pal(9,"RdBu")), scale="none", annColors = "Set2", annRow=geneExp$type, Colv = NA, Rowv=NA)
aheatmap(geneExp_matrix[,3:4], color = rev(brewer.pal(9,"RdBu")), scale="none", annColors = "Set2", annRow=geneExp$type, Colv = NA, Rowv=NA)