Goal: to produce a similar heatplot to the one in the lattice tutorial using ggplot2.
Lattice tutorial
ggplot2 tutorial
Load the required libraries and data
library(ggplot2)
library(RColorBrewer)
prDat <- read.table("GSE4051_data.tsv")
prDes <- readRDS("GSE4051_design.rds")
Set seed so that the “random” sample we select is the same as the one from the tutorial
set.seed(1)
Choose 50 probes (out of the 30k)
yo <- sample(1:nrow(prDat), size = 50)
hDat <- prDat[yo, ]
colnames(hDat) <- with(prDes, paste(devStage, gType, sidChar, sep = "_"))
Transform the data into long format that will be easy to plot
Note that the lattice heatmap expects a matrix, but (as far as I can tell), ggplot heatmap expects a dataframe, so we need to convert the 2D matrix of values into a list of tuples in the form of
prDatTall <- data.frame(sample = rep(colnames(hDat), each = nrow(hDat)),
probe = rownames(hDat),
expression = unlist(hDat))
Create a blue -> purple colour palette
jBuPuFun <- colorRampPalette(brewer.pal(n = 9, "BuPu"))
paletteSize <- 256
jBuPuPalette <- jBuPuFun(paletteSize)
Plot the heatmap.
Note the geom_tile is the graphic element that is used for heatmaps.
ggplot(prDatTall, aes(x = probe, y = sample, fill = expression)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5)) +
geom_tile() +
scale_fill_gradient2(low = jBuPuPalette[1],
mid = jBuPuPalette[paletteSize/2],
high = jBuPuPalette[paletteSize],
midpoint = (max(prDatTall$expression) + min(prDatTall$expression)) / 2,
name = "Expression")
Look at the scale_fill_gradient2 documentation for an explanation of the parameters that were used
In lattice, the heatmap function expects a matrix, so we create a matrix of the data and then plot it
hDat <- as.matrix(t(hDat))
heatmap(hDat, Rowv = NA, Colv = NA, scale = "none", col = jBuPuPalette)
If you track the pixel of any sample/probe pair and compare it in both heatmaps, it should be similar. Note that the heatmaps themselves are not exactly congruent because the order of the samples in our ggplot version is not chronological, whereas in the lattice version they are. This is left as an exercise :)