Keywords

data science, visualization, R language, ggplot2, gridExtra

Description

R plotting package ggplot2 gives a wide variety of options for data presentation in different kind of charts. But sometimes we need something more advance, on example - combining different chart types into one figure. In this article we will describe one of the possible ways to do this - by using gridExtra package.

Intro

While writing my report on Exploration of Ukrainian EIT-2016 results, in last section EIT results by age, I found that I need a chart that would show each Subject result for each Age of EIT participants, and also aggregated results for both Subjects and Ages. For this purpose heatmap with barplots on both axes was what I actually needed.

Load packages

To run code from this article you will need to install and load following packages:

library(reshape2)
library(ggplot2)
library(gridExtra)
library(grid)
library(cowplot)

Using gridExtra:::grid.arrange

Method grid.arrange from gridExtra package is very handy, when it comes to arrange several graphic objects (chart, text, image, etc) into single figure in terms of rows and columns of imaginable grid. Read more in Arranging multiple grobs on a page by Baptiste Auguie. On example, what you can do is put four charts in a 2x2 grid:

set.seed(777)
v <- sample(seq(1:100), size=1000, replace=TRUE)
m <- matrix(v, ncol = 4)
df <- data.frame(m)
colnames(df) <- c("Val", "X", "Y", "Z")
df.melt <- melt(df, id.vars=c("Val"))

histogram <- qplot(v, geom="histogram") 
heatmap <- qplot(x=Val, y=variable, data=df.melt, fill=value, geom="tile")
lines <- qplot(x=Val, y=Y, data=df, geom="line")
scatterplot <- qplot(Val, value, data = df.melt)
    
grid.arrange(histogram, heatmap, lines, scatterplot, nrow = 2, ncol = 2)

So the plan is to prepare heatmap and two barplots with apropriate formatting (y barplot rotated, both barplots with flipped axes, font sizes, etc), and arrange charts together. Additional feature will be to put heatmap colorbar legend separately in upper right corner.

Prepare demo data

To follow with our task we need to prepare some random data first:

set.seed(777)
x.values <- seq(1:10)
y.values <-  LETTERS[seq(1:10)]
v <- sample(seq(1:10), size=100, replace=TRUE)
m <- matrix(v, nrow = 10, ncol = 10)
df <- data.frame(m)
colnames(df) <- y.values
df$x.values <- x.values
print(df)
##     A B  C D  E  F  G  H I  J x.values
## 1   7 8  4 9  1  7  8  2 4  8        1
## 2   5 7  7 8  3  4  2  8 5  1        2
## 3   4 6  6 9  6  1  9 10 4  3        3
## 4  10 6  2 4  4  2  1  5 5  9        4
## 5   7 9  1 3  3  5  9  4 3  3        5
## 6   1 2  8 4 10  2  1 10 5  9        6
## 7   4 5  5 8  9  1  2  3 3  4        7
## 8   2 9  6 3 10 10  3  7 3  3        8
## 9  10 4 10 7 10  3 10  7 8 10        9
## 10  3 4  8 7  2  1  7  4 9 10       10

Also melt it to use within heatmap:

df.melted <- melt(df, id.vars=c("x.values"))
summary(df.melted)
##     x.values       variable      value      
##  Min.   : 1.0   A      :10   Min.   : 1.00  
##  1st Qu.: 3.0   B      :10   1st Qu.: 3.00  
##  Median : 5.5   C      :10   Median : 5.00  
##  Mean   : 5.5   D      :10   Mean   : 5.47  
##  3rd Qu.: 8.0   E      :10   3rd Qu.: 8.00  
##  Max.   :10.0   F      :10   Max.   :10.00  
##                 (Other):40

Prepare heatmap

Now we can build up our heatmap chart, with horizontal legend:

hm <- ggplot(data = df.melted, aes(x = factor(x.values), y = variable, fill = value)) + geom_tile() + 
    scale_fill_distiller(name = "Legend title", palette = "Reds", direction = 1, na.value = "transparent") +
    scale_x_discrete(breaks = unique(df.melted$x.values), labels = unique(df.melted$x.values)) + theme_gray() +
    theme(legend.position = "bottom", legend.direction = "horizontal",
          legend.title = element_text(size = 15), legend.key.size = unit(1,"cm"),
          legend.text = element_text(size = 7)) +
    guides(fill = guide_colorbar(title.position = "top", title.hjust = 0.5))
hm

Extract heatmap legend

To extract heatmap legend we will iterate over GRafical OBjects, searching for “guide-box” name - that would be a legend grob:

# Substract legend for heatmap
tmp <- ggplot_gtable(ggplot_build(hm))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]

After this we will produce new heatmap without legend and axes by applying appropriate theme:

# Remove legend from heatmap
hm.clean <- hm +
    theme(axis.title.y = element_blank(), axis.text.y = element_blank(),
          axis.ticks.y = element_blank(), axis.title.x = element_blank(),
          axis.text.x = element_blank(), axis.ticks.x = element_blank(),
          legend.position="none")
hm.clean

Prepare x axis barplot

Let’s prepare data for x axis barplot - just aggregate melted data mean by x.values column:

df.melted.x.avg <- aggregate(df.melted$value, by = list(df.melted$x.values), FUN = mean)
colnames(df.melted.x.avg) <- c("XCategory", "ValueAvg")
print(df.melted.x.avg)
##    XCategory ValueAvg
## 1          1      5.8
## 2          2      5.0
## 3          3      5.8
## 4          4      4.8
## 5          5      4.7
## 6          6      5.2
## 7          7      4.4
## 8          8      5.6
## 9          9      7.9
## 10        10      5.5

Now, let’s create a barplot with fill by ValueAvg, with XCategory on x axis, with no y axis nor legend:

bp.x <- ggplot(data = df.melted.x.avg, aes(x = factor(XCategory), y = ValueAvg)) + 
    geom_bar(stat = "identity", aes(fill = ValueAvg)) + theme_gray() +
    theme(axis.title.y = element_blank(), axis.text.y = element_blank(), 
          axis.ticks.y = element_blank(), axis.text.x = element_text(size = 15), 
          axis.title.x = element_text(size = 20, margin = margin(10,0,0,0)),
          legend.position = "none") +
    scale_fill_distiller(name = "Value", palette = "Reds", direction = 1) + 
    labs(x = "X Category")
bp.x

Let’s flip x axis using switch_axis_position method from cowplot package (read about it more in Changing the axis positions by Claus O. Wilke):

bp.x.flip <- switch_axis_position(bp.x, "x")

Prepare y axis barplot

Let’s prepare data for y axis barplot - the same way aggregate melted data mean by y.values column:

df.melted.y.avg <- aggregate(df.melted$value, by = list(df.melted$variable), FUN = mean)
colnames(df.melted.y.avg) <- c("YCategory", "ValueAvg")
print(df.melted.y.avg)
##    YCategory ValueAvg
## 1          A      5.3
## 2          B      6.0
## 3          C      5.7
## 4          D      6.2
## 5          E      5.8
## 6          F      3.6
## 7          G      5.2
## 8          H      6.0
## 9          I      4.9
## 10         J      6.0

Let’s create a barplot with fill by ValueAvg, with YCategory on y axis, with no x axis nor legend, and whole chart rotated using coord_flip function from ggplot2 package:

bp.y <- ggplot(data = df.melted.y.avg, aes(x = YCategory, y = ValueAvg)) + 
    geom_bar(stat = "identity", aes(fill = ValueAvg)) + coord_flip() + theme_gray() +
    theme(axis.title.x = element_blank(), axis.text.x = element_blank(),
          axis.ticks.x = element_blank(), axis.text.y = element_text(size = 15), 
          axis.title.y = element_text(size = 20, margin = margin(0,10,0,0), angle = -90),
          legend.position="none") +
    scale_fill_distiller(name = "Value", palette = "Reds", direction = 1) + 
    labs(x = "Y Category")
bp.y

Let’s flip y axis of y axis barplot:

bp.y.flip <- switch_axis_position(bp.y, "y")

Putting all together

When all job is done, what was left is to arrange charts and add some main title at the top of the figure:

grob.title <- textGrob("Main Title", hjust = 0.5, vjust = 0.5, gp = gpar(fontsize = 20))
grid.arrange(bp.x.flip, legend, hm.clean, bp.y.flip, nrow = 2, ncol = 2, 
             widths = c(30, 40), heights = c(40, 60), top = grob.title)