data science, visualization, R language, ggplot2, gridExtra
R plotting package ggplot2 gives a wide variety of options for data presentation in different kind of charts. But sometimes we need something more advance, on example - combining different chart types into one figure. In this article we will describe one of the possible ways to do this - by using gridExtra package.
While writing my report on Exploration of Ukrainian EIT-2016 results, in last section EIT results by age, I found that I need a chart that would show each Subject result for each Age of EIT participants, and also aggregated results for both Subjects and Ages. For this purpose heatmap with barplots on both axes was what I actually needed.
To run code from this article you will need to install and load following packages:
library(reshape2)
library(ggplot2)
library(gridExtra)
library(grid)
library(cowplot)
Method grid.arrange from gridExtra package is very handy, when it comes to arrange several graphic objects (chart, text, image, etc) into single figure in terms of rows and columns of imaginable grid. Read more in Arranging multiple grobs on a page by Baptiste Auguie. On example, what you can do is put four charts in a 2x2 grid:
set.seed(777)
v <- sample(seq(1:100), size=1000, replace=TRUE)
m <- matrix(v, ncol = 4)
df <- data.frame(m)
colnames(df) <- c("Val", "X", "Y", "Z")
df.melt <- melt(df, id.vars=c("Val"))
histogram <- qplot(v, geom="histogram")
heatmap <- qplot(x=Val, y=variable, data=df.melt, fill=value, geom="tile")
lines <- qplot(x=Val, y=Y, data=df, geom="line")
scatterplot <- qplot(Val, value, data = df.melt)
grid.arrange(histogram, heatmap, lines, scatterplot, nrow = 2, ncol = 2)
So the plan is to prepare heatmap and two barplots with apropriate formatting (y barplot rotated, both barplots with flipped axes, font sizes, etc), and arrange charts together. Additional feature will be to put heatmap colorbar legend separately in upper right corner.
To follow with our task we need to prepare some random data first:
set.seed(777)
x.values <- seq(1:10)
y.values <- LETTERS[seq(1:10)]
v <- sample(seq(1:10), size=100, replace=TRUE)
m <- matrix(v, nrow = 10, ncol = 10)
df <- data.frame(m)
colnames(df) <- y.values
df$x.values <- x.values
print(df)
## A B C D E F G H I J x.values
## 1 7 8 4 9 1 7 8 2 4 8 1
## 2 5 7 7 8 3 4 2 8 5 1 2
## 3 4 6 6 9 6 1 9 10 4 3 3
## 4 10 6 2 4 4 2 1 5 5 9 4
## 5 7 9 1 3 3 5 9 4 3 3 5
## 6 1 2 8 4 10 2 1 10 5 9 6
## 7 4 5 5 8 9 1 2 3 3 4 7
## 8 2 9 6 3 10 10 3 7 3 3 8
## 9 10 4 10 7 10 3 10 7 8 10 9
## 10 3 4 8 7 2 1 7 4 9 10 10
Also melt it to use within heatmap:
df.melted <- melt(df, id.vars=c("x.values"))
summary(df.melted)
## x.values variable value
## Min. : 1.0 A :10 Min. : 1.00
## 1st Qu.: 3.0 B :10 1st Qu.: 3.00
## Median : 5.5 C :10 Median : 5.00
## Mean : 5.5 D :10 Mean : 5.47
## 3rd Qu.: 8.0 E :10 3rd Qu.: 8.00
## Max. :10.0 F :10 Max. :10.00
## (Other):40
Now we can build up our heatmap chart, with horizontal legend:
hm <- ggplot(data = df.melted, aes(x = factor(x.values), y = variable, fill = value)) + geom_tile() +
scale_fill_distiller(name = "Legend title", palette = "Reds", direction = 1, na.value = "transparent") +
scale_x_discrete(breaks = unique(df.melted$x.values), labels = unique(df.melted$x.values)) + theme_gray() +
theme(legend.position = "bottom", legend.direction = "horizontal",
legend.title = element_text(size = 15), legend.key.size = unit(1,"cm"),
legend.text = element_text(size = 7)) +
guides(fill = guide_colorbar(title.position = "top", title.hjust = 0.5))
hm
To extract heatmap legend we will iterate over GRafical OBjects, searching for “guide-box” name - that would be a legend grob:
# Substract legend for heatmap
tmp <- ggplot_gtable(ggplot_build(hm))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
After this we will produce new heatmap without legend and axes by applying appropriate theme:
# Remove legend from heatmap
hm.clean <- hm +
theme(axis.title.y = element_blank(), axis.text.y = element_blank(),
axis.ticks.y = element_blank(), axis.title.x = element_blank(),
axis.text.x = element_blank(), axis.ticks.x = element_blank(),
legend.position="none")
hm.clean
Let’s prepare data for x axis barplot - just aggregate melted data mean by x.values column:
df.melted.x.avg <- aggregate(df.melted$value, by = list(df.melted$x.values), FUN = mean)
colnames(df.melted.x.avg) <- c("XCategory", "ValueAvg")
print(df.melted.x.avg)
## XCategory ValueAvg
## 1 1 5.8
## 2 2 5.0
## 3 3 5.8
## 4 4 4.8
## 5 5 4.7
## 6 6 5.2
## 7 7 4.4
## 8 8 5.6
## 9 9 7.9
## 10 10 5.5
Now, let’s create a barplot with fill by ValueAvg, with XCategory on x axis, with no y axis nor legend:
bp.x <- ggplot(data = df.melted.x.avg, aes(x = factor(XCategory), y = ValueAvg)) +
geom_bar(stat = "identity", aes(fill = ValueAvg)) + theme_gray() +
theme(axis.title.y = element_blank(), axis.text.y = element_blank(),
axis.ticks.y = element_blank(), axis.text.x = element_text(size = 15),
axis.title.x = element_text(size = 20, margin = margin(10,0,0,0)),
legend.position = "none") +
scale_fill_distiller(name = "Value", palette = "Reds", direction = 1) +
labs(x = "X Category")
bp.x
Let’s flip x axis using switch_axis_position method from cowplot package (read about it more in Changing the axis positions by Claus O. Wilke):
bp.x.flip <- switch_axis_position(bp.x, "x")
Let’s prepare data for y axis barplot - the same way aggregate melted data mean by y.values column:
df.melted.y.avg <- aggregate(df.melted$value, by = list(df.melted$variable), FUN = mean)
colnames(df.melted.y.avg) <- c("YCategory", "ValueAvg")
print(df.melted.y.avg)
## YCategory ValueAvg
## 1 A 5.3
## 2 B 6.0
## 3 C 5.7
## 4 D 6.2
## 5 E 5.8
## 6 F 3.6
## 7 G 5.2
## 8 H 6.0
## 9 I 4.9
## 10 J 6.0
Let’s create a barplot with fill by ValueAvg, with YCategory on y axis, with no x axis nor legend, and whole chart rotated using coord_flip function from ggplot2 package:
bp.y <- ggplot(data = df.melted.y.avg, aes(x = YCategory, y = ValueAvg)) +
geom_bar(stat = "identity", aes(fill = ValueAvg)) + coord_flip() + theme_gray() +
theme(axis.title.x = element_blank(), axis.text.x = element_blank(),
axis.ticks.x = element_blank(), axis.text.y = element_text(size = 15),
axis.title.y = element_text(size = 20, margin = margin(0,10,0,0), angle = -90),
legend.position="none") +
scale_fill_distiller(name = "Value", palette = "Reds", direction = 1) +
labs(x = "Y Category")
bp.y
Let’s flip y axis of y axis barplot:
bp.y.flip <- switch_axis_position(bp.y, "y")
When all job is done, what was left is to arrange charts and add some main title at the top of the figure:
grob.title <- textGrob("Main Title", hjust = 0.5, vjust = 0.5, gp = gpar(fontsize = 20))
grid.arrange(bp.x.flip, legend, hm.clean, bp.y.flip, nrow = 2, ncol = 2,
widths = c(30, 40), heights = c(40, 60), top = grob.title)