Often, one wants to show two or more plots side by side to show different aspects of the same story in a compelling way. This is the scenario that patchwork was build to solve. At its heart, patchwork is a package that extends ggplot2’s use of the + operator to work between multiple plots, as well as add additional operators for specialized compositions and working with compositions of plots.
As an example of the most basic use of patchwork, we’ll use the following 4 plots of the mpg dataset.
library(ggplot2)
p1 <- ggplot(mpg) +
geom_point(aes(x = displ, y = hwy))
p2 <- ggplot(mpg) +
geom_bar(aes(x = as.character(year), fill = drv), position = "dodge") +
labs(x = "year")
p3 <- ggplot(mpg) +
geom_density(aes(x = hwy, fill = drv), colour = NA) +
facet_grid(rows = vars(drv))
p4 <- ggplot(mpg) +
stat_summary(aes(x = drv, y = hwy, fill = drv), geom = "col", fun.data = mean_se) +
stat_summary(aes(x = drv, y = hwy), geom = "errorbar", fun.data = mean_se, width = 0.5)
library(patchwork)
p1 + p2
+ does not specify any specific layout, only that the plots should be
displayed together. In the absence of a layout the same algorithm that
governs the number of rows and columns in facet_wrap() will decide the
number of rows and columns. This means that adding 3 plots together will
create a 1x3 grid while adding 4 plots together will create a 2x2
grid.
Motive of the Graphs: * Overall Purpose: To explore and compare characteristics of cars in the mpg dataset. Left plot: Shows the relationship between engine size and fuel efficiency. Right plot: Compares the number of car models by drive type across two different years. Why these plots together? They provide a quick overview of both continuous relationships (engine size vs. fuel efficiency) and categorical distributions (drive type by year) in the dataset. This helps in understanding trends in car design and technology over time, and how they relate to fuel efficiency and drivetrain choices.
Summary: The graphs help you understand how engine size affects fuel efficiency, and how the types of drivetrains offered by manufacturers have changed (or not) between 1999 and 2008.
p1 + p2 + p3 + p4
As can be seen from the two examples above, patchwork takes care of
aligning the different parts of the plots with each other. You can see
that all plotting regions are aligned, even in the presence of faceting.
Further, you can see that the y-axis titles in the two left-most plots
are aligned despite the axis text in the bottom left plot being
wider.
Overall Motive of the Graphs: These four plots together provide a comprehensive overview of the relationships between engine size, fuel efficiency, drive type, and year in the mpg dataset: Scatter plot: Shows the general trend between engine size and fuel efficiency. Bar plot (year vs drv): Shows how the types of drivetrains are distributed across years. Density plots: Show how fuel efficiency is distributed within each drive type. Mean bar plot: Summarizes the average fuel efficiency for each drive type, with uncertainty. Purpose: To help you understand how car characteristics (engine size, drive type, year) relate to fuel efficiency, and how these characteristics are distributed in the dataset. This is useful for exploring trends, making comparisons, and generating hypotheses about what factors influence fuel economy in cars.
What are Error Bars? Error bars are graphical representations of the variability or uncertainty in your data. They show how precise a measurement is, or how much the values might vary if you repeated the experiment or sampling.
What do Error Bars Mean? In the context of your bar plot: The error bars on the bars for each drv (drive type) show the uncertainty around the mean highway miles per gallon (hwy) for that group. They usually represent: Standard error of the mean (SEM): How much the sample mean is expected to vary from the true population mean. Confidence intervals (often 95%): A range where we are fairly sure the true mean lies. Standard deviation: The spread of the data (less common for bar plots of means).
Why Use Error Bars? To show reliability: They help you see if differences between groups (e.g., drive types) are likely to be meaningful or just due to random variation. To visualize uncertainty: They remind us that the mean is just an estimate, not a perfect value. To compare groups: If error bars for two groups do not overlap much, it suggests a real difference between the groups.
How to Interpret Error Bars: Length of the Error Bar: A short error bar means the data points are close to the mean (less variability, more confidence in the mean). A long error bar means the data points are more spread out (more variability, less confidence in the mean). Comparing Groups: If the error bars of two groups do not overlap, it suggests a statistically significant difference between the group means. If the error bars overlap a lot, the difference between the means may not be statistically significant. What the Error Bar Represents: If it’s a standard error or confidence interval, it shows the uncertainty in the estimate of the mean. If it’s a standard deviation, it shows the spread of the data (but not uncertainty in the mean).
It is often that the automatically created grid is not what you want and it is of course possible to control it. The most direct and powerful way is to do this is to add a plot_layout() specification to the plot:
p1 + p2 + p3 + plot_layout(ncol = 2)
A common scenario is wanting to force a single row or column. patchwork
provides two operators, | and / respectively, to facilitate this (under
the hood they simply set number of rows or columns in the layout to
1).
p1 / p2
# Basically the same as using `+` but the intend is clearer
p3 | p4
patchwork allows nesting layouts which means that it is possible to create some very intricate layouts using just these two operators
#Use | to arrange plots in a row. Use / to arrange plots in a column.
p3 | (p2 / (p1 | p4))
layout <- "
AAB
C#B
CDD
"
p1 + p2 + p3 + p4 + plot_layout(design = layout)
Is has been apparent in the last couple of plots the legend often becomes redundant between plots. While it is possible to remove the legend in all but one plot before assembling them, patchwork provides something easier for the common case:
Each character (A, B, C, D, #) represents a cell in a grid. Each letter (A, B, C, D) refers to a specific plot (e.g., p1, p2, p3, p4). The # symbol means an empty cell (no plot). The rows and columns are defined by the arrangement of characters.
Visual Representation: | | 1 | 2 | 3 | |—|—|—|—| | 1 | A | A | B | | 2 | C | # | B | | 3 | C | D | D | A occupies two cells in the first row. B occupies the top right and middle right. C occupies the left column in rows 2 and 3. D occupies the bottom right two cells. * # is an empty space (no plot).
What is the motive of this layout? Custom arrangement: It allows you to create complex, non-rectangular layouts for your plots, not just simple grids. Highlighting relationships: You can make some plots larger (by spanning multiple cells) or position them to emphasize relationships or importance. Professional presentation: Useful for making publication-quality figures where you want precise control over plot arrangement.
Why and when use it? When you need more than a simple grid: If you want some plots to be bigger, or to have empty spaces, or to arrange plots in a way that tells a better story. For composite figures: When preparing figures for papers, posters, or presentations, and you need to combine multiple plots in a visually appealing and informative way. To match journal requirements: Some journals require specific figure layouts.
p1 + p2 + p3 + plot_layout(ncol = 2, guides = "collect")
Electing to collect guides will take all guides and put them together at
the position governed by the global theme. Further, it will remove any
duplicate guide leaving only unique guides in the plot. The duplication
detection looks at the appearance of the guide, and not the underlying
scale it comes from. Thus, it will only remove guides that are exactly
alike. If you want to optimize space use by putting guides in an empty
area of the layout, you can specify a plotting area for collected
guides:
p1 + p2 + p3 + guide_area() + plot_layout(ncol = 2, guides = "collect")
# Modifying subplots One of the tenets of patchwork is that the plots
remain as standard ggplot objects until rendered. This means that they
are amenable to modification after they have been assembled. The
specific plots can by retrieved and set with [[]] indexing:
p12 <- p1 + p2
p12[[2]] <- p12[[2]] + theme_light()
p12
Often though, it is necessary to modify all subplots at once to e.g. give them a common theme. patchwork provides the & for this scenario:
p1 + p4 & theme_minimal()
This can also be used to give plots a common axis if they share the same
aesthetic on that axis:
p1 + p4 & scale_y_continuous(limits = c(0, 45))
# Adding annotation
Once plots have been assembled, they start to form a single unit. This also means that titles, subtitles, and captions will often pertain to the full ensemble and not individual plots. Titles etc. can be added to patchwork plots using the plot_annotation() function.
p34 <- p3 + p4 + plot_annotation(
title = "A closer look at the effect of drive train in cars",
caption = "Source: mpg dataset in ggplot2"
)
p34
The titles formatted according to the theme specification in the
plot_annotation() call.
p34 + plot_annotation(theme = theme_gray(base_family = "mono"))
As the global theme often follows the theme of the subplots, using &
along with a theme object will modify the global theme as well as the
themes of the subplots
p34 & theme_gray(base_family = "mono")
Another type of annotation, known especially in scientific literature, is to add tags to each subplot that will then be used to identify them in the text and caption. ggplot2 has the tag element for exactly this and patchwork offers functionality to set this automatically using the tag_levels argument. It can generate automatic levels in latin characters, arabic numerals, or roman numerals
p123 <- p1 | (p2 / p3)
p123 + plot_annotation(tag_levels = "I") # Uppercase roman numerics
An additional feature is that it is possible to use nesting to define
new tagging levels:
p123[[2]] <- p123[[2]] + plot_layout(tag_level = "new")
p123 + plot_annotation(tag_levels = c("I", "a"))
As can be seen, patchwork offers a long range of possibilities when it
comes to arranging plots, and the API scales with the level of
complexity of the assembly, from simply using + to place multiple plots
in the same area, to using nesting, layouts, and annotations to create
advanced custom layouts.
While a lot of the functionality in patchwork is concerned with aligning plots in a grid, it also allows you to make insets, i.e. small plots placed on top of another plot. The functionality for this is wrapped in the inset_element() function which serves to mark the given plot as an inset to be placed on the preceding plot, along with recording the wanted placement etc. The basic usage is like this:
p1 + inset_element(p2, left = 0.5, bottom = 0.4, right = 0.9, top = 0.95)
The position is specified by given the left, right, top, and bottom
location of the inset. The default is to use npc units which goes from 0
to 1 in the given area, but any grid::unit() can be used by giving them
explicitly. The location is by default set to the panel area, but this
can be changed with the align_to argument. Combining all this we can
place an inset exactly 15 mm from the top right corner like this:
p1 +
inset_element(
p2,
left = 0.4,
bottom = 0.4,
right = unit(1, "npc") - unit(15, "mm"),
top = unit(1, "npc") - unit(15, "mm"),
align_to = "full"
)
insets are not confined to ggplots. Any graphics supported by
wrap_elements() can be used, including patchworks:
p24 <- p2 / p4 + plot_layout(guides = "collect")
p1 + inset_element(p24, left = 0.5, bottom = 0.05, right = 0.95, top = 0.9)
A nice feature of insets is that they behave as standard patchwork subplots until they are rendered. This means that they are amenable to modifications after assembly, e.g. using &:
p12 <- p1 + inset_element(p2, left = 0.5, bottom = 0.5, right = 0.9, top = 0.95)
p12 & theme_bw()
And auto tagging works as expected as well:
p12 + plot_annotation(tag_levels = "A")