When we visualize data, our primary goal is often to uncover patterns, trends, and relationships. However, datasets are rarely homogenous; they often contain distinct subgroups, such as different categories, time periods, or experimental conditions. Visualizing the entire dataset as a single entity can sometimes obscure important details or, worse, lead to misleading conclusions, a phenomenon famously illustrated by Simpson’s Paradox.
Separating out data by these subgroups in our visualizations is crucial for several reasons. It allows us to reveal subgroup-specific patterns, as different groups within your data might behave very differently. These unique trends might be averaged out or hidden in an aggregated view. Furthermore, by plotting subgroups side-by-side or distinguishing them with visual cues like different colors or shapes, we can effectively make comparisons of their characteristics, distributions, or trends. For complex datasets with many observations or multiple variables, breaking down the visualization into smaller, more focused parts can significantly improve clarity, making it much easier to understand and interpret. Finally, separating data helps in avoiding misinterpretation, as aggregated data can hide underlying variations. For example, a positive trend observed in an overall dataset might actually mask a negative trend within one or more significant subgroups.
R’s ggplot2
package offers powerful and flexible ways to
achieve this separation, primarily through two main strategies: 1.
Mapping variables to aesthetics: Using visual
properties like color, shape, size, or linetype to distinguish groups
within a single plot. 2. Faceting: Creating multiple
subplots (small multiples), where each subplot displays a different
subset of the data.
Let’s explore these methods.
One common and intuitive way to separate data is by mapping a
categorical variable from your dataset to an aesthetic property of the
geoms in your plot. Aesthetics like color
,
shape
, size
, or linetype
can be
used to visually differentiate groups.
In the example below, we want to see if the relationship between GDP
per capita and life expectancy differs across continents. We achieve
this by mapping the continent
variable to the
color
aesthetic within aes()
. When this
mapping is done, ggplot2
automatically assigns a unique
color to each continent, applies these colors to the points
(geom_point
) and the smoothed lines
(geom_smooth
), and generates a legend to identify which
color corresponds to which continent.
p <- ggplot(data = gapminder %>% filter(year==1987),
mapping = aes(x = gdpPercap,
y = lifeExp,
color = continent))
p + geom_point(alpha=0.2) +
geom_smooth(method = "lm", se=F, formula = y ~ x) +
scale_x_log10(labels = scales::dollar_format(accuracy = 1)) +
labs(x = "GDP Per Capita", y = "Life Expectancy in Years",
title = "Economic Growth and Life Expectancy by Continent",
subtitle = "Data points are country-years; lines are per-continent linear models",
caption = "Source: Gapminder.")
This single plot now clearly shows five distinct lines, one for each continent, allowing us to compare their respective trends. For instance, we can observe differences in the slope or intercept of the relationship between GDP and life expectancy across continents.
Another powerful method for separating data is faceting. Faceting creates a grid of plots (often called “small multiples”), where each plot displays a subset of the data corresponding to a level of one or more categorical variables. This is particularly useful when you want to see the same type of plot for different groups side-by-side, making comparisons straightforward, especially if using aesthetics for separation would lead to a cluttered plot.
Here’s an example from Jonathan Rodden’s work on left vote share and population density in majoritarian democracies. Notice how much information it presents in a very compact but clear fashion. And notice how it’s obviously a ggplot…
facet_wrap()
facet_wrap()
is typically used to create a grid of plots
based on a single categorical variable. It “wraps” a sequence of panels
into a 2D grid. Let’s use it to create separate plots for each
continent, showing the relationship between GDP per capita and life
expectancy.
In this case, the color
aesthetic is removed from the
main aes()
mapping (or not set globally), and instead,
facet_wrap(~ continent)
is added. This tells
ggplot2
to create one panel for each unique value in the
continent
column.
# Define the base plot without color aesthetic for faceting
p_base_data <- gapminder %>% filter(year==1987)
p_base <- ggplot(data = gapminder %>% filter(year==1987),
mapping = aes(x = gdpPercap, y = lifeExp))
p_base + geom_point(alpha = 0.2) +
geom_smooth(method = "lm", formula = y ~ x) +
scale_x_log10(labels = scales::dollar_format(accuracy = 1)) +
facet_wrap(~ continent) +
labs(x = "GDP Per Capita", y = "Life Expectancy in Years",
title = "Economic Growth and Life Expectancy (Faceted by Continent)",
subtitle = "Data points are country-years; one plot per continent",
caption = "Source: Gapminder.")
## Warning in qt((1 - level)/2, df): NaNs produced
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
In this faceted plot, facet_wrap(~ continent)
creates a
separate plot panel for each continent. Each plot shows all data points
for that continent and a single linear model fit to that continent’s
data. By default, all facets share the same x and y scales, which is
crucial for making valid comparisons of trends and data distributions
across the facets.
levels(gapminder$continent)
## [1] "Africa" "Americas" "Asia" "Europe" "Oceania"
forcats::fct_reorder()
By default, facet_wrap()
orders the facets
alphabetically by the levels of the faceting variable (if it’s a
character string) or by the factor levels (if it’s a factor). Sometimes,
you might want to order the facets in a more meaningful way, for
example, by some summary statistic of the data within each facet.
The forcats
package (part of the tidyverse) provides
functions that can make factor manipulation, including reordering, more
concise. The fct_reorder()
function is particularly useful
here. It reorders the levels of a factor based on values of another
variable (typically after a summary function like mean
or
median
is applied).
Let’s reorder the continent facets by their average life expectancy
in 1987, from highest to lowest (descending order), using
fct_reorder()
:
# Reorder continent factor directly within the mutate step using fct_reorder
# We want to order 'continent' by 'lifeExp', using the mean of lifeExp for ordering.
# The .desc = TRUE argument ensures descending order.
p_base_data <- gapminder %>% filter(year==1987)
p_base_data_forcats_ordered <- p_base_data %>%
filter(!is.na(lifeExp)) %>%
mutate(continent = fct_reorder(continent, lifeExp, .fun = mean, .na_rm = TRUE, .desc = TRUE))
# Create the plot
ggplot(data = p_base_data_forcats_ordered,
mapping = aes(x = gdpPercap, y = lifeExp)) +
geom_point(alpha = 0.2) +
geom_smooth(method = "lm", formula = y ~ x) +
scale_x_log10(labels = scales::dollar_format(accuracy = 1)) +
facet_wrap(~ continent) +
labs(x = "GDP Per Capita", y = "Life Expectancy in Years",
title = "Economic Growth and Life Expectancy (Faceted by Continent, using forcats)",
subtitle = "Facets ordered by average life expectancy in 1987 (highest to lowest)",
caption = "Source: Gapminder.") +
theme_minimal()
## Warning in qt((1 - level)/2, df): NaNs produced
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
As you can see, fct_reorder()
makes the code for
reordering factors based on summary statistics more compact and often
more readable once you are familiar with its syntax. It handles the
summarization and re-leveling internally. For simple manual ordering
where the order isn’t data-driven, directly using
factor(variable, levels = c("level3", "level1", "level2"))
remains the most straightforward approach.
You can customize the appearance and behavior of facets to improve readability and aesthetics.
You can control the layout of the facets created by
facet_wrap()
using the nrow
or
ncol
parameters. This determines how many rows or columns
the grid of plots will have.
# Using the data with forcats-ordered continents for consistency
ggplot(data = p_base_data_forcats_ordered,
mapping = aes(x = gdpPercap, y = lifeExp)) +
geom_point(alpha = 0.2) +
geom_smooth(method = "lm", formula = y ~ x) +
scale_x_log10(labels = scales::dollar_format(accuracy = 1)) +
facet_wrap(~ continent, ncol = 2) +
labs(x = "GDP Per Capita", y = "Life Expectancy in Years",
title = "Economic Growth and Life Expectancy (2 Columns)",
subtitle = "Facets ordered by average life expectancy",
caption = "Source: Gapminder.") +
theme_minimal()
## Warning in qt((1 - level)/2, df): NaNs produced
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
Here, ncol = 2
arranges the five continent facets into a
grid with two columns (and three rows, in this case).
Sometimes, the range of data can vary significantly between facets. While shared scales are good for direct comparison, forcing all facets to use the same scales might make it difficult to see patterns within individual facets if one group has a much wider data range than others. In such cases, it can be useful to allow each facet to have its own scale.
The scales
argument in facet_wrap()
(and
facet_grid()
) controls this. The default,
"fixed"
, means all panels share the same scales. Setting
scales = "free"
allows each panel to use its own scales for
both x and y axes. Alternatively, "free_x"
lets each panel
use its own x-axis scale while sharing the y-axis scale, and
"free_y"
does the opposite.
# Using the data with forcats-ordered continents
ggplot(data = p_base_data_forcats_ordered,
mapping = aes(x = gdpPercap, y = lifeExp)) +
geom_point(alpha = 0.2) +
geom_smooth(method = "lm", formula = y ~ x) +
scale_x_log10(labels = scales::dollar_format(accuracy = 1)) +
facet_wrap(~ continent, scales = "free") +
labs(x = "GDP Per Capita", y = "Life Expectancy in Years",
title = "Economic Growth and Life Expectancy (Free Scales)",
subtitle = "Facets ordered by average life expectancy; scales vary by continent",
caption = "Source: Gapminder.") +
theme_minimal()
## Warning in qt((1 - level)/2, df): NaNs produced
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
With scales = "free"
, each facet’s axes are scaled to
fit its own data range. This can help reveal details within each
specific continent’s plot, but makes direct visual comparison of slopes
or absolute values across continents more challenging. The choice
depends on the analytical goal.
Let’s apply these concepts to the Statehouse Democracy
dataset. First, we set up the data as in previous lessons. This involves
selecting relevant variables, filtering out Alaska and Hawaii, and
creating some new variables like polconserv
and
region.name
.
states <- get_cspp_data(vars=c("pid","ideo","statemin","pollib_median","region"),years=seq(1976,2020),core=F) %>%
select(-stateno,-state_fips,-state_icpsr) %>%
filter(!st%in%c("AK","HI")) %>%
mutate(
pid = round(pid,2),
ideo = round(ideo,2),
pollib_median = round(pollib_median,2),
polconserv = -(pollib_median),
year=as.character(year),
ymd=lubridate::ymd(paste0(year, "-01-01")),
region.name = case_when(
region == 1 ~ "south",
region == 2 ~ "west",
region == 3 ~ "midwest",
region == 4 ~ "northeast",
TRUE ~ NA_character_
)) %>%
relocate(region.name, .after=state)
Previously, we might have created entirely separate plots for
different years. An alternative is to combine data from multiple years
onto a single plot. By mapping the year
variable to the
color
aesthetic (after converting year
to a
factor so ggplot2
treats it as a discrete categorical
variable), we can directly compare the relationship between partisanship
(pid
) and ideology (ideo
) for the years 1976
and 2011 on the same set of axes. This allows us to visually assess the
overall relationship in each year, whether the slope or intercept of
this relationship has changed, and how individual states compare across
these time points.
states
## # A tibble: 2,205 × 11
## st state region.name year pid ideo statemin pollib_median region
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <int>
## 1 AL Alabama south 1976 0.23 -0.21 NA -1.43 1
## 2 AL Alabama south 1977 0.25 -0.08 NA -1.42 1
## 3 AL Alabama south 1978 0.44 -0.18 NA -1.47 1
## 4 AL Alabama south 1979 0.38 -0.21 NA -1.51 1
## 5 AL Alabama south 1980 0.38 -0.23 3.1 -1.74 1
## 6 AL Alabama south 1981 0.25 -0.31 3.35 -1.7 1
## 7 AL Alabama south 1982 0.29 -0.26 3.35 -1.76 1
## 8 AL Alabama south 1983 0.29 -0.27 3.35 -1.82 1
## 9 AL Alabama south 1984 0.18 -0.15 3.35 -1.78 1
## 10 AL Alabama south 1985 -0.06 -0.23 3.35 -1.78 1
## # ℹ 2,195 more rows
## # ℹ 2 more variables: polconserv <dbl>, ymd <date>
ggplot(data=states %>% filter(year %in% c("1976","2011")),
aes(x=pid,y=ideo,color=as.factor(year))) +
geom_smooth(method="lm",formula = y ~ x, se=F) +
geom_text(aes(label=st)) +
labs(x="Net Partisanship (Democratic PID)",y="Net Ideology (Liberalism)", color = "Year") +
scale_x_continuous(labels = scales::percent) +
scale_y_continuous(labels = scales::percent) +
theme_bw()
The plot above shows two distinct lines, one for 1976 and one for
2011. However, the state labels (geom_text
) are very
crowded, making it difficult to read. This illustrates a common
challenge when plotting many data points with labels on a single
panel.
While the combined plot with colors is useful for seeing overarching
shifts, the overlapping text labels reduce its clarity. Faceting by
year
provides an alternative that can alleviate this
clutter. By adding facet_wrap(~year)
, we create separate
panels for 1976 and 2011. This approach offers several advantages: each
year gets its own dedicated space, making it easier to examine the
specific pattern within that year; text labels are less likely to
overlap; and ggplot2
ensures consistent scales across
facets by default, crucial for valid comparisons.
ggplot(states %>% filter(year %in% c("1976","2011")),
aes(x=pid,y=ideo)) +
geom_smooth(method="lm",formula = y ~ x, se=F) +
geom_text(aes(label=st)) +
labs(x="Net Partisanship (Democratic PID)",y="Net Ideology (Liberalism)") +
scale_x_continuous(labels = scales::percent) +
scale_y_continuous(labels = scales::percent) +
facet_wrap(~year)
Now we definitely need smaller labels and to repel them to avoid
overlap within each facet! geom_text_repel
from the
ggrepel
package is excellent for this.
ggplot(states %>% filter(year %in% c("1976","2011")),
aes(x=pid,y=ideo)) +
geom_smooth(method="lm",formula = y ~ x, se=F) +
ggrepel::geom_text_repel(aes(label=st), size = 3) +
labs(x="Net Partisanship (Democratic PID)",y="Net Ideology (Liberalism)",
title="State Partisanship vs. Ideology", subtitle="Faceted by Year (1976 & 2011)") +
scale_x_continuous(labels = scales::percent) +
scale_y_continuous(labels = scales::percent) +
facet_wrap(~year)+
theme_bw()
## Warning: ggrepel: 6 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
This version is much clearer, with each year’s data neatly presented in its own panel and labels positioned to minimize overlap.
pid
) and state minimum wage
(statemin
) for the years 1980 and 2011.pid
) and state policy liberalism
(pollib_median
) across three different years: 1980, 1990,
and 2000.states %>% filter(year %in% c("1976","2011")) %>% select(st, statemin)
## # A tibble: 98 × 2
## st statemin
## <chr> <dbl>
## 1 AL NA
## 2 AL 7.25
## 3 AR NA
## 4 AR 6.25
## 5 AZ NA
## 6 AZ 7.35
## 7 CA NA
## 8 CA 8
## 9 CO NA
## 10 CO 7.36
## # ℹ 88 more rows
ggplot(states %>% filter(year %in% c("1983","2011")),
aes(x=pid,y=statemin)) +
geom_smooth(method="lm",formula = y ~ x, se=F) +
ggrepel::geom_text_repel(aes(label=st), size = 3) +
labs(x="Net Partisanship (Democratic PID)",y="Minimum Wage",
title="State Partisanship vs. Minimum Wage", subtitle="Faceted by Year (1976 & 2011)") +
scale_x_continuous(labels = scales::percent) +
facet_wrap(~year)+
theme_bw()
## Warning: ggrepel: 28 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
## Warning: ggrepel: 34 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
We can further enhance this by combining faceting with aesthetics.
For instance, in addition to faceting by year
, we can map
the region.name
variable to the color
aesthetic for the text labels. This will show us if the relationship
between partisanship and ideology differs by region within each
year. The smoothed line will represent the overall trend for all states
within that year’s facet.
# Combining faceting by year with color aesthetic for region (on text only)
ggplot(states %>% filter(year %in% c("1976","2011")),
aes(x=pid, y=ideo)) +
geom_smooth(method="lm", formula = y ~ x, se=F) +
ggrepel::geom_text_repel(aes(label=st, color=region.name), size = 3) +
labs(x="Net Partisanship (Democratic PID)",y="Net Ideology (Liberalism)",
title="State Partisanship vs. Ideology by Region",
subtitle="Faceted by Year (1976 & 2011), Text Colored by Region",
color = "Region") +
scale_x_continuous(labels = scales::percent) +
scale_y_continuous(labels = scales::percent) +
facet_wrap(~year)+
theme_bw()
## Warning: ggrepel: 3 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
## Warning: ggrepel: 11 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
This combined approach allows for a richer comparison, showing both year-over-year changes (via facets) and regional differences within those years (via text color), with a single trend line per year.
By using faceting, we can effectively compare these relationships across multiple years, with each year presented clearly in its own panel. This makes it easier to identify changes in trends or patterns over time.
While facet_wrap()
is excellent for faceting by a single
variable, ggplot2
also offers facet_grid()
.
facet_grid()
creates a 2D grid of panels defined by one or
two categorical variables, with rows corresponding to the levels of one
variable and columns to the levels of another (e.g.,
facet_grid(var1 ~ var2)
). This can be very useful for more
structured comparisons across two dimensions.
Let’s switch from facet_wrap()
to
facet_grid()
[Slide]
First let’s look at a new data set, the 2016 General Social Survey (GSS). Notice how many of the variables are now categorical, as opposed to continuous in the Gapminder data.
head(gss_sm,n=20)
## # A tibble: 20 × 32
## year id ballot age childs sibs degree race sex region income16
## <dbl> <dbl> <labelled> <dbl> <dbl> <labe> <fct> <fct> <fct> <fct> <fct>
## 1 2016 1 1 47 3 2 Bache… White Male New E… $170000…
## 2 2016 2 2 61 0 3 High … White Male New E… $50000 …
## 3 2016 3 3 72 2 3 Bache… White Male New E… $75000 …
## 4 2016 4 1 43 4 3 High … White Fema… New E… $170000…
## 5 2016 5 3 55 2 2 Gradu… White Fema… New E… $170000…
## 6 2016 6 2 53 2 2 Junio… White Fema… New E… $60000 …
## 7 2016 7 1 50 2 2 High … White Male New E… $170000…
## 8 2016 8 3 23 3 6 High … Other Fema… Middl… $30000 …
## 9 2016 9 1 45 3 5 High … Black Male Middl… $60000 …
## 10 2016 10 3 71 4 1 Junio… White Male Middl… $60000 …
## 11 2016 11 2 33 5 4 High … Black Fema… Middl… under $…
## 12 2016 12 1 86 4 4 High … White Fema… Middl… under $…
## 13 2016 13 2 32 3 3 High … Black Male Middl… $8 000 …
## 14 2016 14 3 60 5 6 High … Black Fema… Middl… $12500 …
## 15 2016 15 2 76 7 0 Lt Hi… White Male New E… $40000 …
## 16 2016 16 3 33 2 1 High … White Fema… New E… $50000 …
## 17 2016 17 3 56 6 3 High … White Male New E… $50000 …
## 18 2016 18 2 62 5 8 Lt Hi… Other Fema… New E… $5 000 …
## 19 2016 19 2 31 0 2 Gradu… Black Male New E… $35000 …
## 20 2016 20 1 43 2 0 High … Black Male New E… $25000 …
## # ℹ 21 more variables: relig <fct>, marital <fct>, padeg <fct>, madeg <fct>,
## # partyid <fct>, polviews <fct>, happy <fct>, partners <fct>, grass <fct>,
## # zodiac <fct>, pres12 <labelled>, wtssall <dbl>, income_rc <fct>,
## # agegrp <fct>, ageq <fct>, siblings <fct>, kids <fct>, religion <fct>,
## # bigregion <fct>, partners_rc <fct>, obama <dbl>
Let’s create a scatterplot of the relationship between respondent age and number of children. The small multiples we are doing are a cross-classification between bigregion and degree. We add a smoother and an alpha for the actual points.
R’s formula notation is used for the cross-classification
(bigregion ~ degree
).
Notice that ggplot removes certain observations because of missing values. You could have subsetted the data before to get rid of them as well, but ggplot works hard for you.
p <- ggplot(data = subset(gss_sm,!is.na(age) & !is.na(childs) & !is.na(bigregion) & !is.na(degree)),
mapping = aes(x = age, y = childs))
p + geom_point(alpha = 0.2) +
geom_smooth(method="lm", formula = y ~ x) +
facet_grid(bigregion ~ degree) +
theme_bw()
We can reverse the order of the cross-classification.
p + geom_point(alpha = 0.2) +
geom_smooth(method="lm",formula = y ~ x) +
facet_grid(degree ~ bigregion)
Both mapping to aesthetics (like color
or
shape
) and faceting are powerful ways to show data for
different subgroups. How do you choose which one to use?
Generally, use aesthetics like color
or
shape
when you have a relatively small number of groups
(e.g., 2-5) and want to directly compare them on the same set of axes.
This is effective for highlighting group differences within a single,
unified plot, such as comparing the slopes of regression lines.
Conversely, use faceting when dealing with a larger number of groups where different colors or shapes would become confusing, or when each group needs its own dedicated space for clarity. This is the idea of “small multiples,” useful for presenting the same type of plot for many segments of your data consistently, especially if overplotting is an issue in a single panel.
Sometimes, you can even combine both strategies: use facets to separate by one variable (e.g., different years) and then use aesthetics like color or shape to distinguish subgroups within each facet (e.g., different regions within each year). We’ll see an example of this in the Case Study section.
patchwork
While ggplot2
’s faceting system is powerful for creating
small multiples of similar plots, sometimes you need more flexibility to
combine different types of plots or arrange them in more complex
layouts. This is where the patchwork
package comes in.
patchwork
provides an intuitive way to combine separate
ggplot
objects into a single figure using arithmetic
operators.
Using patchwork
is beneficial because it allows you to
combine disparate plots, such as placing a scatter plot
next to a bar chart, which isn’t directly achievable with faceting. It
enables complex layouts, including nested arrangements
and control over relative widths/heights, and you can add overall
annotations. Importantly, patchwork
maintains
ggplot
objects; you create individual plots as
usual, and patchwork
handles the assembly.
Let’s create two different plots using the gapminder
dataset and then combine them.
Prepare data for 2007
# Filter data for 2007
data_2007 <- gapminder %>%
filter(year == 2007)
# Create summarized data for continent averages
avg_life_exp_data <- data_2007 %>%
group_by(continent) %>%
summarise(avg_lifeExp = mean(lifeExp), .groups = 'drop')
Plot 1: Scatter plot of GDP vs. Life Expectancy for Asia in 2007
plot1_asia_scatter <- ggplot(data = data_2007 %>% filter(continent == "Asia"),aes(x = gdpPercap, y = lifeExp)) +
geom_point(aes(size = pop), color = "steelblue", alpha = 0.7) +
geom_text_repel(aes(label = country), size = 3) +
scale_x_log10(labels = scales::dollar_format(accuracy = 1)) +
scale_size_continuous(name = "Population", labels = scales::comma) +
labs(title = "Asia: GDP vs. Life Expectancy (2007)",
x = "GDP Per Capita (log scale)",
y = "Life Expectancy") +
theme_minimal()
plot1_asia_scatter # Optionally print individual plot
## Warning: ggrepel: 1 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
Plot 2: Bar chart of average Life Expectancy by Continent in 2007
plot2_avg_life_exp_bar <- avg_life_exp_data %>%
ggplot(aes(x = reorder(continent, avg_lifeExp), y = avg_lifeExp, fill = continent)) +
geom_col(show.legend = FALSE) +
coord_flip() +
labs(title = "Average Life Expectancy by Continent (2007)",
x = "Continent",
y = "Average Life Expectancy (Years)") +
theme_minimal()
# print(plot2_avg_life_exp_bar) # Optionally print individual plot
Now, let’s combine these two plots using patchwork
. -
plot1 + plot2
will place them side-by-side. -
plot1 / plot2
will place plot1
above
plot2
.
# Combine plots side-by-side
plot1_asia_scatter + plot2_avg_life_exp_bar
## Warning: ggrepel: 3 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
We can also arrange them vertically:
# Combine plots vertically
plot1_asia_scatter / plot2_avg_life_exp_bar
## Warning: ggrepel: 4 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
patchwork
also allows for more complex arrangements. For
instance, (plot1 | plot2)
is another way for side-by-side,
and you can control layouts with plot_layout()
for things
like number of columns/rows, or relative widths/heights. You can also
add overall titles and annotations to the assembled patchwork using
plot_annotation()
.
Example with plot_layout
and
plot_annotation
:
# More complex layout: plot1 takes up more space
plot1_asia_scatter + plot2_avg_life_exp_bar +
plot_layout(widths = c(2, 1)) +
plot_annotation(
title = "Combined Gapminder Insights (2007)",
caption = "Data source: Gapminder package"
)
## Warning: ggrepel: 3 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
# Ensure the plot object is the last thing to be evaluated in the chunk