\(\underline{\textbf{Chapter Description}}\)
Facets let you split plots into multiple panes, each displaying subsets of the dataset.
Here you’ll learn how to wrap facets and arrange them in a grid, as well as providing custom labeling.
library(tidyverse)
# Tools for Working with Categorical Variables (Factors)
library(forcats)
# Used to load the Vocab dataset
library(carData)
# Modified version of mtcars
mtcars <- read.csv("~/Desktop/R/Datacamp/Data Visualization/Datasets/mtcars.csv", stringsAsFactors=FALSE)
mtcars <- mtcars %>%
mutate(fam = as.factor(am), fcyl = as.factor(cyl), car = model, fvs = as.factor(vs)) %>%
mutate(fcyl_fam = interaction(fcyl, fam, sep=":"))
# Used in multiple sections in this chapter
mtcars
model mpg cyl disp hp drat wt qsec vs am gear carb fcyl
1 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 6
2 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 6
3 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 4
4 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 6
5 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 8
6 Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 6
7 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 8
8 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 4
9 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 4
10 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 6
11 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 6
12 Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 8
13 Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 8
14 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 8
15 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 8
16 Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 8
17 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 8
18 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 4
19 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 4
20 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 4
21 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 4
22 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 8
23 AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 8
24 Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 8
25 Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 8
26 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 4
27 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 4
28 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 4
29 Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 8
30 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 6
31 Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 8
32 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 4
fam car fvs fcyl_fam
1 1 Mazda RX4 0 6:1
2 1 Mazda RX4 Wag 0 6:1
3 1 Datsun 710 1 4:1
4 0 Hornet 4 Drive 1 6:0
5 0 Hornet Sportabout 0 8:0
6 0 Valiant 1 6:0
7 0 Duster 360 0 8:0
8 0 Merc 240D 1 4:0
9 0 Merc 230 1 4:0
10 0 Merc 280 1 6:0
11 0 Merc 280C 1 6:0
12 0 Merc 450SE 0 8:0
13 0 Merc 450SL 0 8:0
14 0 Merc 450SLC 0 8:0
15 0 Cadillac Fleetwood 0 8:0
16 0 Lincoln Continental 0 8:0
17 0 Chrysler Imperial 0 8:0
18 1 Fiat 128 1 4:1
19 1 Honda Civic 1 4:1
20 1 Toyota Corolla 1 4:1
21 0 Toyota Corona 1 4:0
22 0 Dodge Challenger 0 8:0
23 0 AMC Javelin 0 8:0
24 0 Camaro Z28 0 8:0
25 0 Pontiac Firebird 0 8:0
26 1 Fiat X1-9 1 4:1
27 1 Porsche 914-2 0 4:1
28 1 Lotus Europa 1 4:1
29 1 Ford Pantera L 0 8:1
30 1 Ferrari Dino 0 6:1
31 1 Maserati Bora 0 8:1
32 1 Volvo 142E 1 4:1
# Used in Section "Face Wrap and Margins", Sub-Section "Wrapping for Many Levels"
data("Vocab")
Lecture Slides 1-12
Faceting splits the data up into groups, according to a categorical
variable, then plots each group in its own panel. For splitting the data
by one or two categorical variables, facet_grid() is
best.
Given categorical variables A and B, the
code pattern is
plot +
facet_grid(rows = vars(A), cols = vars(B))
This draws a panel for each pairwise combination of the values of
A and B.
Here, we’ll use the mtcars data set to practice.
Although cyl and am are not encoded as factor
variables in the data set, ggplot2 will coerce variables to
factors when used in facets.
am value in its own
row.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet rows by am
facet_grid(rows = vars(am))
cyl value in its
own column.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet columns by cyl
facet_grid(cols = vars(cyl))
am value in its own
row and each cyl value in its own column.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet rows by am and columns by cyl
facet_grid(rows = vars(am), cols = vars(cyl))
In addition to aesthetics, facets are another way of encoding factor (i.e. categorical) variables. They can be used to reduce the complexity of plots with many variables.
Our goal is the plot in the viewer, which contains 7 variables.
Two variables are mapped onto the color aesthetic, using hue and
lightness. To achieve this we combined fcyl and
fam into a single interaction
variable, fcyl_fam. This will allow us to take advantage of
Color Brewer’s Paired color palette.
Map fcyl_fam onto the a color
aesthetic.
Add a scale_color_brewer() layer and set
"Paired" as the palette.
# See the interaction column
mtcars$fcyl_fam
[1] 6:1 6:1 4:1 6:0 8:0 6:0 8:0 4:0 4:0 6:0 6:0 8:0 8:0 8:0 8:0 8:0 8:0 4:1 4:1
[20] 4:1 4:0 8:0 8:0 8:0 8:0 4:1 4:1 4:1 8:1 6:1 8:1 4:1
Levels: 4:0 6:0 8:0 4:1 6:1 8:1
# Color the points by fcyl_fam
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl_fam)) +
geom_point() +
# Use a paired color palette
scale_color_brewer(palette = "Paired")
disp, the displacement volume from each cylinder,
onto the size aesthetic.# Update the plot to map disp to size
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl_fam, size = disp)) +
geom_point() +
scale_color_brewer(palette = "Paired")
facet_grid() layer, faceting the plot according
to gear on rows and vs on columns.# Update the plot
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl_fam, size = disp)) +
geom_point() +
scale_color_brewer(palette = "Paired") +
# Grid facet on gear and vs
facet_grid(rows = vars(gear), cols = vars(vs))
The last plot you’ve created contains 7 variables (4 categorical, 3 continuous).
Useful combinations of aesthetics and facets help to achieve this.
As well as the vars() notation for specifying which
variables should be used to split the dataset into facets, there is also
a traditional formula notation. The three cases are shown in the
table.
| Modern notation | Formula notation |
|---|---|
facet_grid(rows = vars(A)) |
facet_grid(A ~ .) |
facet_grid(cols = vars(B)) |
facet_grid(. ~ B) |
facet_grid(rows = vars(A), cols = vars(B)) |
facet_grid(A ~ B) |
mpg_by_wt is available again. Rework the previous plots,
this time using formula notation.
am value in its own
row.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet rows by am using formula notation
facet_grid(am ~ .)
cyl value in its
own column.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet columns by cyl using formula notation
facet_grid(. ~ cyl)
am value in its own
row and each cyl value in its own column.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet rows by am and columns by cyl using formula notation
facet_grid(am ~ cyl)
ggplots still use the traditional formula
notation, using vars() is now preferred.Lecture Slides 13-31
If your factor levels are not clear, your facet labels may be
confusing. You can assign proper labels in your original data
before plotting (see next exercise), or you can use the
labeller argument in the facet layer.
The default value is
label_value: Default, displays only the valueCommon alternatives are:
label_both: Displays both the value and the variable
name
label_context: Displays only the values or both the
values and variables depending on whether multiple factors are
faceted
facet_grid() layer and facet cols
according to the cyl using vars(). There is no
labeling.# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# The default is label_value
facet_grid(cols = vars(cyl))
label_both to the labeller argument
and check the output.# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Displaying both the values and the variables
facet_grid(cols = vars(cyl), labeller = label_both)
label_context to the labeller
argument and check the output.# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Label context
facet_grid(cols = vars(cyl), labeller = label_context)
label_context, let’s facet by one more
variable: vs.# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Two variables
facet_grid(cols = vars(vs, cyl), labeller = label_context)
If you want to change the order of your facets, it’s best to properly define your factor variables before plotting.
Let’s see this in action with the mtcars transmission
variable am. In this case, 0 = "automatic" and
1 = "manual".
Here, we’ll make am a factor variable and relabel the
numbers to proper names. The default order is alphabetical.
To rearrange them we’ll call fct_rev() from the forcats
package to reverse the order.
0 and 1 values of the
am column as "automatic" and
"manual", respectively.# Make factor, set proper labels explictly
mtcars$fam <- factor(mtcars$am,
labels = c(`0` = "automatic", `1` = "manual"))
# Default order is alphabetical
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
facet_grid(cols = vars(fam))
levels and
labels arguments. Recall that 1 is
"manual" and 0 is
"automatic".# Make factor, set proper labels explictly, and
# manually set the label order
mtcars$fam <- factor(mtcars$am,
levels = c(1, 0),
labels = c("manual", "automatic"))
# View again
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
facet_grid(cols = vars(fam))
Lecture Slides 32-43
By default every facet of a plot has the same axes. If the data ranges vary wildly between facets, it can be clearer if each facet has its own scale.
This is achieved with the scales argument to
facet_grid().
"fixed" (default): axes are shared between
facets.
free: each facet has its own axes.
free_x: each facet has its own x-axis, but the
y-axis is shared.
free_y: each facet has its own y-axis, but the
x-axis is shared.
When faceting by columns, "free_y" has no effect, but we
can adjust the x-axis. In contrast, when faceting by rows,
"free_x" has no effect, but we can adjust the
y-axis.
cyl.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Facet columns by cyl
facet_grid(cols = vars(cyl))
x-axis scales.ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Update the faceting to free the x-axis scales
facet_grid(cols = vars(cyl), scales = "free_x")
Facet rows by cyl (rather than
columns).
Free the y-axis scales (instead of
x).
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
# Swap cols for rows; free the y-axis scales
facet_grid(rows = vars(cyl), scales = "free_y")
Shared scales make it easy to compare between facets, but can be confusing if the data ranges are very different.
When you have a categorical variable with many levels which are not all present in each sub-group of another variable, it’s usually desirable to drop the unused levels.
By default, each facet of a plot is the same size. This behavior can
be changed with the spaces argument, which works in the
same way as scales:
"free_x" allows different sized facets on the
x-axis
"free_y", allows different sized facets on the
y-axis
"free" allows different sizes in both
directions.
Facet the plot by rows according to gear using
vars().
ggplot(mtcars, aes(x = mpg, y = car, color = fam)) +
geom_point() +
# Facet rows by gear
facet_grid(rows = vars(gear)) # This looks messsy
scales and space
arguments in facet_grid() to
free_y.ggplot(mtcars, aes(x = mpg, y = car, color = fam)) +
geom_point() +
# Free the y scales and space
facet_grid(rows = vars(gear), scales = "free_y", space = "free_y") # Looks much cleaner now
y-scale to remove blank lines helps focus
attention on the actual data present.Lecture Slides 44-53
facet_grid() is fantastic for categorical variables with
a small number of levels. Although it is possible to facet variables
with many levels, the resulting plot will be very wide or very tall,
which can make it difficult to view.
The solution is to use facet_wrap() which separates
levels along one axis but wraps all the subsets across a given number of
rows or columns.
For this plot, we’ll use the Vocab dataset that we’ve
already seen. The base layer is provided.
Since we have many years, it doesn’t make sense to use
facet_grid(), so let’s try facet_wrap()
instead.
Add a facet_wrap() layer and specify:
year variable with an argument using the
vars() function,ggplot(Vocab, aes(x = education, y = vocabulary)) +
stat_smooth(method = "lm", se = FALSE) +
# Create facets, wrapping by year, using vars()
facet_wrap(vars(year))
facet_wrap() layer and specify the
year variable with a formula notation
(~).ggplot(Vocab, aes(x = education, y = vocabulary)) +
stat_smooth(method = "lm", se = FALSE) +
# Create facets, wrapping by year, using a formula
facet_wrap(~ year)
Add a facet_wrap() layer and specify:
ncol set to
11.ggplot(Vocab, aes(x = education, y = vocabulary)) +
stat_smooth(method = "lm", se = FALSE) +
# Update the facet layout, using 11 columns
facet_wrap(~ year, ncol = 11)
Facets are great for seeing subsets in a variable, but sometimes you want to see both those subsets and all values in a variable.
Here, the margins argument to facet_grid()
is your friend.
FALSE (default): no margins.
TRUE: add margins to every variable being faceted
by.
c("variable1", "variable2"): only add margins to the
variables listed.
To make it easier to follow the facets, we’ve created two factor
variables with proper labels \(-\)
fam for the transmission type, and fvs for the
engine type, respectively.
Zoom the graphics window to better view your plots.
fvs and
fam, and columns by gear.ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
# Facet rows by fvs and cols by fam
facet_grid(rows = vars(fvs, fam), cols = vars(gear))
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
# Update the facets to add margins
facet_grid(rows = vars(fvs, fam), cols = vars(gear), margins = TRUE)
"fam".ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
# Update the facets to only show margins on fam
facet_grid(rows = vars(fvs, fam), cols = vars(gear), margins = "fam")
"gear" and
"fvs".ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
# Update the facets to only show margins on gear and fvs
facet_grid(rows = vars(fvs, fam), cols = vars(gear), margins = c("gear", "fvs"))