\(\underline{\textbf{Chapter Description}}\)

Facets let you split plots into multiple panes, each displaying subsets of the dataset.
Here you’ll learn how to wrap facets and arrange them in a grid, as well as providing custom labeling.

library(tidyverse)

# Tools for Working with Categorical Variables (Factors)
library(forcats)

# Used to load the Vocab dataset
library(carData)
# Modified version of mtcars
mtcars <- read.csv("~/Desktop/R/Datacamp/Data Visualization/Datasets/mtcars.csv", stringsAsFactors=FALSE)
mtcars <- mtcars %>% 
  mutate(fam = as.factor(am), fcyl = as.factor(cyl), car = model, fvs = as.factor(vs)) %>% 
  mutate(fcyl_fam = interaction(fcyl, fam, sep=":"))

# Used in multiple sections in this chapter
mtcars
                 model  mpg cyl  disp  hp drat    wt  qsec vs am gear carb fcyl
1            Mazda RX4 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4    6
2        Mazda RX4 Wag 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4    6
3           Datsun 710 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1    4
4       Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1    6
5    Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2    8
6              Valiant 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1    6
7           Duster 360 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4    8
8            Merc 240D 24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2    4
9             Merc 230 22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2    4
10            Merc 280 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4    6
11           Merc 280C 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4    6
12          Merc 450SE 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3    8
13          Merc 450SL 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3    8
14         Merc 450SLC 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3    8
15  Cadillac Fleetwood 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4    8
16 Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4    8
17   Chrysler Imperial 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4    8
18            Fiat 128 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1    4
19         Honda Civic 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2    4
20      Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1    4
21       Toyota Corona 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1    4
22    Dodge Challenger 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2    8
23         AMC Javelin 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2    8
24          Camaro Z28 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4    8
25    Pontiac Firebird 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2    8
26           Fiat X1-9 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1    4
27       Porsche 914-2 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2    4
28        Lotus Europa 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2    4
29      Ford Pantera L 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4    8
30        Ferrari Dino 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6    6
31       Maserati Bora 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8    8
32          Volvo 142E 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2    4
   fam                 car fvs fcyl_fam
1    1           Mazda RX4   0      6:1
2    1       Mazda RX4 Wag   0      6:1
3    1          Datsun 710   1      4:1
4    0      Hornet 4 Drive   1      6:0
5    0   Hornet Sportabout   0      8:0
6    0             Valiant   1      6:0
7    0          Duster 360   0      8:0
8    0           Merc 240D   1      4:0
9    0            Merc 230   1      4:0
10   0            Merc 280   1      6:0
11   0           Merc 280C   1      6:0
12   0          Merc 450SE   0      8:0
13   0          Merc 450SL   0      8:0
14   0         Merc 450SLC   0      8:0
15   0  Cadillac Fleetwood   0      8:0
16   0 Lincoln Continental   0      8:0
17   0   Chrysler Imperial   0      8:0
18   1            Fiat 128   1      4:1
19   1         Honda Civic   1      4:1
20   1      Toyota Corolla   1      4:1
21   0       Toyota Corona   1      4:0
22   0    Dodge Challenger   0      8:0
23   0         AMC Javelin   0      8:0
24   0          Camaro Z28   0      8:0
25   0    Pontiac Firebird   0      8:0
26   1           Fiat X1-9   1      4:1
27   1       Porsche 914-2   0      4:1
28   1        Lotus Europa   1      4:1
29   1      Ford Pantera L   0      8:1
30   1        Ferrari Dino   0      6:1
31   1       Maserati Bora   0      8:1
32   1          Volvo 142E   1      4:1
# Used in Section "Face Wrap and Margins", Sub-Section "Wrapping for Many Levels"
data("Vocab")

The Facet Layer

Lecture Slides 1-12


Facet Layer Basics

Faceting splits the data up into groups, according to a categorical variable, then plots each group in its own panel. For splitting the data by one or two categorical variables, facet_grid() is best.

Given categorical variables A and B, the code pattern is

plot +
  facet_grid(rows = vars(A), cols = vars(B))

This draws a panel for each pairwise combination of the values of A and B.

Here, we’ll use the mtcars data set to practice. Although cyl and am are not encoded as factor variables in the data set, ggplot2 will coerce variables to factors when used in facets.

Exercise 1

  • Facet the plot in a grid, with each am value in its own row.
ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  
  # Facet rows by am
  facet_grid(rows = vars(am))

Exercise 2

  • Facet the plot in a grid, with each cyl value in its own column.
ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  
  # Facet columns by cyl
  facet_grid(cols = vars(cyl))

Exercise 3

  • Facet the plot in a grid, with each am value in its own row and each cyl value in its own column.
ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  
  # Facet rows by am and columns by cyl
  facet_grid(rows = vars(am), cols = vars(cyl))

Concluding Remarks

  • Compare the different plots that result and see which one makes most sense.



Many Variables

In addition to aesthetics, facets are another way of encoding factor (i.e. categorical) variables. They can be used to reduce the complexity of plots with many variables.

Our goal is the plot in the viewer, which contains 7 variables.

Two variables are mapped onto the color aesthetic, using hue and lightness. To achieve this we combined fcyl and fam into a single interaction variable, fcyl_fam. This will allow us to take advantage of Color Brewer’s Paired color palette.

Exercise 1

  • Map fcyl_fam onto the a color aesthetic.

  • Add a scale_color_brewer() layer and set "Paired" as the palette.

# See the interaction column
mtcars$fcyl_fam
 [1] 6:1 6:1 4:1 6:0 8:0 6:0 8:0 4:0 4:0 6:0 6:0 8:0 8:0 8:0 8:0 8:0 8:0 4:1 4:1
[20] 4:1 4:0 8:0 8:0 8:0 8:0 4:1 4:1 4:1 8:1 6:1 8:1 4:1
Levels: 4:0 6:0 8:0 4:1 6:1 8:1
# Color the points by fcyl_fam
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl_fam)) +
  geom_point() +
  # Use a paired color palette
  scale_color_brewer(palette = "Paired")

Exercise 2

  • Map disp, the displacement volume from each cylinder, onto the size aesthetic.
# Update the plot to map disp to size
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl_fam, size = disp)) +
  geom_point() +
  scale_color_brewer(palette = "Paired")

Exercise 3

  • Add a facet_grid() layer, faceting the plot according to gear on rows and vs on columns.
# Update the plot
ggplot(mtcars, aes(x = wt, y = mpg, color = fcyl_fam, size = disp)) +
  geom_point() +
  scale_color_brewer(palette = "Paired") +
  # Grid facet on gear and vs
  facet_grid(rows = vars(gear), cols = vars(vs))

Concluding Remarks

  • The last plot you’ve created contains 7 variables (4 categorical, 3 continuous).

  • Useful combinations of aesthetics and facets help to achieve this.



Formula Notation

As well as the vars() notation for specifying which variables should be used to split the dataset into facets, there is also a traditional formula notation. The three cases are shown in the table.

Modern notation Formula notation
facet_grid(rows = vars(A)) facet_grid(A ~ .)
facet_grid(cols = vars(B)) facet_grid(. ~ B)
facet_grid(rows = vars(A), cols = vars(B)) facet_grid(A ~ B)

mpg_by_wt is available again. Rework the previous plots, this time using formula notation.

Exercise 1

  • Facet the plot in a grid, with each am value in its own row.
ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  # Facet rows by am using formula notation
  facet_grid(am ~ .)

Exercise 2

  • Facet the plot in a grid, with each cyl value in its own column.
ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  # Facet columns by cyl using formula notation
  facet_grid(. ~ cyl)

Exercise 3

  • Facet the plot in a grid, with each am value in its own row and each cyl value in its own column.
ggplot(mtcars, aes(wt, mpg)) + 
  geom_point() +
  # Facet rows by am and columns by cyl using formula notation
  facet_grid(am ~ cyl)

Concluding Remarks

  • While many ggplots still use the traditional formula notation, using vars() is now preferred.



Facet Labels and Order

Lecture Slides 13-31


Labeling Facets

If your factor levels are not clear, your facet labels may be confusing. You can assign proper labels in your original data before plotting (see next exercise), or you can use the labeller argument in the facet layer.

The default value is

  • label_value: Default, displays only the value

Common alternatives are:

  • label_both: Displays both the value and the variable name

  • label_context: Displays only the values or both the values and variables depending on whether multiple factors are faceted

Exercise 1

  • Add a facet_grid() layer and facet cols according to the cyl using vars(). There is no labeling.
# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  # The default is label_value
  facet_grid(cols = vars(cyl))

Exercise 2

  • Apply label_both to the labeller argument and check the output.
# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  
  # Displaying both the values and the variables
  facet_grid(cols = vars(cyl), labeller = label_both)

Exercise 3

  • Apply label_context to the labeller argument and check the output.
# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  
  # Label context
  facet_grid(cols = vars(cyl), labeller = label_context)

Exercise 4

  • In addition to label_context, let’s facet by one more variable: vs.
# Plot wt by mpg
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  
  # Two variables
  facet_grid(cols = vars(vs, cyl), labeller = label_context)

Concluding Remarks

  • Make sure there is no ambiguity in interpreting plots by using proper labels.



Setting Order

If you want to change the order of your facets, it’s best to properly define your factor variables before plotting.

Let’s see this in action with the mtcars transmission variable am. In this case, 0 = "automatic" and 1 = "manual".

Here, we’ll make am a factor variable and relabel the numbers to proper names. The default order is alphabetical.
To rearrange them we’ll call fct_rev() from the forcats package to reverse the order.

Exercise 1

  • Explicitly label the 0 and 1 values of the am column as "automatic" and "manual", respectively.
# Make factor, set proper labels explictly
mtcars$fam <- factor(mtcars$am, 
                     labels = c(`0` = "automatic", `1` = "manual"))

# Default order is alphabetical
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  facet_grid(cols = vars(fam))

Exercise 2

  • Define a specific order using separate levels and labels arguments. Recall that 1 is "manual" and 0 is "automatic".
# Make factor, set proper labels explictly, and
# manually set the label order
mtcars$fam <- factor(mtcars$am,
                     levels = c(1, 0),
                     labels = c("manual", "automatic"))

# View again
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  facet_grid(cols = vars(fam))

Concluding Remarks

  • Arrange your facets in an intuitive order for your data.



Facet Plotting Spaces

Lecture Slides 32-43


Variable Plotting Spaces Part 1: Continuous Variables

By default every facet of a plot has the same axes. If the data ranges vary wildly between facets, it can be clearer if each facet has its own scale.

This is achieved with the scales argument to facet_grid().

  • "fixed" (default): axes are shared between facets.

  • free: each facet has its own axes.

  • free_x: each facet has its own x-axis, but the y-axis is shared.

  • free_y: each facet has its own y-axis, but the x-axis is shared.

When faceting by columns, "free_y" has no effect, but we can adjust the x-axis. In contrast, when faceting by rows, "free_x" has no effect, but we can adjust the y-axis.

Exercise 1

  • Update the plot to facet columns by cyl.
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() + 
  # Facet columns by cyl 
  facet_grid(cols = vars(cyl))

Exercise 2

  • Update the faceting to free the x-axis scales.
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() + 
  # Update the faceting to free the x-axis scales
  facet_grid(cols = vars(cyl), scales = "free_x")

Exercise 3

  • Facet rows by cyl (rather than columns).

  • Free the y-axis scales (instead of x).

ggplot(mtcars, aes(wt, mpg)) +
  geom_point() + 
  # Swap cols for rows; free the y-axis scales
  facet_grid(rows = vars(cyl), scales = "free_y")

Concluding Remarks

  • Shared scales make it easy to compare between facets, but can be confusing if the data ranges are very different.

    • In that case, used free scales.



Variable Plotting Spaces Part 2: Categorical Variables

When you have a categorical variable with many levels which are not all present in each sub-group of another variable, it’s usually desirable to drop the unused levels.

By default, each facet of a plot is the same size. This behavior can be changed with the spaces argument, which works in the same way as scales:

  • "free_x" allows different sized facets on the x-axis

  • "free_y", allows different sized facets on the y-axis

  • "free" allows different sizes in both directions.

Exercise 1

  • Facet the plot by rows according to gear using vars().

    • Notice that every car is listed in every facet, resulting in many lines without data.
ggplot(mtcars, aes(x = mpg, y = car, color = fam)) +
  geom_point() +
  
  # Facet rows by gear
  facet_grid(rows = vars(gear))     # This looks messsy

Exercise 2

  • To remove blank lines, set the scales and space arguments in facet_grid() to free_y.
ggplot(mtcars, aes(x = mpg, y = car, color = fam)) +
  geom_point() +
  
  # Free the y scales and space
  facet_grid(rows = vars(gear), scales = "free_y", space = "free_y")     # Looks much cleaner now

Concluding Remarks

  • Freeing the y-scale to remove blank lines helps focus attention on the actual data present.



Facet Wrap and Margins

Lecture Slides 44-53


Wrapping for Many Levels

facet_grid() is fantastic for categorical variables with a small number of levels. Although it is possible to facet variables with many levels, the resulting plot will be very wide or very tall, which can make it difficult to view.

The solution is to use facet_wrap() which separates levels along one axis but wraps all the subsets across a given number of rows or columns.

For this plot, we’ll use the Vocab dataset that we’ve already seen. The base layer is provided.

Since we have many years, it doesn’t make sense to use facet_grid(), so let’s try facet_wrap() instead.

Exercise 1

  • Add a facet_wrap() layer and specify:

    • The year variable with an argument using the vars() function,
ggplot(Vocab, aes(x = education, y = vocabulary)) +
  stat_smooth(method = "lm", se = FALSE) +
  
  # Create facets, wrapping by year, using vars()
  facet_wrap(vars(year))

Exercise 2

  • Add a facet_wrap() layer and specify the year variable with a formula notation (~).
ggplot(Vocab, aes(x = education, y = vocabulary)) +
  stat_smooth(method = "lm", se = FALSE) +
  
  # Create facets, wrapping by year, using a formula
  facet_wrap(~ year)

Exercise 3

  • Add a facet_wrap() layer and specify:

    • Formula notation as before, and ncol set to 11.
ggplot(Vocab, aes(x = education, y = vocabulary)) +
  stat_smooth(method = "lm", se = FALSE) +
  
  # Update the facet layout, using 11 columns
  facet_wrap(~ year, ncol = 11)

Concluding Remarks

  • Start experimenting with facets in your own plots.



Margin Plots

Facets are great for seeing subsets in a variable, but sometimes you want to see both those subsets and all values in a variable.

Here, the margins argument to facet_grid() is your friend.

  • FALSE (default): no margins.

  • TRUE: add margins to every variable being faceted by.

  • c("variable1", "variable2"): only add margins to the variables listed.

To make it easier to follow the facets, we’ve created two factor variables with proper labels \(-\) fam for the transmission type, and fvs for the engine type, respectively.

Zoom the graphics window to better view your plots.

Exercise 1

  • Update the plot to facet the rows by fvs and fam, and columns by gear.
ggplot(mtcars, aes(x = wt, y = mpg)) + 
  geom_point() +
  
  # Facet rows by fvs and cols by fam
  facet_grid(rows = vars(fvs, fam), cols = vars(gear))

Exercise 2

  • Add all possible margins to the plot.
ggplot(mtcars, aes(x = wt, y = mpg)) + 
  geom_point() +
  
  # Update the facets to add margins
  facet_grid(rows = vars(fvs, fam), cols = vars(gear), margins = TRUE)

Exercise 3

  • Update the facets to only show margins on "fam".
ggplot(mtcars, aes(x = wt, y = mpg)) + 
  geom_point() +
  # Update the facets to only show margins on fam
  facet_grid(rows = vars(fvs, fam), cols = vars(gear), margins = "fam")

Exercise 4

  • Update the facets to only show margins on "gear" and "fvs".
ggplot(mtcars, aes(x = wt, y = mpg)) + 
  geom_point() +
  
  # Update the facets to only show margins on gear and fvs
  facet_grid(rows = vars(fvs, fam), cols = vars(gear), margins = c("gear", "fvs"))

Concluding Remarks

  • It can be really helpful to show the full margin plots!