Plotting with ggplot2

library(tidyverse)

Practice with irises

ggplot(iris, aes(x = Petal.Length, y = Petal.Width)) +
  geom_point()

Start with just data and coordinates

ggplot(iris, aes(x = Petal.Length, y = Petal.Width)) 

Add in points to show data, along with color and a linear model

ggplot(iris, aes(x = Petal.Length, y = Petal.Width)) +
  geom_point()

ggplot(iris, aes(x = Petal.Length, y = Petal.Width, color = Species)) +
  geom_point() 

ggplot(iris, aes(x = Petal.Length, y = Petal.Width, color = Species)) +
  geom_point() +
  geom_smooth(method = "lm")

ggplot(iris, aes(x = Petal.Length, y = Petal.Width)) +
  geom_point(aes(color = Species)) +
  geom_smooth(method = "lm")

Activity 1


Looking at a new dataset

diamonds

Start simple, single variable

Histograms

ggplot(diamonds, aes(x = carat)) +
  geom_histogram()

Adjust binwidth as necessary

ggplot(diamonds, aes(x = carat)) +
  geom_histogram(bins = 40)

ggplot(diamonds, aes(x = carat)) +
  geom_histogram(bins = 10)

ggplot(diamonds, aes(x = carat)) +
  geom_histogram(bins = 15)

Quick color adjustment

Color is different from Fill

ggplot(diamonds, aes(x = carat)) +
  geom_histogram(bins = 15, color = "black", fill = "steelblue")

Activity 2


Categorical Variables

Bar plots, defaults to stat = “count”

ggplot(diamonds, aes(x = cut)) +
  geom_bar()

ggplot(diamonds, aes(x = cut)) +
  geom_bar(stat = "count")

Boxplots

ggplot(diamonds, aes(x = cut, y = price)) +
  geom_boxplot()

Violin plots

ggplot(diamonds, aes(x = cut, y = price)) +
  geom_violin()

Other features of ggplot2

Starting with a basic scatterplot

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point()

Plotting in Layers

Points created first, line drawn on top

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point() +
  geom_smooth()

Line created first, points drawn on top

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_smooth() + 
  geom_point() 

Using Color

This applies the color to everything - global ggplot call carries to all geom_*

ggplot(diamonds, aes(x = carat, y = price, color = cut)) +
  geom_point() +
  geom_smooth()

If we want to color the points based on the cut of the diamond:

This doesn’t work.

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(color = cut) +
  geom_smooth()

This does! Using the data to color the data requires the aes() function

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = cut)) +
  geom_smooth()

This changes all of the points to a single color

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(color = "orange") +
  geom_smooth()

This creates a color based on one value, “orange.” It doesn’t involve the color “orange” at all.

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = "orange")) +
  geom_smooth()

Colors can be continuous

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) 

Labels, Legends, and Themes

Titles and labels

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  )

Legend / color customization

We can customize the colors used in our plots. For instance, scale_color_continuous allows us to set high and low values

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  )

Using built-in themes

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  ) +
  theme_dark()

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  ) +
  theme_bw()

Facets / Small Multiples

Facet with one variable using facet_wrap()

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  ) +
  theme_bw() +
  facet_wrap( ~ clarity)

Facet in specific directions with facet_grid()

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  ) +
  theme_bw() +
  facet_grid(cut ~ .)

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  ) +
  theme_bw() +
  facet_grid(. ~ cut)

Facet in both directions across different variables

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  ) +
  theme_bw() +
  facet_grid(color ~ cut)

Activity 3


Saving

It’s easy to save a plot, especially once you’ve stored it in a variable (like p).

p <- ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(color = table)) +
  labs(
    x = "Carat",
    y = "Price (USD)",
    title = "Price vs. Carat",
    subtitle = "Data from ggplot2",
    caption = "This is a caption",
    color = "Table of Diamond"
  ) +
  scale_color_continuous(
    low = "yellow",
    high = "blue"
  ) +
  theme_bw() +
  facet_grid(color ~ cut)

ggplot2 can save in many file formats without difficulty. Here are three:

ggsave(plot = p, filename = "./images/diamonds.png")
ggsave(plot = p, filename = "./images/diamonds.svg")
ggsave(plot = p, filename = "./images/diamonds.eps")

We can define further options, like dpi, within the ggsave() command. Remember that width and height are defined in inches by default.

ggsave(plot = p, filename = "./images/diamonds.png", dpi = 1200, width = 8, height = 6)