This plots Anscombe’s quartet using the tidy tools of dplyr, tidyr and ggplot2:
library(ggplot2)
library(dplyr)
library(tidyr)
The dataset comes built into R:
head(anscombe)
## x1 x2 x3 x4 y1 y2 y3 y4
## 1 10 10 10 8 8.04 9.14 7.46 6.58
## 2 8 8 8 8 6.95 8.14 6.77 5.76
## 3 13 13 13 8 7.58 8.74 12.74 7.71
## 4 9 9 9 8 8.81 8.77 7.11 8.84
## 5 11 11 11 8 8.33 9.26 7.81 8.47
## 6 14 14 14 8 9.96 8.10 8.84 7.04
But it must be reshaped into a tidy form:
anscombe_tidy <- anscombe %>%
mutate(observation = seq_len(n())) %>%
gather(key, value, -observation) %>%
separate(key, c("variable", "set"), 1, convert = TRUE) %>%
mutate(set = c("I", "II", "III", "IV")[set]) %>%
spread(variable, value)
head(anscombe_tidy)
## observation set x y
## 1 1 I 10 8.04
## 2 1 II 10 9.14
## 3 1 III 10 7.46
## 4 1 IV 8 6.58
## 5 2 I 8 6.95
## 6 2 II 8 8.14
Now it can be plotted:
ggplot(anscombe_tidy, aes(x, y)) +
geom_point() +
facet_wrap(~ set) +
geom_smooth(method = "lm", se = FALSE)