Q: How do you put two color scales on a ggplot2 chart?
A: This is probably a bad idea - the mappings between data and aesthetics that ggplot2 enforces are for your own good.
Q: but I want to anyway
A: OK, fine
The trick is to tag your data with the color you want, then use scale_color_identity()
library(ggplot2)
library(dplyr)
library(magrittr)
Read in some data:
ed_data <- read.csv(file = 'https://gist.githubusercontent.com/sfsheath/07f47f65073fa3f47286/raw/0bf450a6a5344ebd22e63cdfda9730d44c867ec3/mydata100.csv', stringsAsFactors = FALSE) %>%
dplyr::select(-X) %>%
dplyr::mutate(
start_rank = rank(pretest, ties.method = 'first')
)
head(ed_data)
## gender workshop q1 q2 q3 q4 pretest posttest start_rank
## 1 Female R 4 3 4 5 72 80 22
## 2 Male SPSS 3 4 3 4 70 75 12
## 3 <NA> <NA> 3 2 NA 3 74 78 40
## 4 Female SPSS 5 4 5 3 80 82 79
## 5 Female Stata 4 4 3 4 75 81 48
## 6 Female SPSS 5 4 3 5 72 77 23
Let’s say you wanted to show growth from pretest to posttest:
p <- ggplot(
data = ed_data,
aes(
x = pretest,
xend = posttest,
y = start_rank,
yend = start_rank
)
) +
geom_segment() +
#start point
geom_point(
aes(x = pretest, y = start_rank)
) +
#end point
geom_point(
aes(x = posttest, y = start_rank)
) +
theme_bw()
p
Now let’s say you wanted to encode the points with meaning - top and bottom quartile, say.
ed_data <- ed_data %>%
dplyr::mutate(
end_rank = rank(posttest, ties.method = 'first'),
start_quartile = ntile(pretest, 4),
end_quartile = ntile(posttest, 4)
)
head(ed_data)
## gender workshop q1 q2 q3 q4 pretest posttest start_rank end_rank
## 1 Female R 4 3 4 5 72 80 22 36
## 2 Male SPSS 3 4 3 4 70 75 12 8
## 3 <NA> <NA> 3 2 NA 3 74 78 40 27
## 4 Female SPSS 5 4 5 3 80 82 79 49
## 5 Female Stata 4 4 3 4 75 81 48 40
## 6 Female SPSS 5 4 3 5 72 77 23 19
## start_quartile end_quartile
## 1 1 2
## 2 1 1
## 3 2 2
## 4 4 2
## 5 2 2
## 6 1 1
To my knowledge, ggplot won’t let you apply one color scale to the start points and another to the end points. But, if you’re willing to do a little legwork on your own, you can pre-process your data and use scale_color_identity() to get the desired result.
color_mapping <- function(quartile) {
colors <- c('red2', 'gray50', 'gray50', 'blue')
colors[quartile]
}
ed_data <- ed_data %>%
dplyr::rowwise() %>%
dplyr::mutate(
start_color = color_mapping(start_quartile),
end_color = color_mapping(end_quartile)
)
head(ed_data) %>% print.data.frame()
## gender workshop q1 q2 q3 q4 pretest posttest start_rank end_rank
## 1 Female R 4 3 4 5 72 80 22 36
## 2 Male SPSS 3 4 3 4 70 75 12 8
## 3 <NA> <NA> 3 2 NA 3 74 78 40 27
## 4 Female SPSS 5 4 5 3 80 82 79 49
## 5 Female Stata 4 4 3 4 75 81 48 40
## 6 Female SPSS 5 4 3 5 72 77 23 19
## start_quartile end_quartile start_color end_color
## 1 1 2 red2 gray50
## 2 1 1 red2 red2
## 3 2 2 gray50 gray50
## 4 4 2 blue gray50
## 5 2 2 gray50 gray50
## 6 1 1 red2 red2
p <- ggplot(
data = ed_data,
aes(
x = pretest,
xend = posttest,
y = start_rank,
yend = start_rank
)
) +
geom_segment() +
#start point
geom_point(
aes(x = pretest, y = start_rank, color = start_color)
) +
#end point
geom_point(
aes(x = posttest, y = start_rank, color = end_color)
) +
theme_bw() +
scale_color_identity()
p