I’ll admit it – like many R users, I’m guilty of using the default colors when plotting graphs using ggplot2. There’s nothing wrong with that, but why not have some fun while we’re at it.
Data visualisation is about creating a story that will engage your stakeholders. One way to make plots more impactful is by using colors wisely.
If you can relate and would like to find ways to make your plots more unique, I invite you to read further.
Let’s have a look at the variety of palettes available in this R package and how they bring your visualisations to life.
In this vignette, we will use ggplot2 to create our graphs and select colors using the RColorBrewer and viridis packages.
First things first – load the required libraries (you may have to install them first if you haven’t used them yet :)
The RColorBrewer
package contains 3 palettes which we can access using the below code:
There is an additional option that allows to view color blind friendly options only:
Here’s how to select a palette based on your data (in order based on the above picture):
Sequential
: for representing values going from “low” (or uninteresting) to “high” (or interesting)
Qualitative
: for categorical or nominal data (factors)
Diverging
: for numerical data that can be positive or negative with darker colors on each extreme and lighter ones in the middle.
The Viridis
palette was originally developed for matplotlib (Python package) and is essentially composed of blues and yellow sequences:
For this example on how to add a palette to your plot, we’re using the inbuilt dataset “mpg” which contains “Fuel economy data from 1999 to 2008 for 38 popular models of cars”.
With the below code, we’re plotting the city mileage grouped by number of cylinders, cylinders being a categorical value:
library(gridExtra)
p1 <- ggplot(mpg, aes(cty)) +
geom_density(aes(fill=factor(cyl)), alpha=0.8) +
labs(title="Dark2",
x="City Mileage",
fill="# Cylinders") +
scale_fill_brewer(palette="Dark2")
p2 <- ggplot(mpg,aes(cty)) +
geom_density(aes(fill=factor(cyl)), alpha=0.8) +
labs(title="Pastel2",
x="City Mileage",
fill="# Cylinders") +
scale_fill_brewer(palette="Pastel2")
p3 <- ggplot(mpg, aes(cty)) +
geom_density(aes(fill=factor(cyl)), alpha=0.8) +
labs(title="Accent",
x="City Mileage",
fill="# Cylinders") +
scale_fill_brewer(palette="Accent")
p4 <- ggplot(mpg, aes(cty)) +
geom_density(aes(fill=factor(cyl)), alpha=0.8) +
labs(title="Set1",
x="City Mileage",
fill="# Cylinders") +
scale_fill_brewer(palette="Set1")
grid.arrange(p1, p2, p3, p4, nrow = 2)
This time we will manually select 2 colors from a palette using their hexcodes
We’re using a simple dataframe that we create:
# Display the Hex code and select two colors for the next graph, "#9970AB" & "#5AAE61"
brewer.pal(n = 11, name = "PRGn")
[1] “#40004B” “#762A83” “#9970AB” “#C2A5CF” “#E7D4E8” “#F7F7F7” “#D9F0D3” [8] “#A6DBA0” “#5AAE61” “#1B7837” “#00441B”
Be informed that if you wish to use a full palette but you’re plotting data that requires more than 11 colors (if the palette only has 11 different colors like this one “PRGn”) you will face an issue as the remaining data will not be colored at all.
ggplot(df, aes(x = x, y = y, fill = pos)) +
geom_bar(stat = "identity", position = "identity") +
scale_fill_manual(values = c("#9970AB","#5AAE61"))
We’re creating a simple dataset again to plot some bars.
tooth <- data.frame(dose=c("D0.5", "D1", "D2"),
len=c(4.2, 10, 29.5))
g1 <- ggplot(tooth, aes(x=dose, y=len, fill = dose)) +
geom_bar(stat="identity") + scale_fill_brewer(palette = "Purples")
g2 <- ggplot(tooth, aes(x=dose, y=len, fill = dose)) +
geom_bar(stat="identity") + scale_fill_brewer(palette = "Oranges")
g3 <- ggplot(tooth, aes(x=dose, y=len, fill = dose)) +
geom_bar(stat="identity") + scale_fill_brewer(palette = "Blues")
g4 <- ggplot(tooth, aes(x=dose, y=len, fill = dose)) +
geom_bar(stat="identity") + scale_fill_brewer(palette = "Greens")
grid.arrange(g1, g2, g3, g4, nrow = 2)
In this section we will color the US map by State using USArrests which is also an inbuilt R Dataset. It contains statistics of different type of crimes, for this example we will focus on “assault” only, the number is to be read as a rate per 100 000 residents
# prepare data
arrests <- USArrests
arrests$region <- tolower(rownames(USArrests))
# Retrieve the states map data and merge with crime data
states_map <- map_data("state")
arrests_map <- left_join(states_map, arrests, by = "region")
# Create the map
ggplot(arrests_map, aes(long, lat, group = group)) +
geom_polygon(aes(fill = Assault), color = "white") +
labs(title="Assault arrests (per 100 000) per state") +
scale_fill_viridis_c(option = "D", direction = 1)
Swap the order by changing direction =1 to -1
ggplot(arrests_map, aes(long, lat, group = group)) +
geom_polygon(aes(fill = Assault), color = "white") +
labs(title="Assault arrests (per 100 000 inhabitants) per state") +
scale_fill_viridis_c(option = "D", direction = -1)
In addition to using different palette, I recommend to first do some research on the best color for the context of your graph.
Playing with contrasts and highlighting only one bar or some points of your plots is also very useful to bring attention to some important information but is not covered in this article.
That’s it for this vignette, but there’s lots of ways to play with colors for your visuals.
R Documentation
Peng, R. D. (n.d.). 10 Plotting and Color in R | Exploratory Data Analysis with R. Retrieved August 13, 2020, from https://bookdown.org/rdpeng/exdata/
ggplot2 Reference and Examples (Part 2)—Colours. (n.d.). Retrieved August 13, 2020, from http://rstudio-pubs-static.s3.amazonaws.com/5312_98fc1aba2d5740dd849a5ab797cc2c8d.html
The A - Z Of Rcolorbrewer Palette You Must Know. (2018, November 18). Datanovia. https://www.datanovia.com/en/blog/the-a-z-of-rcolorbrewer-palette/
Top 50 ggplot2 Visualizations—The Master List (With Full R Code). (n.d.). Retrieved August 13, 2020, from http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html