Colour

The aim of this document is to provide a simple understanding on colour and its application within RStudio using ggplot2. We are also going to explore packages and custom palettes.

The use of appropriate colour when we start to plot our data helps us to interpret, and more importantly describe a relationship in a visual context. When we do not choose the correct set of colours we could limit one’s interpretation and not convey a message effectively.

How to select colour

We have a number of methods to call a colour in R and we can then start to layer or incorporate packages. We will take you through the evolution of colour in R within this vignette.

First let’s import the ggplot2 package and find a dataset in base R that will allow us to demonstrate how to access colour. ggplot2 enables us to change aspects of the chart using its fill and color arguments. Note that color and colour get the same desired output.

library(ggplot2)

cars <- ggplot(mtcars, aes(x = mpg))
  • Name
cars + geom_density(color="yellow", fill="grey")

  • rgb()
cars + geom_density(fill=rgb(0.5, 0.5, 0.5))

  • Number
cars + geom_density(fill=colors()[333])

  • Hex code
cars + geom_density(colour="#FFFF00", fill="#808080")

  • Palettes and/or Scales

Here is what you can do using base R. There are five base R functions and to generate a vector of (n) contiguous colors:

rainbow(n), heat.colors(n), terrain.colors(n), topo.colors(n), and cm.colors(n).

# Use rainbow
barplot(1:9, col=rainbow(9))

# Use heat.colors
barplot(1:9, col=heat.colors(9))

# Use terrain.colors
barplot(1:9, col=terrain.colors(9))

# Use topo.colors
barplot(1:9, col=topo.colors(9))

# Use cm.colors
barplot(1:9, col=cm.colors(9))

Here is what we could do when we explore palettes and/or scales. Do not worry as we will explain more in later sections of the document as we incorporate palettes.

library("ggplot2")
# Box plot
box_plot <- ggplot(iris, aes(Species, Sepal.Length)) + 
  geom_boxplot(aes(fill = Species)) +
  theme_minimal() +
  theme(legend.position = "bottom")

# Scatter plot
scatter_plot <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + 
  geom_point(aes(color = Species)) +
  theme_minimal()+
  theme(legend.position = "bottom")
# grey
box_plot + scale_fill_grey(start = 0.8, end = 0.2) 

scatter_plot + scale_color_grey(start = 0.8, end = 0.2) 

# hue
box_plot + scale_fill_hue(h = c(175, 255))

scatter_plot + scale_color_hue(h = c(150, 255))  

# RColorBrewer
box_plot + scale_fill_brewer(palette = "Blues")

scatter_plot + scale_color_brewer(palette = "Blues")

You can search for a number of useful packages within R that allow you to use palettes and/or scales. We explore some of these packages in the next section of the document. It is worth noting that when getting a better understanding it is always good to understand the syntax and the options available to each specific package.

RColourBrewer Package

This is an example on one of the packages available on CRAN that contains interesting/useful colour palettes. This package can be used in conjunction with other function to do cool stuff so please explore colorRampPalette() and/or smoothScatter().

Here we have three types of data:

  • Sequential - Used for ordered data and/or numerical data that have a low and a high. Go from light to dark as you go from left to right.
  • Qualitative Used to represent categorical data or nominal data.
  • Diverging - Left being negative and the right being positive is a good way to think about this visual. Darker colours on each end and lighter ones (typically white) in the center.

Please install and/or load the RColorBrewer package to get started:

Note: this is the not the first time using the package yet we have only just installed it. This is due to ggplot2 having access to it within its library

# install.packages("RColorBrewer")
library(RColorBrewer)

The available palettes are listed in the documentation. However, the display.brewer.all() function will plot all of them along with their name. In the graph below we see the sequential palettes, then the qualitative palettes, and finally the diverging palettes

display.brewer.all()

Usage in ggplot2. Two color scale functions are available in ggplot2:

  • scale_fill_brewer()
  • scale_color_brewer()
box_plot + scale_fill_brewer(palette = "Oranges")

scatter_plot + scale_color_brewer(palette = "Oranges")

Usage in base plots:

barplot(1:9, col = brewer.pal(n = 9, name = "Oranges"))

viridis Package

Originally designed for Python’s matplotlib library this colour scale is designed to be Colourful, perceptually uniform, Robust to colorblindness and produces good results in grey scale printing. The authors also state it to be Pretty, oh so pretty.

The color palettes are provided as ggplot2 scale functions:

library(ggplot2)
library(viridis)
# Scatter
ggplot(iris, aes(Sepal.Length, Sepal.Width))+
  geom_point(aes(color = Sepal.Length)) +
  scale_color_viridis(option = "D")+
  theme_minimal() +
  theme(legend.position = "bottom")

Usage in base plots:

barplot(1:9, col = viridis(9))

Vignette here.

ggsci Package

More commonly known as the Scientific journal color palette. It is a library of high-quality color palettes inspired by colours used in scientific journals, data visualisation libraries, science fiction movies, and TV shows.

The color palettes are provided as ggplot2 scale functions:

library(ggplot2)
library(ggsci)

ggplot(iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) +
  geom_point(size=6) +
  scale_color_simpsons() +
  scale_fill_simpsons() +
  theme(legend.position = "bottom")

Usage in base plots:

barplot(1:9, col = pal_simpsons()(9))

Vignette here.

Custom palettes

The ability to build of custom palettes was the inspiration to explain colour. It has so many applications and very relevant.

Install the library

library(ggplot2)

Plot a chart using a base dataset

p <- ggplot(iris, aes(Sepal.Length, Sepal.Width))+
  geom_point(aes(color = Species)) +
  theme(legend.position = "bottom")

Let’s imagine that we work at Mastercard and we are presenting information to our stakeholders, it would have more impact if we were to plot our analysis using the same colour scheme as the company. This is how we could do it:

mastercard <- c("#EB001B", "#FF5F00", "#F79E1B", "#231F20")
p + scale_color_manual(values = mastercard)

This is designed as an insight and the best way to implement this is by using themes. I would suggest reading about ggthemes here.

References

Plotting and Color in R; Roger D. Peng https://bookdown.org/rdpeng/exdata/plotting-and-color-in-r.html

R Markdown: The Definitive Guide; Yihui Xie, J. J. Allaire, Garrett Grolemund https://bookdown.org/yihui/rmarkdown/

MasterCard Logo Colors With Hex & RGB Codes; https://www.schemecolor.com/mastercard.php