1.0 Overview

Ternary plots are a way of displaying the distribution and variability of three-part compositional data. (For example, the proportion of aged, economy active and young population or sand, silt, and clay in soil.) Its display is a triangle with sides scaled from 0 to 1. Each side represents one of the three components.

In this hands-on, you will learn how to build ternary plot programmatically using R for visualising and analysing population structure of Singapore.

The hands-on exercise consists of four steps:

  • Install and launch tidyverse and ggtern packages.
  • Derive three new measures using mutate() function of dplyr package.
  • Build a static ternary plot using ggtern() function of ggtern package.
  • Build an interactive ternary plot using plot-ly() function of Plotly R package.

2.0 Installing and launching R packages

For this exercise, two main R packages will be used in this hands-on exercise, they are:

  • ggtern, a ggplot extension specially designed to plot ternary diagrams. The package will be used to plot static ternary plots.
  • Plotly R, an R package for creating interactive web-based graphs via plotly’s JavaScript graphing library, plotly.js. The plotly R libary contains the ggplotly function, which will convert ggplot2 figures into a Plotly object.

We will also need to ensure that selected tidyverse family packages namely: readr, dplyr and tidyr are also installed and loaded.

The code chunks below will accomplish the task.

packages = c( 'ggtern', 'plotly', 'readr', 'dplyr', 'tidyr')

for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

3.0 Data Preparation

3.1 The data

For the purpose of this hands-on exercise, the Singapore Residents by Planning AreaSubzone, Age Group, Sex and Type of Dwelling, June 2000-2018 data will be used. It is called respopagsex2000to2018_tidy.csv and is in csv file format.

3.2 Importing Data

To import respopagsex2000to2018_tidy.csv into R, read_csv() function of readr package will be used.

#Reading the data into R environment
pop_data <- read_csv("data/respopagsex2000to2018_tidy.csv") 

3.3 Preparing the Data

Next, use the mutate() function of dplyr package to derive three new measures, namely: young, active, and old.

#Deriving the young, economy active and old measures
agpop_mutated <- pop_data %>%
  spread(AG, Population) %>%
  mutate(YOUNG = rowSums(.[4:8]))%>%
  mutate(ACTIVE = rowSums(.[9:16]))  %>%
  mutate(OLD = rowSums(.[17:21])) %>%
  mutate(TOTAL = rowSums(.[22:24])) %>%
  filter(Year == 2018)%>%
  filter(TOTAL > 0)
df <- data.frame(agpop_mutated)

4.0 Plotting Ternary Diagram with R

4.1 Plotting a static ternary diagram

Use ggtern() function of ggtern package to create a simple ternary plot.

#Building the static ternary plot
ggtern(data = agpop_mutated,aes(x=YOUNG,y=ACTIVE, z=OLD)) +
  geom_point()

#Building the static ternary plot
ggtern(data=agpop_mutated, aes(x=YOUNG,y=ACTIVE, z=OLD)) +
  geom_point() +
  labs(title="Population structure, 2015") +
  theme_rgbw()

  #theme_bw()

4.2 Plotting an interative ternary diagram

The code below create an interactive ternary plot using plot_ly() function of Plotly R.

# reusable function for axis formatting
axis <- function(txt) {
  list(
    title = txt, tickformat = "%", tickfont = list(size = 10)
  )
}

ternaryAxes <- list(
  aaxis = axis("Active"), 
  baxis = axis("Young"), 
  caxis = axis("Old")
)

# Initiating a plotly visualization 
plot_ly(
  agpop_mutated, 
  a = ~ACTIVE, 
  b = ~YOUNG, 
  c = ~OLD, 
  #text = ~YOUNG,
  type = "scatterternary",
  color = I("red")
)%>%
  layout(
    ternary = ternaryAxes
  )