These documents provide some examples of triad data visualization primarily using the packages ggtern, compositions and robCompositions. The ggtern package is an extension of R’s ggplot2 package offering many specialized visualization options for presenting 3-part compositional data within ternary plots. The author, Nicholas Hamilton, offers many examples with accompanying code. Compositions and robCompositions provide custom functions to transform, analyse, and model compositional data (data with multiple variables that measure distinct parts of a whole)

This code and documentation were developed to accompany in-person tutorials for an individual needing to learn just enough R to understand, create, and edit visualizations of their dissertation data. I wrote these notes primarily to serve as reminders of coding topics covered in our tutoring sessions.

NB: R software and packages are regularly updated, so version notes are provided at the end of this document along with instructions on how to roll back to older versions, if necessary.

Reload your data

# Remember to change the file path to your own directory
demodata1 <- read.csv("E:/P_Teaching/DemoSheets/triad_data1.csv")
str(demodata1)
## 'data.frame':    62 obs. of  23 variables:
##  $ ObsID    : int  27 30 2 37 25 5 10 28 4 52 ...
##  $ StartTime: Factor w/ 47 levels "2/3/2016 19:13",..: 10 11 9 43 23 12 47 4 27 38 ...
##  $ EndTime  : Factor w/ 47 levels "2/3/2016 19:13",..: 11 12 10 43 24 13 47 5 28 39 ...
##  $ T1A      : num  9.65 13.07 45.19 42.77 24.14 ...
##  $ T1B      : num  33.2 78.4 42 40.2 17.4 ...
##  $ T1C      : num  57.12 8.53 12.81 16.98 58.42 ...
##  $ T2A      : num  9.6 22.6 46.9 42.9 60.1 ...
##  $ T2B      : num  47.3 58.7 39.4 45.1 23.1 ...
##  $ T2C      : num  43.1 18.8 13.8 12 16.7 ...
##  $ T3A      : num  24.18 16.48 10.8 17.39 9.93 ...
##  $ T3B      : num  54.7 43.8 81.5 70.5 85.4 ...
##  $ T3C      : num  21.1 39.77 7.73 12.11 4.65 ...
##  $ D1X      : num  0.317 0.289 0.247 0.202 0.488 ...
##  $ D1Y      : num  0.683 0.711 0.753 0.798 0.512 ...
##  $ D2X      : num  0.372 0.546 0.82 0.233 0.909 ...
##  $ D2Y      : num  0.6278 0.4543 0.1798 0.7666 0.0915 ...
##  $ D3X      : num  0.665 0.558 0.587 0.699 0.27 ...
##  $ D3Y      : num  0.335 0.442 0.413 0.301 0.73 ...
##  $ F1       : Factor w/ 5 levels "Jupiter","Mars",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ F2       : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 1 1 1 1 1 ...
##  $ F3       : Factor w/ 2 levels "A","B": 1 2 2 2 2 1 1 2 2 2 ...
##  $ L1XRight : num  0.306 0.351 0.727 0.301 0.724 ...
##  $ L1YTop   : num  0.303 0.5 0.645 0.274 0.772 ...

Remember, if you want to view components of your data, you could type the commands such as:

Are there groups in the triad data?

Visually, data can appear dispersed or clustered within the space of the triad graph. The package compositions provides the acomp function to apply the Aitchison transform to the compositional data. The package stats provides the function dist to calculate point-to-point distance matrices and hclust to generate cluster dendograms to explore data structure.

Chapter 6.2 in Analyzing Compositional Data with R provides guidance for identifying natural groups within compositional data. A cluster dendogram is useful to visualize the similarities and differences among samples. Clusters can be defined at any level from each individual point to the entire set of points - there is not one correct answer. Break points can be selected based on past research (known important thresholds) or simply as the division representing the greatest dissimilarity between clusters (longest height between breaks). In the example below, the dendrogram is cut into four clusters at a height of 5.

library(compositions)
library(ggtern)

#Transform the triads data with aitchison's transformation
t1_acomp <- acomp(demodata1[,4:6], parts=c("T1A", "T1B", "T1C"))
#Calculate the distance matrix
t1_dist <- dist(t1_acomp, method="euclidean")
#Define clusters based on Ward's method (note: you need Wards.D2 NOT Wards.D)
t1_clust <- hclust(t1_dist, method="ward.D2")
#Plot the cluster dendogram
plot(t1_clust)

#Choose a height to cut the dendogram and define clusters
#You can select the height intractively with the "locator" function or ssign a value directly
#t1_h <- locator(1)$y
t1_h <- 5

After selecting a height to define clusters on the dendogram, the cluster id’s can be added to the triad data. Then the triad can be plotted with the data colored according to cluster id using R’s default color scheme.

#Cut the dendogram and assign cluster ids
t1_clustids <- cutree(t1_clust, h=t1_h)
#Convert the acomp data object to a dataframe
t1_clusters<-as.data.frame(t1_acomp)
#Add the vector of cluster IDs to the dataframe as a new column
t1_clusters<-cbind(t1_clusters,as.vector(t1_clustids))
#Name the new data column "T1Cluster"
names(t1_clusters)[4]<-"T1Cluster"

#Plot the triad data with ggtern and color the points based on their custer id, T1Cluster
ggtern(data=t1_clusters,aes(x=T1A, y=T1B, z=T1C))+
  geom_point(aes(colour=as.factor(T1Cluster)))+
  ggtitle("Clusters in my Triad")+
  xlab("A")+
  ylab("B")+
  zlab("C")+
  guides(color = "none")

Often you will NOT want to use the default color scheme. The package RColorBrewer offers a variety of alternative color palettes. Or you can define the colors directly using manual scales. Both methods are illustrated below, along with some additional code to provide a legend.

#Create a custom color scales
library(RColorBrewer)
#Define a new variable and assign it the values from one of the RColorBrwer palettes
AltPalette <- brewer.pal(9,"Set1") 
#Look at the colors
image(1:9,1,as.matrix(1:9),col=AltPalette,xlab="Set 1 (unique)", ylab="",xaxt="n",yaxt="n",bty="n")

ggtern(data=t1_clusters,aes(x=T1A, y=T1B, z=T1C))+
    geom_point(aes(colour=factor(T1Cluster)))+
    ggtitle("My Triad with RColorBrewer")+
    xlab("A Part")+
    ylab("B Part")+
    zlab("C Part")+
    #Apply the custom colors using the RColorBrewer palette and provide a legend name and labels
    scale_colour_manual(values = AltPalette, name="Ward's D Method",
                        labels=c("Cluster 1", "Cluster 2", "Cluster 3", "Cluster 4"))+
    theme_bw() #Change to black and white theme to remove gray background

# Or define colors directly with manual scale controls
ggtern(data=t1_clusters,aes(x=T1A, y=T1B, z=T1C))+
    geom_point(aes(colour=factor(T1Cluster)))+
    #Apply the custom colors using html color codes and provide legend name and labels
    scale_colour_manual(values=c("#173AE8", "#2AD1BB", "#199C31", "#064F10"), name="Ward's D Method",
                        labels=c("Cluster 1", "Cluster 2", "Cluster 3", "Cluster 4"))+
    ggtitle("My Triad with Custom Color Scale")+
    xlab("A Part")+
    ylab("B Part")+
    zlab("C Part")+
    theme_bw() #Change to black and white theme to remove gray background

Session and Package Information

I created and tested these examples with:

If you need to install older versions of ggplot2 and ggtern enter these commands (substitute the version numbers you are seeking):

After installation, you should restart R Studio.

Contact Information

For more information about this R script and associated data support consulting services, contact Dr. Ashton Drew.

alt text