Welcome to the first ever WildEco Tea Break!

I have prepared some code and data for you to play with the hierarchical clustering analysis. Hierarchical clustering is a data analysis technique that groups similar data points into clusters. It is a powerful tool for exploratory data analysis and can be used to identify patterns in your data. I have used this to cluster the species in a community based on their life-history traits to identify the functional groups in the community. There are thousands of ways to do a clustering analysis and hierarchical clustering. We will be focusing on the Ward’s hierarchical clustering because I have only used this XD. The package FD can be used to do the clustering and this is also the package to calculate functional diversity so I thought this might be useful for some of you.

Note: I am not an expert of using this method, so I could be wrong and please do your own research before using this method.

A little bit background about hierarchical clustering

Let’s assume you have six species/data point, each species has distinctive traits. To do a hierarchical clustering you need to compute a distance matrix between all the data points.

Hierarchical clustering starts by treating each observation as a separate cluster. Then, it repeatedly executes the following two steps: (1) identify the two clusters that are closest together, and (2) merge the two most similar clusters. This iterative process continues until all the clusters are merged together.

At the end, the main output of Hierarchical Clustering is a tree-like diagrams called dendrogram (树形图) which shows how far each of your data points are from each other.

Then, you can cut the dendrogram (based on how many cluster you need) at a certain height to get the clusters.

The $ 1 millon dollar question: How many cluster should I have?

For other clustering method like K-means, you need to specify the number of clusters you want to have. However, for hierarchical clustering, you can cut the dendrogram at different heights to get different number of clusters. So, how do you decide how many clusters you should have? The correct answer to this question is: it depends.

But there are several methods to do this, one of the most common methods is the elbow method. The elbow method is a graphical method to determine the optimal number of clusters. You plot the number of clusters against the within-cluster sum of squares (WCSS) or silhouette score. And look for the “elbow” point in the plot. The elbow point is the point where the rate of decrease in WCSS/sil score slows down. This is the optimal number of clusters.

within-cluster sum of squares (WCSS) is the sum of the squared distances between each member of the cluster and its centroid (average). silhouette score measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation)

There are also many ways for hierarchical clustering to calculate the distance between the clusters. This is called linkage. The most common methods are single linkage, complete linkage, average linkage, and Ward's method. Ward’s method is the one that is most balanced, it minimize the total within-cluster variance and maximize the distance between the cluster. So we will be using Ward’s method in this tutorial.

Note: Most of the code were adapted from this paper Darling, E. S., Alvarez‐Filip, L., Oliver, T. A., McClanahan, T. R., & Côté, I. M. (2012). Evaluating life‐history strategies of reef corals from species traits. Ecology Letters, 15(12), 1378-1386.

Setting up the environment

Okay, let’s start by loading the required packages and the data. My R is on R version 4.4.1 (2024-06-14).

library(dplyr)
library(FD)
library(cluster)
library(factoextra)
library(dendextend)
library(ggplot2)
library(ggrepel)

The data is a list of species and their life-history traits. The file species_traits.csv and can be downloaded here. Let’s load the data and take a quick look.

# Load the data
species_traits <- read.csv("trait_data.csv")
head(species_traits)
##                 species body_mass      activity_pattern litter_size hab_breadth
## 1   Cabassous centralis      3.16             Nocturnal         1.0           7
## 2        Cuniculus paca      8.28             Nocturnal         1.5           1
## 3   Dasyprocta punctata      3.60               Diurnal         1.9           5
## 4  Dasypus novemcinctus      4.20 Nocturnal/Crepuscular         4.0          16
## 5 Didelphis marsupialis      1.09 Nocturnal/Crepuscular         5.1           4
## 6          Eira barbara      3.91               Diurnal         2.0           5
##     diet_PC1 gen_length     habitat         social
## 1  7.7576388  1859.1198 terrestrial Solitary/pairs
## 2 -3.4901222  2097.4631 terrestrial Solitary/pairs
## 3 -2.6445504  1683.3608 terrestrial Solitary/pairs
## 4  7.7576388  1825.0000 terrestrial Solitary/pairs
## 5  0.4515587   494.7575  scansorial Solitary/pairs
## 6  0.4682619  2686.6021  scansorial     Coalitions

We have got:

To simplified the analysis, we will convert categorical data into 0/1. And inspect the distribution of continues traits.

# Convert categorical data into 0/1
species_traits <- species_traits %>%
  mutate(activity_pattern = ifelse(activity_pattern == "nocturnal" | activity_pattern == "Nocturnal/Crepuscular", 1, 0),
         habitat = ifelse(habitat == "terrestrial", 1, 0),
         social = ifelse(social == "Solitary/pairs", 1, 0))
str(species_traits)
## 'data.frame':    36 obs. of  9 variables:
##  $ species         : chr  "Cabassous centralis" "Cuniculus paca" "Dasyprocta punctata" "Dasypus novemcinctus" ...
##  $ body_mass       : num  3.16 8.28 3.6 4.2 1.09 3.91 11.9 3.25 22.8 4.03 ...
##  $ activity_pattern: num  0 0 0 1 1 0 1 0 0 0 ...
##  $ litter_size     : num  1 1.5 1.9 4 5.1 2 1.63 1.5 1.2 4.17 ...
##  $ hab_breadth     : int  7 1 5 16 4 5 7 5 4 2 ...
##  $ diet_PC1        : num  7.758 -3.49 -2.645 7.758 0.452 ...
##  $ gen_length      : num  1859 2097 1683 1825 495 ...
##  $ habitat         : num  1 1 1 1 0 0 0 0 1 0 ...
##  $ social          : num  1 1 1 1 1 0 1 1 1 0 ...
# Histogram plot of the traits
par(mfrow=c(2,3))
hist(species_traits$body_mass, main = "Body mass", xlab = "Body mass (kg)", col = "lightblue")
hist(species_traits$litter_size, main = "Litter size", xlab = "Litter size", col = "lightblue")
hist(species_traits$hab_breadth, main = "Habitat breadth", xlab = "Habitat breadth", col = "lightblue")
hist(species_traits$diet_PC1, main = "Diet PC1", xlab = "Diet PC1", col = "lightblue")
hist(species_traits$gen_length, main = "Generation length", xlab = "Generation length (days)", col = "lightblue")

The body mass, litter size, and habitat breadth are right-skewed, so let’s log-transform these variables and look at again. You might also need to check the correlations between the variables and remove the highly correlated variables.

# Log-transform the variables
species_traits$body_mass <- log(species_traits$body_mass)
species_traits$litter_size <- log(species_traits$litter_size)
species_traits$hab_breadth <- log(species_traits$hab_breadth)

par(mfrow=c(1,3))
hist(species_traits$body_mass, main = "Log body mass", xlab = "Log body mass (kg)", col = "lightblue")
hist(species_traits$litter_size, main = "Log litter size", xlab = "Log litter size", col = "lightblue")
hist(species_traits$hab_breadth, main = "Log habitat breadth", xlab = "Log habitat breadth", col = "lightblue")

Distance matrix

Okay, looks good. Now the first step of the analysis is to calculate the distance matrix between the species (based on their traits). The most basic method is to calculate the Euclidean distance using dist() function. But as we have catogorical/factor/binary variables (and NAs), we can use the Gower dissimilarity index to compare the multivariate trait distance between species because it allows for mixed types of data, missing values and can weight individual traits differently.

Remember this distance is sensitive to the scale of the variables so remember to standardize the variables .

# Standardize the variables
species_traits$body_mass <- as.numeric(scale(species_traits$body_mass))
species_traits$litter_size <-  as.numeric(scale(species_traits$litter_size))
species_traits$gen_length <-  as.numeric(scale(species_traits$gen_length))
species_traits$hab_breadth <-  as.numeric(scale((species_traits$hab_breadth)))
species_traits$diet_PC1 <-  as.numeric(scale(species_traits$diet_PC1))

# Assign the species name as the row names
rownames(species_traits) <- species_traits$species
species_traits <- species_traits[,-1]

# Calculate the distance matrix
d <- FD::gowdis(species_traits)

The hierarchical clustering

Now we can do the hierarchical clustering using the hclust() function, using Ward’s linkage to calculate the distance between the clusters.

# Hierarchical clustering

hc <- hclust(d, method = "ward.D2")
hc
## 
## Call:
## hclust(d = d, method = "ward.D2")
## 
## Cluster method   : ward.D2 
## Number of objects: 36
plot(hc, cex = 0.6, hang = -1)

You can also try to different linkage methods (use method = single, complete, average) and see how the dendrogram looks like.

How many clusters should we have?

# Elbow method
# within-cluster sum of squares (WCSS)
fviz_nbclust(species_traits, FUN = hcut, method = "wss") + theme_minimal()

# Silhouette score
fviz_nbclust(species_traits, FUN = hcut, method = "silhouette") + theme_minimal()

In Darling et al. (2012), they used non-parametric Multivariate Analysis of Variance(MANOVA) to evaluate different groupings of species by comparing the coefficients of determination (R^2). The model with the highest R2 value was selected as the best model. We can try it as well.

## MANOVA using adonis2() function from vegan package

parnames <- c("no_clusters","adonis_R2")
npar <- length(parnames)
M0 <- matrix(nrow=9,ncol=npar)   
colnames(M0) <- parnames
for(i in (2:10)){
  groups<-cutree(hc,i)
  cluster_name=paste("cluster",i, sep="")
  adonis_model <- adonis2(d ~ groups)
  R2 <- adonis_model$R2[1]
  j <- i - 1
  M0[j,1] <- cluster_name
  M0[j,2] <- R2
}

M0
##       no_clusters adonis_R2           
##  [1,] "cluster2"  "0.350364083459131" 
##  [2,] "cluster3"  "0.315845173513951" 
##  [3,] "cluster4"  "0.18411250016616"  
##  [4,] "cluster5"  "0.0758295452832858"
##  [5,] "cluster6"  "0.102974881823282" 
##  [6,] "cluster7"  "0.104848705213415" 
##  [7,] "cluster8"  "0.096909518140697" 
##  [8,] "cluster9"  "0.0910638057802914"
##  [9,] "cluster10" "0.0936239200252919"

Now we can cut trees

There are no “correct” number of clusters. It all depends on the question you are asking. We can now cut the dendrogram at different heights to get different number of clusters.

plot(hc, cex = 0.6, hang = -1)
rect.hclust(hc, k = 3, border = 2:5) # k is the number of cluster

Now make a better plot of the dendrogram with the cluster colors.

# Better plot
dend <- as.dendrogram(hc)
dend1 <- color_branches(dend, k = 3) # give colors to the branches
dend2 <- color_labels(dend, k = 3) # give colors to the labels
plot(dend1, main = "Colored branches")

plot(dend2, main = "Colored labels")

# give colors to both branches and labels
dend <- set(dend1, "labels_cex", 1.4) %>% set("branches_lwd", 3) # set the size of the labels and branches
dend <- color_labels(dend, k = 3) 

par(mar = c(4,4,4,14),cex=0.6) # set the margin of plot
plot(dend, horiz = TRUE)

The warp-up function in FD package

FD package can calculate the different functional diversity metrics of the community at once. Some metrics (i.e. Functional group richness) is a special case of the hierarchical clustering to some extent . As it also needs to calculate the distance between the species in n-dimension trait space. So let’s try this method.

Note We will need to input “g” in the console for the number of clusters you want and then the number of clusters you want for the function to be excused.

There are also bunch of parameters in the dbFD() function and more information in the output. You can explore a bit by yourself.

# Hierarchical clustering using FD package
# Enter g for group, 4 for no. of group in the console

# corr = "cailliez" is often needed see details in the help page
# calc.FGR = T is a must have to return functional group richness
# clust.type = "ward.D2" is the linkage method
# print.pco = T to run PCoA analysis for your traits too 
# calc.CWM = F to not calculate the community weighted mean (we only have 1 community)

ex <- FD::dbFD(d, corr = "cailliez", calc.FGR = T, clust.type = "ward.D2", print.pco = T, calc.CWM=F)
ex
## $nbsp
## Community1 
##         36 
## 
## $sing.sp
## Community1 
##         36 
## 
## $FRic
##   Community1 
## 1.419352e-50 
## 
## $qual.FRic
## [1] 1
## 
## $FEve
## Community1 
##  0.9568357 
## 
## $FDiv
## Community1 
##  0.8937894 
## 
## $FDis
## Community1 
##  0.2382625 
## 
## $RaoQ
## Community1 
## 0.06414295 
## 
## $FGR
## Community1 
##          3 
## 
## $spfgr
##     Cabassous centralis          Cuniculus paca     Dasyprocta punctata 
##                       1                       1                       1 
##    Dasypus novemcinctus   Didelphis marsupialis            Eira barbara 
##                       2                       3                       3 
##      Leopardus pardalis        Leopardus wiedii           Mazama temama 
##                       3                       3                       1 
##            Nasua narica  Odocoileus virginianus           Pecari tajacu 
##                       3                       1                       1 
##     Procyon cancrivorus       Puma yagouaroundi Sylvilagus brasiliensis 
##                       3                       3                       1 
##         Tapirus bairdii     Atelocynus microtis     Dasyprocta leporina 
##                       1                       1                       1 
##        Dasypus kappleri        Mazama americana       Mazama nemorivaga 
##                       2                       1                       1 
## Myrmecophaga tridactyla             Nasua nasua           Panthera onca 
##                       1                       3                       3 
##      Priodontes maximus           Puma concolor      Tapirus terrestris 
##                       1                       3                       1 
##          Tayassu pecari   Dasyprocta fuliginosa        Myoprocta pratti 
##                       1                       1                       1 
##     Sciurus igniventris       Sciurus spadiceus      Leopardus tigrinus 
##                       3                       3                       3 
##       Myoprocta acouchy       Tamandua mexicana   Tamandua tetradactyla 
##                       1                       2                       2 
## 
## $gr.abun
##            group1 group2 group3
## Community1     19      4     13
## 
## $x.values
##  [1] 0.0686607141 0.0546721056 0.0390995662 0.0315568827 0.0183976183
##  [6] 0.0128749469 0.0114649503 0.0078721898 0.0070387615 0.0066047061
## [11] 0.0059486220 0.0044421239 0.0043743650 0.0042304024 0.0041787293
## [16] 0.0037602736 0.0037380802 0.0035491811 0.0034972365 0.0034329629
## [21] 0.0033480894 0.0033176984 0.0032657440 0.0032553132 0.0032230538
## [26] 0.0031541897 0.0029960000 0.0029605646 0.0026706396 0.0024665133
## [31] 0.0017590416 0.0016303226 0.0013315424 0.0006909042
## 
## $x.axes
##                                  A1           A2            A3          A4
## Cabassous centralis      0.25427275 -0.039195530 -0.0619296928 -0.01512962
## Cuniculus paca           0.08117389  0.128600189 -0.2195170236 -0.15736182
## Dasyprocta punctata      0.07587876  0.072287723 -0.1363313853 -0.05332341
## Dasypus novemcinctus     0.32542766 -0.344442032  0.0006023044  0.33390513
## Didelphis marsupialis   -0.18664389 -0.582362568 -0.0667595466  0.14467075
## Eira barbara            -0.42189469  0.070160095  0.1605934945  0.17086455
## Leopardus pardalis      -0.04592023 -0.376432690  0.2869257608  0.01824218
## Leopardus wiedii        -0.20754011 -0.154694490  0.0706525972 -0.17696152
## Mazama temama            0.16997297  0.161308185 -0.0087147871 -0.17600120
## Nasua narica            -0.50260936  0.025322693  0.0825797395  0.24816761
## Odocoileus virginianus   0.03360181  0.472635329  0.2161138453  0.18474939
## Pecari tajacu           -0.01899664  0.398252770  0.1652031700  0.23067960
## Procyon cancrivorus     -0.21175544 -0.235481810  0.0858095351 -0.08855461
## Puma yagouaroundi       -0.26689996 -0.139728861 -0.0552846463 -0.19277599
## Sylvilagus brasiliensis  0.05974635  0.104006927 -0.2520044965 -0.11002139
## Tapirus bairdii          0.30191572  0.265522682  0.2751953114 -0.23460089
## Atelocynus microtis      0.05760103  0.028436881 -0.1168567511 -0.01462438
## Dasyprocta leporina      0.03027599  0.102282793 -0.2678638536 -0.11860227
## Dasypus kappleri         0.26648417 -0.361154218 -0.2802359114  0.26443697
## Mazama americana         0.14935943  0.187472560 -0.1569173471 -0.21594752
## Mazama nemorivaga        0.15151880  0.142188649 -0.1066792046 -0.17507671
## Myrmecophaga tridactyla  0.34209995  0.068718856  0.2455771217 -0.05967434
## Nasua nasua             -0.48244249  0.018284166  0.1075331422  0.24757947
## Panthera onca           -0.17053351 -0.060348069  0.3935210498 -0.19177730
## Priodontes maximus       0.25194671  0.015713018  0.0278989450 -0.05720149
## Puma concolor           -0.20187729 -0.114892859  0.3451994739 -0.11432058
## Tapirus terrestris       0.49176340  0.001537126  0.3039221757 -0.02874446
## Tayassu pecari          -0.02333350  0.369987703  0.2019841753  0.24662259
## Dasyprocta fuliginosa    0.06799021  0.073061941 -0.1224949683 -0.05402914
## Myoprocta pratti        -0.15654054  0.293133799 -0.3264538027  0.23720119
## Sciurus igniventris     -0.31747772 -0.093363666 -0.1821099652 -0.22766915
## Sciurus spadiceus       -0.32101931 -0.094030208 -0.1850875673 -0.23251707
## Leopardus tigrinus      -0.27647699 -0.161855204 -0.0121535104 -0.15319024
## Myoprocta acouchy       -0.14693797  0.320532247 -0.3380751657  0.21583977
## Tamandua mexicana        0.40603164 -0.288485759 -0.0750732337  0.14616929
## Tamandua tetradactyla    0.44183841 -0.272978369  0.0012310177  0.15897660
##                                   A5            A6            A7           A8
## Cabassous centralis      0.321760714 -0.0835388797  0.0799821735 -0.076289676
## Cuniculus paca          -0.044775289  0.1408549543  0.0156368891  0.069913995
## Dasyprocta punctata      0.029809545 -0.1877566820 -0.0710405071  0.001766896
## Dasypus novemcinctus     0.037532932 -0.0231653700 -0.2028394417 -0.052469957
## Didelphis marsupialis   -0.260551124 -0.1589364283 -0.0838637396 -0.006090866
## Eira barbara             0.088085039 -0.1138110385  0.1822984209  0.033082480
## Leopardus pardalis      -0.185722867  0.0380410600  0.1950192315  0.091378835
## Leopardus wiedii         0.106312776 -0.1267371476  0.1143588401  0.092870100
## Mazama temama           -0.032771119 -0.0389287266 -0.0152104452  0.220240430
## Nasua narica            -0.002197978  0.1330453492 -0.0026917291  0.023488487
## Odocoileus virginianus  -0.082974772 -0.0244301561 -0.0553051849  0.135306539
## Pecari tajacu           -0.027641229 -0.0085818367 -0.0005886284 -0.024232458
## Procyon cancrivorus      0.204863942 -0.0001449719 -0.1054292911  0.066426407
## Puma yagouaroundi        0.045548359  0.2044516373  0.0460453014  0.027608049
## Sylvilagus brasiliensis -0.124205850 -0.2601949624 -0.1229683657 -0.047860415
## Tapirus bairdii         -0.119678684 -0.0294449787 -0.0257887906 -0.103958685
## Atelocynus microtis      0.047760583 -0.0763166689 -0.2264467374  0.074068646
## Dasyprocta leporina     -0.029333372  0.0564484417  0.0039500598 -0.053886226
## Dasypus kappleri        -0.062869212  0.3120558543 -0.1235067703 -0.008195299
## Mazama americana        -0.087026079  0.1808960862 -0.0113253608  0.096181439
## Mazama nemorivaga       -0.013631505  0.0670002901  0.0708706328  0.133062902
## Myrmecophaga tridactyla  0.248662566  0.1213732338  0.0223083277 -0.179579428
## Nasua nasua              0.033848251  0.0267106884 -0.0206053472  0.075110866
## Panthera onca           -0.019517495  0.0343455985 -0.0461937166 -0.177018373
## Priodontes maximus       0.272813187  0.0526903634 -0.0407277485  0.059810802
## Puma concolor            0.028099635  0.0553498708 -0.2461805612  0.020494167
## Tapirus terrestris      -0.403192144 -0.0035902601  0.0735693456 -0.035845552
## Tayassu pecari           0.024304063  0.0244630970  0.0005907890 -0.055884746
## Dasyprocta fuliginosa    0.036043077 -0.1754141517 -0.0680851359 -0.002052385
## Myoprocta pratti        -0.018429682  0.0096389239  0.0713803897 -0.095728355
## Sciurus igniventris     -0.081880038  0.0318551578  0.0563109452 -0.125946795
## Sciurus spadiceus       -0.082212906  0.0320235867  0.0579729331 -0.131079493
## Leopardus tigrinus       0.068523415 -0.0907006309  0.0301594733 -0.014122635
## Myoprocta acouchy       -0.060196999 -0.0023471804  0.0720837359 -0.076191951
## Tamandua mexicana        0.075913689 -0.0693124028  0.2058106218  0.063430505
## Tamandua tetradactyla    0.068926571 -0.0478917202  0.1704493911 -0.017808250
##                                    A9          A10          A11          A12
## Cabassous centralis      0.0674823544  0.094535348 -0.059047849 -0.054108344
## Cuniculus paca          -0.0768530394 -0.045353494 -0.073977106 -0.067110049
## Dasyprocta punctata     -0.1125443808 -0.097728716 -0.046244885 -0.016275483
## Dasypus novemcinctus    -0.1148950294  0.091339266 -0.123442548 -0.012968631
## Didelphis marsupialis    0.1203237224  0.013456302  0.076168182  0.003980995
## Eira barbara            -0.0562681121  0.004423401  0.071713704  0.011357602
## Leopardus pardalis      -0.1320438657 -0.083685856  0.011472534  0.002278741
## Leopardus wiedii         0.0071027525  0.017496067 -0.038220048 -0.110907048
## Mazama temama            0.0702258033 -0.009623676  0.031609039  0.107999403
## Nasua narica             0.1566618243 -0.008910168 -0.122653913 -0.018474974
## Odocoileus virginianus  -0.0911699053  0.231844276  0.012382248 -0.071341119
## Pecari tajacu           -0.0746285964 -0.107786496 -0.079132008  0.139761745
## Procyon cancrivorus      0.0129629941 -0.025444653 -0.040044294  0.157515925
## Puma yagouaroundi       -0.0389778677 -0.039654122  0.059352196 -0.121381325
## Sylvilagus brasiliensis  0.1328696480  0.073600675  0.074397614  0.002593397
## Tapirus bairdii          0.1253175535  0.007057251 -0.074197796 -0.054048853
## Atelocynus microtis      0.0256408749 -0.215140047  0.078616878 -0.083719133
## Dasyprocta leporina     -0.1061774703 -0.017846064 -0.114790829 -0.104440026
## Dasypus kappleri         0.0348477225 -0.040902137  0.033453502  0.011123774
## Mazama americana         0.0036176614  0.088399884  0.107702592  0.017008476
## Mazama nemorivaga        0.1333515654 -0.046730065 -0.137546170  0.053595191
## Myrmecophaga tridactyla  0.1167505173 -0.014663292 -0.022048171  0.023414015
## Nasua nasua              0.1310329248  0.024462247 -0.068274796 -0.014361928
## Panthera onca           -0.0098303453 -0.076166701  0.099411539 -0.014593610
## Priodontes maximus      -0.0312281002  0.055862468  0.204037727  0.065683227
## Puma concolor           -0.0955448044  0.137729288 -0.011709550 -0.013003195
## Tapirus terrestris       0.0318132205 -0.002161617 -0.005284483  0.002001522
## Tayassu pecari          -0.0215292903 -0.138169015  0.110039602 -0.021741340
## Dasyprocta fuliginosa   -0.1274417657 -0.094679374 -0.075187363  0.041601871
## Myoprocta pratti         0.0180926022  0.017903477  0.098499824 -0.049198847
## Sciurus igniventris     -0.0616072624  0.070571693 -0.014196422  0.095932149
## Sciurus spadiceus       -0.0677607169  0.072922310 -0.016120661  0.112784916
## Leopardus tigrinus       0.0453287994 -0.008293668  0.005788948 -0.070162749
## Myoprocta acouchy       -0.0005683746  0.032708899  0.021049439  0.044783880
## Tamandua mexicana        0.0315821289  0.001980817  0.006907814  0.015594185
## Tamandua tetradactyla   -0.0459357426  0.036645490  0.019515508 -0.011174359
##                                  A13           A14           A15          A16
## Cabassous centralis      0.007214052 -0.0019481211 -0.0510365837  0.025539949
## Cuniculus paca           0.011214614  0.1170856380  0.0056082178  0.153490326
## Dasyprocta punctata     -0.084758134 -0.0067154895 -0.0036906914 -0.043598841
## Dasypus novemcinctus     0.013182740 -0.0192825369  0.0590892607  0.021861695
## Didelphis marsupialis   -0.028615951 -0.0459527373 -0.0374421690  0.017524392
## Eira barbara            -0.015625884  0.0705785184 -0.0642183894 -0.075476739
## Leopardus pardalis       0.080165645  0.0355447047  0.0538154455  0.039696168
## Leopardus wiedii        -0.101986247 -0.0604578814 -0.0008341314  0.172882589
## Mazama temama            0.008089715 -0.0229008991  0.0915976253  0.013628282
## Nasua narica            -0.066974770 -0.0490827131  0.0307683717  0.048364244
## Odocoileus virginianus   0.003542065  0.0667315414 -0.0209380927 -0.048378237
## Pecari tajacu            0.088009294  0.0267933489  0.0520580612  0.064526067
## Procyon cancrivorus     -0.081933187  0.1633740484 -0.0537848665 -0.038934972
## Puma yagouaroundi        0.041935147  0.1260497981 -0.0001812359 -0.092683789
## Sylvilagus brasiliensis  0.104809008  0.1401448902  0.1125054981  0.044606351
## Tapirus bairdii         -0.092853840  0.0560162816 -0.0606576614 -0.005153009
## Atelocynus microtis      0.065089653  0.0057078656 -0.1510834838  0.018283677
## Dasyprocta leporina     -0.029698389 -0.0403634467  0.1023257260 -0.060138673
## Dasypus kappleri        -0.036496259  0.0305072598  0.0184458975 -0.020126842
## Mazama americana        -0.089398773 -0.0747685492  0.0052755986 -0.024444036
## Mazama nemorivaga        0.131800309 -0.1106857902 -0.1015360716 -0.062135093
## Myrmecophaga tridactyla  0.050525734  0.0351728327  0.0395603885 -0.009847935
## Nasua nasua             -0.020922467  0.0086137054  0.0541432872 -0.002084089
## Panthera onca           -0.060495901 -0.0100818846  0.0339499262 -0.006645440
## Priodontes maximus      -0.038875804 -0.0679157977  0.0768447484  0.065070946
## Puma concolor            0.092807471 -0.0845349327 -0.0998835024 -0.021887435
## Tapirus terrestris      -0.041353866 -0.0009042045 -0.0155160344 -0.026659967
## Tayassu pecari           0.003479422 -0.0935036566  0.0175763215  0.027654513
## Dasyprocta fuliginosa   -0.086290110 -0.0540717484  0.0197857179 -0.067904253
## Myoprocta pratti         0.026208354 -0.0200721531 -0.0623904585  0.024442261
## Sciurus igniventris      0.014044567 -0.0242773225 -0.0402409556  0.028543755
## Sciurus spadiceus        0.019090091 -0.0315004776 -0.0475092232  0.042789713
## Leopardus tigrinus       0.124535868 -0.0728373410  0.1517324468 -0.117901283
## Myoprocta acouchy       -0.034067595  0.0058049910 -0.0318431970 -0.039440652
## Tamandua mexicana       -0.076256124  0.0084513339 -0.0056749142 -0.095764640
## Tamandua tetradactyla    0.100859555 -0.0047190750 -0.0766208766  0.050300998
##                                   A17          A18          A19           A20
## Cabassous centralis     -0.0374385926  0.017985186 -0.072796758 -0.0647390024
## Cuniculus paca           0.0290780745  0.062456760  0.018399758  0.0944900331
## Dasyprocta punctata      0.0533371593 -0.155778366 -0.069960795  0.0470800407
## Dasypus novemcinctus    -0.0005190699  0.013489430  0.020675808  0.0075800951
## Didelphis marsupialis   -0.0027912089 -0.042495261 -0.032520306  0.0041888188
## Eira barbara            -0.0124263280 -0.049094343  0.029611200 -0.0103572449
## Leopardus pardalis       0.0092680863  0.017580647  0.043433200 -0.0212740678
## Leopardus wiedii        -0.0142934748 -0.060512766  0.005576638  0.0276455464
## Mazama temama           -0.1893543933 -0.043520295 -0.037769944 -0.0802229253
## Nasua narica             0.0948876933 -0.081211513  0.071774116 -0.1292803266
## Odocoileus virginianus   0.0312429127  0.005648724  0.032722909 -0.0056802642
## Pecari tajacu            0.0085789457 -0.077620937 -0.061881674  0.0277051874
## Procyon cancrivorus      0.0220120128  0.092456621  0.043293585 -0.0929570072
## Puma yagouaroundi        0.0185799701 -0.074538625 -0.075279889 -0.0117862896
## Sylvilagus brasiliensis  0.0671198426 -0.025078598  0.049341966 -0.0261142064
## Tapirus bairdii          0.0205597945  0.065693941 -0.020174580 -0.0094564576
## Atelocynus microtis     -0.0696529144  0.016530654  0.036479195  0.0061465942
## Dasyprocta leporina     -0.0665676733  0.008894238  0.044241330 -0.1032882692
## Dasypus kappleri         0.0054728285 -0.017184642 -0.021174873  0.0023752624
## Mazama americana        -0.0192033202 -0.002406550 -0.036222043 -0.0247959860
## Mazama nemorivaga        0.1441921426  0.001904591  0.013610153  0.0307135691
## Myrmecophaga tridactyla -0.0919546189 -0.107490511  0.065572395  0.0663218686
## Nasua nasua             -0.0835973384  0.129309848 -0.111033338  0.1497747657
## Panthera onca            0.0514733963  0.019509439 -0.058802546  0.0023422972
## Priodontes maximus       0.1422879962  0.019564472  0.007958800  0.0463878429
## Puma concolor           -0.0412362440 -0.053014343  0.020926111  0.0326884618
## Tapirus terrestris      -0.0038624024  0.009818076  0.006694461  0.0118982249
## Tayassu pecari          -0.0173593217  0.075993447  0.043266886 -0.0588829176
## Dasyprocta fuliginosa    0.0261078569  0.088518876  0.009598807  0.0028428817
## Myoprocta pratti        -0.0366072408  0.033308521  0.105734006  0.0009031674
## Sciurus igniventris     -0.0269829167  0.007709790  0.013130300  0.0042129844
## Sciurus spadiceus       -0.0429455966  0.022914124  0.047793568  0.0305229653
## Leopardus tigrinus       0.0287440798  0.074430096 -0.007040091  0.0188598381
## Myoprocta acouchy        0.0141895489 -0.029038055 -0.119515378  0.0283501820
## Tamandua mexicana       -0.0258570903 -0.017029863  0.126419791  0.1103498402
## Tamandua tetradactyla    0.0155174041  0.052297186 -0.132082770 -0.1145455027
##                                  A21          A22          A23           A24
## Cabassous centralis     -0.034220463 -0.028838251  0.046111844  0.0159120499
## Cuniculus paca          -0.047746899 -0.015432257 -0.049825542 -0.1114700944
## Dasyprocta punctata      0.152437595  0.044070106  0.096309336 -0.0989259371
## Dasypus novemcinctus    -0.011556041 -0.004102748  0.017109316 -0.0082922857
## Didelphis marsupialis   -0.023332220 -0.005305673 -0.027850648 -0.0061835102
## Eira barbara            -0.037895696 -0.013514108 -0.034528505  0.0125215044
## Leopardus pardalis       0.034588132  0.008780282  0.026622142  0.0300107420
## Leopardus wiedii        -0.008586119  0.004869026 -0.020488958  0.0292330668
## Mazama temama           -0.011531094 -0.010732888 -0.019423591 -0.0725035369
## Nasua narica            -0.046270222 -0.034186087 -0.003979322 -0.0205687121
## Odocoileus virginianus   0.019960773 -0.003089210 -0.033419789 -0.0258313284
## Pecari tajacu           -0.065051624 -0.010494174  0.074041113  0.1437377569
## Procyon cancrivorus      0.067919962  0.032550549  0.011493341 -0.0092267491
## Puma yagouaroundi       -0.051141840 -0.028938494  0.036553052 -0.0010158217
## Sylvilagus brasiliensis  0.006332384  0.001180250  0.005562022  0.0074955868
## Tapirus bairdii          0.007010295  0.003129666  0.024257618  0.0593211077
## Atelocynus microtis     -0.008521271 -0.007227047 -0.051598791  0.0987245083
## Dasyprocta leporina      0.096637987  0.045690897 -0.081132878  0.1341882835
## Dasypus kappleri         0.002216078 -0.005039280  0.002990296 -0.0124395515
## Mazama americana        -0.045206138 -0.017694962  0.117022617  0.0346109834
## Mazama nemorivaga        0.042514273  0.020573521 -0.010510987 -0.0141290865
## Myrmecophaga tridactyla -0.011666864  0.001816018 -0.040149631 -0.0608170305
## Nasua nasua              0.072413095  0.037888430  0.042448703  0.0288050465
## Panthera onca           -0.006905797 -0.011183542 -0.005734167 -0.0088708801
## Priodontes maximus       0.030405803  0.018190949 -0.047741521  0.0843344107
## Puma concolor           -0.020998298  0.002772084 -0.012172965 -0.0101995398
## Tapirus terrestris      -0.002093375  0.001476193 -0.010014003 -0.0007581996
## Tayassu pecari           0.066835883  0.020783556 -0.038422435 -0.1210489132
## Dasyprocta fuliginosa   -0.183755748 -0.065590954  0.013358239 -0.0499877635
## Myoprocta pratti        -0.000585820  0.027976691  0.168224392 -0.0108481218
## Sciurus igniventris     -0.065701420  0.238134930 -0.021535811 -0.0170001494
## Sciurus spadiceus        0.110423840 -0.211503924  0.003022033  0.0061896510
## Leopardus tigrinus      -0.024830938 -0.018901827  0.009391971 -0.0308494516
## Myoprocta acouchy       -0.009352695 -0.025227827 -0.177428103  0.0030746240
## Tamandua mexicana       -0.015060059  0.004128567 -0.012875158  0.0356610114
## Tamandua tetradactyla    0.022314542  0.002991537  0.004314770 -0.0328536700
##                                  A25          A26           A27           A28
## Cabassous centralis     -0.095560928 -0.009619963  0.0078237269  0.1575278214
## Cuniculus paca          -0.042836947 -0.069512593  0.0579166307 -0.0408483176
## Dasyprocta punctata     -0.055253948 -0.014180766  0.0053100695 -0.0073597361
## Dasypus novemcinctus    -0.002825655 -0.001370710 -0.0103229065  0.0344622058
## Didelphis marsupialis    0.052374954 -0.032222515  0.0382860703 -0.0278415546
## Eira barbara             0.003202505 -0.023948080  0.0462127669 -0.0260034902
## Leopardus pardalis      -0.081483446  0.071244360 -0.0434443069  0.0703618144
## Leopardus wiedii         0.061896480  0.047555662 -0.0439870400 -0.0475781505
## Mazama temama            0.023734759 -0.091300247  0.0515637599  0.0582280000
## Nasua narica            -0.059472993 -0.020611871 -0.0069229631 -0.0007549098
## Odocoileus virginianus   0.011191768 -0.066995453 -0.1125626662  0.0192181773
## Pecari tajacu           -0.002861298 -0.023031136  0.0299942747 -0.0361198610
## Procyon cancrivorus      0.003904705  0.005683367 -0.0093662223 -0.0711959371
## Puma yagouaroundi        0.097506893  0.057742730  0.0559164252  0.0631118512
## Sylvilagus brasiliensis  0.024418278  0.107116114 -0.0200281101  0.0291592604
## Tapirus bairdii         -0.046650775  0.016187499  0.1261883152 -0.0199210537
## Atelocynus microtis     -0.074540099 -0.027674568 -0.0876836994  0.0039990109
## Dasyprocta leporina      0.079134211 -0.018350424  0.0204826479 -0.0255819444
## Dasypus kappleri        -0.008060801 -0.012863321  0.0006967447  0.0180580259
## Mazama americana        -0.082657328  0.100710899 -0.0886781026 -0.0854022029
## Mazama nemorivaga        0.083339956 -0.004441645 -0.0315024787  0.0493206668
## Myrmecophaga tridactyla  0.043099267  0.034803825 -0.0863062954 -0.0810717479
## Nasua nasua              0.050048820  0.044960833 -0.0274848765  0.0208273325
## Panthera onca            0.037788508 -0.128118491 -0.1025719705  0.0661383577
## Priodontes maximus       0.026717552 -0.037810648  0.0728319983  0.0190514507
## Puma concolor           -0.008046162  0.055559669  0.1113673906 -0.0258740592
## Tapirus terrestris       0.013098104 -0.023581600  0.0074955587 -0.0051521526
## Tayassu pecari          -0.002261261  0.128497390  0.0531001808  0.0388390309
## Dasyprocta fuliginosa    0.093664938  0.056570986 -0.0302842885  0.0137306541
## Myoprocta pratti         0.068766895 -0.086537447  0.0358789567 -0.0185779590
## Sciurus igniventris     -0.037101947 -0.002754652 -0.0282695219  0.0457437162
## Sciurus spadiceus       -0.001814443 -0.003464452 -0.0136617119  0.0273698310
## Leopardus tigrinus      -0.125947985 -0.072686114  0.0188291651 -0.1063585159
## Myoprocta acouchy       -0.067006112  0.043254225 -0.0029956606 -0.0119212291
## Tamandua mexicana       -0.036643117  0.006384936  0.0277924559  0.0238831194
## Tamandua tetradactyla    0.057136653 -0.005195801 -0.0216143168 -0.1214675050
##                                  A29           A30           A31           A32
## Cabassous centralis     -0.002161443  0.1133039878 -0.0025771340  1.180957e-02
## Cuniculus paca          -0.041536590  0.0583815436 -0.0153766798 -5.689220e-02
## Dasyprocta punctata      0.037412728 -0.0040854428 -0.0097625731 -1.246538e-02
## Dasypus novemcinctus    -0.136215553 -0.1025526145  0.0912262547 -7.855424e-03
## Didelphis marsupialis   -0.020295205  0.0411223770 -0.0148400711 -4.741069e-02
## Eira barbara             0.003545354  0.0544120800  0.1408966367 -8.068511e-02
## Leopardus pardalis       0.080263567 -0.0327297028  0.0123351200  2.339046e-03
## Leopardus wiedii        -0.007901588  0.0072383319  0.0390109555  9.704220e-02
## Mazama temama            0.003006491 -0.0579465056  0.0088751967  3.272872e-03
## Nasua narica             0.021189704 -0.0467597596 -0.0529199338 -2.862287e-02
## Odocoileus virginianus   0.051538447 -0.0081729178 -0.0289706785  1.396507e-02
## Pecari tajacu           -0.031941647  0.0421656213 -0.0330160707  1.577526e-02
## Procyon cancrivorus     -0.055585778  0.0341828221 -0.0131775237  6.450321e-02
## Puma yagouaroundi       -0.063567830 -0.0717198415 -0.0250967610  3.250453e-02
## Sylvilagus brasiliensis  0.005294934  0.0309203827 -0.0082061146 -1.989422e-02
## Tapirus bairdii          0.065883441 -0.1091569579  0.0367413437 -2.433693e-02
## Atelocynus microtis      0.002418018 -0.0454745950 -0.0008308601 -3.226050e-03
## Dasyprocta leporina      0.001162195  0.0626404922 -0.0153929655 -4.176523e-02
## Dasypus kappleri         0.130208040  0.0687478373  0.0519503405  3.921724e-02
## Mazama americana        -0.074648788  0.0264164966  0.0198562915 -5.839077e-02
## Mazama nemorivaga       -0.036446225  0.0157852203  0.0482369058 -8.919992e-03
## Myrmecophaga tridactyla  0.041686442 -0.0119135025  0.0177751131 -1.410828e-02
## Nasua nasua              0.027791143  0.0002445566 -0.0129000581 -3.465971e-02
## Panthera onca           -0.056340879  0.0177171953 -0.0306630685 -3.174432e-02
## Priodontes maximus       0.040970903 -0.0323500266  0.0087726118 -2.407804e-03
## Puma concolor            0.027669105  0.0544003061 -0.0550701514  9.411696e-03
## Tapirus terrestris      -0.042766870  0.0716432193  0.0159136239  8.468717e-02
## Tayassu pecari          -0.059230540  0.0299257296 -0.0072167488 -4.809324e-06
## Dasyprocta fuliginosa    0.096614999 -0.0019283696 -0.0122017628 -1.606182e-03
## Myoprocta pratti         0.005726016 -0.0267192298  0.0012785390  5.527009e-02
## Sciurus igniventris      0.018818296 -0.0388729345 -0.0010233177 -7.258401e-03
## Sciurus spadiceus        0.015958618 -0.0314783965 -0.0008282172 -7.101404e-03
## Leopardus tigrinus       0.000956804  0.0004646082  0.0173508024  5.772476e-02
## Myoprocta acouchy       -0.027000052 -0.0439289016 -0.0075612100  6.565107e-02
## Tamandua mexicana       -0.053718620 -0.0193351130 -0.1101586969 -2.343851e-02
## Tamandua tetradactyla    0.031242365 -0.0445879962 -0.0524291382 -4.037950e-02
##                                   A33           A34
## Cabassous centralis     -0.0357972754  0.0130776817
## Cuniculus paca          -0.0147815442  0.0102234879
## Dasyprocta punctata     -0.0204839198 -0.0087258401
## Dasypus novemcinctus     0.0031507844 -0.0007219882
## Didelphis marsupialis   -0.0511733572  0.0601205114
## Eira barbara             0.0147863250 -0.0282026397
## Leopardus pardalis      -0.0228329059  0.0606039781
## Leopardus wiedii         0.0571224003 -0.0035284591
## Mazama temama            0.0206200970  0.0024719133
## Nasua narica            -0.0308731058 -0.0332392772
## Odocoileus virginianus   0.0007389784  0.0102771495
## Pecari tajacu            0.0217530463  0.0016583253
## Procyon cancrivorus     -0.0104798630  0.0218710021
## Puma yagouaroundi       -0.0318479004 -0.0158798931
## Sylvilagus brasiliensis  0.0380675978 -0.0209993364
## Tapirus bairdii          0.0374137560  0.0225560629
## Atelocynus microtis     -0.0244947549 -0.0280625955
## Dasyprocta leporina     -0.0125259590  0.0128966658
## Dasypus kappleri         0.0797546604 -0.0174644906
## Mazama americana        -0.0018816918  0.0120936776
## Mazama nemorivaga        0.0227784676  0.0206100670
## Myrmecophaga tridactyla -0.0519079391  0.0266038585
## Nasua nasua             -0.0251968372 -0.0236333296
## Panthera onca            0.0759489610  0.0153527753
## Priodontes maximus      -0.0656505358 -0.0105885086
## Puma concolor            0.0311077437  0.0052463273
## Tapirus terrestris      -0.0790629201 -0.0605304560
## Tayassu pecari           0.0206335301 -0.0066680805
## Dasyprocta fuliginosa   -0.0224876382  0.0021554078
## Myoprocta pratti        -0.0075577742  0.0354259113
## Sciurus igniventris     -0.0042417257 -0.0334856847
## Sciurus spadiceus       -0.0038991199 -0.0311260502
## Leopardus tigrinus       0.0075257447 -0.0077455236
## Myoprocta acouchy        0.0040116448  0.0420374030
## Tamandua mexicana        0.0615687233 -0.0131035908
## Tamandua tetradactyla    0.0201943069 -0.0315764619

The cluster results are stored in the spfgr of the dbFD output list. You can save the results as a data frame for further use.

# save the results
cluster_g3 <- as.data.frame(ex$spfgr)
cluster_g3$species <- row.names(cluster_g3)
names(cluster_g3) <- c("cluster",  "species")
# write.csv(cluster_g4, "cluster_results.csv")

Some more evaluation of the clustering and more plots

For example we can check how many variation each axis explained like a PCA analysis.

# Variation in 1st axis
(ex$x.values[1])/sum(ex$x.values)
## [1] 0.2046738
# in 2nd axis
(ex$x.values[2])/sum(ex$x.values)
## [1] 0.1629746
# Variation that the total first 3 axis explained 
(ex$x.values[1]+ex$x.values[2]+ex$x.values[3])/sum(ex$x.values)
## [1] 0.4842021

Darling et al. (2012) also showed a method to better visualize the functional groups. They ran the Principal Coordinates Analysis (PCoA) ordination of the Gower dissimilarity matrix to visually show the life-history groups and species traits in multivariate space. Some of the code for plotting is adapted from here.

Here is the code to do this:

############
# PCoA
# Use Caillez correction to avoid negative Eigenvalues
spe.gow.pcoa_cal <- pcoa(d, correction = "cailliez")

# number of traits
n <- ncol(species_traits)

# The principal coordinates with positive eigenvalues
points.stand <- scale(spe.gow.pcoa_cal$vectors)

# Covariance matrix
S <- cov(species_traits, points.stand)

# Select only positive eigenvalues
pos_eigen <-  spe.gow.pcoa_cal$values$Eigenvalues[seq(ncol(S))]

# Standardize value of covariance 
U <- S %*% diag((pos_eigen/(n - 1))^(-0.5))
colnames(U) <- colnames(spe.gow.pcoa_cal$vectors)

# Add values of covariances inside object
spe.gow.pcoa_cal$U <- U

spe.gow.pcoa_cal$vectors <- as.data.frame(spe.gow.pcoa_cal$vectors)

# Take a quick look it needed
# plot_pcoa <-  ggplot(spe.gow.pcoa_cal$vectors, aes(Axis.1, Axis.2)) +
#  geom_point()
# plot_pcoa


# create a data frame with for the arrows that indicate the direction of the traits

arrows_df <- as.data.frame(spe.gow.pcoa_cal$U/4) # divide by 4 to make the arrows shorter

# give arrow names
arrows_df$variable <-  c("Body mass","Nocturnality" ,"Litter size",
                         "Habitat breadth","Diet_PC1","Life span",
                         "Climbing","Social")

# add cluster information
spe.gow.pcoa_cal$cluster <- cluster_g3$cluster

# add species names
spe.gow.pcoa_cal$vectors$species <- row.names(spe.gow.pcoa_cal$vectors)

#######################################################
# The big ggplot
#######################################################

#spe.gow.pcoa_cal$vectors$species
#spe.gow.pcoa_cal$cluster

plot_pcoa <- 
  ggplot(spe.gow.pcoa_cal$vectors, aes(Axis.1, Axis.2)) +
  geom_point(aes(colour=as.factor(spe.gow.pcoa_cal$cluster)), size=I(3)) + # add points
  geom_text(aes(label = row.names(spe.gow.pcoa_cal$vectors)),size = 4.5,#,nudge_x = 0.05,nudge_y =0.02
            position=position_jitter(width=0.055,height=0.05))+ # tune label position with jitter and nudge 
  geom_segment(data = as.data.frame(spe.gow.pcoa_cal$U/3.5), # adjust arrows length
               x = 0, y = 0, alpha = 0.7, # start of the arrow
               mapping = aes(xend = Axis.1, yend = Axis.2), 
               # Add arrow head
               arrow = arrow(length = unit(3, "mm"))) + xlim(-0.4, 0.6)+
  
  # Arrow style setting
  ggrepel::geom_label_repel(data = arrows_df, aes(label = variable),  size = 7,colour = "chocolate")+
  
  # add lines
  geom_hline(yintercept = 0, linetype = 2) +
  geom_vline(xintercept = 0, linetype = 2) +
  
  # Keep coord equal for each axis
  coord_equal() +
  # Labs
  labs(colour='Cluster',
       # Add Explained Variance per axis
       x = paste0("Axis 1 (", round(spe.gow.pcoa_cal$values$Rel_corr_eig[1] * 100, 2), "%)"),
       y = paste0("Axis 2 (", round(spe.gow.pcoa_cal$values$Rel_corr_eig[2] * 100, 2), "%)")) +
  scale_colour_discrete(labels=c( "Cluster 1", "Cluster 2", "Cluster 3"))+
  # Theme change
  theme_bw()+
  theme(legend.title = element_text( size = 20),legend.text = element_text( size = 16),
        text = element_text(size = 20))

plot_pcoa

Now it’s your turn to try for your own data!

If you have the vector of scientific names of the species ready. You can select a few traits for your species and try to cluster your community. Go ahead and check the metadata of the COMBINE trait database for trait information and download the spreadsheet for trait data here.

You can:

  1. Run through the code and see if you have any questions.

  2. Take a look at the COMBINED data and find the traits you are interested in.

  3. Subset the trait data that you are interest and use left_join() to merge the trait data with the species names.

  4. There is a file that have all the code that I used in the tutorial, you can download it here.

  5. See if you got any interesting results.