I have prepared some code and data for you to play with the
hierarchical clustering analysis. Hierarchical clustering is a data
analysis technique that groups similar data points into clusters. It is
a powerful tool for exploratory data analysis and can be used to
identify patterns in your data. I have used this to cluster the species
in a community based on their life-history traits to identify the
functional groups in the community. There are thousands of ways to do a
clustering analysis and hierarchical clustering. We will be focusing on
the Ward’s hierarchical clustering because I have only used this XD. The
package FD can be used to do the clustering and this is
also the package to calculate functional diversity so I thought this
might be useful for some of you.
Note: I am not an expert of using this method, so I could be wrong and please do your own research before using this method.
Let’s assume you have six species/data point, each species has distinctive traits. To do a hierarchical clustering you need to compute a distance matrix between all the data points.
Hierarchical clustering starts by treating each observation as a separate cluster. Then, it repeatedly executes the following two steps: (1) identify the two clusters that are closest together, and (2) merge the two most similar clusters. This iterative process continues until all the clusters are merged together.
At the end, the main output of Hierarchical Clustering is a tree-like diagrams called dendrogram (树形图) which shows how far each of your data points are from each other.
Then, you can cut the dendrogram (based on how many cluster you need) at a certain height to get the clusters.
For other clustering method like K-means, you need to specify the number of clusters you want to have. However, for hierarchical clustering, you can cut the dendrogram at different heights to get different number of clusters. So, how do you decide how many clusters you should have? The correct answer to this question is: it depends.
But there are several methods to do this, one of the most common
methods is the elbow method. The elbow method is a graphical method to
determine the optimal number of clusters. You plot the number of
clusters against the within-cluster sum of squares (WCSS)
or silhouette score. And look for the “elbow” point in the
plot. The elbow point is the point where the rate of decrease in
WCSS/sil score slows down. This is the optimal number of clusters.
within-cluster sum of squares (WCSS) is
the sum of the squared distances between each member of the cluster and
its centroid (average). silhouette score
measure of how similar an object is to its own cluster (cohesion)
compared to other clusters (separation)
There are also many ways for hierarchical clustering to calculate the
distance between the clusters. This is called linkage.
The most common methods are single linkage,
complete linkage, average linkage, and
Ward's method. Ward’s method is the one that is most
balanced, it minimize the total within-cluster variance and maximize the
distance between the cluster. So we will be using Ward’s method in this
tutorial.
Note: Most of the code were adapted from this paper Darling, E. S., Alvarez‐Filip, L., Oliver, T. A., McClanahan, T. R., & Côté, I. M. (2012). Evaluating life‐history strategies of reef corals from species traits. Ecology Letters, 15(12), 1378-1386.
Okay, let’s start by loading the required packages and the data. My R is on R version 4.4.1 (2024-06-14).
library(dplyr)
library(FD)
library(cluster)
library(factoextra)
library(dendextend)
library(ggplot2)
library(ggrepel)
The data is a list of species and their life-history traits. The file
species_traits.csv and can be downloaded here.
Let’s load the data and take a quick look.
# Load the data
species_traits <- read.csv("trait_data.csv")
head(species_traits)
## species body_mass activity_pattern litter_size hab_breadth
## 1 Cabassous centralis 3.16 Nocturnal 1.0 7
## 2 Cuniculus paca 8.28 Nocturnal 1.5 1
## 3 Dasyprocta punctata 3.60 Diurnal 1.9 5
## 4 Dasypus novemcinctus 4.20 Nocturnal/Crepuscular 4.0 16
## 5 Didelphis marsupialis 1.09 Nocturnal/Crepuscular 5.1 4
## 6 Eira barbara 3.91 Diurnal 2.0 5
## diet_PC1 gen_length habitat social
## 1 7.7576388 1859.1198 terrestrial Solitary/pairs
## 2 -3.4901222 2097.4631 terrestrial Solitary/pairs
## 3 -2.6445504 1683.3608 terrestrial Solitary/pairs
## 4 7.7576388 1825.0000 terrestrial Solitary/pairs
## 5 0.4515587 494.7575 scansorial Solitary/pairs
## 6 0.4682619 2686.6021 scansorial Coalitions
We have got:
species: The name of the species, 36 species in
totalbody_mass: The body mass of the species in kg, range =
(0.51,240), mean = 25.31activity_pattern: Categorical. Nocturnal(8),
diurnal(11), Nocturnal/Crepuscular(7), crepuscular(2), or
Non-restricted(8).litter_size: The average litter size of the species,
range = (1,5.6), mean = 2.1hab_breadth: Number of types of habitats the species
can be found in, range = (1,26), mean = 6.06diet_PC1: The first principal component of the diet
data, range = (-6.18,7.76), mean = 0gen_length: The generation length the species (days),
range = (494.76,4124.53), mean = 2315.64habitat: Categorical. Terrestrial or scansorial (can
climb trees), 23 terrestrial species and 13 scansorial speciessocial: Categorical. Solitary or in groups, 28
solitary, 1 coalitions, 3 , and 4 speciesTo simplified the analysis, we will convert categorical data into 0/1. And inspect the distribution of continues traits.
# Convert categorical data into 0/1
species_traits <- species_traits %>%
mutate(activity_pattern = ifelse(activity_pattern == "nocturnal" | activity_pattern == "Nocturnal/Crepuscular", 1, 0),
habitat = ifelse(habitat == "terrestrial", 1, 0),
social = ifelse(social == "Solitary/pairs", 1, 0))
str(species_traits)
## 'data.frame': 36 obs. of 9 variables:
## $ species : chr "Cabassous centralis" "Cuniculus paca" "Dasyprocta punctata" "Dasypus novemcinctus" ...
## $ body_mass : num 3.16 8.28 3.6 4.2 1.09 3.91 11.9 3.25 22.8 4.03 ...
## $ activity_pattern: num 0 0 0 1 1 0 1 0 0 0 ...
## $ litter_size : num 1 1.5 1.9 4 5.1 2 1.63 1.5 1.2 4.17 ...
## $ hab_breadth : int 7 1 5 16 4 5 7 5 4 2 ...
## $ diet_PC1 : num 7.758 -3.49 -2.645 7.758 0.452 ...
## $ gen_length : num 1859 2097 1683 1825 495 ...
## $ habitat : num 1 1 1 1 0 0 0 0 1 0 ...
## $ social : num 1 1 1 1 1 0 1 1 1 0 ...
# Histogram plot of the traits
par(mfrow=c(2,3))
hist(species_traits$body_mass, main = "Body mass", xlab = "Body mass (kg)", col = "lightblue")
hist(species_traits$litter_size, main = "Litter size", xlab = "Litter size", col = "lightblue")
hist(species_traits$hab_breadth, main = "Habitat breadth", xlab = "Habitat breadth", col = "lightblue")
hist(species_traits$diet_PC1, main = "Diet PC1", xlab = "Diet PC1", col = "lightblue")
hist(species_traits$gen_length, main = "Generation length", xlab = "Generation length (days)", col = "lightblue")
The body mass, litter size, and habitat breadth are right-skewed, so let’s log-transform these variables and look at again. You might also need to check the correlations between the variables and remove the highly correlated variables.
# Log-transform the variables
species_traits$body_mass <- log(species_traits$body_mass)
species_traits$litter_size <- log(species_traits$litter_size)
species_traits$hab_breadth <- log(species_traits$hab_breadth)
par(mfrow=c(1,3))
hist(species_traits$body_mass, main = "Log body mass", xlab = "Log body mass (kg)", col = "lightblue")
hist(species_traits$litter_size, main = "Log litter size", xlab = "Log litter size", col = "lightblue")
hist(species_traits$hab_breadth, main = "Log habitat breadth", xlab = "Log habitat breadth", col = "lightblue")
Okay, looks good. Now the first step of the analysis is to calculate
the distance matrix between the species (based on their traits). The
most basic method is to calculate the Euclidean distance using
dist() function. But as we have catogorical/factor/binary
variables (and NAs), we can use the Gower dissimilarity
index to compare the multivariate trait distance between
species because it allows for mixed types of data, missing values and
can weight individual traits differently.
Remember this distance is sensitive to the scale of the variables so remember to standardize the variables .
# Standardize the variables
species_traits$body_mass <- as.numeric(scale(species_traits$body_mass))
species_traits$litter_size <- as.numeric(scale(species_traits$litter_size))
species_traits$gen_length <- as.numeric(scale(species_traits$gen_length))
species_traits$hab_breadth <- as.numeric(scale((species_traits$hab_breadth)))
species_traits$diet_PC1 <- as.numeric(scale(species_traits$diet_PC1))
# Assign the species name as the row names
rownames(species_traits) <- species_traits$species
species_traits <- species_traits[,-1]
# Calculate the distance matrix
d <- FD::gowdis(species_traits)
Now we can do the hierarchical clustering using the
hclust() function, using Ward’s linkage to calculate the
distance between the clusters.
# Hierarchical clustering
hc <- hclust(d, method = "ward.D2")
hc
##
## Call:
## hclust(d = d, method = "ward.D2")
##
## Cluster method : ward.D2
## Number of objects: 36
plot(hc, cex = 0.6, hang = -1)
You can also try to different linkage methods (use method =
single, complete, average) and
see how the dendrogram looks like.
# Elbow method
# within-cluster sum of squares (WCSS)
fviz_nbclust(species_traits, FUN = hcut, method = "wss") + theme_minimal()
# Silhouette score
fviz_nbclust(species_traits, FUN = hcut, method = "silhouette") + theme_minimal()
In Darling et al. (2012), they used non-parametric Multivariate Analysis of Variance(MANOVA) to evaluate different groupings of species by comparing the coefficients of determination (R^2). The model with the highest R2 value was selected as the best model. We can try it as well.
## MANOVA using adonis2() function from vegan package
parnames <- c("no_clusters","adonis_R2")
npar <- length(parnames)
M0 <- matrix(nrow=9,ncol=npar)
colnames(M0) <- parnames
for(i in (2:10)){
groups<-cutree(hc,i)
cluster_name=paste("cluster",i, sep="")
adonis_model <- adonis2(d ~ groups)
R2 <- adonis_model$R2[1]
j <- i - 1
M0[j,1] <- cluster_name
M0[j,2] <- R2
}
M0
## no_clusters adonis_R2
## [1,] "cluster2" "0.350364083459131"
## [2,] "cluster3" "0.315845173513951"
## [3,] "cluster4" "0.18411250016616"
## [4,] "cluster5" "0.0758295452832858"
## [5,] "cluster6" "0.102974881823282"
## [6,] "cluster7" "0.104848705213415"
## [7,] "cluster8" "0.096909518140697"
## [8,] "cluster9" "0.0910638057802914"
## [9,] "cluster10" "0.0936239200252919"
There are no “correct” number of clusters. It all depends on the question you are asking. We can now cut the dendrogram at different heights to get different number of clusters.
plot(hc, cex = 0.6, hang = -1)
rect.hclust(hc, k = 3, border = 2:5) # k is the number of cluster
Now make a better plot of the dendrogram with the cluster colors.
# Better plot
dend <- as.dendrogram(hc)
dend1 <- color_branches(dend, k = 3) # give colors to the branches
dend2 <- color_labels(dend, k = 3) # give colors to the labels
plot(dend1, main = "Colored branches")
plot(dend2, main = "Colored labels")
# give colors to both branches and labels
dend <- set(dend1, "labels_cex", 1.4) %>% set("branches_lwd", 3) # set the size of the labels and branches
dend <- color_labels(dend, k = 3)
par(mar = c(4,4,4,14),cex=0.6) # set the margin of plot
plot(dend, horiz = TRUE)
FD package can calculate the different functional diversity metrics of the community at once. Some metrics (i.e. Functional group richness) is a special case of the hierarchical clustering to some extent . As it also needs to calculate the distance between the species in n-dimension trait space. So let’s try this method.
Note We will need to input “g” in the console for the number of clusters you want and then the number of clusters you want for the function to be excused.
There are also bunch of parameters in the dbFD()
function and more information in the output. You can explore a bit by
yourself.
# Hierarchical clustering using FD package
# Enter g for group, 4 for no. of group in the console
# corr = "cailliez" is often needed see details in the help page
# calc.FGR = T is a must have to return functional group richness
# clust.type = "ward.D2" is the linkage method
# print.pco = T to run PCoA analysis for your traits too
# calc.CWM = F to not calculate the community weighted mean (we only have 1 community)
ex <- FD::dbFD(d, corr = "cailliez", calc.FGR = T, clust.type = "ward.D2", print.pco = T, calc.CWM=F)
ex
## $nbsp
## Community1
## 36
##
## $sing.sp
## Community1
## 36
##
## $FRic
## Community1
## 1.419352e-50
##
## $qual.FRic
## [1] 1
##
## $FEve
## Community1
## 0.9568357
##
## $FDiv
## Community1
## 0.8937894
##
## $FDis
## Community1
## 0.2382625
##
## $RaoQ
## Community1
## 0.06414295
##
## $FGR
## Community1
## 3
##
## $spfgr
## Cabassous centralis Cuniculus paca Dasyprocta punctata
## 1 1 1
## Dasypus novemcinctus Didelphis marsupialis Eira barbara
## 2 3 3
## Leopardus pardalis Leopardus wiedii Mazama temama
## 3 3 1
## Nasua narica Odocoileus virginianus Pecari tajacu
## 3 1 1
## Procyon cancrivorus Puma yagouaroundi Sylvilagus brasiliensis
## 3 3 1
## Tapirus bairdii Atelocynus microtis Dasyprocta leporina
## 1 1 1
## Dasypus kappleri Mazama americana Mazama nemorivaga
## 2 1 1
## Myrmecophaga tridactyla Nasua nasua Panthera onca
## 1 3 3
## Priodontes maximus Puma concolor Tapirus terrestris
## 1 3 1
## Tayassu pecari Dasyprocta fuliginosa Myoprocta pratti
## 1 1 1
## Sciurus igniventris Sciurus spadiceus Leopardus tigrinus
## 3 3 3
## Myoprocta acouchy Tamandua mexicana Tamandua tetradactyla
## 1 2 2
##
## $gr.abun
## group1 group2 group3
## Community1 19 4 13
##
## $x.values
## [1] 0.0686607141 0.0546721056 0.0390995662 0.0315568827 0.0183976183
## [6] 0.0128749469 0.0114649503 0.0078721898 0.0070387615 0.0066047061
## [11] 0.0059486220 0.0044421239 0.0043743650 0.0042304024 0.0041787293
## [16] 0.0037602736 0.0037380802 0.0035491811 0.0034972365 0.0034329629
## [21] 0.0033480894 0.0033176984 0.0032657440 0.0032553132 0.0032230538
## [26] 0.0031541897 0.0029960000 0.0029605646 0.0026706396 0.0024665133
## [31] 0.0017590416 0.0016303226 0.0013315424 0.0006909042
##
## $x.axes
## A1 A2 A3 A4
## Cabassous centralis 0.25427275 -0.039195530 -0.0619296928 -0.01512962
## Cuniculus paca 0.08117389 0.128600189 -0.2195170236 -0.15736182
## Dasyprocta punctata 0.07587876 0.072287723 -0.1363313853 -0.05332341
## Dasypus novemcinctus 0.32542766 -0.344442032 0.0006023044 0.33390513
## Didelphis marsupialis -0.18664389 -0.582362568 -0.0667595466 0.14467075
## Eira barbara -0.42189469 0.070160095 0.1605934945 0.17086455
## Leopardus pardalis -0.04592023 -0.376432690 0.2869257608 0.01824218
## Leopardus wiedii -0.20754011 -0.154694490 0.0706525972 -0.17696152
## Mazama temama 0.16997297 0.161308185 -0.0087147871 -0.17600120
## Nasua narica -0.50260936 0.025322693 0.0825797395 0.24816761
## Odocoileus virginianus 0.03360181 0.472635329 0.2161138453 0.18474939
## Pecari tajacu -0.01899664 0.398252770 0.1652031700 0.23067960
## Procyon cancrivorus -0.21175544 -0.235481810 0.0858095351 -0.08855461
## Puma yagouaroundi -0.26689996 -0.139728861 -0.0552846463 -0.19277599
## Sylvilagus brasiliensis 0.05974635 0.104006927 -0.2520044965 -0.11002139
## Tapirus bairdii 0.30191572 0.265522682 0.2751953114 -0.23460089
## Atelocynus microtis 0.05760103 0.028436881 -0.1168567511 -0.01462438
## Dasyprocta leporina 0.03027599 0.102282793 -0.2678638536 -0.11860227
## Dasypus kappleri 0.26648417 -0.361154218 -0.2802359114 0.26443697
## Mazama americana 0.14935943 0.187472560 -0.1569173471 -0.21594752
## Mazama nemorivaga 0.15151880 0.142188649 -0.1066792046 -0.17507671
## Myrmecophaga tridactyla 0.34209995 0.068718856 0.2455771217 -0.05967434
## Nasua nasua -0.48244249 0.018284166 0.1075331422 0.24757947
## Panthera onca -0.17053351 -0.060348069 0.3935210498 -0.19177730
## Priodontes maximus 0.25194671 0.015713018 0.0278989450 -0.05720149
## Puma concolor -0.20187729 -0.114892859 0.3451994739 -0.11432058
## Tapirus terrestris 0.49176340 0.001537126 0.3039221757 -0.02874446
## Tayassu pecari -0.02333350 0.369987703 0.2019841753 0.24662259
## Dasyprocta fuliginosa 0.06799021 0.073061941 -0.1224949683 -0.05402914
## Myoprocta pratti -0.15654054 0.293133799 -0.3264538027 0.23720119
## Sciurus igniventris -0.31747772 -0.093363666 -0.1821099652 -0.22766915
## Sciurus spadiceus -0.32101931 -0.094030208 -0.1850875673 -0.23251707
## Leopardus tigrinus -0.27647699 -0.161855204 -0.0121535104 -0.15319024
## Myoprocta acouchy -0.14693797 0.320532247 -0.3380751657 0.21583977
## Tamandua mexicana 0.40603164 -0.288485759 -0.0750732337 0.14616929
## Tamandua tetradactyla 0.44183841 -0.272978369 0.0012310177 0.15897660
## A5 A6 A7 A8
## Cabassous centralis 0.321760714 -0.0835388797 0.0799821735 -0.076289676
## Cuniculus paca -0.044775289 0.1408549543 0.0156368891 0.069913995
## Dasyprocta punctata 0.029809545 -0.1877566820 -0.0710405071 0.001766896
## Dasypus novemcinctus 0.037532932 -0.0231653700 -0.2028394417 -0.052469957
## Didelphis marsupialis -0.260551124 -0.1589364283 -0.0838637396 -0.006090866
## Eira barbara 0.088085039 -0.1138110385 0.1822984209 0.033082480
## Leopardus pardalis -0.185722867 0.0380410600 0.1950192315 0.091378835
## Leopardus wiedii 0.106312776 -0.1267371476 0.1143588401 0.092870100
## Mazama temama -0.032771119 -0.0389287266 -0.0152104452 0.220240430
## Nasua narica -0.002197978 0.1330453492 -0.0026917291 0.023488487
## Odocoileus virginianus -0.082974772 -0.0244301561 -0.0553051849 0.135306539
## Pecari tajacu -0.027641229 -0.0085818367 -0.0005886284 -0.024232458
## Procyon cancrivorus 0.204863942 -0.0001449719 -0.1054292911 0.066426407
## Puma yagouaroundi 0.045548359 0.2044516373 0.0460453014 0.027608049
## Sylvilagus brasiliensis -0.124205850 -0.2601949624 -0.1229683657 -0.047860415
## Tapirus bairdii -0.119678684 -0.0294449787 -0.0257887906 -0.103958685
## Atelocynus microtis 0.047760583 -0.0763166689 -0.2264467374 0.074068646
## Dasyprocta leporina -0.029333372 0.0564484417 0.0039500598 -0.053886226
## Dasypus kappleri -0.062869212 0.3120558543 -0.1235067703 -0.008195299
## Mazama americana -0.087026079 0.1808960862 -0.0113253608 0.096181439
## Mazama nemorivaga -0.013631505 0.0670002901 0.0708706328 0.133062902
## Myrmecophaga tridactyla 0.248662566 0.1213732338 0.0223083277 -0.179579428
## Nasua nasua 0.033848251 0.0267106884 -0.0206053472 0.075110866
## Panthera onca -0.019517495 0.0343455985 -0.0461937166 -0.177018373
## Priodontes maximus 0.272813187 0.0526903634 -0.0407277485 0.059810802
## Puma concolor 0.028099635 0.0553498708 -0.2461805612 0.020494167
## Tapirus terrestris -0.403192144 -0.0035902601 0.0735693456 -0.035845552
## Tayassu pecari 0.024304063 0.0244630970 0.0005907890 -0.055884746
## Dasyprocta fuliginosa 0.036043077 -0.1754141517 -0.0680851359 -0.002052385
## Myoprocta pratti -0.018429682 0.0096389239 0.0713803897 -0.095728355
## Sciurus igniventris -0.081880038 0.0318551578 0.0563109452 -0.125946795
## Sciurus spadiceus -0.082212906 0.0320235867 0.0579729331 -0.131079493
## Leopardus tigrinus 0.068523415 -0.0907006309 0.0301594733 -0.014122635
## Myoprocta acouchy -0.060196999 -0.0023471804 0.0720837359 -0.076191951
## Tamandua mexicana 0.075913689 -0.0693124028 0.2058106218 0.063430505
## Tamandua tetradactyla 0.068926571 -0.0478917202 0.1704493911 -0.017808250
## A9 A10 A11 A12
## Cabassous centralis 0.0674823544 0.094535348 -0.059047849 -0.054108344
## Cuniculus paca -0.0768530394 -0.045353494 -0.073977106 -0.067110049
## Dasyprocta punctata -0.1125443808 -0.097728716 -0.046244885 -0.016275483
## Dasypus novemcinctus -0.1148950294 0.091339266 -0.123442548 -0.012968631
## Didelphis marsupialis 0.1203237224 0.013456302 0.076168182 0.003980995
## Eira barbara -0.0562681121 0.004423401 0.071713704 0.011357602
## Leopardus pardalis -0.1320438657 -0.083685856 0.011472534 0.002278741
## Leopardus wiedii 0.0071027525 0.017496067 -0.038220048 -0.110907048
## Mazama temama 0.0702258033 -0.009623676 0.031609039 0.107999403
## Nasua narica 0.1566618243 -0.008910168 -0.122653913 -0.018474974
## Odocoileus virginianus -0.0911699053 0.231844276 0.012382248 -0.071341119
## Pecari tajacu -0.0746285964 -0.107786496 -0.079132008 0.139761745
## Procyon cancrivorus 0.0129629941 -0.025444653 -0.040044294 0.157515925
## Puma yagouaroundi -0.0389778677 -0.039654122 0.059352196 -0.121381325
## Sylvilagus brasiliensis 0.1328696480 0.073600675 0.074397614 0.002593397
## Tapirus bairdii 0.1253175535 0.007057251 -0.074197796 -0.054048853
## Atelocynus microtis 0.0256408749 -0.215140047 0.078616878 -0.083719133
## Dasyprocta leporina -0.1061774703 -0.017846064 -0.114790829 -0.104440026
## Dasypus kappleri 0.0348477225 -0.040902137 0.033453502 0.011123774
## Mazama americana 0.0036176614 0.088399884 0.107702592 0.017008476
## Mazama nemorivaga 0.1333515654 -0.046730065 -0.137546170 0.053595191
## Myrmecophaga tridactyla 0.1167505173 -0.014663292 -0.022048171 0.023414015
## Nasua nasua 0.1310329248 0.024462247 -0.068274796 -0.014361928
## Panthera onca -0.0098303453 -0.076166701 0.099411539 -0.014593610
## Priodontes maximus -0.0312281002 0.055862468 0.204037727 0.065683227
## Puma concolor -0.0955448044 0.137729288 -0.011709550 -0.013003195
## Tapirus terrestris 0.0318132205 -0.002161617 -0.005284483 0.002001522
## Tayassu pecari -0.0215292903 -0.138169015 0.110039602 -0.021741340
## Dasyprocta fuliginosa -0.1274417657 -0.094679374 -0.075187363 0.041601871
## Myoprocta pratti 0.0180926022 0.017903477 0.098499824 -0.049198847
## Sciurus igniventris -0.0616072624 0.070571693 -0.014196422 0.095932149
## Sciurus spadiceus -0.0677607169 0.072922310 -0.016120661 0.112784916
## Leopardus tigrinus 0.0453287994 -0.008293668 0.005788948 -0.070162749
## Myoprocta acouchy -0.0005683746 0.032708899 0.021049439 0.044783880
## Tamandua mexicana 0.0315821289 0.001980817 0.006907814 0.015594185
## Tamandua tetradactyla -0.0459357426 0.036645490 0.019515508 -0.011174359
## A13 A14 A15 A16
## Cabassous centralis 0.007214052 -0.0019481211 -0.0510365837 0.025539949
## Cuniculus paca 0.011214614 0.1170856380 0.0056082178 0.153490326
## Dasyprocta punctata -0.084758134 -0.0067154895 -0.0036906914 -0.043598841
## Dasypus novemcinctus 0.013182740 -0.0192825369 0.0590892607 0.021861695
## Didelphis marsupialis -0.028615951 -0.0459527373 -0.0374421690 0.017524392
## Eira barbara -0.015625884 0.0705785184 -0.0642183894 -0.075476739
## Leopardus pardalis 0.080165645 0.0355447047 0.0538154455 0.039696168
## Leopardus wiedii -0.101986247 -0.0604578814 -0.0008341314 0.172882589
## Mazama temama 0.008089715 -0.0229008991 0.0915976253 0.013628282
## Nasua narica -0.066974770 -0.0490827131 0.0307683717 0.048364244
## Odocoileus virginianus 0.003542065 0.0667315414 -0.0209380927 -0.048378237
## Pecari tajacu 0.088009294 0.0267933489 0.0520580612 0.064526067
## Procyon cancrivorus -0.081933187 0.1633740484 -0.0537848665 -0.038934972
## Puma yagouaroundi 0.041935147 0.1260497981 -0.0001812359 -0.092683789
## Sylvilagus brasiliensis 0.104809008 0.1401448902 0.1125054981 0.044606351
## Tapirus bairdii -0.092853840 0.0560162816 -0.0606576614 -0.005153009
## Atelocynus microtis 0.065089653 0.0057078656 -0.1510834838 0.018283677
## Dasyprocta leporina -0.029698389 -0.0403634467 0.1023257260 -0.060138673
## Dasypus kappleri -0.036496259 0.0305072598 0.0184458975 -0.020126842
## Mazama americana -0.089398773 -0.0747685492 0.0052755986 -0.024444036
## Mazama nemorivaga 0.131800309 -0.1106857902 -0.1015360716 -0.062135093
## Myrmecophaga tridactyla 0.050525734 0.0351728327 0.0395603885 -0.009847935
## Nasua nasua -0.020922467 0.0086137054 0.0541432872 -0.002084089
## Panthera onca -0.060495901 -0.0100818846 0.0339499262 -0.006645440
## Priodontes maximus -0.038875804 -0.0679157977 0.0768447484 0.065070946
## Puma concolor 0.092807471 -0.0845349327 -0.0998835024 -0.021887435
## Tapirus terrestris -0.041353866 -0.0009042045 -0.0155160344 -0.026659967
## Tayassu pecari 0.003479422 -0.0935036566 0.0175763215 0.027654513
## Dasyprocta fuliginosa -0.086290110 -0.0540717484 0.0197857179 -0.067904253
## Myoprocta pratti 0.026208354 -0.0200721531 -0.0623904585 0.024442261
## Sciurus igniventris 0.014044567 -0.0242773225 -0.0402409556 0.028543755
## Sciurus spadiceus 0.019090091 -0.0315004776 -0.0475092232 0.042789713
## Leopardus tigrinus 0.124535868 -0.0728373410 0.1517324468 -0.117901283
## Myoprocta acouchy -0.034067595 0.0058049910 -0.0318431970 -0.039440652
## Tamandua mexicana -0.076256124 0.0084513339 -0.0056749142 -0.095764640
## Tamandua tetradactyla 0.100859555 -0.0047190750 -0.0766208766 0.050300998
## A17 A18 A19 A20
## Cabassous centralis -0.0374385926 0.017985186 -0.072796758 -0.0647390024
## Cuniculus paca 0.0290780745 0.062456760 0.018399758 0.0944900331
## Dasyprocta punctata 0.0533371593 -0.155778366 -0.069960795 0.0470800407
## Dasypus novemcinctus -0.0005190699 0.013489430 0.020675808 0.0075800951
## Didelphis marsupialis -0.0027912089 -0.042495261 -0.032520306 0.0041888188
## Eira barbara -0.0124263280 -0.049094343 0.029611200 -0.0103572449
## Leopardus pardalis 0.0092680863 0.017580647 0.043433200 -0.0212740678
## Leopardus wiedii -0.0142934748 -0.060512766 0.005576638 0.0276455464
## Mazama temama -0.1893543933 -0.043520295 -0.037769944 -0.0802229253
## Nasua narica 0.0948876933 -0.081211513 0.071774116 -0.1292803266
## Odocoileus virginianus 0.0312429127 0.005648724 0.032722909 -0.0056802642
## Pecari tajacu 0.0085789457 -0.077620937 -0.061881674 0.0277051874
## Procyon cancrivorus 0.0220120128 0.092456621 0.043293585 -0.0929570072
## Puma yagouaroundi 0.0185799701 -0.074538625 -0.075279889 -0.0117862896
## Sylvilagus brasiliensis 0.0671198426 -0.025078598 0.049341966 -0.0261142064
## Tapirus bairdii 0.0205597945 0.065693941 -0.020174580 -0.0094564576
## Atelocynus microtis -0.0696529144 0.016530654 0.036479195 0.0061465942
## Dasyprocta leporina -0.0665676733 0.008894238 0.044241330 -0.1032882692
## Dasypus kappleri 0.0054728285 -0.017184642 -0.021174873 0.0023752624
## Mazama americana -0.0192033202 -0.002406550 -0.036222043 -0.0247959860
## Mazama nemorivaga 0.1441921426 0.001904591 0.013610153 0.0307135691
## Myrmecophaga tridactyla -0.0919546189 -0.107490511 0.065572395 0.0663218686
## Nasua nasua -0.0835973384 0.129309848 -0.111033338 0.1497747657
## Panthera onca 0.0514733963 0.019509439 -0.058802546 0.0023422972
## Priodontes maximus 0.1422879962 0.019564472 0.007958800 0.0463878429
## Puma concolor -0.0412362440 -0.053014343 0.020926111 0.0326884618
## Tapirus terrestris -0.0038624024 0.009818076 0.006694461 0.0118982249
## Tayassu pecari -0.0173593217 0.075993447 0.043266886 -0.0588829176
## Dasyprocta fuliginosa 0.0261078569 0.088518876 0.009598807 0.0028428817
## Myoprocta pratti -0.0366072408 0.033308521 0.105734006 0.0009031674
## Sciurus igniventris -0.0269829167 0.007709790 0.013130300 0.0042129844
## Sciurus spadiceus -0.0429455966 0.022914124 0.047793568 0.0305229653
## Leopardus tigrinus 0.0287440798 0.074430096 -0.007040091 0.0188598381
## Myoprocta acouchy 0.0141895489 -0.029038055 -0.119515378 0.0283501820
## Tamandua mexicana -0.0258570903 -0.017029863 0.126419791 0.1103498402
## Tamandua tetradactyla 0.0155174041 0.052297186 -0.132082770 -0.1145455027
## A21 A22 A23 A24
## Cabassous centralis -0.034220463 -0.028838251 0.046111844 0.0159120499
## Cuniculus paca -0.047746899 -0.015432257 -0.049825542 -0.1114700944
## Dasyprocta punctata 0.152437595 0.044070106 0.096309336 -0.0989259371
## Dasypus novemcinctus -0.011556041 -0.004102748 0.017109316 -0.0082922857
## Didelphis marsupialis -0.023332220 -0.005305673 -0.027850648 -0.0061835102
## Eira barbara -0.037895696 -0.013514108 -0.034528505 0.0125215044
## Leopardus pardalis 0.034588132 0.008780282 0.026622142 0.0300107420
## Leopardus wiedii -0.008586119 0.004869026 -0.020488958 0.0292330668
## Mazama temama -0.011531094 -0.010732888 -0.019423591 -0.0725035369
## Nasua narica -0.046270222 -0.034186087 -0.003979322 -0.0205687121
## Odocoileus virginianus 0.019960773 -0.003089210 -0.033419789 -0.0258313284
## Pecari tajacu -0.065051624 -0.010494174 0.074041113 0.1437377569
## Procyon cancrivorus 0.067919962 0.032550549 0.011493341 -0.0092267491
## Puma yagouaroundi -0.051141840 -0.028938494 0.036553052 -0.0010158217
## Sylvilagus brasiliensis 0.006332384 0.001180250 0.005562022 0.0074955868
## Tapirus bairdii 0.007010295 0.003129666 0.024257618 0.0593211077
## Atelocynus microtis -0.008521271 -0.007227047 -0.051598791 0.0987245083
## Dasyprocta leporina 0.096637987 0.045690897 -0.081132878 0.1341882835
## Dasypus kappleri 0.002216078 -0.005039280 0.002990296 -0.0124395515
## Mazama americana -0.045206138 -0.017694962 0.117022617 0.0346109834
## Mazama nemorivaga 0.042514273 0.020573521 -0.010510987 -0.0141290865
## Myrmecophaga tridactyla -0.011666864 0.001816018 -0.040149631 -0.0608170305
## Nasua nasua 0.072413095 0.037888430 0.042448703 0.0288050465
## Panthera onca -0.006905797 -0.011183542 -0.005734167 -0.0088708801
## Priodontes maximus 0.030405803 0.018190949 -0.047741521 0.0843344107
## Puma concolor -0.020998298 0.002772084 -0.012172965 -0.0101995398
## Tapirus terrestris -0.002093375 0.001476193 -0.010014003 -0.0007581996
## Tayassu pecari 0.066835883 0.020783556 -0.038422435 -0.1210489132
## Dasyprocta fuliginosa -0.183755748 -0.065590954 0.013358239 -0.0499877635
## Myoprocta pratti -0.000585820 0.027976691 0.168224392 -0.0108481218
## Sciurus igniventris -0.065701420 0.238134930 -0.021535811 -0.0170001494
## Sciurus spadiceus 0.110423840 -0.211503924 0.003022033 0.0061896510
## Leopardus tigrinus -0.024830938 -0.018901827 0.009391971 -0.0308494516
## Myoprocta acouchy -0.009352695 -0.025227827 -0.177428103 0.0030746240
## Tamandua mexicana -0.015060059 0.004128567 -0.012875158 0.0356610114
## Tamandua tetradactyla 0.022314542 0.002991537 0.004314770 -0.0328536700
## A25 A26 A27 A28
## Cabassous centralis -0.095560928 -0.009619963 0.0078237269 0.1575278214
## Cuniculus paca -0.042836947 -0.069512593 0.0579166307 -0.0408483176
## Dasyprocta punctata -0.055253948 -0.014180766 0.0053100695 -0.0073597361
## Dasypus novemcinctus -0.002825655 -0.001370710 -0.0103229065 0.0344622058
## Didelphis marsupialis 0.052374954 -0.032222515 0.0382860703 -0.0278415546
## Eira barbara 0.003202505 -0.023948080 0.0462127669 -0.0260034902
## Leopardus pardalis -0.081483446 0.071244360 -0.0434443069 0.0703618144
## Leopardus wiedii 0.061896480 0.047555662 -0.0439870400 -0.0475781505
## Mazama temama 0.023734759 -0.091300247 0.0515637599 0.0582280000
## Nasua narica -0.059472993 -0.020611871 -0.0069229631 -0.0007549098
## Odocoileus virginianus 0.011191768 -0.066995453 -0.1125626662 0.0192181773
## Pecari tajacu -0.002861298 -0.023031136 0.0299942747 -0.0361198610
## Procyon cancrivorus 0.003904705 0.005683367 -0.0093662223 -0.0711959371
## Puma yagouaroundi 0.097506893 0.057742730 0.0559164252 0.0631118512
## Sylvilagus brasiliensis 0.024418278 0.107116114 -0.0200281101 0.0291592604
## Tapirus bairdii -0.046650775 0.016187499 0.1261883152 -0.0199210537
## Atelocynus microtis -0.074540099 -0.027674568 -0.0876836994 0.0039990109
## Dasyprocta leporina 0.079134211 -0.018350424 0.0204826479 -0.0255819444
## Dasypus kappleri -0.008060801 -0.012863321 0.0006967447 0.0180580259
## Mazama americana -0.082657328 0.100710899 -0.0886781026 -0.0854022029
## Mazama nemorivaga 0.083339956 -0.004441645 -0.0315024787 0.0493206668
## Myrmecophaga tridactyla 0.043099267 0.034803825 -0.0863062954 -0.0810717479
## Nasua nasua 0.050048820 0.044960833 -0.0274848765 0.0208273325
## Panthera onca 0.037788508 -0.128118491 -0.1025719705 0.0661383577
## Priodontes maximus 0.026717552 -0.037810648 0.0728319983 0.0190514507
## Puma concolor -0.008046162 0.055559669 0.1113673906 -0.0258740592
## Tapirus terrestris 0.013098104 -0.023581600 0.0074955587 -0.0051521526
## Tayassu pecari -0.002261261 0.128497390 0.0531001808 0.0388390309
## Dasyprocta fuliginosa 0.093664938 0.056570986 -0.0302842885 0.0137306541
## Myoprocta pratti 0.068766895 -0.086537447 0.0358789567 -0.0185779590
## Sciurus igniventris -0.037101947 -0.002754652 -0.0282695219 0.0457437162
## Sciurus spadiceus -0.001814443 -0.003464452 -0.0136617119 0.0273698310
## Leopardus tigrinus -0.125947985 -0.072686114 0.0188291651 -0.1063585159
## Myoprocta acouchy -0.067006112 0.043254225 -0.0029956606 -0.0119212291
## Tamandua mexicana -0.036643117 0.006384936 0.0277924559 0.0238831194
## Tamandua tetradactyla 0.057136653 -0.005195801 -0.0216143168 -0.1214675050
## A29 A30 A31 A32
## Cabassous centralis -0.002161443 0.1133039878 -0.0025771340 1.180957e-02
## Cuniculus paca -0.041536590 0.0583815436 -0.0153766798 -5.689220e-02
## Dasyprocta punctata 0.037412728 -0.0040854428 -0.0097625731 -1.246538e-02
## Dasypus novemcinctus -0.136215553 -0.1025526145 0.0912262547 -7.855424e-03
## Didelphis marsupialis -0.020295205 0.0411223770 -0.0148400711 -4.741069e-02
## Eira barbara 0.003545354 0.0544120800 0.1408966367 -8.068511e-02
## Leopardus pardalis 0.080263567 -0.0327297028 0.0123351200 2.339046e-03
## Leopardus wiedii -0.007901588 0.0072383319 0.0390109555 9.704220e-02
## Mazama temama 0.003006491 -0.0579465056 0.0088751967 3.272872e-03
## Nasua narica 0.021189704 -0.0467597596 -0.0529199338 -2.862287e-02
## Odocoileus virginianus 0.051538447 -0.0081729178 -0.0289706785 1.396507e-02
## Pecari tajacu -0.031941647 0.0421656213 -0.0330160707 1.577526e-02
## Procyon cancrivorus -0.055585778 0.0341828221 -0.0131775237 6.450321e-02
## Puma yagouaroundi -0.063567830 -0.0717198415 -0.0250967610 3.250453e-02
## Sylvilagus brasiliensis 0.005294934 0.0309203827 -0.0082061146 -1.989422e-02
## Tapirus bairdii 0.065883441 -0.1091569579 0.0367413437 -2.433693e-02
## Atelocynus microtis 0.002418018 -0.0454745950 -0.0008308601 -3.226050e-03
## Dasyprocta leporina 0.001162195 0.0626404922 -0.0153929655 -4.176523e-02
## Dasypus kappleri 0.130208040 0.0687478373 0.0519503405 3.921724e-02
## Mazama americana -0.074648788 0.0264164966 0.0198562915 -5.839077e-02
## Mazama nemorivaga -0.036446225 0.0157852203 0.0482369058 -8.919992e-03
## Myrmecophaga tridactyla 0.041686442 -0.0119135025 0.0177751131 -1.410828e-02
## Nasua nasua 0.027791143 0.0002445566 -0.0129000581 -3.465971e-02
## Panthera onca -0.056340879 0.0177171953 -0.0306630685 -3.174432e-02
## Priodontes maximus 0.040970903 -0.0323500266 0.0087726118 -2.407804e-03
## Puma concolor 0.027669105 0.0544003061 -0.0550701514 9.411696e-03
## Tapirus terrestris -0.042766870 0.0716432193 0.0159136239 8.468717e-02
## Tayassu pecari -0.059230540 0.0299257296 -0.0072167488 -4.809324e-06
## Dasyprocta fuliginosa 0.096614999 -0.0019283696 -0.0122017628 -1.606182e-03
## Myoprocta pratti 0.005726016 -0.0267192298 0.0012785390 5.527009e-02
## Sciurus igniventris 0.018818296 -0.0388729345 -0.0010233177 -7.258401e-03
## Sciurus spadiceus 0.015958618 -0.0314783965 -0.0008282172 -7.101404e-03
## Leopardus tigrinus 0.000956804 0.0004646082 0.0173508024 5.772476e-02
## Myoprocta acouchy -0.027000052 -0.0439289016 -0.0075612100 6.565107e-02
## Tamandua mexicana -0.053718620 -0.0193351130 -0.1101586969 -2.343851e-02
## Tamandua tetradactyla 0.031242365 -0.0445879962 -0.0524291382 -4.037950e-02
## A33 A34
## Cabassous centralis -0.0357972754 0.0130776817
## Cuniculus paca -0.0147815442 0.0102234879
## Dasyprocta punctata -0.0204839198 -0.0087258401
## Dasypus novemcinctus 0.0031507844 -0.0007219882
## Didelphis marsupialis -0.0511733572 0.0601205114
## Eira barbara 0.0147863250 -0.0282026397
## Leopardus pardalis -0.0228329059 0.0606039781
## Leopardus wiedii 0.0571224003 -0.0035284591
## Mazama temama 0.0206200970 0.0024719133
## Nasua narica -0.0308731058 -0.0332392772
## Odocoileus virginianus 0.0007389784 0.0102771495
## Pecari tajacu 0.0217530463 0.0016583253
## Procyon cancrivorus -0.0104798630 0.0218710021
## Puma yagouaroundi -0.0318479004 -0.0158798931
## Sylvilagus brasiliensis 0.0380675978 -0.0209993364
## Tapirus bairdii 0.0374137560 0.0225560629
## Atelocynus microtis -0.0244947549 -0.0280625955
## Dasyprocta leporina -0.0125259590 0.0128966658
## Dasypus kappleri 0.0797546604 -0.0174644906
## Mazama americana -0.0018816918 0.0120936776
## Mazama nemorivaga 0.0227784676 0.0206100670
## Myrmecophaga tridactyla -0.0519079391 0.0266038585
## Nasua nasua -0.0251968372 -0.0236333296
## Panthera onca 0.0759489610 0.0153527753
## Priodontes maximus -0.0656505358 -0.0105885086
## Puma concolor 0.0311077437 0.0052463273
## Tapirus terrestris -0.0790629201 -0.0605304560
## Tayassu pecari 0.0206335301 -0.0066680805
## Dasyprocta fuliginosa -0.0224876382 0.0021554078
## Myoprocta pratti -0.0075577742 0.0354259113
## Sciurus igniventris -0.0042417257 -0.0334856847
## Sciurus spadiceus -0.0038991199 -0.0311260502
## Leopardus tigrinus 0.0075257447 -0.0077455236
## Myoprocta acouchy 0.0040116448 0.0420374030
## Tamandua mexicana 0.0615687233 -0.0131035908
## Tamandua tetradactyla 0.0201943069 -0.0315764619
The cluster results are stored in the spfgr of the
dbFD output list. You can save the results as a data frame
for further use.
# save the results
cluster_g3 <- as.data.frame(ex$spfgr)
cluster_g3$species <- row.names(cluster_g3)
names(cluster_g3) <- c("cluster", "species")
# write.csv(cluster_g4, "cluster_results.csv")
For example we can check how many variation each axis explained like a PCA analysis.
# Variation in 1st axis
(ex$x.values[1])/sum(ex$x.values)
## [1] 0.2046738
# in 2nd axis
(ex$x.values[2])/sum(ex$x.values)
## [1] 0.1629746
# Variation that the total first 3 axis explained
(ex$x.values[1]+ex$x.values[2]+ex$x.values[3])/sum(ex$x.values)
## [1] 0.4842021
Darling et al. (2012) also showed a method to better visualize the functional groups. They ran the Principal Coordinates Analysis (PCoA) ordination of the Gower dissimilarity matrix to visually show the life-history groups and species traits in multivariate space. Some of the code for plotting is adapted from here.
Here is the code to do this:
############
# PCoA
# Use Caillez correction to avoid negative Eigenvalues
spe.gow.pcoa_cal <- pcoa(d, correction = "cailliez")
# number of traits
n <- ncol(species_traits)
# The principal coordinates with positive eigenvalues
points.stand <- scale(spe.gow.pcoa_cal$vectors)
# Covariance matrix
S <- cov(species_traits, points.stand)
# Select only positive eigenvalues
pos_eigen <- spe.gow.pcoa_cal$values$Eigenvalues[seq(ncol(S))]
# Standardize value of covariance
U <- S %*% diag((pos_eigen/(n - 1))^(-0.5))
colnames(U) <- colnames(spe.gow.pcoa_cal$vectors)
# Add values of covariances inside object
spe.gow.pcoa_cal$U <- U
spe.gow.pcoa_cal$vectors <- as.data.frame(spe.gow.pcoa_cal$vectors)
# Take a quick look it needed
# plot_pcoa <- ggplot(spe.gow.pcoa_cal$vectors, aes(Axis.1, Axis.2)) +
# geom_point()
# plot_pcoa
# create a data frame with for the arrows that indicate the direction of the traits
arrows_df <- as.data.frame(spe.gow.pcoa_cal$U/4) # divide by 4 to make the arrows shorter
# give arrow names
arrows_df$variable <- c("Body mass","Nocturnality" ,"Litter size",
"Habitat breadth","Diet_PC1","Life span",
"Climbing","Social")
# add cluster information
spe.gow.pcoa_cal$cluster <- cluster_g3$cluster
# add species names
spe.gow.pcoa_cal$vectors$species <- row.names(spe.gow.pcoa_cal$vectors)
#######################################################
# The big ggplot
#######################################################
#spe.gow.pcoa_cal$vectors$species
#spe.gow.pcoa_cal$cluster
plot_pcoa <-
ggplot(spe.gow.pcoa_cal$vectors, aes(Axis.1, Axis.2)) +
geom_point(aes(colour=as.factor(spe.gow.pcoa_cal$cluster)), size=I(3)) + # add points
geom_text(aes(label = row.names(spe.gow.pcoa_cal$vectors)),size = 4.5,#,nudge_x = 0.05,nudge_y =0.02
position=position_jitter(width=0.055,height=0.05))+ # tune label position with jitter and nudge
geom_segment(data = as.data.frame(spe.gow.pcoa_cal$U/3.5), # adjust arrows length
x = 0, y = 0, alpha = 0.7, # start of the arrow
mapping = aes(xend = Axis.1, yend = Axis.2),
# Add arrow head
arrow = arrow(length = unit(3, "mm"))) + xlim(-0.4, 0.6)+
# Arrow style setting
ggrepel::geom_label_repel(data = arrows_df, aes(label = variable), size = 7,colour = "chocolate")+
# add lines
geom_hline(yintercept = 0, linetype = 2) +
geom_vline(xintercept = 0, linetype = 2) +
# Keep coord equal for each axis
coord_equal() +
# Labs
labs(colour='Cluster',
# Add Explained Variance per axis
x = paste0("Axis 1 (", round(spe.gow.pcoa_cal$values$Rel_corr_eig[1] * 100, 2), "%)"),
y = paste0("Axis 2 (", round(spe.gow.pcoa_cal$values$Rel_corr_eig[2] * 100, 2), "%)")) +
scale_colour_discrete(labels=c( "Cluster 1", "Cluster 2", "Cluster 3"))+
# Theme change
theme_bw()+
theme(legend.title = element_text( size = 20),legend.text = element_text( size = 16),
text = element_text(size = 20))
plot_pcoa
If you have the vector of scientific names of the species ready. You can select a few traits for your species and try to cluster your community. Go ahead and check the metadata of the COMBINE trait database for trait information and download the spreadsheet for trait data here.
You can:
Run through the code and see if you have any questions.
Take a look at the COMBINED data and find the traits you are interested in.
Subset the trait data that you are interest and use
left_join() to merge the trait data with the species
names.
There is a file that have all the code that I used in the tutorial, you can download it here.
See if you got any interesting results.