1 Packages needed

library(BiodiversityR) # also loads vegan
library(pvclust)
library(ggplot2)
library(ggsci)
library(ggraph)
library(tidygraph)

2 Introduction

The main objective of this document is to show a pathway of how dendrograms for results that were obtained via pvclust can be added to ordination graphs in ggplot2. This documents builds on previous document where I showed how to generate ordination graphs with vegan, BiodiversityR and ggplot2 and how to create dendrograms for pvclust results with ggraph.

3 Step 1: Conduct the pvclust analysis

3.1 Load the data

The dune meadow data sets have been used as case study in Data Analysis in Community and Landscape Ecology, the Tree Diversity Analysis manual on community ecology and biodiversity analysis (this manual can be downloaded for free from this website ) and various recent examples of generating ordination graphs, species accumulation curves and Renyi diversity profiles.

data(dune)
str(dune)

## 'data.frame':    20 obs. of  30 variables:
##  $ Achimill: num  1 3 0 0 2 2 2 0 0 4 ...
##  $ Agrostol: num  0 0 4 8 0 0 0 4 3 0 ...
##  $ Airaprae: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Alopgeni: num  0 2 7 2 0 0 0 5 3 0 ...
##  $ Anthodor: num  0 0 0 0 4 3 2 0 0 4 ...
##  $ Bellpere: num  0 3 2 2 2 0 0 0 0 2 ...
##  $ Bromhord: num  0 4 0 3 2 0 2 0 0 4 ...
##  $ Chenalbu: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Cirsarve: num  0 0 0 2 0 0 0 0 0 0 ...
##  $ Comapalu: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Eleopalu: num  0 0 0 0 0 0 0 4 0 0 ...
##  $ Elymrepe: num  4 4 4 4 4 0 0 0 6 0 ...
##  $ Empenigr: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Hyporadi: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Juncarti: num  0 0 0 0 0 0 0 4 4 0 ...
##  $ Juncbufo: num  0 0 0 0 0 0 2 0 4 0 ...
##  $ Lolipere: num  7 5 6 5 2 6 6 4 2 6 ...
##  $ Planlanc: num  0 0 0 0 5 5 5 0 0 3 ...
##  $ Poaprat : num  4 4 5 4 2 3 4 4 4 4 ...
##  $ Poatriv : num  2 7 6 5 6 4 5 4 5 4 ...
##  $ Ranuflam: num  0 0 0 0 0 0 0 2 0 0 ...
##  $ Rumeacet: num  0 0 0 0 5 6 3 0 2 0 ...
##  $ Sagiproc: num  0 0 0 5 0 0 0 2 2 0 ...
##  $ Salirepe: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Scorautu: num  0 5 2 2 3 3 3 3 2 3 ...
##  $ Trifprat: num  0 0 0 0 2 5 2 0 0 0 ...
##  $ Trifrepe: num  0 5 2 1 2 5 2 2 3 6 ...
##  $ Vicilath: num  0 0 0 0 0 0 0 0 0 1 ...
##  $ Bracruta: num  0 0 2 2 2 6 2 2 2 2 ...
##  $ Callcusp: num  0 0 0 0 0 0 0 0 0 0 ...

data(dune.env)
summary(dune.env)

##        A1         Moisture Management       Use    Manure
##  Min.   : 2.800   1:7      BF:3       Hayfield:7   0:6   
##  1st Qu.: 3.500   2:4      HF:5       Haypastu:8   1:3   
##  Median : 4.200   4:2      NM:6       Pasture :5   2:4   
##  Mean   : 4.850   5:7      SF:6                    3:4   
##  3rd Qu.: 5.725                                    4:3   
##  Max.   :11.500

3.2 Conduct a Hellinger transformation

The pvclust function can not operate with vegan::vegdist. We will therefore use the Euclidean distance. Prior to calculating the clustering results, the community data set is transformed by the Hellinger method as recommended in this article.

# script generated by the BiodiversityR GUI from the constrained ordination menu
dune.Hellinger <- disttransform(dune, method='hellinger')

3.3 pvclust analysis

To obtain a dendrogram that represents clustering of sites, the community data set needs to be transposed. Among the results of pvclust are a hclust object and a edges object.

set.seed(2)
dune.pv <- pvclust(t(dune.Hellinger), 
                   method.hclust="mcquitty",
                   method.dist="euclidean",
                   nboot=10000)

## Bootstrap (r = 0.5)... Done.
## Bootstrap (r = 0.6)... Done.
## Bootstrap (r = 0.7)... Done.
## Bootstrap (r = 0.8)... Done.
## Bootstrap (r = 0.9)... Done.
## Bootstrap (r = 1.0)... Done.
## Bootstrap (r = 1.1)... Done.
## Bootstrap (r = 1.2)... Done.
## Bootstrap (r = 1.3)... Done.
## Bootstrap (r = 1.4)... Done.

attributes(dune.pv)

## $names
## [1] "hclust"  "edges"   "count"   "msfit"   "nboot"   "r"       "store"  
## [8] "version"
## 
## $class
## [1] "pvclust"

It is possible now to plot the dendrogram for pvlust. The rectangles correspond to thresholds applied to the AU (Approximately Unbiased) p-value (this is a better approximation to unbiased p-values than the BP value).

plot(dune.pv)
pvrect(dune.pv, alpha=0.89, pv="au")

In the step 3, the data on edges from pvclust will be added to data on branch nodes for the ordination graph.

dune.pv$edges

##            si        au        bp       se.si       se.au       se.bp
## 1  0.67367308 0.8900162 0.6500593 0.009770065 0.004678329 0.001583062
## 2  0.62470794 0.8646679 0.6509359 0.010472806 0.005409584 0.001580079
## 3  0.52107100 0.8635731 0.4838776 0.012291046 0.005388623 0.001645255
## 4  0.52113062 0.8525799 0.5174110 0.012057944 0.005657406 0.001645565
## 5  0.95626554 0.9895152 0.8144038 0.002496067 0.000769902 0.001345803
## 6  0.58608873 0.8704015 0.5612605 0.011148662 0.005181026 0.001638187
## 7  0.05836824 0.6989306 0.3388617 0.015113634 0.008862102 0.001550838
## 8  0.00000000 0.6973572 0.2073154 0.000000000 0.009848418 0.001326916
## 9  0.49243363 0.8666564 0.4367195 0.013007827 0.005366611 0.001630304
## 10 0.00000000 0.8071869 0.1465370 0.000000000 0.008704205 0.001155411
## 11 0.29408270 0.8664202 0.2572468 0.018146901 0.005953144 0.001429867
## 12 0.42408147 0.8980210 0.2800115 0.016452438 0.004870102 0.001469841
## 13 0.00000000 0.7942302 0.1805106 0.000000000 0.008490440 0.001256977
## 14 0.71627120 0.9296202 0.5444732 0.009239222 0.003338046 0.001648334
## 15 0.67282027 0.8904846 0.6462837 0.009792951 0.004663206 0.001587233
## 16 0.00000000 0.7942395 0.1463469 0.000000000 0.009031488 0.001154938
## 17 0.00000000 0.6579862 0.1959493 0.000000000 0.010479640 0.001300267
## 18 0.00000000 0.5398721 0.1435282 0.000000000 0.012104246 0.001152906
## 19 1.00000000 1.0000000 1.0000000 0.000000000 0.000000000 0.000000000
##              v         c         pchi
## 1  -0.80604734 0.4205668 0.3088314904
## 2  -0.74469145 0.3568429 0.4865154078
## 3  -0.52804525 0.5684691 0.4163645695
## 4  -0.54560966 0.5019529 0.0003789444
## 5  -1.60138616 0.7071441 0.1291560223
## 6  -0.64122844 0.4870626 0.0199835695
## 7  -0.05287773 0.4684495 0.2447419111
## 8   0.14947842 0.6662930 0.2247349650
## 9  -0.47571606 0.6350079 0.0258432352
## 10  0.09191271 0.9594893 0.6549884988
## 11 -0.22888526 0.8807420 0.0312624364
## 12 -0.34377404 0.9265815 0.4122681792
## 13  0.04611690 0.8673041 0.5504418008
## 14 -0.79233900 0.6806293 0.5457562329
## 15 -0.80220781 0.4269015 0.7693126115
## 16  0.11550532 0.9367252 0.9145713157
## 17  0.22460298 0.6315762 0.9794099838
## 18  0.48224497 0.5823565 0.7408072082
## 19  0.00000000 0.0000000 0.0000000000

4 Step 2: Add a dendrogram to an ordination graph in vegan

First a redundancy analysis is done via vegan::rda.

# script generated by the BiodiversityR GUI from the constrained ordination menu
Ordination.model2 <- rda(dune.Hellinger ~ Management, data=dune.env, scaling="species")
summary(Ordination.model2)

## 
## Call:
## rda(formula = dune.Hellinger ~ Management, data = dune.env, scaling = "species") 
## 
## Partitioning of variance:
##               Inertia Proportion
## Total          0.5559     1.0000
## Constrained    0.1667     0.2999
## Unconstrained  0.3892     0.7001
## 
## Eigenvalues, and their contribution to the variance 
## 
## Importance of components:
##                          RDA1    RDA2    RDA3    PC1     PC2     PC3     PC4
## Eigenvalue            0.09377 0.05304 0.01988 0.1279 0.05597 0.04351 0.03963
## Proportion Explained  0.16869 0.09542 0.03575 0.2300 0.10069 0.07827 0.07129
## Cumulative Proportion 0.16869 0.26411 0.29986 0.5299 0.63054 0.70881 0.78010
##                           PC5     PC6     PC7     PC8     PC9     PC10     PC11
## Eigenvalue            0.03080 0.02120 0.01623 0.01374 0.01138 0.009469 0.007651
## Proportion Explained  0.05541 0.03814 0.02919 0.02471 0.02047 0.017034 0.013764
## Cumulative Proportion 0.83551 0.87365 0.90284 0.92755 0.94802 0.965056 0.978820
##                           PC12     PC13     PC14     PC15      PC16
## Eigenvalue            0.003957 0.003005 0.002485 0.001670 0.0006571
## Proportion Explained  0.007117 0.005406 0.004470 0.003004 0.0011821
## Cumulative Proportion 0.985937 0.991344 0.995814 0.998818 1.0000000
## 
## Accumulated constrained eigenvalues
## Importance of components:
##                          RDA1    RDA2    RDA3
## Eigenvalue            0.09377 0.05304 0.01988
## Proportion Explained  0.56255 0.31822 0.11923
## Cumulative Proportion 0.56255 0.88077 1.00000
## 
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores:  1.80276 
## 
## 
## Species scores
## 
##               RDA1      RDA2      RDA3       PC1       PC2       PC3
## Achimill  0.036568  0.127978  0.016399 -0.188968 -0.070111  0.175399
## Agrostol -0.003429 -0.270611 -0.018285  0.361207  0.005321 -0.014204
## Airaprae -0.127487 -0.011618 -0.005281 -0.120126  0.103133  0.060671
## Alopgeni  0.235229 -0.190348  0.027102  0.163865  0.129547 -0.078901
## Anthodor -0.085534  0.115742 -0.079991 -0.240230  0.117203  0.201272
## Bellpere  0.033345  0.041954  0.083723 -0.081623 -0.079661 -0.076104
## Bromhord  0.093316  0.120369  0.065335 -0.058227 -0.047198  0.043097
## Chenalbu  0.016343 -0.027339  0.008563  0.006896  0.028380  0.004920
## Cirsarve  0.019792 -0.033109  0.010370 -0.009773 -0.008401 -0.025099
## Comapalu -0.110016 -0.010026 -0.004558  0.091463 -0.045993  0.026311
## Eleopalu -0.160847 -0.069849 -0.041472  0.352544 -0.115164  0.126956
## Elymrepe  0.173955 -0.047695 -0.002020 -0.108520 -0.206622 -0.115376
## Empenigr -0.047886 -0.004364 -0.001984 -0.029817  0.061163 -0.018612
## Hyporadi -0.130851  0.036162  0.047182 -0.127285  0.148136  0.027482
## Juncarti -0.057086 -0.003074 -0.101556  0.265930 -0.061713 -0.009618
## Juncbufo  0.102986 -0.052188 -0.062882  0.033316  0.152206 -0.038314
## Lolipere  0.260211  0.153503  0.052002 -0.192751 -0.234269 -0.146056
## Planlanc -0.018339  0.192788 -0.064491 -0.222661  0.025888  0.091459
## Poaprat   0.183546  0.093202  0.019068 -0.196708 -0.136129 -0.115068
## Poatriv   0.377233 -0.046485 -0.039312 -0.019700 -0.010085  0.010877
## Ranuflam -0.113470 -0.073249 -0.022781  0.249877 -0.046445  0.073742
## Rumeacet  0.118187  0.070051 -0.198589 -0.056786  0.065922  0.042193
## Sagiproc  0.076848 -0.060054  0.017557  0.035039  0.222856 -0.152162
## Salirepe -0.197203 -0.017971 -0.008170 -0.014209  0.015031 -0.104243
## Scorautu -0.146131  0.134240  0.015763 -0.059562  0.117768 -0.088453
## Trifprat  0.061485  0.069094 -0.135080 -0.066713 -0.001921  0.069306
## Trifrepe  0.010076  0.151446  0.031054  0.039480  0.082043 -0.081712
## Vicilath -0.014220  0.076123  0.084100 -0.026628  0.009338 -0.057100
## Bracruta -0.057834  0.021502 -0.054847  0.083143  0.070717 -0.115855
## Callcusp -0.107307 -0.059711  0.009214  0.169330 -0.068364  0.107835
## 
## 
## Site scores (weighted sums of species scores)
## 
##       RDA1    RDA2     RDA3      PC1      PC2       PC3
## 1   0.5831  0.1826  0.52222 -0.59008 -0.95148 -0.002365
## 2   0.4752  0.3624  0.90879  0.05773 -0.27927 -0.053465
## 3   0.5157 -0.3323  0.54177 -0.17847 -0.28527 -0.368932
## 4   0.4399 -0.3608  0.65164 -0.15066 -0.12951 -0.386916
## 5   0.2767  0.7135 -0.68027 -0.32785 -0.13648  0.331094
## 6   0.1630  0.7846 -0.99354 -0.24794  0.06413  0.280023
## 7   0.3075  0.7908 -0.69144 -0.29554  0.01116  0.283797
## 8   0.1248 -0.4586 -0.03533  0.58291 -0.02785 -0.296149
## 9   0.4391 -0.3568 -0.47994  0.28843  0.08904 -0.598766
## 10  0.1626  0.8889  0.52513 -0.10281 -0.02239  0.398688
## 11 -0.1034  0.6728  0.63968  0.04508  0.30166 -0.345223
## 12  0.2387 -0.6802 -0.39191  0.13764  0.90201 -0.091365
## 13  0.3253 -0.7706  0.08451  0.12875  0.52983  0.091846
## 14 -0.6531 -0.4778  0.18077  0.45837 -0.26343  0.275022
## 15 -0.6814 -0.5359 -0.46471  0.55931 -0.24900  0.020745
## 16 -0.2720 -1.1011 -0.44903  0.65282 -0.06558  0.757733
## 17 -0.5168  0.5924 -0.02957 -0.74414  0.25120  0.742874
## 18 -0.3035  0.6161  0.46554 -0.42677 -0.21642 -0.831729
## 19 -0.7255  0.1808  0.15665 -0.38151  0.78259 -0.238142
## 20 -0.7960 -0.7107 -0.46098  0.53475 -0.30493  0.031229
## 
## 
## Site constraints (linear combinations of constraining variables)
## 
##       RDA1     RDA2     RDA3      PC1      PC2       PC3
## 1   0.3051 -0.51040  0.15987 -0.59008 -0.95148 -0.002365
## 2   0.1781  0.64135  0.69120  0.05773 -0.27927 -0.053465
## 3   0.3051 -0.51040  0.15987 -0.17847 -0.28527 -0.368932
## 4   0.3051 -0.51040  0.15987 -0.15066 -0.12951 -0.386916
## 5   0.2622  0.29468 -0.57610 -0.32785 -0.13648  0.331094
## 6   0.2622  0.29468 -0.57610 -0.24794  0.06413  0.280023
## 7   0.2622  0.29468 -0.57610 -0.29554  0.01116  0.283797
## 8   0.2622  0.29468 -0.57610  0.58291 -0.02785 -0.296149
## 9   0.2622  0.29468 -0.57610  0.28843  0.08904 -0.598766
## 10  0.1781  0.64135  0.69120 -0.10281 -0.02239  0.398688
## 11  0.1781  0.64135  0.69120  0.04508  0.30166 -0.345223
## 12  0.3051 -0.51040  0.15987  0.13764  0.90201 -0.091365
## 13  0.3051 -0.51040  0.15987  0.12875  0.52983  0.091846
## 14 -0.6127 -0.05584 -0.02538  0.45837 -0.26343  0.275022
## 15 -0.6127 -0.05584 -0.02538  0.55931 -0.24900  0.020745
## 16  0.3051 -0.51040  0.15987  0.65282 -0.06558  0.757733
## 17 -0.6127 -0.05584 -0.02538 -0.74414  0.25120  0.742874
## 18 -0.6127 -0.05584 -0.02538 -0.42677 -0.21642 -0.831729
## 19 -0.6127 -0.05584 -0.02538 -0.38151  0.78259 -0.238142
## 20 -0.6127 -0.05584 -0.02538  0.53475 -0.30493  0.031229
## 
## 
## Biplot scores for constraining variables
## 
##                 RDA1     RDA2     RDA3 PC1 PC2 PC3
## ManagementHF  0.3756  0.42205 -0.82512   0   0   0
## ManagementNM -0.9950 -0.09068 -0.04122   0   0   0
## ManagementSF  0.4955 -0.82890  0.25963   0   0   0
## 
## 
## Centroids for factor constraints
## 
##                 RDA1     RDA2     RDA3 PC1 PC2 PC3
## ManagementBF  0.1781  0.64135  0.69120   0   0   0
## ManagementHF  0.2622  0.29468 -0.57610   0   0   0
## ManagementNM -0.6127 -0.05584 -0.02538   0   0   0
## ManagementSF  0.3051 -0.51040  0.15987   0   0   0

These results can be plotted via the ordiplot function.

plot2 <- ordiplot(Ordination.model2, choices=c(1,2))

A dendrogram can be added to the ordination diagram via the vegan::ordicluster function.

ordiplot(plot2)
cluster.data <- ordicluster(plot2, cl=as.hclust(dune.pv$hclust))

The data generated by ordicluster contain information to plot scores and segments:

str(cluster.data)

## List of 2
##  $ scores  : num [1:19, 1:3] 0.235 0.249 0.478 0.227 -0.203 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:3] "x" "y" "w"
##  $ segments:'data.frame':    19 obs. of  5 variables:
##   ..$ x1 : num [1:19] 0.163 0.277 0.516 0.163 -0.103 ...
##   ..$ y1 : num [1:19] 0.785 0.713 -0.332 0.889 0.673 ...
##   ..$ x2 : num [1:19] 0.308 0.235 0.44 0.249 -0.304 ...
##   ..$ y2 : num [1:19] 0.791 0.788 -0.361 0.763 0.616 ...
##   ..$ col: chr [1:19] "#000000" "#000000" "#000000" "#000000" ...
##  - attr(*, "class")= chr "ordicluster"

As shown in the document on how to generate ordination graphs with vegan, BiodiversityR and ggplot2, functions such as BiodiversityR::sites.long and BiodiversityR::axis.long extract data that can be plotted with ggplot2.

sites.long2 <- sites.long(plot2)
head(sites.long2)

##       axis1      axis2 labels
## 1 0.5831201  0.1825554      1
## 2 0.4751761  0.3624230      2
## 3 0.5157071 -0.3323011      3
## 4 0.4398671 -0.3607994      4
## 5 0.2767400  0.7134658      5
## 6 0.1630074  0.7845782      6

axis.long2 <- axis.long(Ordination.model2, choices=c(1, 2))
axis.long2

##   axis     ggplot        label
## 1    1 xlab.label RDA1 (16.9%)
## 2    2 ylab.label  RDA2 (9.5%)

In a similar way, data on the branch nodes and edges can be extracted. The ID field is created to link information from pvclust (section 4).

branches.scores <- data.frame(cluster.data$scores)
branches.scores$ID <- c(1:nrow(branches.scores))
head(branches.scores)

##            x          y w ID
## 1  0.2352643  0.7876763 2  1
## 2  0.2490896  0.7629395 3  2
## 3  0.4777871 -0.3465502 2  3
## 4  0.2274723  0.7944191 4  4
## 5 -0.2034553  0.6444570 2  5
## 6  0.2819862 -0.7254065 2  6

edges.scores <- data.frame(cluster.data$segments)
head(edges.scores)

##           x1         y1         x2         y2     col
## 1  0.1630074  0.7845782  0.3075213  0.7907744 #000000
## 2  0.2767400  0.7134658  0.2352643  0.7876763 #000000
## 3  0.5157071 -0.3323011  0.4398671 -0.3607994 #000000
## 4  0.1626206  0.8888581  0.2490896  0.7629395 #000000
## 5 -0.1033942  0.6727705 -0.3035164  0.6161435 #000000
## 6  0.2387007 -0.6802038  0.3252716 -0.7706092 #000000

Now the ordination with the dendrogram can be created in ggplot2.

BioR.theme <- theme(
        panel.background = element_blank(),
        panel.border = element_blank(),
        panel.grid = element_blank(),
        axis.line = element_line("gray25"),
        text = element_text(size = 12),
        axis.text = element_text(size = 10, colour = "gray25"),
        axis.title = element_text(size = 14, colour = "gray25"),
        legend.title = element_text(size = 14),
        legend.text = element_text(size = 14),
        legend.key = element_blank())

attach(dune.env)

plotgg2 <- ggplot() + 
    geom_vline(xintercept = c(0), color = "grey70", linetype = 2) +
    geom_hline(yintercept = c(0), color = "grey70", linetype = 2) +  
    xlab(axis.long2[1, "label"]) +
    ylab(axis.long2[2, "label"]) +  
    scale_x_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +
    scale_y_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +    
    geom_point(data=sites.long2, 
               aes(x=axis1, y=axis2, colour=Management, shape=Management), 
               alpha=0.9, size=5) +
    geom_segment(data=edges.scores,
               aes(x=x1, y=y1, xend=x2, yend=y2),
               colour="red", size=0.8) +
    geom_point(data=branches.scores, 
               aes(x=x, y=y), 
               colour="black", size=1.5) +
    BioR.theme +
    ggsci::scale_colour_npg() +
    coord_fixed(ratio=1)

plotgg2

5 Step 3: Add information for nodes

Information is added first about the merging process and the clustering height. In the notations used by hclust, negative numbers refer to agglomerations of singletons, whereas positive numbers refer to non-singletons. The information that is added thus shows that row 1 is a branch node at height 0.3939… that has ID 1 (as also in the pvclust dendrogram above) and that corresponds to the merging of sites 6 and 7. The branch node with ID 2 corresponds to the merger of site 5 with the non-singleton 1.

merge.data <- data.frame(cbind(dune.pv$hclust$merge, dune.pv$hclust$height))
names(merge.data) <- c("m1", "m2", "height")
merge.data$ID <- c(1:nrow(merge.data))
 
branches.scores2 <- left_join(branches.scores, merge.data, by="ID")
head(branches.scores2)

##            x          y w ID  m1  m2    height
## 1  0.2352643  0.7876763 2  1  -6  -7 0.3939843
## 2  0.2490896  0.7629395 3  2  -5   1 0.5001389
## 3  0.4777871 -0.3465502 2  3  -3  -4 0.5400578
## 4  0.2274723  0.7944191 4  4 -10   2 0.6026141
## 5 -0.2034553  0.6444570 2  5 -11 -18 0.6075805
## 6  0.2819862 -0.7254065 2  6 -12 -13 0.6159341

Via the ID field (this also is the sequence of hierarchical clustering), probability values are now added.

edges.pv <- data.frame(dune.pv$edges)
edges.pv$ID <- c(1:nrow(edges.pv))

branches.scores3 <- left_join(branches.scores2, edges.pv, by="ID")
head(branches.scores3)

##            x          y w ID  m1  m2    height        si        au        bp
## 1  0.2352643  0.7876763 2  1  -6  -7 0.3939843 0.6736731 0.8900162 0.6500593
## 2  0.2490896  0.7629395 3  2  -5   1 0.5001389 0.6247079 0.8646679 0.6509359
## 3  0.4777871 -0.3465502 2  3  -3  -4 0.5400578 0.5210710 0.8635731 0.4838776
## 4  0.2274723  0.7944191 4  4 -10   2 0.6026141 0.5211306 0.8525799 0.5174110
## 5 -0.2034553  0.6444570 2  5 -11 -18 0.6075805 0.9562655 0.9895152 0.8144038
## 6  0.2819862 -0.7254065 2  6 -12 -13 0.6159341 0.5860887 0.8704015 0.5612605
##         se.si       se.au       se.bp          v         c         pchi
## 1 0.009770065 0.004678329 0.001583062 -0.8060473 0.4205668 0.3088314904
## 2 0.010472806 0.005409584 0.001580079 -0.7446914 0.3568429 0.4865154078
## 3 0.012291046 0.005388623 0.001645255 -0.5280452 0.5684691 0.4163645695
## 4 0.012057944 0.005657406 0.001645565 -0.5456097 0.5019529 0.0003789444
## 5 0.002496067 0.000769902 0.001345803 -1.6013862 0.7071441 0.1291560223
## 6 0.011148662 0.005181026 0.001638187 -0.6412284 0.4870626 0.0199835695

As the information on edges is in the same sequence as the scores, the information on the probabilities can easily be added:

edges.scores2 <- data.frame(edges.scores, au=branches.scores3$au)
head(edges.scores2)

##           x1         y1         x2         y2     col        au
## 1  0.1630074  0.7845782  0.3075213  0.7907744 #000000 0.8900162
## 2  0.2767400  0.7134658  0.2352643  0.7876763 #000000 0.8646679
## 3  0.5157071 -0.3323011  0.4398671 -0.3607994 #000000 0.8635731
## 4  0.1626206  0.8888581  0.2490896  0.7629395 #000000 0.8525799
## 5 -0.1033942  0.6727705 -0.3035164  0.6161435 #000000 0.9895152
## 6  0.2387007 -0.6802038  0.3252716 -0.7706092 #000000 0.8704015

6 Step 4: Generate the dendrogram with ggraph

Now the information on the Approximately Unbiased p-value can be used to differentiate between significant and non-significant nodes at a threshold of 0.89.

plotgg2 <- ggplot() + 
    geom_vline(xintercept = c(0), color = "grey70", linetype = 2) +
    geom_hline(yintercept = c(0), color = "grey70", linetype = 2) +  
    xlab(axis.long2[1, "label"]) +
    ylab(axis.long2[2, "label"]) +  
    scale_x_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +
    scale_y_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +    
    geom_point(data=sites.long2, 
               aes(x=axis1, y=axis2, shape=Management), 
               colour="darkolivegreen4", alpha=0.9, size=5) +
    geom_segment(data=edges.scores2,
               aes(x=x1, y=y1, xend=x2, yend=y2, colour=au>=0.89, 
                   size=as.character(round(au, digits=1))),
               show.legend=FALSE) +
    geom_point(data=branches.scores3, 
               aes(x=x, y=y, colour=au>=0.89), 
               size=2) +
    geom_text(data=sites.long2,
              aes(x=axis1, y=axis2, label=labels)) +
    BioR.theme +
    ggsci::scale_colour_npg() +
    scale_size_manual(values=c(0.7, 0.8, 0.9, 1.5, 1.5)) +
    coord_fixed(ratio=1)

plotgg2

7 Session Information

sessionInfo()

## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.1252 
## [2] LC_CTYPE=English_United Kingdom.1252   
## [3] LC_MONETARY=English_United Kingdom.1252
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=English_United Kingdom.1252    
## 
## attached base packages:
## [1] tcltk     stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] tidygraph_1.2.0      ggraph_2.0.4         ggsci_2.9           
##  [4] ggplot2_3.3.2        pvclust_2.2-0        BiodiversityR_2.12-2
##  [7] rgl_0.100.54         vegan3d_1.1-2        vegan_2.5-6         
## [10] lattice_0.20-41      permute_0.9-5       
## 
## loaded via a namespace (and not attached):
##   [1] minqa_1.2.4             colorspace_1.4-1        ellipsis_0.3.1         
##   [4] class_7.3-17            rio_0.5.16              htmlTable_2.0.0        
##   [7] base64enc_0.1-3         rstudioapi_0.11         farver_2.0.3           
##  [10] graphlayouts_0.7.1      ggrepel_0.8.2           splines_4.0.2          
##  [13] knitr_1.28              effects_4.1-4           polyclip_1.10-0        
##  [16] Formula_1.2-3           jsonlite_1.6.1          nloptr_1.2.2.1         
##  [19] cluster_2.1.0           Rcmdr_2.6-2             png_0.1-7              
##  [22] ggforce_0.3.2           shiny_1.4.0.2           compiler_4.0.2         
##  [25] backports_1.1.7         Matrix_1.2-18           fastmap_1.0.1          
##  [28] survey_4.0              tweenr_1.0.1            later_1.1.0.1          
##  [31] tcltk2_1.2-11           acepack_1.4.1           htmltools_0.5.0        
##  [34] tools_4.0.2             igraph_1.2.6            gtable_0.3.0           
##  [37] glue_1.4.1              dplyr_1.0.2             Rcpp_1.0.4.6           
##  [40] carData_3.0-4           cellranger_1.1.0        vctrs_0.3.4            
##  [43] nlme_3.1-148            crosstalk_1.1.0.1       xfun_0.15              
##  [46] stringr_1.4.0           openxlsx_4.1.5          lme4_1.1-23            
##  [49] mime_0.9                miniUI_0.1.1.1          lifecycle_0.2.0        
##  [52] RcmdrMisc_2.7-0         statmod_1.4.34          MASS_7.3-51.6          
##  [55] zoo_1.8-8               scales_1.1.1            hms_0.5.3              
##  [58] promises_1.1.1          parallel_4.0.2          sandwich_2.5-1         
##  [61] RColorBrewer_1.1-2      yaml_2.2.1              curl_4.3               
##  [64] gridExtra_2.3           rpart_4.1-15            latticeExtra_0.6-29    
##  [67] stringi_1.4.6           nortest_1.0-4           e1071_1.7-3            
##  [70] checkmate_2.0.0         boot_1.3-25             zip_2.0.4              
##  [73] manipulateWidget_0.10.1 rlang_0.4.8             pkgconfig_2.0.3        
##  [76] evaluate_0.14           purrr_0.3.4             labeling_0.3           
##  [79] htmlwidgets_1.5.1       tidyselect_1.1.0        magrittr_1.5           
##  [82] R6_2.4.1                generics_0.1.0          Hmisc_4.4-0            
##  [85] DBI_1.1.0               pillar_1.4.4            haven_2.3.1            
##  [88] foreign_0.8-80          withr_2.2.0             mgcv_1.8-31            
##  [91] survival_3.1-12         scatterplot3d_0.3-41    abind_1.4-5            
##  [94] nnet_7.3-14             tibble_3.0.1            crayon_1.3.4           
##  [97] car_3.0-8               relimp_1.0-5            rmarkdown_2.3          
## [100] viridis_0.5.1           jpeg_0.1-8.1            grid_4.0.2             
## [103] readxl_1.3.1            data.table_1.12.8       forcats_0.5.0          
## [106] digest_0.6.25           webshot_0.5.2           xtable_1.8-4           
## [109] tidyr_1.1.2             httpuv_1.5.4            munsell_0.5.0          
## [112] viridisLite_0.3.0       mitools_2.4

Adding dendrograms to ordination graphs with pvclust, vegan, BiodiversityR and ggplot2

Roeland Kindt

10/12/2020