Click here for other works of the author on RPubs

Load packages

library(vegan)
library(cluster)
library(knitr)

Q1. Use the “dominant” copepod species data (from HW1) and environmental data (from HW3). Apply constrained ordination: 1)determine an appropriate response model; 2)select a set of independent variables; 3) conduct ordination (RDA or CCA); 4) interpret the results. Try different plotting methods; e.g., to display your cluster analysis results on the triplot and interpret how the groups vary with respect to environmental variables.

Import data

Import copepod data, environmental data and extract dominant species data. Also extract group from the result of hierarchical clustering in previous homework.

#import copepod data
copdata = read.table("copepod_composition.txt", header = T)

#import species name
species_name <- read.table("copepodSPlist.txt", sep = "\t")

#assign copepod species name
rownames(copdata) <- as.character(unlist(species_name))

#convert species frequency into percentage
copdata = copdata / 100

#load and extract environmental data
envdata = read.table("enviANDdensity.txt", header = T)
rownames(envdata) <- envdata[, 1]
envdata = envdata[, -1]

#find dominant species
dom = apply(copdata >= 0.02, 2, which)
dom = sort(unique(unlist(dom)))
dom = t(copdata[dom, ])

#normalize species data
dom_log = log(dom + 1)
dom_norm = dom_log / rowSums(dom_log)

#calculate distance matrix
dist_dom = vegdist(dom, method = "jaccard")

#perform hierarchical clustring
hc_dom = agnes(dist_dom, method = "ward")
group = cutree(hc_dom, 3)
group = factor(group, labels = c("Spring", "Summer", "Winter"))

Determine an appropriate response model

Using the gradient(axis) length from DCA, determine whether linear or unimodel model is more appropriate.

dom_dca <- decorana(dom_log)
dom_dca
## 
## Call:
## decorana(veg = dom_log) 
## 
## Detrended correspondence analysis with 26 segments.
## Rescaling of axes with 4 iterations.
## 
##                   DCA1   DCA2   DCA3    DCA4
## Eigenvalues     0.5343 0.2656 0.2181 0.11820
## Decorana values 0.5382 0.2616 0.1193 0.06233
## Axis lengths    3.8675 2.3294 1.5956 1.74621

It appears that a unimodel is more likely, redundancy analysis will be used.

Explore magnitude and dependency of independent variables

Calculate the inertia proportion and VIF in RDA of each independent variable.

con_eig = numeric()
for(i in 1:ncol(envdata)){
    m = rda(dom_norm ~ envdata[, i])
    con_eig[i] = m$CCA$eig / m$tot.chi
}
kable(cbind(con_eig, vif.cca(rda(dom_norm ~ ., data = data.frame(envdata)))), col.names = c("Inertia proportion", "VIF"), digits = 3)
Inertia proportion VIF
Depth 0.079 4.298
Temperature 0.303 332.962
Salinity 0.095 21.043
Fluorescence 0.047 17.872
DissolvedOxygen 0.290 308.418
maxT 0.307 96.952
maxS 0.086 17.003
maxF 0.051 19.787
MaxDO 0.213 16.095
FishDensity 0.095 2.410
CopepodDensity 0.101 3.505

Variable selection

Exclude variables with a VIF larger than 10 to avoid multicollinearity. Variables excluded are Dissolved Oxygen, Temperature and MaxF. The inertia proportion and VIF in RDA of other independent variables are shown below.

envdata_uncor = envdata[, -c(2, 5, 6)]
kable(cbind(con_eig[-c(5, 2, 6)], vif.cca(rda(dom_norm ~ . -DissolvedOxygen - Temperature - maxF, data = data.frame(envdata)))), col.names = c("Inertia proportion", "VIF"), digits = 3)
Inertia proportion VIF
Depth 0.079 3.146
Salinity 0.095 8.462
Fluorescence 0.047 1.318
maxT 0.086 7.757
maxS 0.051 6.780
MaxDO 0.213 8.631
FishDensity 0.095 2.170
CopepodDensity 0.101 1.879

Conduct redundancy analysis

dom_rda <- rda(dom_norm ~ ., data = envdata_uncor)
summary(dom_rda)
## 
## Call:
## rda(formula = dom_norm ~ Depth + Salinity + Fluorescence + maxS +      maxF + MaxDO + FishDensity + CopepodDensity, data = envdata_uncor) 
## 
## Partitioning of variance:
##               Inertia Proportion
## Total         0.10516     1.0000
## Constrained   0.05619     0.5344
## Unconstrained 0.04897     0.4656
## 
## Eigenvalues, and their contribution to the variance 
## 
## Importance of components:
##                         RDA1     RDA2     RDA3     RDA4     RDA5     RDA6
## Eigenvalue            0.0316 0.009497 0.005765 0.003391 0.002735 0.001682
## Proportion Explained  0.3005 0.090310 0.054820 0.032250 0.026010 0.015990
## Cumulative Proportion 0.3005 0.390800 0.445620 0.477870 0.503880 0.519870
##                           RDA7      RDA8     PC1    PC2      PC3      PC4
## Eigenvalue            0.001031 0.0004915 0.01508 0.0113 0.005853 0.004189
## Proportion Explained  0.009800 0.0046700 0.14336 0.1075 0.055660 0.039830
## Cumulative Proportion 0.529680 0.5343500 0.67771 0.7852 0.840840 0.880670
##                            PC5      PC6      PC7      PC8       PC9
## Eigenvalue            0.002989 0.002277 0.001425 0.001231 0.0009821
## Proportion Explained  0.028420 0.021660 0.013550 0.011700 0.0093400
## Cumulative Proportion 0.909090 0.930750 0.944300 0.956000 0.9653400
##                            PC10      PC11      PC12      PC13      PC14
## Eigenvalue            0.0008502 0.0006587 0.0004515 0.0003584 0.0002923
## Proportion Explained  0.0080900 0.0062600 0.0042900 0.0034100 0.0027800
## Cumulative Proportion 0.9734200 0.9796900 0.9839800 0.9873900 0.9901700
##                            PC15      PC16      PC17      PC18      PC19
## Eigenvalue            0.0002147 0.0002144 0.0001516 0.0001064 9.793e-05
## Proportion Explained  0.0020400 0.0020400 0.0014400 0.0010100 9.300e-04
## Cumulative Proportion 0.9922100 0.9942500 0.9956900 0.9967000 9.976e-01
##                            PC20      PC21      PC22      PC23      PC24
## Eigenvalue            8.487e-05 5.921e-05 5.392e-05 2.566e-05 0.0000192
## Proportion Explained  8.100e-04 5.600e-04 5.100e-04 2.400e-04 0.0001800
## Cumulative Proportion 9.984e-01 9.990e-01 9.995e-01 9.998e-01 0.9999400
##                            PC25
## Eigenvalue            6.005e-06
## Proportion Explained  6.000e-05
## Cumulative Proportion 1.000e+00
## 
## Accumulated constrained eigenvalues
## Importance of components:
##                         RDA1     RDA2     RDA3     RDA4     RDA5     RDA6
## Eigenvalue            0.0316 0.009497 0.005765 0.003391 0.002735 0.001682
## Proportion Explained  0.5623 0.169020 0.102590 0.060350 0.048680 0.029930
## Cumulative Proportion 0.5623 0.731360 0.833950 0.894300 0.942980 0.972900
##                           RDA7      RDA8
## Eigenvalue            0.001031 0.0004915
## Proportion Explained  0.018350 0.0087500
## Cumulative Proportion 0.991250 1.0000000
## 
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores:  1.364865 
## 
## 
## Species scores
## 
##                                               RDA1       RDA2       RDA3
## Acartia negligence                      -1.613e-02  0.0066936  0.0166971
## Acartia pacifica                        -2.440e-02 -0.2003945 -0.1056387
## Calanus sinicus                          1.162e-01 -0.0089688  0.1765352
## Canthocalanus pauper                    -9.446e-02  0.0471382  0.0161558
## Cosmocalanus darwinii                   -2.292e-03 -0.0021161  0.0071617
## Undinula vulgaris                       -1.823e-02  0.0125965  0.0098149
## Clausocalanus furcatus                  -2.372e-02  0.0161094  0.0505102
## Clausocalanus minor                     -4.567e-03 -0.0004279  0.0340161
## Subeucalanus pileatus                   -1.545e-02 -0.0009436  0.0092279
## Subeucalanus subcrassus                 -5.152e-03 -0.0173128  0.0140830
## Subeucalanus copepodid                  -9.980e-03  0.0097841 -0.0117043
## Euchaeta concinna                       -4.878e-03 -0.0071807  0.0177724
## Euchaeta copepodid                      -3.977e-02 -0.2677182  0.0467150
## Acrocalanus gibber                      -7.064e-02  0.0265168  0.0112827
## Acrocalanus gracilis                    -1.661e-02  0.0102458 -0.0005696
## Calocalanus gracilis                     8.131e-05  0.0012499 -0.0005341
## Calocalanus pavoninus                   -1.600e-02  0.0089847  0.0016528
## Calocalanus plumulosus                  -2.982e-03  0.0036022 -0.0021647
## Calocalanus styliremis                  -8.187e-03  0.0047680  0.0044886
## Paracalanus aculeatus                    1.688e-02 -0.1194493  0.0764932
## Paracalanus pavus                        5.733e-01 -0.0217272 -0.0924690
## Paracalanus serrulus                    -4.384e-02  0.0341911 -0.0240809
## Parvocalanus crassirostris              -2.297e-01  0.0054013 -0.0682110
## Scolecithricella longispinosa           -1.676e-02 -0.0499967  0.0025899
## Temora stylifera                         6.431e-03  0.0045742 -0.0023920
## Temora turbinata                        -1.290e-01  0.0570917  0.0031643
## Oithona attenuata                       -7.444e-02  0.0214674 -0.0200723
## Oithona brevicornis                     -2.208e-02  0.0048290 -0.0513257
## Oithona fallax                          -1.187e-02 -0.0093674  0.0152255
## Oithona plumifera                       -2.470e-02  0.0237048  0.0163934
## Oithona similis                          1.993e-01  0.0445355 -0.0220831
## Euterpina acutifrons                    -5.246e-02  0.0427563 -0.1404274
## Corycaeus (Ditrichocorycaeus)  affinis   2.561e-01  0.0991534  0.0540791
## Corycaeus (Ditrichocorycaeus) dahli     -1.824e-02  0.0457757 -0.0077550
## Corycaeus (Ditrichocorycaeus) lubbocki  -2.238e-02  0.0007678 -0.0732986
## Corycaeus (Ditrichocorycaeus) subtilis  -6.012e-03  0.0070995 -0.0020088
## Corycaeus (Onychocorycaeus) giesbrechti -1.525e-02  0.0146091  0.0015415
## Farranula gibbula                       -9.276e-03  0.0029722  0.0041894
## Corycaeidae copepodid                    2.043e-02 -0.0023643 -0.0058098
## Oncaea conifera                         -3.699e-02  0.0383077  0.0135828
## Oncaea media                            -3.157e-03  0.0040357  0.0020102
## Oncaea mediterranea                     -1.607e-03 -0.0001428  0.0114126
## Oncaea venusta                          -9.757e-02  0.1091487  0.0137496
##                                               RDA4       RDA5       RDA6
## Acartia negligence                       0.0061294 -1.271e-02  0.0004109
## Acartia pacifica                        -0.0507664  5.902e-02  0.0549766
## Calanus sinicus                         -0.0129707  3.529e-02 -0.0551542
## Canthocalanus pauper                    -0.0046357  1.759e-03  0.0285574
## Cosmocalanus darwinii                    0.0091265 -1.089e-02 -0.0052489
## Undinula vulgaris                        0.0016305 -4.718e-03  0.0142715
## Clausocalanus furcatus                   0.0446816  6.290e-04  0.0100279
## Clausocalanus minor                      0.0287467 -8.337e-03 -0.0240464
## Subeucalanus pileatus                    0.0076074  1.844e-02 -0.0163972
## Subeucalanus subcrassus                 -0.0018486 -2.563e-03 -0.0047724
## Subeucalanus copepodid                   0.0020443 -1.713e-02  0.0121885
## Euchaeta concinna                        0.0018575  1.240e-03 -0.0075792
## Euchaeta copepodid                      -0.0133739  1.414e-02 -0.0022115
## Acrocalanus gibber                      -0.0112257 -3.412e-02  0.0168901
## Acrocalanus gracilis                    -0.0055522 -7.555e-05  0.0054843
## Calocalanus gracilis                     0.0102284 -9.225e-03  0.0003199
## Calocalanus pavoninus                   -0.0015867  1.563e-04  0.0121208
## Calocalanus plumulosus                   0.0006416 -7.128e-03  0.0024492
## Calocalanus styliremis                   0.0011451  1.980e-04  0.0044314
## Paracalanus aculeatus                    0.0390604 -5.551e-02 -0.0147536
## Paracalanus pavus                        0.0418777 -8.503e-02  0.0154238
## Paracalanus serrulus                    -0.0121168  1.499e-02  0.0126946
## Parvocalanus crassirostris              -0.0966554 -1.034e-01 -0.0429043
## Scolecithricella longispinosa           -0.0019227  1.964e-02 -0.0151500
## Temora stylifera                         0.0060545 -1.856e-02 -0.0044696
## Temora turbinata                        -0.0631756 -3.710e-02  0.0266675
## Oithona attenuata                       -0.0369333 -1.812e-02  0.0054565
## Oithona brevicornis                      0.0225786  3.963e-02 -0.0372480
## Oithona fallax                           0.0078700  1.226e-02 -0.0104882
## Oithona plumifera                        0.0573811 -3.896e-02 -0.0099797
## Oithona similis                         -0.0551523  8.818e-02  0.0201849
## Euterpina acutifrons                     0.0455449  3.786e-02 -0.0775412
## Corycaeus (Ditrichocorycaeus)  affinis  -0.1349144  2.563e-02 -0.0239997
## Corycaeus (Ditrichocorycaeus) dahli     -0.0133767 -9.218e-03  0.0421101
## Corycaeus (Ditrichocorycaeus) lubbocki   0.0166318  4.578e-02 -0.0557842
## Corycaeus (Ditrichocorycaeus) subtilis   0.0090116  2.674e-03  0.0057958
## Corycaeus (Onychocorycaeus) giesbrechti -0.0007218  2.328e-03  0.0070142
## Farranula gibbula                        0.0072724 -4.273e-03  0.0094762
## Corycaeidae copepodid                    0.0251405 -5.910e-03  0.0115081
## Oncaea conifera                          0.0254727  2.434e-02  0.0438139
## Oncaea media                             0.0088167 -2.986e-03  0.0034507
## Oncaea mediterranea                      0.0100475  1.847e-03 -0.0009639
## Oncaea venusta                           0.0803297  3.992e-02  0.0429679
## 
## 
## Site scores (weighted sums of species scores)
## 
##          RDA1     RDA2      RDA3      RDA4       RDA5      RDA6
## p1   0.378183  0.33566  0.409134 -0.990324  0.3092341 -0.570460
## p3   0.494807  0.15886 -0.128825 -0.252071  0.0363804  0.075256
## p4   0.550405  0.19149 -0.175446 -0.386636 -0.2197150  0.016142
## p6   0.530685  0.08469 -0.238932  0.130117 -0.1515723  0.213461
## p13  0.268846  0.08685 -0.082499  0.233543  0.0386053  0.254653
## p16  0.486329 -0.11902 -0.347987  0.526109 -0.8392949  0.271107
## p19  0.188306 -0.01875 -0.117454  0.413577 -0.3884204  0.076249
## p21  0.318032  0.14689  0.263248 -0.396238  0.4371574 -0.207531
## p23  0.033390 -0.03511  0.091596  0.463644 -0.2351448  0.006492
## p25  0.208133  0.00444 -0.024171  0.375773 -0.4026001 -0.088705
## s18  0.008868  0.06460 -0.195732  0.184048 -0.0002522  0.253895
## s19 -0.063310  0.07659 -0.224811  0.129697 -0.0419520 -0.187762
## s20 -0.117986  0.09805 -0.521952  0.069970  0.1508586 -0.937519
## s22 -0.208821  0.11436 -0.462668  0.001804  0.4306807 -0.702021
## s23 -0.179750  0.09621 -0.267371  0.112627  0.5875032 -0.039737
## s25 -0.152502  0.18045 -0.185277 -0.041166 -0.0155947  0.042104
## s27 -0.292515  0.21675 -0.064658 -0.344936 -0.3413938 -0.005944
## s29 -0.394324  0.21210 -0.189847 -0.820066 -0.6689440 -0.310707
## sA  -0.165784  0.21211  0.028269  0.166720  0.0203732  0.278107
## sB  -0.171299  0.24621  0.169263  0.550501  0.2827636  0.528109
## sC  -0.232010  0.21650 -0.035076 -0.343489 -0.3288231  0.255799
## sD  -0.195314  0.21300 -0.057387 -0.095009 -0.2796094  0.180029
## sE  -0.115673  0.24522  0.012515  0.238684  0.0330773  0.440605
## sF  -0.261980  0.32328  0.064630  0.158064  0.4075442  0.721562
## sG  -0.441491  0.18153 -0.326229 -0.992307 -0.9680952 -0.720177
## w22 -0.013037 -0.73337 -0.003308 -0.017838  0.3519897  0.222538
## w23 -0.105439 -0.86856 -0.310856 -0.305174  0.7628166  0.814855
## w25  0.101813 -0.60280  0.044396  0.186810 -0.1423778  0.130233
## w27 -0.069752 -0.45113  0.299674  0.150726  0.0877136  0.045221
## w29  0.097248 -0.22048  0.122250  0.133192 -0.2263512  0.169066
## wA  -0.072200 -0.31488  0.716717  0.040845  0.2538619 -0.474141
## wB  -0.136782 -0.06896  0.735362  0.227126  0.5009640 -0.477626
## wC  -0.149859  0.02885  0.491629  0.314842  0.2094188 -0.051301
## wD  -0.125217 -0.30163  0.511803  0.176835  0.3491984 -0.221853
## 
## 
## Site constraints (linear combinations of constraining variables)
## 
##          RDA1      RDA2      RDA3     RDA4      RDA5     RDA6
## p1   0.377157  0.166911  0.132759 -0.24429 -0.108799 -0.08927
## p3   0.381919  0.170042  0.146875 -0.29547  0.085654 -0.09851
## p4   0.380880  0.224745  0.114751 -0.19421  0.118073 -0.05643
## p6   0.578294  0.010030 -0.061755 -0.28611  0.270135  0.12358
## p13  0.255788 -0.040458 -0.017813  0.21671  0.240762  0.21960
## p16  0.279525 -0.133501 -0.220250  0.57402 -0.275382  0.31218
## p19  0.126637  0.056486 -0.039387  0.15467 -0.432181 -0.14747
## p21  0.400553 -0.001720 -0.080404 -0.07853  0.203105  0.18838
## p23  0.008925  0.008747 -0.085838  0.31268 -0.461344 -0.02594
## p25  0.180296  0.103965 -0.025466  0.04236 -0.464902 -0.26394
## s18 -0.107637  0.089693 -0.266981  0.08832  0.334592 -0.41621
## s19 -0.054866 -0.033759 -0.072282 -0.06643 -0.135943 -0.22891
## s20 -0.027403 -0.088096 -0.352210  0.20898  0.231276 -0.54919
## s22 -0.240147  0.062207 -0.504207  0.15198  0.393756 -0.17527
## s23 -0.090744  0.349879 -0.406773  0.09601  0.134167 -0.24323
## s25 -0.133238  0.216673 -0.241368  0.21807 -0.147707  0.47054
## s27 -0.297878 -0.122106  0.048753 -0.12748 -0.058612 -0.02126
## s29 -0.147523  0.329732 -0.116815 -0.34985 -0.265984  0.02499
## sA  -0.136856  0.118901  0.366882  0.12152  0.297438  0.03695
## sB   0.005854  0.125703  0.190425  0.48458  0.158777  0.43340
## sC  -0.401663  0.116530  0.182832 -0.23436  0.032714  0.17057
## sD  -0.150819  0.008598  0.182098  0.26707  0.040635  0.12860
## sE  -0.410558  0.213039 -0.102801 -0.34153  0.011258  0.18992
## sF  -0.164057  0.505893 -0.124254 -0.25060 -0.076425  0.39898
## sG  -0.226431 -0.312398 -0.009409 -0.24386 -0.297112 -0.03580
## w22  0.065072 -0.627123 -0.300520 -0.37275  0.157212  0.27240
## w23 -0.050870 -0.539990 -0.145100 -0.06726  0.136293  0.23280
## w25 -0.121154 -0.408422  0.037758 -0.04960 -0.029172  0.05016
## w27 -0.121082 -0.287972  0.163004 -0.04620 -0.200849 -0.07712
## w29  0.145690 -0.145179  0.039354 -0.08763 -0.370019 -0.22791
## wA  -0.117656 -0.022858  0.487927  0.06768  0.202597 -0.09245
## wB  -0.066314  0.018854  0.462148  0.05868  0.102581 -0.15166
## wC  -0.040468 -0.056224  0.318434  0.13148  0.001981 -0.17145
## wD  -0.079226 -0.076822  0.299634  0.14136  0.171424 -0.18103
## 
## 
## Biplot scores for constraining variables
## 
##                   RDA1     RDA2    RDA3     RDA4      RDA5     RDA6
## Depth          -0.3525  0.10485  0.8121  0.25295  0.290818  0.08751
## Salinity       -0.4467  0.45777  0.2589  0.41626 -0.412253 -0.38119
## Fluorescence   -0.2130  0.11410 -0.6226  0.18357  0.429798 -0.56574
## maxS           -0.4065  0.42489  0.2747  0.64093 -0.154977 -0.26194
## maxF           -0.1999  0.19859 -0.6715  0.38093  0.346589 -0.38204
## MaxDO           0.8190 -0.03405  0.3453  0.02071  0.387484  0.18734
## FishDensity    -0.4420  0.52026 -0.3978  0.12597 -0.197432  0.08749
## CopepodDensity -0.3840  0.68551 -0.4708 -0.09476  0.007639  0.36062

Our RDA model accounts for 53% of inertia (variance).

Visulize result of redundancy analysis with triplots

Groups we get from hierarchical clustering will be shown with ellipse.

#triplot without names of species and sites
ordiplot(dom_rda)
ordiellipse(dom_rda,group,conf = 0.95, lty = 2, label = T)

#triplot with names of species and sites
ordiplot(dom_rda, type = "text")
ordiellipse(dom_rda,group,conf = 0.95, lty = 2)

We can see that the \(1^{st}\) axis of RDA has almost the same direction as MaxDO with Depth and Fluorescence having oppisite direction with it. It also seems that the \(1^{st}\) axis seperates sites in spring from others and the \(2^{nd}\) axis seperates winter from others. However, overlaps do exist and this means these RDA axis might not reflect season differences too well.

Q2. Use the “dominant” copepod species data (from HW1) and environmental data (from HW3). Apply partial RDA (or CCA) to quantify how variance of species community is attributed to biological and physical habitat effects. In enviANDdensity.xls, columns B-J are physical variables, and columns K and L are biological variables.

Extract biological and physical hibitat variables

phy = envdata[, 1:9]
bio = envdata[, 10:11]

Conduct partial RDA for biological and physical hibitat effects

all_rda <- rda(dom_norm ~ ., data = envdata)
phy_rda <- rda(dom_norm, phy, bio)
bio_rda <- rda(dom_norm, bio, phy)
phy_r2 = phy_rda$CCA$tot.chi / phy_rda$tot.chi
bio_r2 = bio_rda$CCA$tot.chi / bio_rda$tot.chi
joint_r2 = all_rda$CCA$tot.chi / all_rda$tot.chi - phy_r2 - bio_r2

Physical variables along account for 53% of variance, biological variables along account for 4% of variance and the joint effect of these variables account for 9% of variance. These variance are not adjusted by number of variables.

Visulize result of variance partitioning

Graphical results are based on adjusted \(R^2\) (variance)

vp = varpart(dom_norm, phy, bio)
plot(vp, bg = c("red", "blue"), Xnames = c("Physical", "Biological"), alpha = 80)

Physical variables account for much more variance than biological variables.