Click here for other works of the author on RPubs
library(vegan)
library(cluster)
library(knitr)
Import copepod data, environmental data and extract dominant species data. Also extract group from the result of hierarchical clustering in previous homework.
#import copepod data
copdata = read.table("copepod_composition.txt", header = T)
#import species name
species_name <- read.table("copepodSPlist.txt", sep = "\t")
#assign copepod species name
rownames(copdata) <- as.character(unlist(species_name))
#convert species frequency into percentage
copdata = copdata / 100
#load and extract environmental data
envdata = read.table("enviANDdensity.txt", header = T)
rownames(envdata) <- envdata[, 1]
envdata = envdata[, -1]
#find dominant species
dom = apply(copdata >= 0.02, 2, which)
dom = sort(unique(unlist(dom)))
dom = t(copdata[dom, ])
#normalize species data
dom_log = log(dom + 1)
dom_norm = dom_log / rowSums(dom_log)
#calculate distance matrix
dist_dom = vegdist(dom, method = "jaccard")
#perform hierarchical clustring
hc_dom = agnes(dist_dom, method = "ward")
group = cutree(hc_dom, 3)
group = factor(group, labels = c("Spring", "Summer", "Winter"))
Using the gradient(axis) length from DCA, determine whether linear or unimodel model is more appropriate.
dom_dca <- decorana(dom_log)
dom_dca
##
## Call:
## decorana(veg = dom_log)
##
## Detrended correspondence analysis with 26 segments.
## Rescaling of axes with 4 iterations.
##
## DCA1 DCA2 DCA3 DCA4
## Eigenvalues 0.5343 0.2656 0.2181 0.11820
## Decorana values 0.5382 0.2616 0.1193 0.06233
## Axis lengths 3.8675 2.3294 1.5956 1.74621
It appears that a unimodel is more likely, redundancy analysis will be used.
Calculate the inertia proportion and VIF in RDA of each independent variable.
con_eig = numeric()
for(i in 1:ncol(envdata)){
m = rda(dom_norm ~ envdata[, i])
con_eig[i] = m$CCA$eig / m$tot.chi
}
kable(cbind(con_eig, vif.cca(rda(dom_norm ~ ., data = data.frame(envdata)))), col.names = c("Inertia proportion", "VIF"), digits = 3)
| Inertia proportion | VIF | |
|---|---|---|
| Depth | 0.079 | 4.298 |
| Temperature | 0.303 | 332.962 |
| Salinity | 0.095 | 21.043 |
| Fluorescence | 0.047 | 17.872 |
| DissolvedOxygen | 0.290 | 308.418 |
| maxT | 0.307 | 96.952 |
| maxS | 0.086 | 17.003 |
| maxF | 0.051 | 19.787 |
| MaxDO | 0.213 | 16.095 |
| FishDensity | 0.095 | 2.410 |
| CopepodDensity | 0.101 | 3.505 |
Exclude variables with a VIF larger than 10 to avoid multicollinearity. Variables excluded are Dissolved Oxygen, Temperature and MaxF. The inertia proportion and VIF in RDA of other independent variables are shown below.
envdata_uncor = envdata[, -c(2, 5, 6)]
kable(cbind(con_eig[-c(5, 2, 6)], vif.cca(rda(dom_norm ~ . -DissolvedOxygen - Temperature - maxF, data = data.frame(envdata)))), col.names = c("Inertia proportion", "VIF"), digits = 3)
| Inertia proportion | VIF | |
|---|---|---|
| Depth | 0.079 | 3.146 |
| Salinity | 0.095 | 8.462 |
| Fluorescence | 0.047 | 1.318 |
| maxT | 0.086 | 7.757 |
| maxS | 0.051 | 6.780 |
| MaxDO | 0.213 | 8.631 |
| FishDensity | 0.095 | 2.170 |
| CopepodDensity | 0.101 | 1.879 |
dom_rda <- rda(dom_norm ~ ., data = envdata_uncor)
summary(dom_rda)
##
## Call:
## rda(formula = dom_norm ~ Depth + Salinity + Fluorescence + maxS + maxF + MaxDO + FishDensity + CopepodDensity, data = envdata_uncor)
##
## Partitioning of variance:
## Inertia Proportion
## Total 0.10516 1.0000
## Constrained 0.05619 0.5344
## Unconstrained 0.04897 0.4656
##
## Eigenvalues, and their contribution to the variance
##
## Importance of components:
## RDA1 RDA2 RDA3 RDA4 RDA5 RDA6
## Eigenvalue 0.0316 0.009497 0.005765 0.003391 0.002735 0.001682
## Proportion Explained 0.3005 0.090310 0.054820 0.032250 0.026010 0.015990
## Cumulative Proportion 0.3005 0.390800 0.445620 0.477870 0.503880 0.519870
## RDA7 RDA8 PC1 PC2 PC3 PC4
## Eigenvalue 0.001031 0.0004915 0.01508 0.0113 0.005853 0.004189
## Proportion Explained 0.009800 0.0046700 0.14336 0.1075 0.055660 0.039830
## Cumulative Proportion 0.529680 0.5343500 0.67771 0.7852 0.840840 0.880670
## PC5 PC6 PC7 PC8 PC9
## Eigenvalue 0.002989 0.002277 0.001425 0.001231 0.0009821
## Proportion Explained 0.028420 0.021660 0.013550 0.011700 0.0093400
## Cumulative Proportion 0.909090 0.930750 0.944300 0.956000 0.9653400
## PC10 PC11 PC12 PC13 PC14
## Eigenvalue 0.0008502 0.0006587 0.0004515 0.0003584 0.0002923
## Proportion Explained 0.0080900 0.0062600 0.0042900 0.0034100 0.0027800
## Cumulative Proportion 0.9734200 0.9796900 0.9839800 0.9873900 0.9901700
## PC15 PC16 PC17 PC18 PC19
## Eigenvalue 0.0002147 0.0002144 0.0001516 0.0001064 9.793e-05
## Proportion Explained 0.0020400 0.0020400 0.0014400 0.0010100 9.300e-04
## Cumulative Proportion 0.9922100 0.9942500 0.9956900 0.9967000 9.976e-01
## PC20 PC21 PC22 PC23 PC24
## Eigenvalue 8.487e-05 5.921e-05 5.392e-05 2.566e-05 0.0000192
## Proportion Explained 8.100e-04 5.600e-04 5.100e-04 2.400e-04 0.0001800
## Cumulative Proportion 9.984e-01 9.990e-01 9.995e-01 9.998e-01 0.9999400
## PC25
## Eigenvalue 6.005e-06
## Proportion Explained 6.000e-05
## Cumulative Proportion 1.000e+00
##
## Accumulated constrained eigenvalues
## Importance of components:
## RDA1 RDA2 RDA3 RDA4 RDA5 RDA6
## Eigenvalue 0.0316 0.009497 0.005765 0.003391 0.002735 0.001682
## Proportion Explained 0.5623 0.169020 0.102590 0.060350 0.048680 0.029930
## Cumulative Proportion 0.5623 0.731360 0.833950 0.894300 0.942980 0.972900
## RDA7 RDA8
## Eigenvalue 0.001031 0.0004915
## Proportion Explained 0.018350 0.0087500
## Cumulative Proportion 0.991250 1.0000000
##
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores: 1.364865
##
##
## Species scores
##
## RDA1 RDA2 RDA3
## Acartia negligence -1.613e-02 0.0066936 0.0166971
## Acartia pacifica -2.440e-02 -0.2003945 -0.1056387
## Calanus sinicus 1.162e-01 -0.0089688 0.1765352
## Canthocalanus pauper -9.446e-02 0.0471382 0.0161558
## Cosmocalanus darwinii -2.292e-03 -0.0021161 0.0071617
## Undinula vulgaris -1.823e-02 0.0125965 0.0098149
## Clausocalanus furcatus -2.372e-02 0.0161094 0.0505102
## Clausocalanus minor -4.567e-03 -0.0004279 0.0340161
## Subeucalanus pileatus -1.545e-02 -0.0009436 0.0092279
## Subeucalanus subcrassus -5.152e-03 -0.0173128 0.0140830
## Subeucalanus copepodid -9.980e-03 0.0097841 -0.0117043
## Euchaeta concinna -4.878e-03 -0.0071807 0.0177724
## Euchaeta copepodid -3.977e-02 -0.2677182 0.0467150
## Acrocalanus gibber -7.064e-02 0.0265168 0.0112827
## Acrocalanus gracilis -1.661e-02 0.0102458 -0.0005696
## Calocalanus gracilis 8.131e-05 0.0012499 -0.0005341
## Calocalanus pavoninus -1.600e-02 0.0089847 0.0016528
## Calocalanus plumulosus -2.982e-03 0.0036022 -0.0021647
## Calocalanus styliremis -8.187e-03 0.0047680 0.0044886
## Paracalanus aculeatus 1.688e-02 -0.1194493 0.0764932
## Paracalanus pavus 5.733e-01 -0.0217272 -0.0924690
## Paracalanus serrulus -4.384e-02 0.0341911 -0.0240809
## Parvocalanus crassirostris -2.297e-01 0.0054013 -0.0682110
## Scolecithricella longispinosa -1.676e-02 -0.0499967 0.0025899
## Temora stylifera 6.431e-03 0.0045742 -0.0023920
## Temora turbinata -1.290e-01 0.0570917 0.0031643
## Oithona attenuata -7.444e-02 0.0214674 -0.0200723
## Oithona brevicornis -2.208e-02 0.0048290 -0.0513257
## Oithona fallax -1.187e-02 -0.0093674 0.0152255
## Oithona plumifera -2.470e-02 0.0237048 0.0163934
## Oithona similis 1.993e-01 0.0445355 -0.0220831
## Euterpina acutifrons -5.246e-02 0.0427563 -0.1404274
## Corycaeus (Ditrichocorycaeus) affinis 2.561e-01 0.0991534 0.0540791
## Corycaeus (Ditrichocorycaeus) dahli -1.824e-02 0.0457757 -0.0077550
## Corycaeus (Ditrichocorycaeus) lubbocki -2.238e-02 0.0007678 -0.0732986
## Corycaeus (Ditrichocorycaeus) subtilis -6.012e-03 0.0070995 -0.0020088
## Corycaeus (Onychocorycaeus) giesbrechti -1.525e-02 0.0146091 0.0015415
## Farranula gibbula -9.276e-03 0.0029722 0.0041894
## Corycaeidae copepodid 2.043e-02 -0.0023643 -0.0058098
## Oncaea conifera -3.699e-02 0.0383077 0.0135828
## Oncaea media -3.157e-03 0.0040357 0.0020102
## Oncaea mediterranea -1.607e-03 -0.0001428 0.0114126
## Oncaea venusta -9.757e-02 0.1091487 0.0137496
## RDA4 RDA5 RDA6
## Acartia negligence 0.0061294 -1.271e-02 0.0004109
## Acartia pacifica -0.0507664 5.902e-02 0.0549766
## Calanus sinicus -0.0129707 3.529e-02 -0.0551542
## Canthocalanus pauper -0.0046357 1.759e-03 0.0285574
## Cosmocalanus darwinii 0.0091265 -1.089e-02 -0.0052489
## Undinula vulgaris 0.0016305 -4.718e-03 0.0142715
## Clausocalanus furcatus 0.0446816 6.290e-04 0.0100279
## Clausocalanus minor 0.0287467 -8.337e-03 -0.0240464
## Subeucalanus pileatus 0.0076074 1.844e-02 -0.0163972
## Subeucalanus subcrassus -0.0018486 -2.563e-03 -0.0047724
## Subeucalanus copepodid 0.0020443 -1.713e-02 0.0121885
## Euchaeta concinna 0.0018575 1.240e-03 -0.0075792
## Euchaeta copepodid -0.0133739 1.414e-02 -0.0022115
## Acrocalanus gibber -0.0112257 -3.412e-02 0.0168901
## Acrocalanus gracilis -0.0055522 -7.555e-05 0.0054843
## Calocalanus gracilis 0.0102284 -9.225e-03 0.0003199
## Calocalanus pavoninus -0.0015867 1.563e-04 0.0121208
## Calocalanus plumulosus 0.0006416 -7.128e-03 0.0024492
## Calocalanus styliremis 0.0011451 1.980e-04 0.0044314
## Paracalanus aculeatus 0.0390604 -5.551e-02 -0.0147536
## Paracalanus pavus 0.0418777 -8.503e-02 0.0154238
## Paracalanus serrulus -0.0121168 1.499e-02 0.0126946
## Parvocalanus crassirostris -0.0966554 -1.034e-01 -0.0429043
## Scolecithricella longispinosa -0.0019227 1.964e-02 -0.0151500
## Temora stylifera 0.0060545 -1.856e-02 -0.0044696
## Temora turbinata -0.0631756 -3.710e-02 0.0266675
## Oithona attenuata -0.0369333 -1.812e-02 0.0054565
## Oithona brevicornis 0.0225786 3.963e-02 -0.0372480
## Oithona fallax 0.0078700 1.226e-02 -0.0104882
## Oithona plumifera 0.0573811 -3.896e-02 -0.0099797
## Oithona similis -0.0551523 8.818e-02 0.0201849
## Euterpina acutifrons 0.0455449 3.786e-02 -0.0775412
## Corycaeus (Ditrichocorycaeus) affinis -0.1349144 2.563e-02 -0.0239997
## Corycaeus (Ditrichocorycaeus) dahli -0.0133767 -9.218e-03 0.0421101
## Corycaeus (Ditrichocorycaeus) lubbocki 0.0166318 4.578e-02 -0.0557842
## Corycaeus (Ditrichocorycaeus) subtilis 0.0090116 2.674e-03 0.0057958
## Corycaeus (Onychocorycaeus) giesbrechti -0.0007218 2.328e-03 0.0070142
## Farranula gibbula 0.0072724 -4.273e-03 0.0094762
## Corycaeidae copepodid 0.0251405 -5.910e-03 0.0115081
## Oncaea conifera 0.0254727 2.434e-02 0.0438139
## Oncaea media 0.0088167 -2.986e-03 0.0034507
## Oncaea mediterranea 0.0100475 1.847e-03 -0.0009639
## Oncaea venusta 0.0803297 3.992e-02 0.0429679
##
##
## Site scores (weighted sums of species scores)
##
## RDA1 RDA2 RDA3 RDA4 RDA5 RDA6
## p1 0.378183 0.33566 0.409134 -0.990324 0.3092341 -0.570460
## p3 0.494807 0.15886 -0.128825 -0.252071 0.0363804 0.075256
## p4 0.550405 0.19149 -0.175446 -0.386636 -0.2197150 0.016142
## p6 0.530685 0.08469 -0.238932 0.130117 -0.1515723 0.213461
## p13 0.268846 0.08685 -0.082499 0.233543 0.0386053 0.254653
## p16 0.486329 -0.11902 -0.347987 0.526109 -0.8392949 0.271107
## p19 0.188306 -0.01875 -0.117454 0.413577 -0.3884204 0.076249
## p21 0.318032 0.14689 0.263248 -0.396238 0.4371574 -0.207531
## p23 0.033390 -0.03511 0.091596 0.463644 -0.2351448 0.006492
## p25 0.208133 0.00444 -0.024171 0.375773 -0.4026001 -0.088705
## s18 0.008868 0.06460 -0.195732 0.184048 -0.0002522 0.253895
## s19 -0.063310 0.07659 -0.224811 0.129697 -0.0419520 -0.187762
## s20 -0.117986 0.09805 -0.521952 0.069970 0.1508586 -0.937519
## s22 -0.208821 0.11436 -0.462668 0.001804 0.4306807 -0.702021
## s23 -0.179750 0.09621 -0.267371 0.112627 0.5875032 -0.039737
## s25 -0.152502 0.18045 -0.185277 -0.041166 -0.0155947 0.042104
## s27 -0.292515 0.21675 -0.064658 -0.344936 -0.3413938 -0.005944
## s29 -0.394324 0.21210 -0.189847 -0.820066 -0.6689440 -0.310707
## sA -0.165784 0.21211 0.028269 0.166720 0.0203732 0.278107
## sB -0.171299 0.24621 0.169263 0.550501 0.2827636 0.528109
## sC -0.232010 0.21650 -0.035076 -0.343489 -0.3288231 0.255799
## sD -0.195314 0.21300 -0.057387 -0.095009 -0.2796094 0.180029
## sE -0.115673 0.24522 0.012515 0.238684 0.0330773 0.440605
## sF -0.261980 0.32328 0.064630 0.158064 0.4075442 0.721562
## sG -0.441491 0.18153 -0.326229 -0.992307 -0.9680952 -0.720177
## w22 -0.013037 -0.73337 -0.003308 -0.017838 0.3519897 0.222538
## w23 -0.105439 -0.86856 -0.310856 -0.305174 0.7628166 0.814855
## w25 0.101813 -0.60280 0.044396 0.186810 -0.1423778 0.130233
## w27 -0.069752 -0.45113 0.299674 0.150726 0.0877136 0.045221
## w29 0.097248 -0.22048 0.122250 0.133192 -0.2263512 0.169066
## wA -0.072200 -0.31488 0.716717 0.040845 0.2538619 -0.474141
## wB -0.136782 -0.06896 0.735362 0.227126 0.5009640 -0.477626
## wC -0.149859 0.02885 0.491629 0.314842 0.2094188 -0.051301
## wD -0.125217 -0.30163 0.511803 0.176835 0.3491984 -0.221853
##
##
## Site constraints (linear combinations of constraining variables)
##
## RDA1 RDA2 RDA3 RDA4 RDA5 RDA6
## p1 0.377157 0.166911 0.132759 -0.24429 -0.108799 -0.08927
## p3 0.381919 0.170042 0.146875 -0.29547 0.085654 -0.09851
## p4 0.380880 0.224745 0.114751 -0.19421 0.118073 -0.05643
## p6 0.578294 0.010030 -0.061755 -0.28611 0.270135 0.12358
## p13 0.255788 -0.040458 -0.017813 0.21671 0.240762 0.21960
## p16 0.279525 -0.133501 -0.220250 0.57402 -0.275382 0.31218
## p19 0.126637 0.056486 -0.039387 0.15467 -0.432181 -0.14747
## p21 0.400553 -0.001720 -0.080404 -0.07853 0.203105 0.18838
## p23 0.008925 0.008747 -0.085838 0.31268 -0.461344 -0.02594
## p25 0.180296 0.103965 -0.025466 0.04236 -0.464902 -0.26394
## s18 -0.107637 0.089693 -0.266981 0.08832 0.334592 -0.41621
## s19 -0.054866 -0.033759 -0.072282 -0.06643 -0.135943 -0.22891
## s20 -0.027403 -0.088096 -0.352210 0.20898 0.231276 -0.54919
## s22 -0.240147 0.062207 -0.504207 0.15198 0.393756 -0.17527
## s23 -0.090744 0.349879 -0.406773 0.09601 0.134167 -0.24323
## s25 -0.133238 0.216673 -0.241368 0.21807 -0.147707 0.47054
## s27 -0.297878 -0.122106 0.048753 -0.12748 -0.058612 -0.02126
## s29 -0.147523 0.329732 -0.116815 -0.34985 -0.265984 0.02499
## sA -0.136856 0.118901 0.366882 0.12152 0.297438 0.03695
## sB 0.005854 0.125703 0.190425 0.48458 0.158777 0.43340
## sC -0.401663 0.116530 0.182832 -0.23436 0.032714 0.17057
## sD -0.150819 0.008598 0.182098 0.26707 0.040635 0.12860
## sE -0.410558 0.213039 -0.102801 -0.34153 0.011258 0.18992
## sF -0.164057 0.505893 -0.124254 -0.25060 -0.076425 0.39898
## sG -0.226431 -0.312398 -0.009409 -0.24386 -0.297112 -0.03580
## w22 0.065072 -0.627123 -0.300520 -0.37275 0.157212 0.27240
## w23 -0.050870 -0.539990 -0.145100 -0.06726 0.136293 0.23280
## w25 -0.121154 -0.408422 0.037758 -0.04960 -0.029172 0.05016
## w27 -0.121082 -0.287972 0.163004 -0.04620 -0.200849 -0.07712
## w29 0.145690 -0.145179 0.039354 -0.08763 -0.370019 -0.22791
## wA -0.117656 -0.022858 0.487927 0.06768 0.202597 -0.09245
## wB -0.066314 0.018854 0.462148 0.05868 0.102581 -0.15166
## wC -0.040468 -0.056224 0.318434 0.13148 0.001981 -0.17145
## wD -0.079226 -0.076822 0.299634 0.14136 0.171424 -0.18103
##
##
## Biplot scores for constraining variables
##
## RDA1 RDA2 RDA3 RDA4 RDA5 RDA6
## Depth -0.3525 0.10485 0.8121 0.25295 0.290818 0.08751
## Salinity -0.4467 0.45777 0.2589 0.41626 -0.412253 -0.38119
## Fluorescence -0.2130 0.11410 -0.6226 0.18357 0.429798 -0.56574
## maxS -0.4065 0.42489 0.2747 0.64093 -0.154977 -0.26194
## maxF -0.1999 0.19859 -0.6715 0.38093 0.346589 -0.38204
## MaxDO 0.8190 -0.03405 0.3453 0.02071 0.387484 0.18734
## FishDensity -0.4420 0.52026 -0.3978 0.12597 -0.197432 0.08749
## CopepodDensity -0.3840 0.68551 -0.4708 -0.09476 0.007639 0.36062
Our RDA model accounts for 53% of inertia (variance).
Groups we get from hierarchical clustering will be shown with ellipse.
#triplot without names of species and sites
ordiplot(dom_rda)
ordiellipse(dom_rda,group,conf = 0.95, lty = 2, label = T)
#triplot with names of species and sites
ordiplot(dom_rda, type = "text")
ordiellipse(dom_rda,group,conf = 0.95, lty = 2)
We can see that the \(1^{st}\) axis of RDA has almost the same direction as MaxDO with Depth and Fluorescence having oppisite direction with it. It also seems that the \(1^{st}\) axis seperates sites in spring from others and the \(2^{nd}\) axis seperates winter from others. However, overlaps do exist and this means these RDA axis might not reflect season differences too well.
phy = envdata[, 1:9]
bio = envdata[, 10:11]
all_rda <- rda(dom_norm ~ ., data = envdata)
phy_rda <- rda(dom_norm, phy, bio)
bio_rda <- rda(dom_norm, bio, phy)
phy_r2 = phy_rda$CCA$tot.chi / phy_rda$tot.chi
bio_r2 = bio_rda$CCA$tot.chi / bio_rda$tot.chi
joint_r2 = all_rda$CCA$tot.chi / all_rda$tot.chi - phy_r2 - bio_r2
Physical variables along account for 53% of variance, biological variables along account for 4% of variance and the joint effect of these variables account for 9% of variance. These variance are not adjusted by number of variables.
Graphical results are based on adjusted \(R^2\) (variance)
vp = varpart(dom_norm, phy, bio)
plot(vp, bg = c("red", "blue"), Xnames = c("Physical", "Biological"), alpha = 80)
Physical variables account for much more variance than biological variables.