Multivariate analysis

Dataset organisation

Data are loaded in a matrix configured for multivariate analyses (i.e. observations in rows, variables in columns). We load the output of preprocessing scripts and removed the polygons that contained undetermined data and only bare susbtratum for 2015, 2018 & 2020. The column bare substratum was removed. Each matrix is stored for each year separately. Each row corresponds to a group of polygons. Polygons were assigned to a given group according to their proximity and the total area they sum up together to reach approximately the size of the largest polygon drawn in 3D (i.e. 1.7 m²). The histogram below describes the surfaces of each group of polygons computed.

In each of these matrices, data are organised as follow: 1. each row corresponds to a group to which is assigned two IDs, the sum of all polygons assigned to that group, coordinates of the group centroid (i.e. mean of polygons’ centroid weighted by their specific area) and the side.

head(Matrix_RDA_group_2015[,1:7],10)

##    Original_ID ID_tot     Area Centroid_x Centroid_y Centroid_z Side
## 1          260      1 1.983091   14.00839  -9.669603  11.564273   ES
## 2          257      2 1.799056   13.45408  -8.645261  11.254791   ES
## 3          254      3 1.931393   14.21711  -9.652415  10.361668   ES
## 4          249      4 1.565404   13.92157  -8.293676   9.816375   ES
## 5          247      5 1.754464   14.34365  -9.351373   9.201676   ES
## 6          243      6 1.579983   13.81203  -9.786435   8.293649   ES
## 7          256      7 1.422055   12.71541  -8.121112  11.275338   ES
## 8          244      8 1.593894   13.02906  -7.269152   9.229354   ES
## 9          242      9 1.351696   14.04972  -7.976499   8.601580   ES
## 10         253     10 1.723426   12.20112  -8.564667  10.476200   ES

each group (i.e. row) contains the sum of the area [m²] of the group-specific polygons annotated for a given assemblage/substratum cover. We decided to divide the surface of each assemblage by the total surface of each group to get relative abundances (does that make sense?).

Matrix_RDA_group_2015[,8:20] = Matrix_RDA_group_2015[,8:20]/Matrix_RDA_group_2015$Area
Matrix_RDA_group_2018[,8:20] = Matrix_RDA_group_2018[,8:20]/Matrix_RDA_group_2018$Area
Matrix_RDA_group_2020[,8:20] = Matrix_RDA_group_2020[,8:20]/Matrix_RDA_group_2020$Area
head(Matrix_RDA_group_2015[,8:20],10)

##    Large B. azoricus Medium B. azoricus Small B. azoricus
## 1         0.62784992         0.32455845        0.02801790
## 2         0.04803632         0.63404183        0.27305581
## 3         0.57379727         0.05802035        0.00000000
## 4         0.00000000         0.90909750        0.09090250
## 5         0.57154545         0.25070580        0.00000000
## 6         0.52378556         0.00000000        0.00000000
## 7         0.00000000         0.86843126        0.13156874
## 8         0.00000000         0.82819410        0.17180590
## 9         0.12713548         0.73445941        0.02618499
## 10        0.31732690         0.39826262        0.25230251
##    white hydrothermal deposit Empty shell Hydrozoan sp. Amorphous sulphur
## 1                  0.00000000           0             0                 0
## 2                  0.00000000           0             0                 0
## 3                  0.03633720           0             0                 0
## 4                  0.00000000           0             0                 0
## 5                  0.06488362           0             0                 0
## 6                  0.37820810           0             0                 0
## 7                  0.00000000           0             0                 0
## 8                  0.00000000           0             0                 0
## 9                  0.00000000           0             0                 0
## 10                 0.00000000           0             0                 0
##    Zoanthid sp. Microbial mat S. mesatlantica Phymorhynchus sp. M. fortunata
## 1             0     0.7730584       0.0000000                 0   0.00000000
## 2             0     0.2750248       0.0000000                 0   0.00000000
## 3             0     0.2902998       0.8042291                 0   0.00000000
## 4             0     0.6192059       0.0000000                 0   0.00000000
## 5             0     0.6483385       0.6497420                 0   0.00000000
## 6             0     0.0000000       0.4005854                 0   0.02822113
## 7             0     0.2086696       0.0000000                 0   0.00000000
## 8             0     0.1308291       0.0000000                 0   0.00000000
## 9             0     0.7608667       0.5473211                 0   0.00000000
## 10            0     0.7836366       0.6733563                 0   0.00000000
##    P. smaragdina
## 1              0
## 2              0
## 3              0
## 4              0
## 5              0
## 6              0
## 7              0
## 8              0
## 9              0
## 10             0

Environmental data were also included. They correspond to the sum of the inverse cubic distance (sum(1/dist^3)) from the group centroid to all environmental features observed annotated in 3D (i.e. primary and secondary fluid exits, diffusion, zones, flanges, topographic change). The values correspond therefore to the proximity of one of these features being a proxy for the influence of these features: the greater the value, the more it is close to one of these features. The smaller, the less it is likely to be influenced.

head(Matrix_RDA_group_2015[,21:25],10)

##    Main smoker Secondary smoker Diffusion zone      Flange Topographic change
## 1    0.1833138        0.2992664     0.08255605 0.003294837          1.0446882
## 2    0.1417707        0.2918170     0.09660745 0.003617790          0.8658826
## 3    0.3986712        0.7386391     0.19708229 0.003215369          1.1793672
## 4    0.2654811        0.5492500     0.22769658 0.003641328          0.9972748
## 5    0.6415041        2.7474817     0.68290602 0.003212541          1.8679686
## 6    0.4921507       11.8343732     5.14803919 0.002969702          5.3476480
## 7    0.0982040        0.2340368     0.08625154 0.003779044          0.6673483
## 8    0.1548733        0.2941247     0.17854449 0.003925524          0.7718310
## 9    0.3327573        0.7531478     0.40910400 0.003624202          1.2978646
## 10   0.1068804        0.5303387     0.17617608 0.003493348          1.2980379

Exploratory analyses of fauna spatial distribution (PCA)

We perform a PCA with cover data for 2015. These data were log-transformed and then Chord-transformed

cov_2015 = Matrix_RDA_group_2015[,8:20]
env_2015 = Matrix_RDA_group_2015[,21:25]
cov_chord_2015 = decostand(log1p(cov_2015), "normalize")
cov_chord_pca_2015 = rda(cov_chord_2015)
#screeplot(cov_chord_pca, bstick=TRUE)
proportion_explained_2015 = sum(summary(cov_chord_pca_2015, display=NULL)[["cont"]][["importance"]][2,1:2])*100
cleanplot.pca(cov_chord_pca_2015)
mtext(paste0("Proportion of variance explained ", round(proportion_explained_2015), " %"), side=3)

The first two PCA axes demonstrate the spatial zonation in vent assemblages. The quartile II is mostly influenced by S. mesatlantica and less importantly by Large B. azoricus and by the white material therefore grouping features living in warm environmental conditions, closer to the hydrothermal vents (Cuvelier et al. 2009; Husson et al. 2017; Girard et al. 2020). The lower quartiles (III and IV) group empty mussel shells and the smaller-size mussel and zoanthid assemblages, living in colder locations further away from the vent fluid exits (Husson et al. 2017; Girard et al. 2020). Finally, the quartile I groups assemblages of medium mussels and microbial mats which occur in intermediate conditions of hydrothermal vent fluid exposure (Husson et al. 2017). In summary, these PCs clearly reflect the knowledge on the spatial distriubtion of the vent fauna we have at ET.

Still, we check for the two next axes to see how the data distribute in these PCs.

cleanplot.pca(cov_chord_pca_2015, ax1 = 3, ax2 = 4)
proportion_explained_temp =  sum(summary(cov_chord_pca_2015, display=NULL)[["cont"]][["importance"]][2,3:4])*100
mtext(paste0("Proportion of variance explained ", round(proportion_explained_temp), " %"), side=3)

However, it does not remain informative as for the first two axes as it discriminates the distribution of colder assemblages.

Exploratory analyses of fauna spatial distribution in relation to environmental variables (RDA)

We perform a RDA to check for any influence of environmental variables.

cov_chord_rda_2015 = rda(cov_chord_2015, env_2015)
triplot.rda(cov_chord_rda_2015, ax1=1, ax2=2)

## -----------------------------------------------------------------------
## Site constraints (lc) selected. To obtain site scores that are weighted
## sums of species scores (default in vegan), argument site.sc must be set
## to wa.
## -----------------------------------------------------------------------
## 
## 
## No factor, hence levels cannot be plotted with symbols;
## 'plot.centr' is set to FALSE

We clearly see the influence of the hydrothermal activity on the distribution of assemblages. The flanges appear to have a different effect mostly explaining the distribution of intermediate to warm hydrothermal assemblages (i.e. medium top large-mussel assemblage). Cold assemblages locating on the other side of the triplot, their distribution in the ET hdyrothermal edifice may be restricted to areas with low hydrothermal vent fluid exposure. The RDA model is significant.

anova(cov_chord_rda_2015)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(X = cov_chord_2015, Y = env_2015)
##           Df Variance      F Pr(>F)    
## Model      5  0.08829 8.6724  0.001 ***
## Residual 194  0.39500                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Exploration of fauna spatial distribution across sampling times independently

We redo PCA and RDA analyses for 2018 and 2020 to compare among the three time replicates how these relationships changed in time.

PCA biplots are presented as a series starting from the top for 2015 then 2018 and at the bottom, 2020.

par(mfrow=c(3,1))
cleanplot.pca(cov_chord_pca_2015)

## Warning in arrows(0, 0, spe.sc[, ax1], spe.sc[, ax2], length = ahead, angle =
## aangle, : zero-length arrow is of indeterminate angle and so skipped

mtext(paste0("Proportion of variance explained ", round(proportion_explained_2015), " %"), side=3)
cleanplot.pca(cov_chord_pca_2018)

## Warning in arrows(0, 0, spe.sc[, ax1], spe.sc[, ax2], length = ahead, angle =
## aangle, : zero-length arrow is of indeterminate angle and so skipped

proportion_explained_2018 = sum(summary(cov_chord_pca_2018, display=NULL)[["cont"]][["importance"]][2,1:2])*100
mtext(paste0("Proportion of variance explained ", round(proportion_explained_2018), " %"), side=3)
cleanplot.pca(cov_chord_pca_2020)
proportion_explained_2020 = sum(summary(cov_chord_pca_2020, display=NULL)[["cont"]][["importance"]][2,1:2])*100
mtext(paste0("Proportion of variance explained ", round(proportion_explained_2020), " %"), side=3)

We clearly see the stability among the years as the PCs are redrawn almost exactly the same year after year, explaining the same share of the variance (~55.5%). This demonstrates that stability predominates in the dataset. We may need to divide our dataset to better look for variability patterns if there are any of interest (see spatial exploration of the data with 3D mapping plots). Variability is however observed for microbial mats. As expected these microbial assemblages can undergo rapid variations through time (Van Audenhaege et al., in review). As a recall, our exploratory analyses depicted a large decline in microbial mat cover from 2015 to 2020.

We now proceed to RDA comparisons among years independently.

par(mfrow=c(3,1))
triplot.rda(cov_chord_rda_2015)

## -----------------------------------------------------------------------
## Site constraints (lc) selected. To obtain site scores that are weighted
## sums of species scores (default in vegan), argument site.sc must be set
## to wa.
## -----------------------------------------------------------------------
## 
## 
## No factor, hence levels cannot be plotted with symbols;
## 'plot.centr' is set to FALSE

mtext(paste0("Proportion of variance explained: Constrained ", round(sum(summary(cov_chord_rda_2015, display=NULL)[["cont"]][["importance"]][2,1:2])*100), " %", "and unconstrained ", round(sum(summary(cov_chord_rda_2015, display=NULL)[["cont"]][["importance"]][2,ncol(env_2015)+1:ncol(env_2015)+2])*100), " [%]"), side=3)
triplot.rda(cov_chord_rda_2018)

## -----------------------------------------------------------------------
## Site constraints (lc) selected. To obtain site scores that are weighted
## sums of species scores (default in vegan), argument site.sc must be set
## to wa.
## -----------------------------------------------------------------------
## 
## 
## No factor, hence levels cannot be plotted with symbols;
## 'plot.centr' is set to FALSE

mtext(paste0("Proportion of variance explained: Constrained ", round(sum(summary(cov_chord_rda_2018, display=NULL)[["cont"]][["importance"]][2,1:2])*100), " %", "and unconstrained ", round(sum(summary(cov_chord_rda_2018, display=NULL)[["cont"]][["importance"]][2,ncol(env_2018)+1:ncol(env_2018)+2])*100), " [%]"), side=3)
triplot.rda(cov_chord_rda_2020)

## -----------------------------------------------------------------------
## Site constraints (lc) selected. To obtain site scores that are weighted
## sums of species scores (default in vegan), argument site.sc must be set
## to wa.
## -----------------------------------------------------------------------
## 
## 
## No factor, hence levels cannot be plotted with symbols;
## 'plot.centr' is set to FALSE

mtext(paste0("Proportion of variance explained: Constrained ", round(sum(summary(cov_chord_rda_2020, display=NULL)[["cont"]][["importance"]][2,1:2])*100), " %", "and unconstrained ", round(sum(summary(cov_chord_rda_2020, display=NULL)[["cont"]][["importance"]][2,ncol(env_2020)+1:ncol(env_2020)+2])*100), " [%]"), side=3)

Same observation here, with slight variation on the influence of flanges onto each assemblage. Flanges were indeed observed building up considerably over time, therefore changing their position in space through time.

Spatial scales involved in the spatial distribution of the fauna

We now redo the analyses but this time, we will test the effect of spatial scales to explain faunal distribution. To do so, we will include the use of dbMEMs (Legendre and Legendre 2012).

We compute a matrix of distance (dist.xyz) among group centroids. To truncate the matrix, we then define a treshold equivalent to four times the maximum distance computed by a minimum spanning tree (MST).

Centroids = Matrix_RDA_group_2015[,4:6]
dist_all_mst = as.dist(dist.xyz(Centroids,Centroids))
mst_ = mst(dist_all_mst)
max_treshold_dbMEM = max(mst_*dist.xyz(Centroids,Centroids))
dist_all = dist.xyz(Centroids,Centroids)
dist_all[which(dist_all>max_treshold_dbMEM)]=max_treshold_dbMEM*4

We now plot in 3D the spatial relationships determined by the truncated matrix that will be used to compute dbMEMs.

if(plot3d==T){plot3d( 
  x=Matrix_RDA_group_2015$Centroid_x, y=Matrix_RDA_group_2015$Centroid_y, z=Matrix_RDA_group_2015$Centroid_z, 
  col = "green", 
  type = 's', 
  radius = .25,
  xlab="x [m]", ylab="y [m]", zlab="z [m]")
for (i in 1:nrow(dist_all)){
  for (j in 1:ncol(dist_all)){
    if (dist_all[i,j]<=max_treshold_dbMEM){
      x = c(Matrix_RDA_group_2015$Centroid_x[i],Matrix_RDA_group_2015$Centroid_x[j])
      y = c(Matrix_RDA_group_2015$Centroid_y[i],Matrix_RDA_group_2015$Centroid_y[j])
      z = c(Matrix_RDA_group_2015$Centroid_z[i],Matrix_RDA_group_2015$Centroid_z[j])
    segments3d(x = x, y = y, z = z, radius =8, col="red")}
  }
}
#legend3d("topright", legend = paste('Year', c('2015', '2018', '2020')), pch = 16, col = mycolors, cex=2.5, inset=c(0.02))
arrow3d(c(0,0,0), c(0,5,0), type = "rotation",  col = "green")
arrow3d(c(0,0,0), c(-4,0,0), type = "rotation",  col = "red", cex=4)
arrow3d(c(0,0,0), c(0,0,4), type = "rotation",  col = "blue")
text3d(c(0,6,0), text="N")
text3d(c(-5,0,0), text="W")
text3d(c(0,0,5), text="Z(+)")
title3d(position="topcenter",main = ("Spatial relationship among group centroids used for dbMEMs"))}

We now compute the dbMEMs (Ecology with R, p. 318)

dist_dbmem=as.dist(dist_all)
tr100.dbmem.tmp = dbmem(dist_dbmem, MEM.autocor = "positive")

## Warning in dudi.pco(matdist, scannf = FALSE, nf = 2): Non euclidean distance

tr100.dbmem <- as.data.frame(tr100.dbmem.tmp) # Display the eigenvalues 
length(attributes(tr100.dbmem.tmp)$values)

## [1] 30

30 dbMEMs were retrieved. We look for significant spatial patterns modeled by the dbMEMs by testing them with an ANOVA .

dbmem_raw = tr100.dbmem
rda1 = rda(cov_chord_2015~.,dbmem_raw)
anova(rda1)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = cov_chord_2015 ~ MEM1 + MEM2 + MEM3 + MEM4 + MEM5 + MEM6 + MEM7 + MEM8 + MEM9 + MEM10 + MEM11 + MEM12 + MEM13 + MEM14 + MEM15 + MEM16 + MEM17 + MEM18 + MEM19 + MEM20 + MEM21 + MEM22 + MEM23 + MEM24 + MEM25 + MEM26 + MEM27 + MEM28 + MEM29 + MEM30, data = dbmem_raw)
##           Df Variance      F Pr(>F)    
## Model     30  0.09785 1.4301  0.001 ***
## Residual 169  0.38544                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(rda1.R2a = RsquareAdj(rda1)$adj.r.squared)

## [1] 0.06089191

The model is significant, although the R² is low. We plot a scalogram to screen for significant dbMEMs according to the spatial scale they model (using the function built in Ecology with R, 2018)

source("C:/Lucky Strike/3D analysis/RDA/NEwR-2ed_code_data/NEwR-2ed_code_data 2/NEwR-2ed_code_data/NEwR2-Functions/scalog.R")
scalog(rda1)

The results of the scalogram show that the first six dbMEMs are significant. At broader scales, we still observe two unsignificant peaks at dbMEMs 22 & 29. After a forward selection, we investigate how these dbMEMs model those spatial scales explaining significantly the distribution of the fauna.

dbmem.fwd = forward.sel(cov_chord_2015, as.matrix(dbmem_raw),nperm=9999)

## Testing variable 1
## Testing variable 2
## Testing variable 3
## Testing variable 4
## Testing variable 5
## Testing variable 6
## Testing variable 7
## Procedure stopped (alpha criteria): pvalue for variable 7 is 0.058400 (> 0.050000)

(nb.sig.dbmem <- nrow(dbmem.fwd))

## [1] 6

(dbmem.sign <- sort(dbmem.fwd[ ,2]))

## [1] 1 2 3 4 5 6

dbmem.red <- dbmem_raw[ ,c(dbmem.sign)]

6 dbMEMs were found significant. We plot them from a downward-looking perspective. QUESTION: What results should we trust the best, the scalogram or the forward-selection ?

source("C:/Lucky Strike/3D analysis/RDA/NEwR-2ed_code_data/NEwR-2ed_code_data 2/NEwR-2ed_code_data/NEwR2-Functions/sr.value.R")
par(mfrow = c(2, 4))
for(i in 1 : ncol(dbmem.red))
{
  sr.value(Centroids[,1:2], 
           dbmem.red[ ,i], 
           #             sub = paste("dbMEM",i), 
           sub = paste("dbMEM",dbmem.sign[i]), 
           csub = 2)
}

These dbMEMs show different spatial patterns of faunal distribution. dbMEM1 describes broad-scale differences in terms of spatial organisation: the periphery and the summit. dbMEM2&3 describe medium-scale differences between northern and the summit communities. dbMEMs5&6 describe more differences at finer scales.

We now proceed to the delimitation of the dbMEMs This section may be useless.

# Interpretation of significant dbmems at broad and medium scales # 1
mite.dbmem.broad <- 
    rda(cov_chord_2015 ~ ., data = dbmem_raw[ ,c(1,2,3)])
anova(mite.dbmem.broad)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = cov_chord_2015 ~ MEM1 + MEM2 + MEM3, data = dbmem_raw[, c(1, 2, 3)])
##           Df Variance      F Pr(>F)    
## Model      3  0.02650 3.7901  0.001 ***
## Residual 196  0.45679                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(axes.broad <- anova(mite.dbmem.broad, by = "axis"))

## Permutation test for rda under reduced model
## Forward tests for axes
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = cov_chord_2015 ~ MEM1 + MEM2 + MEM3, data = dbmem_raw[, c(1, 2, 3)])
##           Df Variance      F Pr(>F)    
## RDA1       1  0.01798 7.7153  0.001 ***
## RDA2       1  0.00621 2.6659  0.080 .  
## RDA3       1  0.00231 0.9891  0.374    
## Residual 196  0.45679                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Number of significant axes
(nb.ax.broad <- 
    length(which(axes.broad[ , ncol(axes.broad)] <=  0.05)))

## [1] 1

# Plot of the two significant canonical axes
mite.dbmembroad.axes <- 
  scores(mite.dbmem.broad, 
         choices = c(1,2), 
         display = "lc", 
         scaling = 1)
par(mfrow = c(1, 2))
sr.value(Centroids[,1:2], mite.dbmembroad.axes[ ,1])
sr.value(Centroids[,1:2], mite.dbmembroad.axes[ ,2])

# Interpretation of significant dbmems at broad and medium scales # 1
mite.dbmem.fine <- 
    rda(cov_chord_2015 ~ ., data = dbmem_raw[ ,c(4,5,6,22)])
anova(mite.dbmem.fine)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = cov_chord_2015 ~ MEM4 + MEM5 + MEM6 + MEM22, data = dbmem_raw[, c(4, 5, 6, 22)])
##           Df Variance      F Pr(>F)    
## Model      4  0.03302 3.5746  0.001 ***
## Residual 195  0.45028                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(axes.fine <- anova(mite.dbmem.fine, by = "axis"))

## Permutation test for rda under reduced model
## Forward tests for axes
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = cov_chord_2015 ~ MEM4 + MEM5 + MEM6 + MEM22, data = dbmem_raw[, c(4, 5, 6, 22)])
##           Df Variance      F Pr(>F)   
## RDA1       1  0.01796 7.7797  0.002 **
## RDA2       1  0.00743 3.2164  0.073 . 
## RDA3       1  0.00604 2.6140  0.081 . 
## RDA4       1  0.00159 0.6881  0.651   
## Residual 195  0.45028                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Number of significant axes
(nb.ax.fine <- 
    length(which(axes.fine[ , ncol(axes.fine)] <=  0.05)))

## [1] 1

# Plot of the two significant canonical axes
mite.dbmemfine.axes <- 
  scores(mite.dbmem.fine, 
         choices = c(1,2), 
         display = "lc", 
         scaling = 1)
par(mfrow = c(1, 2))
sr.value(Centroids[,1:2], mite.dbmemfine.axes[ ,1])
sr.value(Centroids[,1:2], mite.dbmemfine.axes[ ,2])

Full analysis of the 2015 dataset by variance partitioning

Detrending the data to remove any linear effect

mite.h = cov_chord_2015
mite.XY.rda <- rda(mite.h, Centroids)
anova(mite.XY.rda)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(X = mite.h, Y = Centroids)
##           Df Variance      F Pr(>F)    
## Model      3  0.05966 9.2007  0.001 ***
## Residual 196  0.42364                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

As a linear trend is present, we separate this effect by keeping only the residuals.

mite.h.det <- resid(lm(as.matrix(mite.h) ~ ., data = Centroids))

We now make a forward selection to only keep the environmental variables that affect significantly the spatial distribution.

mite.xy = Centroids
mite.env = env_2015
primary_smokers <- model.matrix( ~ mite.env[ ,1])[ ,-1]
secondary_smokers <- model.matrix( ~ mite.env[ ,2])[ ,-1]
diffusion_zone = model.matrix( ~ mite.env[ ,3])[ ,-1]
flange = model.matrix( ~ mite.env[ ,4])[ ,-1]
topographic_change = model.matrix( ~ mite.env[ ,5])[ ,-1]
mite.env2 <- cbind(primary_smokers,secondary_smokers, diffusion_zone, flange,topographic_change)
colnames(mite.env2) <- 
  c("Primary smokers", "Secondary smokers", "Diffusion zone", "Flange", "Topographic change")
mite.env2 = as.data.frame(mite.env2)
mite.env.rda <- rda(mite.h ~., mite.env2)
(mite.env.R2a <- RsquareAdj(mite.env.rda)$adj.r.squared)

## [1] 0.1616181

mite.env.fwd <- 
  forward.sel(mite.h, mite.env2,
              adjR2thresh = mite.env.R2a, 
              nperm = 9999)

## Testing variable 1
## Testing variable 2
## Testing variable 3
## Procedure stopped (adjR2thresh criteria) adjR2cum = 0.161675 with 3 variables (> 0.161618)

env.sign <- sort(mite.env.fwd$order)
env.red <- mite.env2[ ,c(env.sign)]

Secondary smokers, diffusion zone and flanges were signficant and kept for the variance partitioning of the data.

mite.dbmem = dbmem_raw
mite.dbmem.rda <- rda(mite.h ~., mite.dbmem)
anova(mite.dbmem.rda)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = mite.h ~ MEM1 + MEM2 + MEM3 + MEM4 + MEM5 + MEM6 + MEM7 + MEM8 + MEM9 + MEM10 + MEM11 + MEM12 + MEM13 + MEM14 + MEM15 + MEM16 + MEM17 + MEM18 + MEM19 + MEM20 + MEM21 + MEM22 + MEM23 + MEM24 + MEM25 + MEM26 + MEM27 + MEM28 + MEM29 + MEM30, data = mite.dbmem)
##           Df Variance      F Pr(>F)   
## Model     30  0.09785 1.4301  0.002 **
## Residual 169  0.38544                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Since the analysis is significant, compute the adjusted R2
# and run a forward selection of the dbMEM variables
(mite.dbmem.R2a <- 
    RsquareAdj(mite.dbmem.rda)$adj.r.squared)

## [1] 0.06089191

(mite.dbmem.fwd <- 
    forward.sel(mite.h, 
                as.matrix(mite.dbmem), 
                adjR2thresh = mite.dbmem.R2a, nperm=9999))

## Testing variable 1
## Testing variable 2
## Testing variable 3
## Procedure stopped (adjR2thresh criteria) adjR2cum = 0.063708 with 3 variables (> 0.060892)

##   variables order        R2     R2Cum   AdjR2Cum        F pvalue
## 1      MEM5     5 0.0332816 0.0332816 0.02839918 6.816624  2e-04
## 2      MEM1     1 0.0248216 0.0581032 0.04854079 5.191498  2e-04

# Number of significant dbMEM
(nb.sig.dbmem <- nrow(mite.dbmem.fwd))

## [1] 2

# Identify the significant dbMEM sorted in increasing order
(dbmem.sign <- sort(mite.dbmem.fwd$order))

## [1] 1 5

# Write the significant dbMEM to a new object (reduced set)
dbmem.red <- mite.dbmem[ ,c(dbmem.sign)]

If we only use undetrended data, we get 2 significant dbMEMs now with this treshold of R². We plot them in space to display how they actually model spatial relationships among sampling points.

mite.red.axes <- 
  scores(mite.dbmem.rda, 
         choices = c(1,2), 
         display = "lc", 
         scaling = 1)
par(mfrow = c(1, 2))
sr.value(Centroids[,1:2], dbmem.red[ ,1])
sr.value(Centroids[,1:2], dbmem.red[ ,2])

These dbMEMs depict differences in: dbMEM1: Periphery vs. summit; dbMEM5: Position and orientation on the summit. Perhaps, we may want to include a dbMEM like the dbMEM22 that represents better finer-scale patterns, close to the vent fluid exits. We will also try with detrended data.

mite.dbmem = dbmem_raw
mite.det.dbmem.rda <- rda(mite.h.det ~., mite.dbmem)
anova(mite.det.dbmem.rda)

## Permutation test for rda under reduced model
## Permutation: free
## Number of permutations: 999
## 
## Model: rda(formula = mite.h.det ~ MEM1 + MEM2 + MEM3 + MEM4 + MEM5 + MEM6 + MEM7 + MEM8 + MEM9 + MEM10 + MEM11 + MEM12 + MEM13 + MEM14 + MEM15 + MEM16 + MEM17 + MEM18 + MEM19 + MEM20 + MEM21 + MEM22 + MEM23 + MEM24 + MEM25 + MEM26 + MEM27 + MEM28 + MEM29 + MEM30, data = mite.dbmem)
##           Df Variance      F Pr(>F)  
## Model     30  0.07507 1.2132  0.042 *
## Residual 169  0.34857                
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Since the analysis is significant, compute the adjusted R2
# and run a forward selection of the dbMEM variables

(mite.det.dbmem.R2a <- 
    RsquareAdj(mite.det.dbmem.rda)$adj.r.squared)

## [1] 0.03114347

#(mite.det.dbmem.fwd <- 
#    forward.sel(mite.h.det, 
#                as.matrix(mite.dbmem), 
#                adjR2thresh = mite.det.dbmem.R2a))

# Number of significant dbMEM
#(nb.sig.dbmem <- nrow(mite.det.dbmem.fwd))

# Identify the significant dbMEM sorted in increasing order
#(dbmem.sign <- sort(mite.det.dbmem.fwd$order))

# Write the significant dbMEM to a new object (reduced set)
#dbmem.red_detrended <- mite.dbmem[ ,c(dbmem.sign)]

If we only use undetrended data, we get 1 significant dbMEM. We now split by scales (i.e. broader to finer with the two significant dbMEMs) to compute the RDA results with significant dbmems used with undetrended data.

dbmem.broad <- dbmem.red[ , 1]
dbmem.fine <- dbmem.red[ , 2]

To analyse the variance partitioning, we perform partial RDA to assess the contribution of each factor and their combination (trend, environmental factors and space divided in broader and finer scales).

# Tests of the unique fractions [a], [b], [c] and [d]
# Fraction [a], pure environmental
env_stat = anova(
  #rda(mite.h, env.red, cbind(mite.xy, dbmem.broad, dbmem.fine))
  rda(mite.h, env.red, cbind(mite.xy, dbmem.broad, dbmem.fine))
)
if(env_stat$`Pr(>F)`[1]<=0.05&&env_stat$`Pr(>F)`[1]>0.01){text_env = "Environment*"
}else if(env_stat$`Pr(>F)`[1]<=0.01&&env_stat$`Pr(>F)`[1]>0.001){text_env = "Environment**"
}else if(env_stat$`Pr(>F)`[1]<=0.001){text_env = "Environment***"
}else{text_env = "Environment"}
# Fraction [b], pure trend
linear_stat = anova(
  #rda(mite.h, mite.xy, cbind(env.red, dbmem.broad, dbmem.fine))
  rda(mite.h, mite.xy, cbind(env.red, dbmem.broad, dbmem.fine))
)
if(linear_stat$`Pr(>F)`[1]<=0.05&&linear_stat$`Pr(>F)`[1]>0.01){text_linear = "Linear trend*"
}else if(linear_stat$`Pr(>F)`[1]<=0.01&&linear_stat$`Pr(>F)`[1]>0.001){text_linear = "Linear trend**"
}else if(linear_stat$`Pr(>F)`[1]<=0.001){text_linear = "Linear trend***"
}else{text_linear = "Linear trend"}
     # Fraction [c], pure broad scale spatial
broad_stat = anova(rda
     # (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
     (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
)
if(broad_stat$`Pr(>F)`[1]<=0.05&&broad_stat$`Pr(>F)`[1]>0.01){text_broad = "Broader scale*"
}else if(broad_stat$`Pr(>F)`[1]<=0.01&&broad_stat$`Pr(>F)`[1]>0.001){text_broad = "Broader scale**"
}else if(broad_stat$`Pr(>F)`[1]<=0.001){text_broad = "Broader scale***"
}else{text_broad = "Broader scale"}
# Fraction [d], pure fine scale spatial
fine_stat = anova(rda
      # (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
      (mite.h, dbmem.fine, cbind(env.red, mite.xy, dbmem.broad))
)
if(fine_stat$`Pr(>F)`[1]<=0.05 && fine_stat$`Pr(>F)`[1]>0.01){text_fine = "Finer scale*"
}else if(fine_stat$`Pr(>F)`[1]<=0.01&&fine_stat$`Pr(>F)`[1]>0.001){text_fine = "Finer scale**"
}else if(fine_stat$`Pr(>F)`[1]<=0.001){text_fine = "Finer scale***"
}else{text_fine = "Finer scale"}
## 5. Mite - environment - trend - dbMEM variation partitioning
mite.varpart <- 
    #varpart(mite.h, env.red, mite.xy, dbmem.broad, dbmem.fine))
    varpart(mite.h, env.red, mite.xy, dbmem.broad, dbmem.fine)

# Show the symbols of the fractions and plot their values
#par(mfrow = c(1,2))
#showvarparts(4, bg = c("red", "blue", "yellow", "green"))
plot(mite.varpart, 
     digits = 2, 
     bg = c("red", "blue", "yellow", "green"),
     Xnames = c(text_env, text_linear, text_broad, text_fine)
)
mtext(paste0(expression("p-values: * \u2264 0.05, ** \u2264 0.01, *** \u2264 0.001")), side=1, line = 1)

The Euler diagram shows that all fractions explain 21.9% of the variance. The remaining variance (78.1%) illustrates the importance of other factors in explaining the spatial distribution of the ET vent fauna. The environmental variables explain significantly 7.3% of the variance. 4.2% of the variance is explained with a combination of environmental factors and linear trend. Linear trend in the data explains significantly 5.6% of the data variance depicting a low but present gradient in the response variables as well as in the environmental response variables (4.2%). dbMEMs explain poorly the data. While broad scale is not significant, the fine-scale dbMEM explains significantly only 2.1% of the variance. This is surprising for both scales. This broad-scale dbMEM may not model well the distant exposure modulated by the current, the topography and the hydrothermal vent fluid exits. The fine-scale dbMEM may represent better the local influence of vent fluid exits, although a small R² resulted from their interaction (0.7%). This dbMEM may therefore not be modeling this effect in the data.

We may have taken the most restrictive criterium to choose our dbMEMs, perhaps, we could be less restrictive and only use the scalogram ?

Comparison of data partitioning among 2015, 2018, 2020

The code is not presented as repetitive with the section above.

mite.red.axes <- 
  scores(mite.dbmem.rda, 
         choices = c(1,2), 
         display = "lc", 
         scaling = 1)
par(mfrow = c(1, 3))
sr.value(Centroids[,1:2], dbmem.red[ ,1])
sr.value(Centroids[,1:2], dbmem.red[ ,2])
sr.value(Centroids[,1:2], dbmem.red[ ,3])

# Tests of the unique fractions [a], [b], [c] and [d]
# Fraction [a], pure environmental
env_stat = anova(
  #rda(mite.h, env.red, cbind(mite.xy, dbmem.broad, dbmem.fine))
  rda(mite.h, env.red, cbind(mite.xy, dbmem.broad, dbmem.fine))
)
if(env_stat$`Pr(>F)`[1]<=0.05&&env_stat$`Pr(>F)`[1]>0.01){text_env = "Environment*"
}else if(env_stat$`Pr(>F)`[1]<=0.01&&env_stat$`Pr(>F)`[1]>0.001){text_env = "Environment**"
}else if(env_stat$`Pr(>F)`[1]<=0.001){text_env = "Environment***"
}else{text_env = "Environment"}
# Fraction [b], pure trend
linear_stat = anova(
  #rda(mite.h, mite.xy, cbind(env.red, dbmem.broad, dbmem.fine))
  rda(mite.h, mite.xy, cbind(env.red, dbmem.broad, dbmem.fine))
)
if(linear_stat$`Pr(>F)`[1]<=0.05&&linear_stat$`Pr(>F)`[1]>0.01){text_linear = "Linear trend*"
}else if(linear_stat$`Pr(>F)`[1]<=0.01&&linear_stat$`Pr(>F)`[1]>0.001){text_linear = "Linear trend**"
}else if(linear_stat$`Pr(>F)`[1]<=0.001){text_linear = "Linear trend***"
}else{text_linear = "Linear trend"}
     # Fraction [c], pure broad scale spatial
broad_stat = anova(rda
     # (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
     (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
)
if(broad_stat$`Pr(>F)`[1]<=0.05&&broad_stat$`Pr(>F)`[1]>0.01){text_broad = "Broader scale*"
}else if(broad_stat$`Pr(>F)`[1]<=0.01&&broad_stat$`Pr(>F)`[1]>0.001){text_broad = "Broader scale**"
}else if(broad_stat$`Pr(>F)`[1]<=0.001){text_broad = "Broader scale***"
}else{text_broad = "Broader scale"}
# Fraction [d], pure fine scale spatial
fine_stat = anova(rda
      # (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
      (mite.h, dbmem.fine, cbind(env.red, mite.xy, dbmem.broad))
)
if(fine_stat$`Pr(>F)`[1]<=0.05 && fine_stat$`Pr(>F)`[1]>0.01){text_fine = "Finer scale*"
}else if(fine_stat$`Pr(>F)`[1]<=0.01&&fine_stat$`Pr(>F)`[1]>0.001){text_fine = "Finer scale**"
}else if(fine_stat$`Pr(>F)`[1]<=0.001){text_fine = "Finer scale***"
}else{text_fine = "Finer scale"}
## 5. Mite - environment - trend - dbMEM variation partitioning
mite.varpart <- 
    #varpart(mite.h, env.red, mite.xy, dbmem.broad, dbmem.fine))
    varpart(mite.h, env.red, mite.xy, dbmem.broad, dbmem.fine)

# Show the symbols of the fractions and plot their values
#par(mfrow = c(1,2))
#showvarparts(4, bg = c("red", "blue", "yellow", "green"))
plot(mite.varpart, 
     digits = 2, 
     bg = c("red", "blue", "yellow", "green"),
     Xnames = c(text_env, text_linear, text_broad, text_fine)
)
mtext(paste0(expression("p-values: * \u2264 0.05, ** \u2264 0.01, *** \u2264 0.001")), side=1, line = 1)

mite.red.axes <- 
  scores(mite.dbmem.rda, 
         choices = c(1,2), 
         display = "lc", 
         scaling = 1)
par(mfrow = c(1, 4))
sr.value(Centroids[,1:2], dbmem.red[ ,1])
sr.value(Centroids[,1:2], dbmem.red[ ,2])
sr.value(Centroids[,1:2], dbmem.red[ ,3])
sr.value(Centroids[,1:2], dbmem.red[ ,4])

# Tests of the unique fractions [a], [b], [c] and [d]
# Fraction [a], pure environmental
env_stat = anova(
  #rda(mite.h, env.red, cbind(mite.xy, dbmem.broad, dbmem.fine))
  rda(mite.h, env.red, cbind(mite.xy, dbmem.broad, dbmem.fine))
)
if(env_stat$`Pr(>F)`[1]<=0.05&&env_stat$`Pr(>F)`[1]>0.01){text_env = "Environment*"
}else if(env_stat$`Pr(>F)`[1]<=0.01&&env_stat$`Pr(>F)`[1]>0.001){text_env = "Environment**"
}else if(env_stat$`Pr(>F)`[1]<=0.001){text_env = "Environment***"
}else{text_env = "Environment"}
# Fraction [b], pure trend
linear_stat = anova(
  #rda(mite.h, mite.xy, cbind(env.red, dbmem.broad, dbmem.fine))
  rda(mite.h, mite.xy, cbind(env.red, dbmem.broad, dbmem.fine))
)
if(linear_stat$`Pr(>F)`[1]<=0.05&&linear_stat$`Pr(>F)`[1]>0.01){text_linear = "Linear trend*"
}else if(linear_stat$`Pr(>F)`[1]<=0.01&&linear_stat$`Pr(>F)`[1]>0.001){text_linear = "Linear trend**"
}else if(linear_stat$`Pr(>F)`[1]<=0.001){text_linear = "Linear trend***"
}else{text_linear = "Linear trend"}
     # Fraction [c], pure broad scale spatial
broad_stat = anova(rda
     # (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
     (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
)
if(broad_stat$`Pr(>F)`[1]<=0.05&&broad_stat$`Pr(>F)`[1]>0.01){text_broad = "Broader scale*"
}else if(broad_stat$`Pr(>F)`[1]<=0.01&&broad_stat$`Pr(>F)`[1]>0.001){text_broad = "Broader scale**"
}else if(broad_stat$`Pr(>F)`[1]<=0.001){text_broad = "Broader scale***"
}else{text_broad = "Broader scale"}
# Fraction [d], pure fine scale spatial
fine_stat = anova(rda
      # (mite.h, dbmem.broad, cbind(env.red, mite.xy, dbmem.fine))
      (mite.h, dbmem.fine, cbind(env.red, mite.xy, dbmem.broad))
)
if(fine_stat$`Pr(>F)`[1]<=0.05 && fine_stat$`Pr(>F)`[1]>0.01){text_fine = "Finer scale*"
}else if(fine_stat$`Pr(>F)`[1]<=0.01&&fine_stat$`Pr(>F)`[1]>0.001){text_fine = "Finer scale**"
}else if(fine_stat$`Pr(>F)`[1]<=0.001){text_fine = "Finer scale***"
}else{text_fine = "Finer scale"}
## 5. Mite - environment - trend - dbMEM variation partitioning
mite.varpart <- 
    #varpart(mite.h, env.red, mite.xy, dbmem.broad, dbmem.fine))
    varpart(mite.h, env.red, mite.xy, dbmem.broad, dbmem.fine)

# Show the symbols of the fractions and plot their values
#par(mfrow = c(1,2))
#showvarparts(4, bg = c("red", "blue", "yellow", "green"))
plot(mite.varpart, 
     digits = 2, 
     bg = c("red", "blue", "yellow", "green"),
     Xnames = c(text_env, text_linear, text_broad, text_fine)
)
mtext(paste0(expression("p-values: * \u2264 0.05, ** \u2264 0.01, *** \u2264 0.001")), side=1, line = 1)

Comparing these results with those of 2015, we observe that the variance explained remains highly stable. The changes occur when determining the dbMEMs used as inputs for modeling the broader-scale components as a larger number of dbMEMs were selected by the forward-selection procedure (2 for 2015, 3 for 2018 and 4 for 2020). The dbMEM the broader-scale dbMEM selected for 2018 & 2020 and not in 2015 appears to represent the orientation in active areas of the edifice, notably on the summit (dbMEM 2 in 2020). This may explain the significance of broad-scale dbMEMs in the variance partitioning plots of 2018 and 2020. In my opinion, we should keep the three dbMEMs selected for the 2018 dataset and probably include the dbMEM 22 for finer metre-scale patterns (see scalogram).

MANOVA tests

We first test for the influence of each factor for the 2015 dataset

model = cov_chord_2015 ~ env_2015$`Main smoker`*env_2015$`Secondary smoker`*env_2015$`Diffusion zone`*env_2015$Flange*env_2015$`Topographic change`
adonis2(model, method = "euc", by = "term")

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = model, method = "euc", by = "term")
##                                                                                                                             Df
## env_2015$`Main smoker`                                                                                                       1
## env_2015$`Secondary smoker`                                                                                                  1
## env_2015$`Diffusion zone`                                                                                                    1
## env_2015$Flange                                                                                                              1
## env_2015$`Topographic change`                                                                                                1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`                                                                           1
## env_2015$`Main smoker`:env_2015$`Diffusion zone`                                                                             1
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                                        1
## env_2015$`Main smoker`:env_2015$Flange                                                                                       1
## env_2015$`Secondary smoker`:env_2015$Flange                                                                                  1
## env_2015$`Diffusion zone`:env_2015$Flange                                                                                    1
## env_2015$`Main smoker`:env_2015$`Topographic change`                                                                         1
## env_2015$`Secondary smoker`:env_2015$`Topographic change`                                                                    1
## env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                                      1
## env_2015$Flange:env_2015$`Topographic change`                                                                                1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                 1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange                                                           1
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                             1
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                        1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Topographic change`                                             1
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                               1
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                          1
## env_2015$`Main smoker`:env_2015$Flange:env_2015$`Topographic change`                                                         1
## env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                                                    1
## env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                                                      1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                 1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                   1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                             1
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                               1
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                          1
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`   1
## Residual                                                                                                                   168
## Total                                                                                                                      199
##                                                                                                                            SumOfSqs
## env_2015$`Main smoker`                                                                                                        3.701
## env_2015$`Secondary smoker`                                                                                                   2.709
## env_2015$`Diffusion zone`                                                                                                     2.672
## env_2015$Flange                                                                                                               8.266
## env_2015$`Topographic change`                                                                                                 0.221
## env_2015$`Main smoker`:env_2015$`Secondary smoker`                                                                            0.485
## env_2015$`Main smoker`:env_2015$`Diffusion zone`                                                                              0.558
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                                         1.388
## env_2015$`Main smoker`:env_2015$Flange                                                                                        1.080
## env_2015$`Secondary smoker`:env_2015$Flange                                                                                   0.464
## env_2015$`Diffusion zone`:env_2015$Flange                                                                                     0.958
## env_2015$`Main smoker`:env_2015$`Topographic change`                                                                          0.415
## env_2015$`Secondary smoker`:env_2015$`Topographic change`                                                                     0.226
## env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                                       0.594
## env_2015$Flange:env_2015$`Topographic change`                                                                                 0.470
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                  1.101
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange                                                            0.198
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                              0.740
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                         0.865
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Topographic change`                                              1.117
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                0.240
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                           0.762
## env_2015$`Main smoker`:env_2015$Flange:env_2015$`Topographic change`                                                          0.469
## env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                                                     0.711
## env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                                                       0.507
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                  0.615
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                    0.399
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                              0.533
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                                0.925
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                           0.415
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`    0.220
## Residual                                                                                                                     62.154
## Total                                                                                                                        96.176
##                                                                                                                                 R2
## env_2015$`Main smoker`                                                                                                     0.03849
## env_2015$`Secondary smoker`                                                                                                0.02817
## env_2015$`Diffusion zone`                                                                                                  0.02779
## env_2015$Flange                                                                                                            0.08594
## env_2015$`Topographic change`                                                                                              0.00229
## env_2015$`Main smoker`:env_2015$`Secondary smoker`                                                                         0.00504
## env_2015$`Main smoker`:env_2015$`Diffusion zone`                                                                           0.00580
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                                      0.01444
## env_2015$`Main smoker`:env_2015$Flange                                                                                     0.01122
## env_2015$`Secondary smoker`:env_2015$Flange                                                                                0.00482
## env_2015$`Diffusion zone`:env_2015$Flange                                                                                  0.00996
## env_2015$`Main smoker`:env_2015$`Topographic change`                                                                       0.00432
## env_2015$`Secondary smoker`:env_2015$`Topographic change`                                                                  0.00235
## env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                                    0.00617
## env_2015$Flange:env_2015$`Topographic change`                                                                              0.00488
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                               0.01145
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange                                                         0.00206
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                           0.00769
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                      0.00899
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Topographic change`                                           0.01161
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                             0.00250
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                        0.00792
## env_2015$`Main smoker`:env_2015$Flange:env_2015$`Topographic change`                                                       0.00487
## env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                                                  0.00739
## env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                                                    0.00527
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                               0.00639
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                 0.00415
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                           0.00554
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                             0.00961
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                        0.00431
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change` 0.00228
## Residual                                                                                                                   0.64626
## Total                                                                                                                      1.00000
##                                                                                                                                  F
## env_2015$`Main smoker`                                                                                                     10.0049
## env_2015$`Secondary smoker`                                                                                                 7.3232
## env_2015$`Diffusion zone`                                                                                                   7.2236
## env_2015$Flange                                                                                                            22.3417
## env_2015$`Topographic change`                                                                                               0.5966
## env_2015$`Main smoker`:env_2015$`Secondary smoker`                                                                          1.3104
## env_2015$`Main smoker`:env_2015$`Diffusion zone`                                                                            1.5087
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                                       3.7525
## env_2015$`Main smoker`:env_2015$Flange                                                                                      2.9180
## env_2015$`Secondary smoker`:env_2015$Flange                                                                                 1.2531
## env_2015$`Diffusion zone`:env_2015$Flange                                                                                   2.5883
## env_2015$`Main smoker`:env_2015$`Topographic change`                                                                        1.1229
## env_2015$`Secondary smoker`:env_2015$`Topographic change`                                                                   0.6098
## env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                                     1.6049
## env_2015$Flange:env_2015$`Topographic change`                                                                               1.2697
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                2.9764
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange                                                          0.5354
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                            2.0003
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                       2.3374
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Topographic change`                                            3.0179
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                              0.6497
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                         2.0596
## env_2015$`Main smoker`:env_2015$Flange:env_2015$`Topographic change`                                                        1.2664
## env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                                                   1.9214
## env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                                                     1.3712
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                1.6612
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                  1.0785
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                            1.4411
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                              2.4990
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                         1.1217
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`  0.5935
## Residual                                                                                                                          
## Total                                                                                                                             
##                                                                                                                            Pr(>F)
## env_2015$`Main smoker`                                                                                                      0.001
## env_2015$`Secondary smoker`                                                                                                 0.001
## env_2015$`Diffusion zone`                                                                                                   0.002
## env_2015$Flange                                                                                                             0.001
## env_2015$`Topographic change`                                                                                               0.698
## env_2015$`Main smoker`:env_2015$`Secondary smoker`                                                                          0.260
## env_2015$`Main smoker`:env_2015$`Diffusion zone`                                                                            0.204
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                                       0.002
## env_2015$`Main smoker`:env_2015$Flange                                                                                      0.015
## env_2015$`Secondary smoker`:env_2015$Flange                                                                                 0.278
## env_2015$`Diffusion zone`:env_2015$Flange                                                                                   0.027
## env_2015$`Main smoker`:env_2015$`Topographic change`                                                                        0.322
## env_2015$`Secondary smoker`:env_2015$`Topographic change`                                                                   0.678
## env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                                     0.165
## env_2015$Flange:env_2015$`Topographic change`                                                                               0.255
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                0.009
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange                                                          0.757
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                            0.073
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                       0.045
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Topographic change`                                            0.010
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                              0.668
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                         0.068
## env_2015$`Main smoker`:env_2015$Flange:env_2015$`Topographic change`                                                        0.259
## env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                                                   0.085
## env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                                                     0.231
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                0.142
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                  0.383
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                            0.201
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                              0.024
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                         0.310
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`  0.704
## Residual                                                                                                                         
## Total                                                                                                                            
##                                                                                                                               
## env_2015$`Main smoker`                                                                                                     ***
## env_2015$`Secondary smoker`                                                                                                ***
## env_2015$`Diffusion zone`                                                                                                  ** 
## env_2015$Flange                                                                                                            ***
## env_2015$`Topographic change`                                                                                                 
## env_2015$`Main smoker`:env_2015$`Secondary smoker`                                                                            
## env_2015$`Main smoker`:env_2015$`Diffusion zone`                                                                              
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                                                      ** 
## env_2015$`Main smoker`:env_2015$Flange                                                                                     *  
## env_2015$`Secondary smoker`:env_2015$Flange                                                                                   
## env_2015$`Diffusion zone`:env_2015$Flange                                                                                  *  
## env_2015$`Main smoker`:env_2015$`Topographic change`                                                                          
## env_2015$`Secondary smoker`:env_2015$`Topographic change`                                                                     
## env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                                       
## env_2015$Flange:env_2015$`Topographic change`                                                                                 
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`                                               ** 
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange                                                            
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                           .  
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                                      *  
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Topographic change`                                           ** 
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                                
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                                        .  
## env_2015$`Main smoker`:env_2015$Flange:env_2015$`Topographic change`                                                          
## env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                                                  .  
## env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                                                       
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange                                  
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$`Topographic change`                    
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$Flange:env_2015$`Topographic change`                              
## env_2015$`Main smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                             *  
## env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`                           
## env_2015$`Main smoker`:env_2015$`Secondary smoker`:env_2015$`Diffusion zone`:env_2015$Flange:env_2015$`Topographic change`    
## Residual                                                                                                                      
## Total                                                                                                                         
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We now test for interaction with time and space Not sure about the model expression of the statistics here

cov_chord_all = rbind(cov_chord_2015,cov_chord_2018, cov_chord_2020)
env_all = rbind(env_2015, env_2018, env_2020)
Sites = as.factor(rep(seq(1:nrow(cov_chord_2015)),3))
Time = c(rep(2015,nrow(cov_chord_2015)),rep(2018,nrow(cov_chord_2018)),rep(2020,nrow(cov_chord_2020)))
Side = rep(Matrix_RDA_group_2015$Side,3)
model = cov_chord_all ~ (env_all$`Main smoker`*Time)+(env_all$`Secondary smoker`*Time)+(env_all$`Diffusion zone`*Time)+(env_all$Flange+env_all$`Topographic change`*Time)
adonis2(model, method = "euc", by = "term", permutations = 499)

## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 499
## 
## adonis2(formula = model, permutations = 499, method = "euc", by = "term")
##                                    Df SumOfSqs      R2       F Pr(>F)   
## env_all$`Main smoker`               1    7.180 0.02316 16.3198  0.002 **
## Time                                1    3.936 0.01270  8.9459  0.002 **
## env_all$`Secondary smoker`          1    9.199 0.02967 20.9071  0.002 **
## env_all$`Diffusion zone`            1    1.016 0.00328  2.3094  0.038 * 
## env_all$Flange                      1   24.101 0.07775 54.7777  0.002 **
## env_all$`Topographic change`        1    1.925 0.00621  4.3760  0.004 **
## env_all$`Main smoker`:Time          1    0.524 0.00169  1.1907  0.292   
## Time:env_all$`Secondary smoker`     1    2.050 0.00661  4.6598  0.002 **
## Time:env_all$`Diffusion zone`       1    0.833 0.00269  1.8923  0.098 . 
## Time:env_all$`Topographic change`   1    0.084 0.00027  0.1903  0.984   
## Residual                          589  259.151 0.83597                  
## Total                             599  310.000 1.00000                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Spatially-constrained hierarchical clustering (Bray-Curtis distance, ward distance)

E = matrix(ncol=2)
for (i in (1:(ncol(mst_)-1))){
  for (j in ((1+i):(nrow(mst_)))){
    #print(paste0(as.character(i),"-",as.character(j)))
    if(mst_[j,i]==1){E=rbind(E,c(j,i))}}}
E = E[-1,]

diss_eucl_cov_chord_2015 = vegdist(cov_2015, method="bray")
#diss_eucl_cov_chord_2015 = dist(cov_chord_2015, method="euclidean")
cclust_2015 = constr.hclust(diss_eucl_cov_chord_2015, method = "ward.D2", links = E, coords=Centroids)

k=5
stats:::plot.hclust(cclust_2015, hang=-1)
rect.hclust(cclust_2015, k=k, border="red")

plot(cclust_2015, links=TRUE, k=k, col=seq(1,k))
legend(x=30,y=4, legend = c(paste0(rep("g",k),seq(1,k))), col=seq(1,k), pch=19)

# We can redo that with transitions of assemblages in each group....

We randomly chosen 5 groups to plot the clusters in space. According to that spatial distribution, we can already see that the spatial distribution of the groups are explained by microbial mats in the g5. Then, this distribution is explained by different compositions from empty mussel shells to zoanthids. We explain the differences of these groups according to Simper analyses.

group = cutree(cclust_2015, k)
simper(cov_2015, group, permutations = 999)

## cumulative contributions of most influential species:
## 
## $`1_2`
##    S. mesatlantica Medium B. azoricus      Microbial mat  Large B. azoricus 
##          0.2876011          0.5055950          0.6702926          0.8179775 
## 
## $`1_3`
##       Empty shell   S. mesatlantica     Microbial mat Large B. azoricus 
##         0.2530395         0.4807190         0.6257961         0.7538165 
## 
## $`1_4`
##    S. mesatlantica       Zoanthid sp.      Microbial mat Medium B. azoricus 
##          0.2426589          0.4241419          0.5827847          0.7324369 
## 
## $`1_5`
##       Empty shell   S. mesatlantica     Microbial mat Large B. azoricus 
##         0.2622493         0.4810949         0.6339730         0.7572220 
## 
## $`2_3`
##        Empty shell Medium B. azoricus      Microbial mat    S. mesatlantica 
##          0.2935777          0.5388948          0.6894564          0.8043913 
## 
## $`2_4`
##       Zoanthid sp. Medium B. azoricus      Microbial mat    S. mesatlantica 
##          0.2334370          0.4520952          0.6290334          0.7621893 
## 
## $`2_5`
##        Empty shell Medium B. azoricus      Microbial mat 
##          0.3120701          0.5373474          0.7021933 
## 
## $`3_4`
##        Empty shell       Zoanthid sp. Medium B. azoricus 
##          0.3595336          0.6103882          0.8079715 
## 
## $`3_5`
##        Empty shell  Small B. azoricus Medium B. azoricus 
##          0.4094079          0.6059561          0.7562709 
## 
## $`4_5`
##        Empty shell       Zoanthid sp. Medium B. azoricus 
##          0.3751849          0.6321336          0.8148523

I am aware that hierarchical clustering may be difficult to explain as it needs to delimit the contribution of each assemblage one by one for each pair of groups separately. However, that may be the most interesting way to decompose the variability that we observe at ET. In the end, we could build a spatially-constrained hierarchical clustering considering a matrix that contains the temporal information! What we could do is to use the balance from 2015 to 2020 for each assemblage (+xm² or -Xm²), or even the surface provided by a given assemblage transition (i.e. categories will be 1a.1b.1b like a sequence of assemblage transition, but there may be a lot of zeros then). We need to talk about this with Pierre.

E = matrix(ncol=2)
for (i in (1:(ncol(mst_)-1))){
  for (j in ((1+i):(nrow(mst_)))){
    #print(paste0(as.character(i),"-",as.character(j)))
    if(mst_[j,i]==1){E=rbind(E,c(j,i))}}}
E = E[-1,]

diss_eucl_cov_chord_2015 = vegdist(cov_2015, method="bray")
#diss_eucl_cov_chord_2015 = dist(cov_chord_2015, method="euclidean")
cclust_2015 = constr.hclust(diss_eucl_cov_chord_2015, method = "ward.D2", links = E, coords=Centroids)


stats:::plot.hclust(cclust_2015, hang=-1)
rect.hclust(cclust_2015, k=10, border="red")

plot(cclust_2015, links=TRUE, k=10)

# We can redo that with transitions of assemblages in each group....

Multivariate analysis - R markdown

Loïc Van Audenhaege

28/03/2022