Introduction

Community ecology is the study of assemblages consisting of two or more populations, and their interactions with one another and their environment. Many different aspects of ecology and evolution fall under the umbrella of community ecology, but during this session we will focus on diversity. Understanding the distribution of diversity across ecosystems, and the environmental factors influencing diversity and community composition, are critical components of ecology. Conceptually, the diversity of ecosystems can be broken down into three spatial scales (Whittaker, 1972):


Alpha diversity
The diversity of a local community (a single site or sample).

Beta diversity
The diversity between local communities. Classically defined as the relationship between gamma diversity and alpha diversity, and in its simplest iteration can be calculated as gamma diversity divided by alpha diversity (though beta diversity is commonly explored with more complex metrics).

Gamma diversity
The total diversity of the ecosystem (considering all sites or samples).


beta1

beta2

Fig. 1. The relationship between alpha, beta, and gamma diversity under scenarios of minimum and maximum differentiation. In the minimum differentiation scenario, each local community harbours five species (alpha diversity = 5), a total of five species are present across the whole ecosystem (gamma diversity = 5), and there is no differentiation between local communities (beta diversity = 1). In the maximum differentiation scenario each local community harbours five species, but all local communities harbour a distinct cohort of species therefore a total of fifteen species are present in the ecosystem (gamma diversity = 15) and there is full differentiation between each local community (beta diversity = 3).


Here we will explore diversity across multiple scales using a case study from the Plymouth Sound to familiarise you with key concepts in community ecology. We will explore classic measures of alpha and beta diversity, and use statistical analyses to test how environmental conditions such as heavy-metal contamination and nitrate enrichment affect diversity. This will equip you to perform your own diversity analysis on bacterial community data from the Plymouth Sound.



map
Fig. 2. Sampling locations in the Plymouth Sound. WM: West Mud, MS: Mallard Shoal, SL: Sutton Lock, JC: Jennycliff Bay, IB: Inner Breakwater, CB: Cawsand Bay.


Load required packages

library(vegan)
library(ggplot2)
library(viridis)
library(cowplot)


Upload raw data

meio.data <- read.csv("meiofauna_data.csv")
meio.data <- meio.data[, 2:107]
env.data <- read.csv("env_data_alt.csv")


Alpha diversity

Alpha diversity refers to the diversity of a local community (i.e. a single site or sample). The simplest form of alpha diversity is species richness (as represented in Fig. 1) which is the number of species observed in the local community. However, there are many different metrics which can be used to quantify alpha diversity in community ecology, that can broadly be broken down into three categories: measures of species richness, measures of species evenness, and overall diversity metrics considering both richness and evenness. Here we will explore three of the most common.

Shannon’s Index H’
Shannon’s H’ is the most frequently used metric to quantify overall diversity of a site. Shannon’s H’ takes into account both the total number of species observed at a site, and how evenly distributed species abundance is at each site. Shannon’s H’ is a measure of how difficult it is to predict the identity of a randomly chosen indvidual from the community, and is calculated as follows: \[\begin{align} H'=-\sum^S_{i = 1}p_i~ln~p_i \end{align}\] Where \(S\) is species richness, \(P_i\) is the proportion of the community made up on species \(i\), and \(ln\) is the natural log.

Richness
Richness is simply the number of species at a specific site. Despite the simplicity of richness, multiple measures exist, ranging from the number of species observed at the site through to estimates of true richness based on models (e.g. Chao1 richness).

Pielou’s Evenness J’
Evenness is an important component of diversity, and reflects how evenly distributed abundances are between all species present at a site. A common measure of eveness is Pielou’s Evenness. Pielou’s Evenness can be easily derived from Shannon’s H’ as follows:
\[J' = \frac{H'}{H'_{max}} \] Where \(H'_{max}\) is the maximum possible \(H'\) value for a community:
\[\begin{align} H'_{max} = - \sum^S_{i = 1} \frac{1}{S} ~ ln ~ \frac{1}{S} = ln ~ S \end{align}\] Where \(S\) is the number of species observed, and \(ln\) is the natural log.


Calculate alpha diversity

# Shannon's H'
H <- diversity(meio.data)

# Observed Richness
richness <- specnumber(meio.data)  

# Pielou's Evenness
evenness <- H/log(richness)
  
# Create alpha diversity dataframe including environmental data
alpha <- cbind(shannon = H, richness = richness, pielou = evenness, env.data)
head(alpha)
##    shannon richness    pielou             site depth sediment  NO3  Cu   Hg
## 1 2.453751       34 0.6958309      Cawsand Bay    18     Silt 0.87  32 0.09
## 2 1.613972       16 0.5821174 Inner Breakwater     6     Sand 0.90 119 1.15
## 3 2.852518       35 0.8023176  Jenny Cliff Bay     9     Silt 0.64  18 0.01
## 4 2.421241       19 0.8223099    Mallard Shoal     4     Silt 0.70  38 0.02
## 5 1.777476       31 0.5176128         West Mud     3     Sand 1.10  81 0.82
## 6 2.121223       26 0.6510620      Sutton Lock     8     Sand 0.50  65 0.60


Plot alpha diversity

plot.shan <- ggplot(alpha, aes(x = site, y = shannon, colour = site)) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  ylab("Shannon's H'") + 
  xlab("") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.4))

plot.rich <-ggplot(alpha, aes(x = site, y = richness, colour = site)) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  ylab("Species Richness") +
  xlab("") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.4))

plot.even <- ggplot(alpha, aes(x = site, y = evenness, colour = site)) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  ylab("Pielou's Evenness") +
  xlab("") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.4))

legend <- get_legend(plot.even)

plot_grid(plot.shan + theme(legend.position = "none"), plot.rich + theme(legend.position = "none"), plot.even + theme(legend.position = "none"),ncol = 3)

These graphs illustrate the differences between alpha diversity metrics and the relationship between species richness, Pielou’s evenness, and Shannon’s H’. Lets focus on two examples: Cawsand Bay and Mallard Shoal. These two sites have similar Shannon’s H’ scores, however, this is resultant of distinct mechanisms. Cawsand Bay has a large number of species (high species richness) but is relatively uneven (moderate Pielou’s Evenness), resulting in a moderate Shannon’s H’. Mallard Shoal, in contrast, has a small number of species but is very even, also resulting in a moderate Shannon’s H’. Alpha diversity is structured very differently at Cawsand Bay compared with Mallard Shoal, but if we had used Shannon’s H’ alone we may not have identified these important differences.

Relationships between alpha diversity and environment

Simple linear models

summary(lm(shannon ~ depth, alpha))
summary(lm(shannon ~ NO3, alpha))
summary(lm(shannon ~ Cu, alpha))
summary(lm(shannon ~ Hg, alpha))
## 
## Call:
## lm(formula = shannon ~ Hg, data = alpha)
## 
## Residuals:
##        1        2        3        4        5        6 
## -0.07859  0.04493  0.24747 -0.17472 -0.09146  0.05236 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.61413    0.09721  26.891 1.14e-05 ***
## Hg          -0.90878    0.15489  -5.867  0.00421 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1666 on 4 degrees of freedom
## Multiple R-squared:  0.8959, Adjusted R-squared:  0.8699 
## F-statistic: 34.43 on 1 and 4 DF,  p-value: 0.004213

The linear model outputs demonstrate that there is a significant negative relationship between both Copper and Mercury concentration, and Shannon’s H’ across the Plymouth Sound. We can then fit the significant regression lines to our plots using the geom_smooth function in the r package ggplot2.


depth.reg <- ggplot(alpha, aes(x = depth, y = shannon, colour = site)) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  xlab("Depth (m)") + 
  ylab("Shannon's H'") +
  ylim(1.1,3) +
  theme_bw()

NO3.reg <- ggplot(alpha, aes(x = NO3, y = shannon, colour = site)) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  xlab(bquote(Nitrate ~ (mg~kg^-1))) + 
  ylab("") +
  ylim(1.1,3) +
  theme_bw()

cu.reg <- ggplot(alpha, aes(x = Cu, y = shannon, colour = site)) +
  geom_smooth(method = "lm", colour = "black", fill = "grey90") +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  xlab(bquote(Copper ~ (mg~kg^-1))) + 
  ylab("") +
  ylim(1.1,3) +
  theme_bw()

hg.reg <- ggplot(alpha, aes(x = Hg, y = shannon, colour = site)) +
  geom_smooth(method = "lm", colour = "black", fill = "grey85") +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  xlab(bquote(Mercury ~ (mg~kg^-1))) + 
  ylab("") +
  ylim(1.1,3) +
  theme_bw()

legend <- get_legend(hg.reg)

plot_grid(depth.reg + theme(legend.position = 'none'), NO3.reg + theme(legend.position = 'none'), cu.reg + theme(legend.position = 'none'), hg.reg+ theme(legend.position = 'none'), legend, rel_heights = c(1,1,1,1,0.5),  ncol = 5)



Analysis of Variance (ANOVA)

anova <- aov(shannon ~ sediment, alpha)
summary(anova)
##             Df Sum Sq Mean Sq F value Pr(>F)  
## sediment     1 0.8176  0.8176   13.11 0.0223 *
## Residuals    4 0.2494  0.0624                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
sed.plot <- ggplot(alpha, aes(x = sediment, y = shannon, fill = sediment)) +
  geom_boxplot(aes(fill = sediment)) +
  geom_point(size = 3, aes(colour = site)) + 
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  scale_fill_manual(values = c("grey60", "grey90"), guide = FALSE) +
  ylab("Shannon's H'") + 
  xlab('')+
  theme_bw() 
 sed.plot

The one-way ANOVA reveals a significant relationship between sediment type (a categorical variable) and Shannon’s H’, as greater meiofaunal diversity was observed in silt sediments compared with sand sediments throughout the Plymouth Sound.



Beta Diversity

In this section, we will explore the differentiation between local communities across the Plymouth Sound. That is, to compare the dissimilarity in community composition between each pair of sites. Such analyses are challenging with large species by site matrices like our meio.data, and consequently we us dissimilarity metrics to summarise differences between pairs of communities. There are a large number of dissimilarity metrics which are used by ecologists, which each have their own idiosyncrasies and applications. In the session we will calculate and compare two of the most common metrics, Bray-Curtis dissimilarity and Jaccard dissimilarity.


Bray-Curtis Dissimilarity
Bray-Curtis dissimilarity is an abundance-based dissimilarity metric. It is calculated as follows:

\[BC_{jk} = \frac{\sum |x_{ij} - x_{ik}|}{\sum (x_{ij}+ x_{ik})} \] Where \(BC_{jk}\) is the Bray-Curtis dissimilarity between sites \(j\) and \(k\), \(x_{ij}\) is the abundance of species \(i\) in community \(j\) and \(x_{ik}\) represents the abundance of species \(i\) in community \(k\).

Jaccard Dissimilarity
Jaccard dissimilarity is a binary dissimilarity metricl, which explores the intersection over union of sites. Jaccard dissimilarity is calculated as follows: \[JAC_{jk} = 1- \frac{|x_{j} \cap x_{k}|}{|x_{j} \cup x_{k}|} = 1- \frac{|x_{j} \cap x_{k}|}{|x_j| + |x_k| - |x_{j} \cap x_{k}|}\] Where \(JAC_{jk}\) is the Jaccard dissimilarity between sites \(j\) and \(k\), \(|x_{j} \cap x_{k}|\) is the number of species shared between sites \(j\) and \(k\), and \(|x_{j} \cup x_{k}|\) is the number of species in site \(j\), plus the number of species in site \(k\), minus the number of species shared between the sites (or in simpler terms, the total number of species considering both sites).



Calculate pairwise dissimilarity

meio.mdf <- as.matrix.data.frame(meio.data)
rownames(meio.mdf) <- env.data$site

meio.bray <- vegdist(meio.mdf, method = "bray")
meio.bray
##                  Cawsand Bay Inner Breakwater Jenny Cliff Bay Mallard Shoal  West Mud
## Inner Breakwater   0.9267176                                                         
## Jenny Cliff Bay    0.7335542        0.9532976                                        
## Mallard Shoal      0.9596577        0.8998273       0.9394366                        
## West Mud           0.6964623        0.9674981       0.8112900     0.9516616          
## Sutton Lock        0.8507216        0.9824561       0.9057592     0.9576516 0.9169350
meio.jac <- vegdist(meio.mdf, method = "jaccard", binary = T)
meio.jac
##                  Cawsand Bay Inner Breakwater Jenny Cliff Bay Mallard Shoal  West Mud
## Inner Breakwater   0.9130435                                                         
## Jenny Cliff Bay    0.7678571        0.9375000                                        
## Mallard Shoal      0.9183673        0.9393939       0.6829268                        
## West Mud           0.7962963        0.9782609       0.6530612     0.7500000          
## Sutton Lock        0.8461538        0.9230769       0.8039216     0.8750000 0.7872340

It is important to consider the differences in the underlying formula for calculating Bray-Curtis dissimilarity and Jaccard dissimilarity, and the consequent differences in output pairwise dissimilarity matrices. Jaccard dissimilarity simply compares the overlap in species presence/absence between pairs of communities, consequently two communities with identical species present will have a Jaccard score of 0 regardless of the relative abundace of these organisms in the community. For example, through the lens of Jaccard dissimilarity, a community with 10 frogs, 1 cat, and 1 fox would be considered identical to a community with 1 frog, 3 cats, and 8 foxes. In contrast, Bray-Curtis dissimilarity considers the abundance of each species in each community and consequently would differentiate these communities. In some cases, it is useful to compare the outputs of both dissimilarity metrics to gain deeper insights into the structure of communities in an ecosystem.

While summary insights into community dissimilarity are valuable for understanding the factors which determine the structure of communities through time and space, it is important to consider which species are driving these trends. We can identify the species which contribute most to Bray-Curtis dissimilarity between each pair of sites using SIMPER analysis.


SIMPER
SIMPER is a method for identifying which species contribute most to beta diversity (Bray-Curtis dissimilarity). SIMPER identifies the most abundant and/or most variables species in the dataset.

\[BC_{ijk} = \frac{|x_{ij}-x_{ik}|}{\sum(x_{ij}+x_{ik})}\]

Where \(x_{ij}\) is the abundance of species \(i\) in community \(j\), and \(x_{ik}\) is the abundance of species \(i\) in community \(k\). \(BC_{ijk}\) is the contribution of species \(i\) to overall Bray-Curtis dissimilarity between communities \(j\) and \(k\). The overall Bray-Curtis dissimilarity between communities \(j\) and \(k\) is the sum of individual species contributions. Note that SIMPER is severely biased towards highly abundant species.


We can use the simper function in the R package vegan to calculate the contribution of each species to Bray-Curtis dissimilarity between sites in the Plymouth Sound. The simper function will automatically perform pairwise comparisons between sites, and return cumulative contribution scores up to a maximum of 0.7 (70% of total dissimilarity). You can cross reference the species identified with your taxonomy table and discuss the relevant biology and ecology of these organisms.


simper(meio.data, env.data$site, permutations = 999)
##     Cawsand_Inner                                                                        
## otu "OTU_21"      "OTU_73"    "OTU_63"    "OTU_71"    "OTU_3162"  "OTU_104"   "OTU_47"   
## cs  "0.2731253"   "0.4094143" "0.4942529" "0.5604817" "0.6119321" "0.6584565" "0.7033388"



Ordination

Principal Coordinates Analysis (PCoA)

Now that we have calculated our dissimilarity matrices, we can using ordination to visualise the structure of these communities. Ordination by definition is the representation of large numbers of sites or species in low dimensional space, such that any underlying relationships in these data become visually apparent. Below we will use functions in the R package vegan to perform and plot principal coordinates analysis for both Bray-Curtis and Jaccard dissimilarity matrices to visualise meiofaunal community structure across the Plymouth Sound.

# calculate principal coordinates analysis (Bray-Curtis)
pcoa.meio.bray <- cmdscale(meio.bray, k = 2, eig = T)

# extract axis positions for each site from cmdscale object and create a dataframe for plotting
pcoa.meio.bray.plotting <- as.data.frame(pcoa.meio.bray$points)
colnames(pcoa.meio.bray.plotting) <- c("axis_1", "axis_2")
pcoa.meio.bray.plotting$site <- rownames(pcoa.meio.bray.plotting)

# calculate the proportion of variance in the data which is explained by the first two PCoA axes
pcoa.meio.bray$eig[1]/(sum(pcoa.meio.bray$eig))
## [1] 0.3047715
pcoa.meio.bray$eig[2]/(sum(pcoa.meio.bray$eig))
## [1] 0.2235254
# create a PCoA plot
pcoa.meio.bray.plot <- ggplot(pcoa.meio.bray.plotting, aes(x = axis_1, y = axis_2, colour = site)) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  theme_bw() + 
  xlab("PCoA 1 (30.8%)") +
  ylab("PCoA 2 (22.3%)") +
  annotate(geom = 'text', label = 'Bray-Curtis', x = Inf, y = -Inf, hjust = 1.15, vjust = -1)


# repeat process with Jaccard dissimilarity matrix
pcoa.meio.jac <- cmdscale(meio.jac, k = 2, eig = T)

pcoa.meio.jac.plotting <- as.data.frame(pcoa.meio.jac$points)
colnames(pcoa.meio.jac.plotting) <- c("axis_1", "axis_2")
pcoa.meio.jac.plotting$site <- rownames(pcoa.meio.jac.plotting)

pcoa.meio.jac$eig[1]/(sum(pcoa.meio.jac$eig))
## [1] 0.3166126
pcoa.meio.jac$eig[2]/(sum(pcoa.meio.jac$eig))
## [1] 0.2390876
pcoa.meio.jac.plot <- ggplot(pcoa.meio.jac.plotting, aes(x = axis_1, y = axis_2, colour = site)) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  theme_bw() + 
  xlab("PCoA 1 (31.6%)") +
  ylab("PCoA 2 (24.0%)") +
  annotate(geom = 'text', label = 'Jaccard', x = Inf, y = -Inf, hjust = 1.215, vjust = -1)


# extract plot legend
legend <- get_legend(pcoa.meio.jac.plot)

# plot Bray-Curtis PCoA and Jaccard PCoA side by side
plot_grid(pcoa.meio.bray.plot + theme(legend.position = 'none'), pcoa.meio.jac.plot + theme(legend.position = 'none'), legend, ncol = 3, rel_widths = c(1,1,0.5))

PERMANOVA

PERMANOVA (permutational multivariate analysis of variance; Anderson 2001) is non-parametric multivariate statistical test used to quantify the impact of both continuous and categorical variables on dissimilarity between communities. While it is valid to construct a PERMANOVA model which includes continuous variables, in this instance we will use a simple PERMANOVA model to test the effect of sediment type on meiofaunal community composition. The input for PERMANOVA is a dissimilarity matrix (Bray-Curtis dissimilarity in this case), and corresponding environmental data. A resultant p-value < 0.05 indicates that centroid position and/or dispersion differs between the groups in the model.
As PERMANOVA is affected by both centroid position and disperison, we perform a homogeneity of dispersions analysis using betadisper to establish whether dispersion is homogeneous between groups (in this case, silt and sand sediment associated communities). We will then perform PERMANOVA analysis using the adonis function in the R package vegan.

# Homogeneity of dispersion test
permutest(betadisper(meio.bray, env.data$sediment))
## 
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 719
## 
## Response: Distances
##           Df   Sum Sq   Mean Sq      F N.Perm Pr(>F)
## Groups     1 0.003431 0.0034315 0.3266    719 0.6014
## Residuals  4 0.042029 0.0105073
# PERMANOVA analysis
adonis(meio.bray ~ sediment, data = env.data, permutations = 999)
## 
## Call:
## adonis(formula = meio.bray ~ sediment, data = env.data, permutations = 999) 
## 
## Permutation: free
## Number of permutations: 719
## 
## Terms added sequentially (first to last)
## 
##           Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)
## sediment   1   0.33396 0.33396 0.78833 0.16463      1
## Residuals  4   1.69454 0.42364         0.83537       
## Total      5   2.02851                 1.00000

The betadisper analysis shows that there is no signficiant difference in dispersion between sand sediments and silt sediments. The PERMANOVA analysis show that there is no significant difference in centroid position or dispersion between sediment types.

Distance-Based Redundancy Analysis (dbRDA)

Distance-Based Redundancy Analysis (dbRDA; Legendre and Anderson, 1999) is an extension of Redundancy Analysis (RDA) which allows the use of non-euclidean distance matrices as inputs (e.g. Bray-Curtis dissimilarity). The method works by first calculating a PCoA from the dissimilarity matrix, and then subjecting the PCoA eigenvalues (which represent dissimilarities in euclidean space) to RDA. The method aims to detect linear relationships between environmental variables and these dissimilarities.
dbRDA differs from PCoA in that it is a constrained analysis. While PCoA axes are generated to explain maximum variation in the distance matrix, dbRDA canonical axes are constructed as linear combinations of environmental variables. Consequently these axes are constrained to the environmental variables in the model, and the ordination will be distinct from the PCoA. dbRDA allows us to visualise how environmental variables contrain variation in community composition between our sites.
As dbRDA considers multiple environmental variables which are measured using different techniques, it is important to normalise environmental variables such that they can be compared concurrently. This is achieved using a z-score transformation. Moreover, dbRDA is sensitive to multicolinearity (i.e. high correlation between environmental variables). We can test whether any of our environmental variables are correlated by calculating variance inflation factors (VIFs) and then removing selected terms from the model until all VIF scores are < 10.

Z-Score Transformation
The z-score transformation normalises environmental variables by their standard deviation from the mean using the following formula:

\[Z_i = \frac {x_i - \overline{x}}{S}\] Where \(Z_i\) is the z-score for variable \(x\) at site \(i\), \(x_i\) is the value of variable \(x\) at site \(i\), \(\overline{x}\) is the mean of variable \(x\) across all sites, and \(S\) is the standard deviation of variable \(x\) across all sites.


Z-Score transform your environmental data

env.data.z <- env.data
env.data.z$depth <- (env.data.z$depth - mean(env.data.z$depth))/sd(env.data.z$depth)
env.data.z$NO3 <- (env.data.z$NO3 - mean(env.data.z$NO3))/sd(env.data.z$NO3)
env.data.z$Cu <- (env.data.z$Cu - mean(env.data.z$Cu))/sd(env.data.z$Cu)
env.data.z$Hg <- (env.data.z$Hg - mean(env.data.z$Hg))/sd(env.data.z$Hg)

Perform dbRDA

# construct full model and calculate VIF
dbRDA.full <- capscale(meio.bray ~ depth+NO3+Cu+Hg,
env.data.z)
vif.cca(dbRDA.full)
##     depth       NO3        Cu        Hg 
##  1.274408  1.248604 23.680195 22.889961
# construct reduced model and calculate VIF
dbRDA.mat <- capscale(meio.bray ~ depth+NO3+Hg,
env.data.z)
vif.cca(dbRDA.mat)
##    depth      NO3       Hg 
## 1.230496 1.239103 1.491538
# test overall significance of the analysis
anova(dbRDA.mat)
## Permutation test for capscale under reduced model
## Permutation: free
## Number of permutations: 719
## 
## Model: capscale(formula = meio.bray ~ depth + NO3 + Hg, data = env.data.z)
##          Df SumOfSqs      F Pr(>F)
## Model     3  1.14806 0.8693 0.7819
## Residual  2  0.88044
# test significance of each environmental variable
anova(dbRDA.mat, by = "terms")
## Permutation test for capscale under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 719
## 
## Model: capscale(formula = meio.bray ~ depth + NO3 + Hg, data = env.data.z)
##          Df SumOfSqs      F Pr(>F)
## depth     1  0.32584 0.7402 0.8417
## NO3       1  0.41418 0.9408 0.5847
## Hg        1  0.40804 0.9269 0.6069
## Residual  2  0.88044
# summary of dbRDA model to extract total variance constrained and axis scores
summary(dbRDA.mat)
## 
## Call:
## capscale(formula = meio.bray ~ depth + NO3 + Hg, data = env.data.z) 
## 
## Partitioning of squared Bray distance:
##               Inertia Proportion
## Total          2.0285      1.000
## Constrained    1.1481      0.566
## Unconstrained  0.8804      0.434
## 
## Eigenvalues, and their contribution to the squared Bray distance 
## 
## Importance of components:
##                         CAP1   CAP2   CAP3   MDS1   MDS2
## Eigenvalue            0.4366 0.3948 0.3166 0.5243 0.3561
## Proportion Explained  0.2152 0.1946 0.1561 0.2585 0.1756
## Cumulative Proportion 0.2152 0.4099 0.5660 0.8244 1.0000
## 
## Accumulated constrained eigenvalues
## Importance of components:
##                         CAP1   CAP2   CAP3
## Eigenvalue            0.4366 0.3948 0.3166
## Proportion Explained  0.3803 0.3439 0.2758
## Cumulative Proportion 0.3803 0.7242 1.0000
## 
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores:  1.784582 
## 
## 
## Species scores
## 
##      CAP1 CAP2 CAP3 MDS1 MDS2
## Dim1                         
## Dim2                         
## Dim3                         
## Dim4                         
## Dim5                         
## 
## 
## Site scores (weighted sums of species scores)
## 
##                       CAP1     CAP2    CAP3     MDS1    MDS2
## Cawsand Bay      -0.653940  0.52258 -1.1851 -0.09111  0.4015
## Inner Breakwater  0.453642  0.95241  0.9518 -1.01815 -0.4491
## Jenny Cliff Bay  -0.188139 -0.51652 -0.6921  0.29011 -1.3971
## Mallard Shoal     0.004171 -1.29984  0.7392 -0.82329  0.6974
## West Mud         -0.939952  0.31442  0.5930  0.94169  0.1454
## Sutton Lock       1.324219  0.02693 -0.4069  0.70075  0.6020
## 
## 
## Site constraints (linear combinations of constraining variables)
## 
##                      CAP1     CAP2    CAP3     MDS1    MDS2
## Cawsand Bay      -0.72388  0.57179 -1.2781 -0.09111  0.4015
## Inner Breakwater  0.38504  0.97402  0.5645 -1.01815 -0.4491
## Jenny Cliff Bay   0.05169 -0.68590 -0.3804  0.29011 -1.3971
## Mallard Shoal    -0.20449 -1.16885  0.2860 -0.82329  0.6974
## West Mud         -0.83750  0.26554  0.9870  0.94169  0.1454
## Sutton Lock       1.32913  0.04341 -0.1789  0.70075  0.6020
## 
## 
## Biplot scores for constraining variables
## 
##          CAP1   CAP2    CAP3 MDS1 MDS2
## depth -0.1369 0.2983 -0.9446    0    0
## NO3   -0.7609 0.5042  0.4086    0    0
## Hg     0.2525 0.7214  0.6448    0    0

Plot dbRDA

smry <- summary(dbRDA.mat)
scrs <- scores(dbRDA.mat)
df1  <- data.frame(smry$sites[,1:2]) # site scores for RDA1 and RDA2
df1$site <- rownames(df1)  #add site names
df2  <- data.frame(smry$biplot[,1:2])  # mapping environmental variables

rda.plot <- ggplot(df1, aes(x=CAP1, y=CAP2, colour = site)) + 
  geom_segment(data=df2, aes(x=0, xend=CAP1, y=0, yend=CAP2), 
               color="grey50", arrow=arrow(length=unit(0.01,"npc"))) +
  geom_text(data=df2, aes(x=CAP1,y=CAP2,label=rownames(df2),
            hjust=0.5*(1-sign(CAP1)),vjust=0.5*(1-sign(CAP2))), 
            color="grey50", size=3) +
  geom_point(size = 3) +
  scale_colour_viridis_d(option = "magma", begin = 0.2, end = 0.8) +
  geom_text(aes(label=rownames(df1),
            hjust=0,vjust=1.5), colour = "black",size=3) +
  geom_hline(yintercept=0, linetype="dotted") +
  geom_vline(xintercept=0, linetype="dotted") +
  xlim(-1.05, 1.65) +
  ylim(-1.5, 1.05) +
  xlab("RDA1 (21.6%)") + # this percentage comes from the CAP1 'importance of components:' proportion explained, which can be found in summary(dbRDA.mat) 
  ylab("RDA2 (19.2%)") + # this percentage comes from the CAP2 'importance of components:' proportion explained, which can be found in summary(dbRDA.mat) 
  coord_fixed() +
  theme_bw() 
rda.plot

It is critical to remember that we have reduced our model due to colinearity between Hg and Cu, but that these factors are confounded. The relationship between constrained variance and unconstrained variance informs us of how well the beta-diversity of the Plymouth Sound can be explained by the environmental factors in our model. While the measured environmental parameters, namely Copper and Mercury concentrations, were able to explain trends in Shannon’s H’ observed throughout the Plymouth Sound, the environmental parameters were unable to explain a significant proportion of variance in meiofauna community composition (beta-diversity) throughout the region.