Note that when parameterized as ~ 1, capscale performs an unconstrained analysis called Principal Coordinates Analysis, abbreviated PCoA.
pacman::p_load(vegan, tidyverse)
data(dune, dune.env)
ord <- capscale(dune ~ 1, data=dune)
Eigenvalues are used to assess (1) how well the ordination fits, and (2) how much variation is explained by each axis (and overall).
ord
## Call: capscale(formula = dune ~ 1, data = dune)
##
## Inertia Rank
## Total 84.12
## Unconstrained 84.12 19
## Inertia is mean squared Euclidean distance
## Species scores projected from 'dune'
##
## Eigenvalues for unconstrained axes:
## MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 MDS7 MDS8
## 24.795 18.147 7.629 7.153 5.695 4.333 3.199 2.782
## (Showing 8 of 19 unconstrained eigenvalues)
Simply looking at the eigenvalues doesn’t do much, though. To assess how good the ordination is, we look at the rate of change in eigenvalues using a screeplot. Desirable qualities of the screeplot include (1) a major kink in the line that (2) occurs as low as possible. Eigenvalues are plotted as “inertia.”
screeplot(ord, type="line")
There is a clear kink in the inertia plot at MDS3, but note that MDS4 has equal inertia and should probably be considered, as well.
eigenvals(ord) %>%
summary() -> ev
MDS1 | MDS2 | MDS3 | MDS4 | MDS5 | MDS6 | MDS7 | |
---|---|---|---|---|---|---|---|
Eigenvalue | 24.8 | 18.15 | 7.63 | 7.15 | 5.7 | 4.33 | 3.2 |
Proportion Explained | 0.29 | 0.22 | 0.09 | 0.09 | 0.07 | 0.05 | 0.04 |
Cumulative Proportion | 0.29 | 0.51 | 0.6 | 0.69 | 0.75 | 0.81 | 0.84 |
To write this up in a Results section, one would simply say,
Cumulatively, four axes of the PCoA represented 69% of variation in the Bray-Curtis distance matrix. First, second, third, and fourth axes individually explained 29%, 22%, 9%, and 9% respectively.
Here we test two environmental variables:
envfit knows how to interpret these variables and test them as vectors or factors automatically.
Also in this example, we nest by use using the strata argument, which can be used to control for repeated measures or random error terms.
The choices argument has been set to include the first four axes of the ordination, which are necessary to explain a sufficient proportion (69%) of the variation.
dune.fit <- envfit(ord ~ Management + A1, dune.env,
choices=c(1:4),
strata = dune.env$Use)
dune.fit
##
## ***VECTORS
##
## MDS1 MDS2 MDS3 MDS4 r2 Pr(>r)
## A1 0.86580 0.16093 -0.23320 0.41245 0.3393 0.142
## Blocks: strata
## Permutation: free
## Number of permutations: 999
##
## ***FACTORS:
##
## Centroids:
## MDS1 MDS2 MDS3 MDS4
## ManagementBF -1.5335 0.1574 1.0042 -0.5583
## ManagementHF -0.8985 -0.2111 -0.5008 1.3197
## ManagementNM 0.9626 1.5060 -0.0934 -0.6189
## ManagementSF 0.5529 -1.4088 0.0086 -0.2017
##
## Goodness of fit:
## r2 Pr(>r)
## Management 0.3803 0.001 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Blocks: strata
## Permutation: free
## Number of permutations: 999
Results from envfit indicate A1 does not have a significant correlation with variation but at least one of the Management groups is significantly different from at least one other.
Determining which groups are actually different requires a post-hoc pairwise test, which is available in the RVAideMemoire package:
pacman::p_load(RVAideMemoire)
pw <- pairwise.factorfit(ord,dune.env$Management,
strata=dune.env$Use,
choices=c(1:4), nperm=999)
p.value:
BF | HF | NM | |
---|---|---|---|
HF | 0.662 | NA | NA |
NM | 0.016 | 0.016 | NA |
SF | 0.0435 | 0.0672 | 0.006 |
p.adjust.method: fdr
These results indicate HF overlaps with BF and SF but the community composition of the other pairs are statistically-significantly different.
These graphs connect site scores by Use:
plot(ord, display = "sites")
ordispider(ord, dune.env$Use, display = "sites",
col=c(unique(as.numeric(dune.env$Use))))
plot(ord, type = "n")
ordispider(ord, dune.env$Use, display = "sites",
col=c("blue", "orange", "black"),lwd=2)
Extract the data from vegan objects:
# Extract scores from ord results
dune.spp <- scores(ord, scaling = 1)$species %>%
as.data.frame() %>%
rownames_to_column("species")
dune.sites <- scores(ord, scaling = 2)$sites %>%
as.data.frame()
# fit env variables: vectors
dune.vec <- scores(dune.fit, "vectors") %>%
round(., 3) %>%
as.data.frame()
# Env factor variables: get group cluster centroids
# This is totally hack-y but trust me
{
pdf(file=NULL)
plot(ord)
spid <- ordispider(ord, dune.env$Management, display = "sites")
w <- dev.off() # simply 'gobbles' message; prevents showing in output
}
dune.cen <- cbind(spid)
colnames(dune.cen) <- c("x.cen", "y.cen")
# Create the final table for site score plotting
dune.man<- data.frame(Management=dune.env$Management,
dune.sites,
dune.cen)
str(dune.man)
## 'data.frame': 20 obs. of 5 variables:
## $ Management: Factor w/ 4 levels "BF","HF","NM",..: 4 1 4 4 2 2 2 2 2 1 ...
## $ MDS1 : num -0.8568 -1.6448 -0.4401 0.0479 -1.6245 ...
## $ MDS2 : num -0.172 -1.23 -2.383 -2.046 0.29 ...
## $ x.cen : num 0.553 -1.533 0.553 0.553 -0.899 ...
## $ y.cen : num -1.409 0.157 -1.409 -1.409 -0.211 ...
Plot the ordination with ggplot:
# Vector Expansion Factor
VEF = 4
# Site scores plot
sites.gg <-
ggplot() + theme_bw(16) +
labs(x="MDS Axis 1",y="MDS Axis 2") +
geom_segment(data=dune.man,
aes(xend=MDS1,
yend=MDS2,
x=x.cen,
y=y.cen,
color=Management)) +
geom_point(data=dune.man,
aes(x=MDS1, y=MDS2, bg=Management),
colour="black", pch=21, size=3, stroke=2) +
geom_segment(data=dune.vec, aes(x=0, y=0,
xend=MDS1*VEF,
yend=MDS2*VEF),
arrow=arrow(length = unit(0.03, "npc")),
lwd=1.5) +
geom_text(data=dune.vec,
aes(x=MDS1*(VEF*1.2), y=MDS2*(VEF*1.2),
label=rownames(dune.vec)),
nudge_y = -0.03, size=6, fontface="bold") +
geom_point(data=dune.spp, aes(x=MDS1, y=MDS2),
shape="+", color="grey30", size=5) +
theme(panel.grid=element_blank(),
legend.position = "bottom",
legend.direction="horizontal") +
guides(color=guide_legend(nrow=2, byrow=TRUE))
# Species scores plot
species.gg <-
ggplot(data=dune.spp, aes(x=MDS1, y=MDS2)) + theme_bw(16) +
labs(x="MDS Axis 1",y="MDS Axis 2") +
coord_cartesian(xlim=c(-3.5,3.5),
ylim=c(-3.5, 2.5)) +
theme(panel.grid=element_blank(),
plot.margin = unit(c(0.4,0.5,3,0.5), "cm")) +
geom_label(aes(label=species),
label.padding=unit(0.1,"lines"),
label.size = 0, fontface="bold")
gridExtra::grid.arrange(species.gg, sites.gg, nrow=1)