Doing ordinations

Fit ordination

Note that when parameterized as ~ 1, capscale performs an unconstrained analysis called Principal Coordinates Analysis, abbreviated PCoA.

pacman::p_load(vegan, tidyverse)
data(dune, dune.env)
ord <- capscale(dune ~ 1, data=dune)

Assess and report ordinations

Eigenvalues are used to assess (1) how well the ordination fits, and (2) how much variation is explained by each axis (and overall).

ord
## Call: capscale(formula = dune ~ 1, data = dune)
## 
##               Inertia Rank
## Total           84.12     
## Unconstrained   84.12   19
## Inertia is mean squared Euclidean distance 
## Species scores projected from 'dune' 
## 
## Eigenvalues for unconstrained axes:
##   MDS1   MDS2   MDS3   MDS4   MDS5   MDS6   MDS7   MDS8 
## 24.795 18.147  7.629  7.153  5.695  4.333  3.199  2.782 
## (Showing 8 of 19 unconstrained eigenvalues)

Visualize: The screeplot

Simply looking at the eigenvalues doesn’t do much, though. To assess how good the ordination is, we look at the rate of change in eigenvalues using a screeplot. Desirable qualities of the screeplot include (1) a major kink in the line that (2) occurs as low as possible. Eigenvalues are plotted as “inertia.”

screeplot(ord, type="line")
There is a clear kink in the inertia plot at MDS3, but note that MDS4 has equal inertia and should probably be considered, as well.

There is a clear kink in the inertia plot at MDS3, but note that MDS4 has equal inertia and should probably be considered, as well.

Quantify: Summarize eigenvalues & variation explained

eigenvals(ord) %>%
  summary() -> ev
In addition to raw eigenvalues, summary() gives the proportion explained by each axis and the cumulative proportion explained. A ‘good’ ordination should explain at least 70% of the variation; here, that requires four axes to get a good fit. This actually isn’t a good ordination, with MDS1 explaining only 29% of variation. Ideally the first axis is > 50%.
  MDS1 MDS2 MDS3 MDS4 MDS5 MDS6 MDS7
Eigenvalue 24.8 18.15 7.63 7.15 5.7 4.33 3.2
Proportion Explained 0.29 0.22 0.09 0.09 0.07 0.05 0.04
Cumulative Proportion 0.29 0.51 0.6 0.69 0.75 0.81 0.84

Reporting

To write this up in a Results section, one would simply say,

Cumulatively, four axes of the PCoA represented 69% of variation in the Bray-Curtis distance matrix. First, second, third, and fourth axes individually explained 29%, 22%, 9%, and 9% respectively.

Test and display environmental variables

Testing vectors and groups with envfit

Here we test two environmental variables:

  • A1, a continuous vector
  • Management, a categorical factor

envfit knows how to interpret these variables and test them as vectors or factors automatically.

Also in this example, we nest by use using the strata argument, which can be used to control for repeated measures or random error terms.

The choices argument has been set to include the first four axes of the ordination, which are necessary to explain a sufficient proportion (69%) of the variation.

dune.fit <- envfit(ord ~ Management + A1, dune.env, 
                   choices=c(1:4), 
                   strata = dune.env$Use) 
dune.fit
## 
## ***VECTORS
## 
##        MDS1     MDS2     MDS3     MDS4     r2 Pr(>r)
## A1  0.86580  0.16093 -0.23320  0.41245 0.3393  0.142
## Blocks:  strata 
## Permutation: free
## Number of permutations: 999
## 
## ***FACTORS:
## 
## Centroids:
##                 MDS1    MDS2    MDS3    MDS4
## ManagementBF -1.5335  0.1574  1.0042 -0.5583
## ManagementHF -0.8985 -0.2111 -0.5008  1.3197
## ManagementNM  0.9626  1.5060 -0.0934 -0.6189
## ManagementSF  0.5529 -1.4088  0.0086 -0.2017
## 
## Goodness of fit:
##                r2 Pr(>r)    
## Management 0.3803  0.001 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Blocks:  strata 
## Permutation: free
## Number of permutations: 999

Results from envfit indicate A1 does not have a significant correlation with variation but at least one of the Management groups is significantly different from at least one other.

Determining which groups are actually different requires a post-hoc pairwise test, which is available in the RVAideMemoire package:

pacman::p_load(RVAideMemoire)
pw <- pairwise.factorfit(ord,dune.env$Management, 
                         strata=dune.env$Use, 
                         choices=c(1:4), nperm=999) 
  • method: factor fitting to an ordination
  • data.name: ord by dune.env$Management 999 permutations
  • p.value:

      BF HF NM
    HF 0.662 NA NA
    NM 0.016 0.016 NA
    SF 0.0435 0.0672 0.006
  • p.adjust.method: fdr

These results indicate HF overlaps with BF and SF but the community composition of the other pairs are statistically-significantly different.

Default group plotting

These graphs connect site scores by Use:

plot(ord, display = "sites")
ordispider(ord, dune.env$Use, display = "sites", 
           col=c(unique(as.numeric(dune.env$Use))))

Customize colors

plot(ord, type = "n")
ordispider(ord, dune.env$Use, display = "sites", 
           col=c("blue", "orange", "black"),lwd=2)

GGplotting ordinations

Extract the data from vegan objects:

# Extract scores from ord results
  dune.spp <- scores(ord, scaling = 1)$species %>%
                as.data.frame() %>%
                  rownames_to_column("species")
  dune.sites <- scores(ord, scaling = 2)$sites %>%
                  as.data.frame() 

  # fit env variables: vectors
      dune.vec <- scores(dune.fit, "vectors") %>%
                    round(., 3)  %>%
                      as.data.frame() 
 
  # Env factor variables: get group cluster centroids
  # This is totally hack-y but trust me
    {
      pdf(file=NULL) 
      plot(ord) 
      spid <- ordispider(ord, dune.env$Management, display = "sites")
      w <- dev.off() # simply 'gobbles' message; prevents showing in output 
    }
    dune.cen <- cbind(spid)
    colnames(dune.cen) <- c("x.cen", "y.cen")

# Create the final table for site score plotting
  dune.man<- data.frame(Management=dune.env$Management, 
                        dune.sites, 
                        dune.cen)
  str(dune.man)
## 'data.frame':    20 obs. of  5 variables:
##  $ Management: Factor w/ 4 levels "BF","HF","NM",..: 4 1 4 4 2 2 2 2 2 1 ...
##  $ MDS1      : num  -0.8568 -1.6448 -0.4401 0.0479 -1.6245 ...
##  $ MDS2      : num  -0.172 -1.23 -2.383 -2.046 0.29 ...
##  $ x.cen     : num  0.553 -1.533 0.553 0.553 -0.899 ...
##  $ y.cen     : num  -1.409 0.157 -1.409 -1.409 -0.211 ...

Plot the ordination with ggplot:

# Vector Expansion Factor
    VEF = 4

# Site scores plot
sites.gg <-
  ggplot() + theme_bw(16) + 
    labs(x="MDS Axis 1",y="MDS Axis 2") + 
    geom_segment(data=dune.man, 
                 aes(xend=MDS1,
                     yend=MDS2, 
                     x=x.cen, 
                     y=y.cen, 
                     color=Management)) +
    geom_point(data=dune.man,
               aes(x=MDS1, y=MDS2, bg=Management), 
               colour="black", pch=21, size=3, stroke=2) +
    geom_segment(data=dune.vec, aes(x=0, y=0, 
                                    xend=MDS1*VEF, 
                                    yend=MDS2*VEF),
                 arrow=arrow(length = unit(0.03, "npc")),
                 lwd=1.5) + 
    geom_text(data=dune.vec, 
                   aes(x=MDS1*(VEF*1.2), y=MDS2*(VEF*1.2), 
                       label=rownames(dune.vec)), 
                   nudge_y = -0.03, size=6, fontface="bold")  + 
    geom_point(data=dune.spp, aes(x=MDS1, y=MDS2), 
                      shape="+", color="grey30", size=5) +
    theme(panel.grid=element_blank(),
          legend.position = "bottom", 
          legend.direction="horizontal") +
      guides(color=guide_legend(nrow=2, byrow=TRUE))

# Species scores plot
species.gg <-
  ggplot(data=dune.spp, aes(x=MDS1, y=MDS2)) + theme_bw(16) + 
            labs(x="MDS Axis 1",y="MDS Axis 2") + 
            coord_cartesian(xlim=c(-3.5,3.5), 
                            ylim=c(-3.5, 2.5)) +
            theme(panel.grid=element_blank(), 
                  plot.margin = unit(c(0.4,0.5,3,0.5), "cm")) +
     geom_label(aes(label=species),   
               label.padding=unit(0.1,"lines"),
               label.size = 0, fontface="bold")

gridExtra::grid.arrange(species.gg, sites.gg, nrow=1)