By C Nell
Do groups differ in composition?
Does community structure vary among regions or over time?
Do environmental variables explain community patterns?
Which species are responsible for differences among groups?
Multivariate analysis of ecological communities with vegan
install.packages('vegan')
library(vegan) ##Community ecology: ordination, disversity & dissimilarities
Bird abundances from 32 different plots (rows), 12 of which have 1 tree species (DIVERSITY = M) and 20 with 4 tree species (DIVERSITY = P).
Tree composition: there are a total of 6 possible tree species (treecomp), each signified with a letter A to F. Bird abundances are totalled according to their feeding guild (columns).
Get data from internet:
birds<-read.csv('https://raw.githubusercontent.com/collnell/lab-demo/master/bird_by_fg.csv')
trees<-read.csv('https://raw.githubusercontent.com/collnell/lab-demo/master/tree_comp.csv')
Or from your computer:
setwd("/Users/colleennell/Dropbox/Projects/Mexico/R") #change to data folder
birds<-read.csv('bird_by_fg.csv')
head(birds)
## DIVERSITY PLOT CA FR GR HE IN NE OM
## 1 M 3 0 0 0 0 2 0 0
## 2 M 9 0 0 2 0 6 0 4
## 3 M 12 0 0 0 0 2 0 2
## 4 M 17 0 0 0 0 7 0 4
## 5 M 20 0 0 0 0 1 0 4
## 6 M 21 0 0 3 0 14 0 7
trees<-read.csv('tree_comp.csv')
head(trees)
## PLOT comp A B C D E F row col
## 1 3 D 0 0 0 1 0 0 3 1
## 2 9 A 1 0 0 0 0 0 2 2
## 3 12 E 0 0 0 0 1 0 5 2
## 4 17 F 0 0 0 0 0 1 3 3
## 5 20 A 1 0 0 0 0 0 6 3
## 6 21 B 0 1 0 0 0 0 7 3
Questions: Is C. pentandara (B) associated with variation in bird species composition? Does feeding guild composition differ between monoculture and polyculture plots?
Parametric test for differences between independent groups for multiple continuous dependent variables. Like ANOVA for many response variables. Requires variables to be fewer than number of smaples.
Is C. pentandara (B) associated with variation in bird species composition? Or D & F (both Fabaceae)?
bird.matrix<-as.matrix(birds[,3:9])##response variables in a sample x species matrix
trees$B<-as.factor(trees$B)
bird.manova<-manova(bird.matrix~as.factor(B), data=trees) ##manova test
summary(bird.manova)
## Df Pillai approx F num Df den Df Pr(>F)
## as.factor(B) 1 0.39147 2.2056 7 24 0.07027 .
## Residuals 30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show univariate results:
summary.aov(bird.manova)
Problem: Most ecological data is overdispersed, has many 0’s or rare species, unequal sample sizes.
Solution: Dissimilarity coefficients, permutation tests
Non-paramentric, based on dissimilarities. Allows for partitioning of variability, similar to ANOVA, allowing for complex design (multiple factors, nested design, interactions, covariates). Uses permutation to compute F-statistic (pseudo-F).
Interactive app demonstrating permutation tests
Based on Legendre & Anderson (1999, Ecological Monographs) and Anderson (2001, Austral Ecology).
Null hypothesis: Groups do not differ in spread or positioni n multivaraite space.
Use square root or proportions to minimize influence of most abundant groups.
bird.mat<-sqrt(bird.matrix)#square root transform
#bird.prop<-decostand(bird.matrix, method="total")
Quantify pairwise compositional dissimilarity between sites based on species occurances.
- Bray-Curtis dissimilarity (abundance weighted)
- Jaccard (presence/absence)
- Gower’s (non-continuous variables)
Dissimilarity: 0 = sites are indentical, 1 = sites do not share any species
Create a dissimilarity matrix:
bird.dist<-vegdist(bird.mat, method='bray')
Do monoculture and polyculture plots differ in feeding guild composition?
set.seed(36) #reproducible results
bird.div<-adonis2(bird.dist~DIVERSITY, data=birds, permutations = 999, method="bray", strata="PLOT")
bird.div
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = bird.dist ~ DIVERSITY, data = birds, permutations = 999, method = "bray", strata = "PLOT")
## Df SumOfSqs F Pr(>F)
## DIVERSITY 1 0.32857 4.1585 0.008 **
## Residual 30 2.37033
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
strata = ‘exchangeable units’ for permutation. Important for nested design.
The average distance to group centroid. Used as a measure of multivariate beta diversity.
dispersion<-betadisper(bird.dist, group=birds$DIVERSITY)
permutest(dispersion)
##
## Permutation test for homogeneity of multivariate dispersions
## Permutation: free
## Number of permutations: 999
##
## Response: Distances
## Df Sum Sq Mean Sq F N.Perm Pr(>F)
## Groups 1 0.00369 0.0036924 0.2231 999 0.638
## Residuals 30 0.49659 0.0165530
plot(dispersion, hull=FALSE, ellipse=TRUE) ##sd ellipse
Non-metric multi-dimensional scaling. Unconstrained ordination. See (https://jonlefcheck.net/2012/10/24/nmds-tutorial-in-r/).
The goal of NMDS is to represent the original position of communities in multidimensional space as accurately as possible using a reduced number of dimensions that can be easily visualized. NMDS uses rank orders to preserve distances among objects thus can accomodate a variety of data types.
Configure samples in 2-dimensional space:
birdMDS<-metaMDS(bird.mat, distance="bray", k=2, trymax=35, autotransform=TRUE) ##k is the number of dimensions
birdMDS ##metaMDS takes eaither a distance matrix or your community matrix (then requires method for 'distance=')
stressplot(birdMDS)
Stress: similarity of observed distance to ordination distance. < 0.15 to indidates acceptable fit.
install.packages('ggplot2') ##plotting package
library(ggplot2)
##pull points from MDS
NMDS1 <- birdMDS$points[,1] ##also found using scores(birdMDS)
NMDS2 <- birdMDS$points[,2]
bird.plot<-cbind(birds, NMDS1, NMDS2, trees)
#plot ordination
p<-ggplot(bird.plot, aes(NMDS1, NMDS2, color=DIVERSITY))+
geom_point(position=position_jitter(.1), shape=3)+##separates overlapping points
stat_ellipse(type='t',size =1)+ ##draws 95% confidence interval ellipses
theme_minimal()
p
#plot ordination
p<-ggplot(bird.plot, aes(NMDS1, NMDS2, color=DIVERSITY))+
stat_ellipse(type='t',size =1)+
theme_minimal()+geom_text(data=bird.plot,aes(NMDS1, NMDS2, label=comp), position=position_jitter(.35))+
annotate("text", x=min(NMDS1), y=min(NMDS2), label=paste('Stress =',round(birdMDS$stress,3))) #add stress to plot
p
Which envornmental variables are correlated with the ordination?
fit<-envfit(birdMDS, bird.mat)
arrow<-data.frame(fit$vectors$arrows,R = fit$vectors$r, P = fit$vectors$pvals)
arrow$FG <- rownames(arrow)
arrow.p<-filter(arrow, P <= 0.05)
p<-ggplot(data=bird.plot, aes(NMDS1, NMDS2))+
geom_point(data=bird.plot, aes(NMDS1, NMDS2, color=DIVERSITY),position=position_jitter(.1))+##separates overlapping points
stat_ellipse(aes(fill=DIVERSITY), alpha=.2,type='t',size =1, geom="polygon")+ ##changes shading on ellipses
theme_minimal()+
geom_segment(data=arrow.p, aes(x=0, y=0, xend=NMDS1, yend=NMDS2, label=FG, lty=FG), arrow=arrow(length=unit(.2, "cm")*arrow.p$R)) ##add arrows (scaled by R-squared value)
p
Show gradient of insectivores:
ordisurf(birdMDS, bird.mat[,'IN'], bubble=TRUE)##bubble size reflects abundance of insectivores
##
## Family: gaussian
## Link function: identity
##
## Formula:
## y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)
##
## Estimated degrees of freedom:
## 5.6 total = 6.6
##
## REML score: 35.72947
Resources:
GUSTA ME - Provides several ‘wizards’ in choosing the correct statistical test, walkthrough examples of multivariate analyses, and guide to the major types of analyses.