trait modelling using mvabund

Prepared for Eco-Stats March lab, at the last minute

In this lab we will use the latest version of the mvabund package (3.10.1) available on R-Forge. So start by downloading it:

install.packages("mvabund", repos="http://R-Forge.R-project.org")

## Installing package into '\\INFPWFS111.ad.unsw.edu.au/Staff012$/z3103495/R/win-library/3.1'
## (as 'lib' is unspecified)

library(mvabund)

## Warning: package 'mvabund' was built under R version 3.1.3

What is this trait modelling thing about?

Often ecologists collect data (especially abundance or presence-absence data) simultaneously across many taxa, with the intention of studying what occurs where (and why). This tutorial focusses on the why - methods to help us move towards a functional explanation of community abundances. McGill et al (2006) and Shipley (2010) argue passionately for the need for this.

A common strategy in any field looking at “why” is to look for predictor variables that can explain the response. In the case of studying why some taxa are abundant at a site while others are not, the relevant predictors are species traits. These come in a matrix, different traits in different columns, different taxa in different rows.

Here is a schematic diagram (from Warton et al 2015) describing how trait modelling fits in with other things you might know about - SDM, multivariate analysis…

Fourth corner diagram

Basically we are just adding an extra type of predictor to the model (traits). More details below.

Single site example (“CATS”)

Shipley et al (2006) proposed CATS as a method to study why (relative) abundances at a single site are larger for some taxa than for others, by relating abundances to a matrix of traits. Shipley used maximum entropy, but Warton et al (2015) recently showed this is equivalent to a Poisson GLM (“CATS regression”). So we can take count data at a site and use a GLM to regress it against traits.

Consider for example ant abundances at a site somewhere in SE Australia from Gibb et al (2015). Abundances are counts, we will use the first row of antTraits$abund, and regress as a function of several trait variables (stored as a data frame in antTraits$traits, log-transformed and regressed against body length):

data(antTraits)
y=c(t(antTraits$abund[1,])) #coercing the first row of antTraits$abund into a vector
singleSite = data.frame(y, antTraits$traits,check.names=F) #make a data frame from site abundances and traits
head(singleSite) #here is what the dataset looks like

##                             y Femur.length No.spines Pilosity Polymorphism
## Amblyopone.australis        0  -0.38278140         0        2            1
## Aphaenogaster.longiceps     0   0.15408686         2        1            0
## Camponotus.cinereus.amperei 0   0.01135992         0        1            1
## Camponotus.claripes         2   0.14233775         0        1            1
## Camponotus.consobrinus      1   0.12496098         0        1            1
## Camponotus.nigriceps        6   0.09985137         0        1            1
##                             Webers.length
## Amblyopone.australis            1.0942247
## Aphaenogaster.longiceps         0.6740655
## Camponotus.cinereus.amperei     0.6940704
## Camponotus.claripes             0.8847158
## Camponotus.consobrinus          1.2145690
## Camponotus.nigriceps            1.4607617

ft=manyglm(y~.,data=singleSite, family="negative.binomial") #fit a negative binomial regression
summary(ft,nBoot=1) #find which traits are important

## 
## Test statistics:
##               wald value Pr(>wald)
## (Intercept)        1.192         1
## Femur.length       0.826         1
## No.spines          1.611         1
## Pilosity1          0.001         1
## Pilosity2          0.607         1
## Pilosity3          1.880         1
## Polymorphism1      0.906         1
## Polymorphism2      0.134         1
## Webers.length      1.618         1
## Arguments: P-value calculated using 0 resampling iterations via pit.trap resampling.
## 
## Test statistic:  3.954, p-value: 1 
## Arguments: P-value calculated using 0 resampling iterations via pit.trap resampling.

plot(ft) #fit looks OK

Of the six traits in this model, the strongest effects seem to be due to Weber’s length (body length) and Pilosity (hairiness). nBoot=1 doesn’t do resampling - note that a key assumption for resampling is violated in this dataset: the species abundances are not independent! So I’m going to eyeball and wave my arms instead.

par(mfrow=c(1,2))
plot( log(singleSite$y+1)~singleSite$Weber )
plot( log(singleSite$y+1)~singleSite$Pilosity )

It looks like any pattern for body length is not overly strong. A suggestion that small ants (log(Weber’s length)<-0.5) are absent at the site. Hairy buggers (Pilosity=3) seem to have high abundance here though.

Try again on a different site

y=c(t(antTraits$abund[2,])) #coercing the first row of antTraits$abund into a vector
anotherSite = data.frame(y, antTraits$traits,check.names=F) #make a data frame from site abundances and traits

ft=manyglm(y~.,data=anotherSite, family="negative.binomial") #fit a negative binomial regression
summary(ft,nBoot=1) #find which traits are important

## 
## Test statistics:
##               wald value Pr(>wald)
## (Intercept)        1.044         1
## Femur.length       0.915         1
## No.spines          1.055         1
## Pilosity1          0.390         1
## Pilosity2          0.447         1
## Pilosity3          0.999         1
## Polymorphism1      1.967         1
## Polymorphism2      1.312         1
## Webers.length      1.774         1
## Arguments: P-value calculated using 0 resampling iterations via pit.trap resampling.
## 
## Test statistic:  3.436, p-value: 1 
## Arguments: P-value calculated using 0 resampling iterations via pit.trap resampling.

par(mfrow=c(1,2))
plot( log(anotherSite$y+1)~anotherSite$Weber )
plot( log(anotherSite$y+1)~anotherSite$Pilosity )

Pilosity seems to be on again but any association with Weber’s length is difficult to detect. We could try another site, and another, and we will keep finding we get different results, and want to try and settle on what the broader picture is.

Multiple sites - the “fourth corner problem”

Gibb et al (2015) actually sampled at 30 sites, and abundances across these sites form a matrix (different sites in different rows, different taxa in different columns). If there are predictors describing key differences across sites (“environmental variables”) this gives us a third matrix, as in the diagram at the top of the page. We would like to study how the abundances vary across sites and taxa as a function of environment and traits. Or more to the point, which traits best explain variation across taxa in their environmental response.

One of the early attempts to look at this problem called it the “fourth corner problem” (Legendre et al 1996), seeing the issue as one where we have three matrices and what we care about is the fourth (the one that relates environment to traits). Labelled (a) in the below diagram, where L=abundance, R=environmental variables, Q=traits (taking the notation from Doledec et al 1996).

Fourth corner problem	fourth corner regression model

But the problem is more naturally understood using regression (as in Brown et al 2014), where the goal is to predict abundance, and we have two types of predictors - environment and trait. What we are most interested in is the interaction between environmental variables and traits (because this tells us how environmental response varies as traits vary). The env*trait interaction coefficients that arise in such a model can be understood as the fourth corner matrix. This is conceptualised in (b) above (which comes from Fig 1 from Brown et al 2014):

A new function for fitting fourth corner models in the mvabund package on R is traitglm. It takes three arguments: L; R; and Q, and uses these to construct a regression model for abundance (L) as a function of additive terms for each variable in R and Q, quadratic terms for any continuous covariates (such as Weber’s length), and interaction terms between environmental and trait variables. These interactions are what we really care about (how environmental response changes as traits change). All variables are standardised by default.

The data need to be entered in such a way that the rows of L and R must match (and correspond to the sites sampled), and the columns of L must match the rows of Q (and correspond to the different taxa observed). This means that as compared to the above diagram, Q needs to be turned on its side, as in the antTraits dataset.

ft=traitglm(antTraits$abund,antTraits$env,antTraits$traits)
ft$fourth #print fourth corner terms

##                Bare.ground Canopy.cover   Shrub.cover Volume.lying.CWD
## Femur.length  -0.005345114  -0.03372640 -0.0846686767      0.166704472
## No.spines     -0.044762960  -0.05617177 -0.0889573035     -0.119947322
## Pilosity0     -0.068000805  -0.09410584 -0.1107630310     -0.147687738
## Pilosity1     -0.023416487   0.09962158 -0.0008981637      0.039695356
## Pilosity2      0.041131267   0.05653293  0.0545829023      0.022215263
## Pilosity3      0.037427051  -0.16844261  0.0172086375      0.030054684
## Polymorphism0 -0.056821173  -0.03255994 -0.0232985333      0.010504951
## Polymorphism1  0.052033214   0.02485483  0.0129314617     -0.009613700
## Polymorphism2  0.014540899   0.01574084  0.0185111477     -0.002697445
## Webers.length  0.088389977   0.08653766 -0.0484200426     -0.014818162
##               Feral.mammal.dung
## Femur.length      -0.0259056583
## No.spines          0.0396076363
## Pilosity0         -0.0255683063
## Pilosity1         -0.0096296249
## Pilosity2          0.0001542087
## Pilosity3          0.0383154261
## Polymorphism0     -0.0407112737
## Polymorphism1      0.0415889141
## Polymorphism2      0.0039852641
## Webers.length     -0.0548418395

Because all predictors are standardised, you can interpret the size of coefficients as a measure of importance. As interaction terms, the fourth coefficients each have an interpretation as the amount by which a unit (1 sd) change in the trait variable changes the slope of the relationship between abundance and a given environmental variable. (But not that there are quadratic terms in the model too, which makes it a little messier to interpret, a plot would be a good idea.) So for example, one of the strongest interactions seems to be between Pilosity and Canopy cover - when there is more canopy cover, ants tend to have lower pilosity scores (less hairy).

You can change the type of model being fitted by traitglm using the method argument and other usual arguments such as family. The default method used for the fit is a manyglm model, which enables use of the anova function for resampling-based hypothesis testing for a environment*trait interaction. By default, manyglm assumes you have counts and fits a negative binomial regression. If you had presence/absence for example you could add the argument family="binomial" to your traitglm call.

There is currently no formula argument - it is hoped that soon there will be, so that you can specify exactly which env and trait variables to use, instead of using all of them, and to specify exactly what function of them to use in the model. But not today!

The anova function takes a long time to run on fourth corner models so I’ll sub-sample the data first, to just the species found at more than 20% of sites, and a reduced number of predictor variables:

abSum = apply(antTraits$abund>0,2,mean) # stores the proportion of values that are non-zero for each species.
ab = antTraits$abund[,abSum>0.2] # just keep columns of $abund that are more than 20% presence
tr = antTraits$traits[abSum>0.2,] # and make sure that change is also made to rows of the traits data frame.

# now fit the fourth corner model, only as a function of a couple of traits and env variables:
ftSmall=traitglm(ab,antTraits$env[,1:3],data.frame(tr$Weber,tr$Femur))
anova(ftSmall)

## Resampling begins for test 1.
##  Resampling run 0 finished. Time elapsed: 0.00 minutes...
## Time elapsed: 0 hr 0 min 11 sec

## Analysis of Deviance Table
## 
## env.plus.trait: l ~ .
## env.times.trait: l ~ .
## 
## Multivariate test:
##                           Res.Df Df.diff   Dev Pr(>Dev)   
## Main effects only            922                          
## env:trait (fourth corner)    916       6 21.05     0.01 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Arguments: P-value calculated using 99 resampling iterations via resampling.

We can see from these results that there is evidence of a environment:trait interaction term, i.e. these traits explain a significant amount of the variation in environmental response across species.

The idea of using resampling-bsaed inference in a regression context to test for fourth corner interactions was first illustrated in Warton et al (2015), under the guise of multi-site CATS regression. The anova.traitglm function uses “row resampling”, resampling sites but keeping all species from a site together in the resample. This ensures that we can make inferences that are valid if the abundances can be assumed to be independent across sites, even though we know that the abundances are not independent across species. Resampling means that it can take a while though.

Note the number of bootstrap resamples used - only 99 by default, which is only correct to within 2% for a P-value near 0.05. The reason only 99 resamples are used is that this takes a fair while to run, but you should repeat with nBoot=1000 when you want a more accurate P-value, after a test run to see exactly how long you will have to wait!

GLMs and fourth corner models using a LASSO penalty

Another recent addition to the mvabund package is some functions for fitting GLMs using a LASSO penalty, the main one being the glm1path function. This simplifies interpretation because it automatically does model selection, setting to zero any interaction coefficients that don’t help reduce BIC:

ft1=traitglm(antTraits$abund,antTraits$env,antTraits$traits,method="glm1path")
ft1$fourth #notice LASSO penalty has shrunk many interactions to zero

##               Bare.ground Canopy.cover Shrub.cover Volume.lying.CWD
## Femur.length   0.00000000  0.000000000 -0.05653626       0.10635948
## No.spines     -0.02193309 -0.003940441 -0.03615544      -0.08549197
## Pilosity0     -0.04688064 -0.076900343 -0.07239145      -0.12969223
## Pilosity1      0.00000000  0.036098724  0.00000000       0.00000000
## Pilosity2      0.01457037  0.000000000  0.03473728       0.00000000
## Pilosity3      0.03314056 -0.158031996  0.00000000       0.00000000
## Polymorphism0 -0.03477743 -0.001193911  0.00000000       0.00000000
## Polymorphism1  0.03887527  0.005790553  0.00000000       0.00000000
## Polymorphism2  0.00000000  0.000000000  0.00000000       0.00000000
## Webers.length  0.05834639  0.057169217 -0.02282845       0.00000000
##               Feral.mammal.dung
## Femur.length       0.0000000000
## No.spines          0.0167645908
## Pilosity0          0.0000000000
## Pilosity1          0.0000000000
## Pilosity2          0.0000000000
## Pilosity3          0.0004473121
## Polymorphism0      0.0000000000
## Polymorphism1      0.0000000000
## Polymorphism2      0.0000000000
## Webers.length     -0.0188204072

Notice all the zeros in the fourth corner matrix now.

There is also a cv.glm1path function, to use cross-validation to choose the best model, but the standard error calculation is giving some funny answers at the moment so check with David if you are going to use it with the 1se rule.

To construct a pretty picture of the fourth corner coefficients, along the lines of Gibb et al (2015):

library(lattice)
a        = max( abs(ft1$fourth.corner) )
colort   = colorRampPalette(c("blue","white","red")) 
plot.4th = levelplot(t(as.matrix(ft1$fourth.corner)), xlab="Environmental Variables",
 ylab="Species traits", col.regions=colort(100), at=seq(-a, a, length=100),
 scales = list( x= list(rot = 45)))
print(plot.4th)

Fitting a single multi-species SDM

You can also use this function to fit a single predictive model for all species at all sites, assuming different environmental response for difference species, but not attempting to explain these using traits. This can be understood as like a multivariate SDM. This is done by using traitglm function without a trait argument, because by default, a species ID factor is used as Q.

ftspp1=traitglm(antTraits$abund,antTraits$env,method="glm1path")

## No traits matrix entered, so will fit SDMs with different env response for each spp

ftspp1$fourth #notice LASSO penalty has shrunk many interactions to zero

##                                        Bare.ground Canopy.cover
## names.L.Amblyopone.australis           0.000000000  0.000000000
## names.L.Aphaenogaster.longiceps        0.000000000  0.195903772
## names.L.Camponotus.cinereus.amperei    0.000000000  0.051089268
## names.L.Camponotus.claripes            0.085404350  0.101102800
## names.L.Camponotus.consobrinus         0.053094170 -0.046531404
## names.L.Camponotus.nigriceps           0.000000000  0.112896362
## names.L.Camponotus.nigroaeneus         0.093879869  0.074060218
## names.L.Cardiocondyla.nuda.atalanta   -0.010988317  0.000000000
## names.L.Crematogaster.sp..A            0.000000000  0.036135506
## names.L.Heteroponera.sp..A             0.000000000  0.000000000
## names.L.Iridomyrmex.bicknelli          0.029544801 -0.105223246
## names.L.Iridomyrmex.dromus             0.036160843  0.000000000
## names.L.Iridomyrmex.mjobergi          -0.030370598  0.000000000
## names.L.Iridomyrmex.purpureus          0.016079976  0.000000000
## names.L.Iridomyrmex.rufoniger          0.000000000 -0.125993541
## names.L.Iridomyrmex.suchieri           0.000000000 -0.057124677
## names.L.Iridomyrmex.suchieroides       0.000000000 -0.031790430
## names.L.Melophorus.sp..E               0.012687320  0.000000000
## names.L.Melophorus.sp..F              -0.013298385 -0.032551264
## names.L.Melophorus.sp..H              -0.040156289  0.000000000
## names.L.Meranoplus.sp..A               0.000000000 -0.037315624
## names.L.Monomorium.leae                0.000000000  0.080980647
## names.L.Monomorium.rothsteini         -0.059078261 -0.151110963
## names.L.Monomorium.sydneyense         -0.009400791 -0.016848224
## names.L.Myrmecia.pilosula.complex      0.000000000  0.000000000
## names.L.Notoncus.capitatus            -0.021835901  0.047607685
## names.L.Notoncus.ectatommoides         0.000000000  0.000000000
## names.L.Nylanderia.sp..A               0.000000000 -0.111945793
## names.L.Ochetellus.glaber              0.000000000  0.000000000
## names.L.Paraparatrechina.sp..B         0.000000000  0.137431642
## names.L.Pheidole.sp..A                 0.000000000  0.000000000
## names.L.Pheidole.sp..B                 0.000000000  0.079300183
## names.L.Pheidole.sp..E                -0.006205572  0.000000000
## names.L.Pheidole.sp..J                 0.000000000  0.000000000
## names.L.Polyrhachis.sp..A              0.000000000  0.013672230
## names.L.Rhytidoponera.metallica.sp..A -0.011215687 -0.020809462
## names.L.Rhytidoponera.sp..B           -0.020977054  0.000000000
## names.L.Solenopsis.sp..A               0.000000000  0.000000000
## names.L.Stigmacros.sp..A               0.000000000  0.058630996
## names.L.Tapinoma.sp..A                -0.013093979  0.005962769
## names.L.Tetramorium.sp..A             -0.052186664 -0.076548652
##                                        Shrub.cover Volume.lying.CWD
## names.L.Amblyopone.australis           0.000000000      0.000000000
## names.L.Aphaenogaster.longiceps        0.000000000      0.047952476
## names.L.Camponotus.cinereus.amperei    0.000000000      0.011740122
## names.L.Camponotus.claripes            0.000000000      0.016963383
## names.L.Camponotus.consobrinus         0.068554436      0.044222192
## names.L.Camponotus.nigriceps           0.000000000      0.000000000
## names.L.Camponotus.nigroaeneus         0.000000000      0.000000000
## names.L.Cardiocondyla.nuda.atalanta    0.000000000     -0.025172727
## names.L.Crematogaster.sp..A            0.000000000      0.159996746
## names.L.Heteroponera.sp..A             0.046517208     -0.005829482
## names.L.Iridomyrmex.bicknelli          0.019233536      0.000000000
## names.L.Iridomyrmex.dromus             0.000000000      0.085909562
## names.L.Iridomyrmex.mjobergi          -0.023773178      0.003080628
## names.L.Iridomyrmex.purpureus         -0.150719515      0.116695075
## names.L.Iridomyrmex.rufoniger          0.009426445      0.000000000
## names.L.Iridomyrmex.suchieri           0.050960000      0.000000000
## names.L.Iridomyrmex.suchieroides       0.000000000      0.000000000
## names.L.Melophorus.sp..E               0.000000000      0.000000000
## names.L.Melophorus.sp..F              -0.020970544      0.000000000
## names.L.Melophorus.sp..H               0.000000000      0.000000000
## names.L.Meranoplus.sp..A              -0.019577256     -0.047790840
## names.L.Monomorium.leae                0.039234381      0.000000000
## names.L.Monomorium.rothsteini          0.000000000     -0.047369444
## names.L.Monomorium.sydneyense          0.000000000     -0.036404068
## names.L.Myrmecia.pilosula.complex      0.000000000      0.000000000
## names.L.Notoncus.capitatus             0.020338143      0.024038756
## names.L.Notoncus.ectatommoides        -0.038764547     -0.052386610
## names.L.Nylanderia.sp..A              -0.014956588      0.000000000
## names.L.Ochetellus.glaber              0.000000000      0.000000000
## names.L.Paraparatrechina.sp..B         0.000000000      0.000000000
## names.L.Pheidole.sp..A                -0.015260326      0.000000000
## names.L.Pheidole.sp..B                 0.000000000      0.034265287
## names.L.Pheidole.sp..E                 0.030685939      0.005728487
## names.L.Pheidole.sp..J                 0.000000000      0.000000000
## names.L.Polyrhachis.sp..A              0.024642150      0.000000000
## names.L.Rhytidoponera.metallica.sp..A  0.000000000     -0.017581313
## names.L.Rhytidoponera.sp..B           -0.073423897     -0.118284536
## names.L.Solenopsis.sp..A               0.000000000      0.000000000
## names.L.Stigmacros.sp..A               0.026612308      0.007229789
## names.L.Tapinoma.sp..A                 0.073579369      0.000000000
## names.L.Tetramorium.sp..A              0.000000000     -0.020434751
##                                       Feral.mammal.dung
## names.L.Amblyopone.australis                0.000000000
## names.L.Aphaenogaster.longiceps            -0.076741544
## names.L.Camponotus.cinereus.amperei         0.000000000
## names.L.Camponotus.claripes                 0.000000000
## names.L.Camponotus.consobrinus              0.000000000
## names.L.Camponotus.nigriceps                0.000000000
## names.L.Camponotus.nigroaeneus              0.000000000
## names.L.Cardiocondyla.nuda.atalanta         0.047867029
## names.L.Crematogaster.sp..A                 0.000000000
## names.L.Heteroponera.sp..A                  0.000000000
## names.L.Iridomyrmex.bicknelli               0.013391980
## names.L.Iridomyrmex.dromus                 -0.034889324
## names.L.Iridomyrmex.mjobergi                0.000000000
## names.L.Iridomyrmex.purpureus               0.000000000
## names.L.Iridomyrmex.rufoniger               0.000000000
## names.L.Iridomyrmex.suchieri                0.000000000
## names.L.Iridomyrmex.suchieroides            0.010135444
## names.L.Melophorus.sp..E                    0.056305877
## names.L.Melophorus.sp..F                    0.000000000
## names.L.Melophorus.sp..H                    0.000000000
## names.L.Meranoplus.sp..A                    0.018181708
## names.L.Monomorium.leae                     0.000000000
## names.L.Monomorium.rothsteini               0.007043668
## names.L.Monomorium.sydneyense              -0.013329988
## names.L.Myrmecia.pilosula.complex           0.000000000
## names.L.Notoncus.capitatus                  0.000000000
## names.L.Notoncus.ectatommoides             -0.015422250
## names.L.Nylanderia.sp..A                   -0.017632939
## names.L.Ochetellus.glaber                   0.000000000
## names.L.Paraparatrechina.sp..B              0.000000000
## names.L.Pheidole.sp..A                      0.000000000
## names.L.Pheidole.sp..B                     -0.009424855
## names.L.Pheidole.sp..E                      0.000000000
## names.L.Pheidole.sp..J                      0.042361417
## names.L.Polyrhachis.sp..A                   0.000000000
## names.L.Rhytidoponera.metallica.sp..A       0.000000000
## names.L.Rhytidoponera.sp..B                 0.000000000
## names.L.Solenopsis.sp..A                    0.000000000
## names.L.Stigmacros.sp..A                    0.000000000
## names.L.Tapinoma.sp..A                      0.000000000
## names.L.Tetramorium.sp..A                   0.027725589

And we can see from this which species are responding to the environmental gradient and how.

library(lattice)
a        = max( abs(ftspp1$fourth.corner) )
colort   = colorRampPalette(c("blue","white","red")) 
plot.spp = levelplot(t(as.matrix(ftspp1$fourth.corner)), xlab="Environmental Variables",
 ylab="Species traits", col.regions=colort(100), at=seq(-a, a, length=100),
 scales = list( x= list(rot = 45)))
print(plot.spp)

You could actually include both traits and a spp ID factor in Q, if you use a LASSO penalty. This would allow species terms to “soak up” additional variation across species not explained by traits. Jamil et al (2013) argue strongly for including such a term in the model, Brown et al (2014) tried it and were somewhat non-plussed.

References

Brown, A.M., Warton, D.I., Andrew, N.R., Binns, M., Cassis, G. & Gibb, H. (2014) The fourth-corner solution – using predictive models to understand how species traits interact with the environment. Methods in Ecology and Evolution, 5, 344–352.

Doledec, S., Chessel,D., terBraak, C.J.F.&Champely, S. (1996) Matching species traits to environmental variables: a new three-table ordination method. Environmental and Ecological Statistics, 3, 143–166

Jamil, T., Ozinga, W.A., Kleyer, M. & ter Braak, C.J. (2013) Selecting traits that explain species–environment relationships: a generalized linear mixed model approach. Journal of Vegetation Science, 24, 988–1000.

Legendre, P., Galzin, R. & Harmelin-Vivien, M.L. (1997) Relating behavior to habitat: solutions to the fourth-corner problem. Ecology, 78, 547–562.

McGill, B.J., Enquist, B.J.,Weiher, E.&Westoby,M. (2006) Rebuilding community ecology from functional traits. Trends in Ecology and Evolution, 20, 178–185.

Shipley, B., Vile, D. & Garnier, E. (2006) From plant traits to plant communities: a statistical mechanistic approach to biodiversity. Science, 314, 812–814.

Shipley, B. (2010) From Plant Traits to Vegetation Structure: Chance and Selection in the Assembly of Ecological Communities. Cambridge University Press, Cambridge. In this book, Shipley described communitry ecologists who are not using trait varibles as acting like “demented accountants”! (documenting lots of different patterns in different species without thinking about why they are different…)

Warton, D.I., Shipley, B., & Hastie, T. (2015) CATS regression – a model-based approach to studying trait-based community assembly. Methods in Ecology and Evolution.

Pollock, L.J., Morris, W.K. & Vesk, P.A. (2012) The role of functional traits in species distributions revealed through a hierarchical model. Ecography, 35, 716–725