Diversity interaction models a worked example

Introduction to DI models

Diversity-Interactions (DI) models are a modeling framework that estimates the effects of species identity and diversity on ecosystem function. They Predict the diversity–function relationship across different types of communities. DI models separately estimate the contributions of different species when they interact, and are usually an analysis of simplex design.

A DI model has the following 3 main elements:

Fig1: DI Model Explained

How DI models are used in practice

The models are a family of models and not one single model can explain its full functionality. One example of a DI model is displayed below:

Fig2: DI Model Syntax

How biological assumptions are tested in DI models

The below flow diagram shows the hierarchy of the models, where: * The different models are in boxes (models 1 to 8) * The arrows indicate where a model is hierarchical to the one below * The letters show the alternative biological hypotheses A–G that represent different assumptions about the relationship between diversity and ecosystem function

Fig3

The simulated dataset

Let’s load in some data for this worked example:

simData <- read.csv("C:/Users/hiran/Desktop/PHD docs/Year 1 Sept 2020/Markdown file for DI models- groupwork/simulated_dataset.csv", header = T)

Exploring the data

Descriptive statistics are the very fist step when analysing data. They are used to describe the basic features of the data in a study and provide simple summaries about the sample and the measures used. R provides a wide range of functions for obtaining summary statistics. One method of obtaining descriptive statistics is to use the summary() function or pairs() to get some graphs.

summary(simData)

##    Community        Richness    Treatment               p1        
##  Min.   : 1.00   Min.   :1.0   Length:100         Min.   :0.0000  
##  1st Qu.: 5.00   1st Qu.:2.0   Class :character   1st Qu.:0.0000  
##  Median :17.50   Median :2.0   Mode  :character   Median :0.1000  
##  Mean   :16.26   Mean   :2.4                      Mean   :0.2500  
##  3rd Qu.:26.00   3rd Qu.:3.0                      3rd Qu.:0.3333  
##  Max.   :31.00   Max.   :4.0                      Max.   :1.0000  
##        p2               p3               p4            Response     
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   : 4.869  
##  1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.: 9.752  
##  Median :0.1000   Median :0.1000   Median :0.1000   Median :12.028  
##  Mean   :0.2500   Mean   :0.2500   Mean   :0.2500   Mean   :11.581  
##  3rd Qu.:0.3333   3rd Qu.:0.3333   3rd Qu.:0.3333   3rd Qu.:13.701  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :19.896

This data has:

4 species (where: P1 and P2 are grasses ; P3 and P4 are legumes)
Response variable is yield
There are 2 treatments (Treatment A=50 observations ; Treatment B-50 observations)

Model types explained

These models are Built using the glm() function in R
The term y represents the Yield, β the Identity effects, and δ the Diversity effects
The species used are named p1, p2, p3, and p4, where p1 and p2 are grasses and p3 and p4 are legumes.
The treatments in Model 8 refer to managements
Models 1 through 6 were built first, with models 7, 8, and their opposite/sub models being built after hypothesis testing

Important calculations before starting the modeling process

We will need to add the below necessary columns:

Calculate S first, where S equals total number of species considered in experiment

S=4

Now calculate E, which is Evenness, Species evenness refers to how close in numbers each species in an environment is.

simData$E = (2*S / (S-1)) * (simData$p1 * simData$p2 + simData$p1 * simData$p3 + simData$p1 * simData$p4 + simData$p2 * simData$p3 + simData$p2 * simData$p4 + simData$p3 * simData$p4)

And then calculate functional group proportions

simData$Grass = simData$p1 + simData$p2
simData$Legume = simData$p3 + simData$p4

And finally calculate species specific interaction terms

simData$p1INT = simData$p1 * (1-simData$p1)
simData$p2INT = simData$p2 * (1-simData$p2)
simData$p3INT = simData$p3 * (1-simData$p3)
simData$p4INT = simData$p4 * (1-simData$p4)

Code for the models

Model 1 - Null Model

This is the simplest model and assumes that all species perform in the same manner and do not interact.

model1 <- glm(Response ~ Richness, data = simData)

Model 2 - Species Identity Effects

Allows a separate identity effect for each species. When comparing models 1 and 2, we can check to see if the estimates of the identity effects for both grasses are below the average and if the estimates for both legumes are above the average. We can test whether these differences are significant by conducting an F test between models 1 and 2.

model2 <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + Richness, data = simData)

Model 3 - Separate Pairwise Interactions

This model includes a separate interaction coefficient for each pair of species. When we have four species there are six separate pairwise interaction terms. When comparing models 2 and 3 will allow us to check if the six interaction terms are needed. This tests whether species interact to produce diversity effects.

model3 <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + p1:p2 + p1:p3 + p1:p4 + p2:p3 + p2:p4 + p3:p4 + Richness, data = simData)

Model 4 - Evenness Model

This includes a single pairwise interaction coefficient. If the separate pairwise interactions do not differ much in magnitude, a single interaction term may adequately describe the diversity effect. By comparing models 3 and 4 will test whether one interaction term is sufficient to describe the diversity effect.

model4 <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + E + Richness, data = simData)

Model 5 - Additive species-specific contributions to interactions

Model 5 includes a separate coefficient for the contribution of each species to interaction. The strength of a pairwise interaction is then estimated as the sum of the interaction strength for each of the interacting species.This model is a more parsimonious description of the pattern of pairwise interaction than model 3. In this example there are 4 interaction terms to estimate compared with 6 for model 3. A comparison of these models with an F test will indicate whether the more parsimonious description is sufficient to describe the interaction pattern.

model5 <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + p1INT + p2INT + p3INT + p4INT + Richness, data = simData)

Model 6 - Functional Group Effects

This model estimates interactions between and within functional groups. In this experiment, we have two functional groups, grasses and legumes. A comparison of this model with model 3 tests whether patterns in the interspecific interactions relate to functional groupings.

model6 <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + Grass + Grass:Legume + Legume + Richness, data = simData)
model6_grass <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + Grass + Grass:p3 + Grass:p4 + p3:p4 + Richness, data = simData)

Model 7 - Functional Redundancy Model

This Model fits the same species identity coefficient and interaction coefficients for species of the same functional group. A comparison of this model with model 3 tests whether both the identity effects and interaction coefficients for the two grasses are equal using a F-test.

model7_grass <- glm(Response ~ 0 + Grass + p3 + p4 + Grass:p3 + Grass:p4 + p3:p4 + Richness, data = simData)
model7_legume <- glm(Response ~ 0 + p1 + p2 + Legume + p1:p2 + p1:Legume + p2:Legume + Richness, data = simData)

Model 8 -Functional Redundancy Model, crossed with treatment factor

This model extends from model 7 and tt gives separate estimates of identity and interaction effects for each N level.

model8_grass <- glm(Response ~ 0 + Grass + p3 + p4 + Grass:p3 + Grass:p4 + p3:p4 + Richness +
                      Grass:Treatment + p3:Treatment + p4:Treatment + Grass:p3:Treatment + Grass:p4:Treatment + p3:p4:Treatment + Richness:Treatment, 
                    data = simData)
model8 <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + p1:p2 + p1:p3 + p1:p4 + p2:p3 + p2:p4 + p3:p4 + Richness +
                p1:Treatment + p2:Treatment + p3:Treatment + p4:Treatment + p1:p2:Treatment + p1:p3:Treatment + p1:p4:Treatment + p2:p3:Treatment + p2:p4:Treatment + p3:p4:Treatment + Richness:Treatment, 
              data = simData)

model8_final <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + p1:p2 + p1:p3 + p1:p4 + p2:p3 + p2:p4 +
                      p1:Treatment + p2:Treatment + p4:Treatment + p1:p4:Treatment + p2:p3:Treatment + p3:p4:Treatment + Richness:Treatment, 
                    data = simData)

Hypothesis testing

The below table shows the outcome of the hypothesis testing conducted using the biological assumptions discussed in Fig3 above.

Model selection

Note: N=0 when Treatment=‘A’ & N=1 when Treatment=‘B’

The AIC function calculates Akaike’s ‘An Information Criterion’ for one or several fitted model objects for which a log-likelihood value can be obtained. In R the syntax is for Model 1 for example would be:

AIC(model1)

The below table discusses the results obtained from model selection:

Further the visual representation is shown below:

We therefore determined Model 8 without insignificant parameters that were highlighted in the testing was our best model.

model8_final <- glm(Response ~ 0 + p1 + p2 + p3 + p4 + p1:p2 + p1:p3 + p1:p4 + p2:p3 + p2:p4 +
                      p1:Treatment + p2:Treatment + p4:Treatment + p1:p4:Treatment + p2:p3:Treatment + p3:p4:Treatment + Richness:Treatment, 
                    data = simData)

Results

The best model was a full pairwise model crossed with Treatment factor.
Grass p2 and legume p4 had higher yields in monocultures compared to grass p1 and legume p3.
Treatment B had a significantly higher yield compared to Treatment A.
The within group interaction between the two legumes wasn’t significant for Treatment A.
The Richness variable wasn’t significant on its own, but its interaction with Treatment was!

Things don’t have to be this hard

The above steps are useful but fortunately there is a package that can make life so much easier. The DImodels package is designed to make fitting Diversity-Interactions models simpler. A worked example using this package can be found here:

https://cran.r-project.org/web/packages/DImodels/vignettes/DImodels_Vignette.html

References

Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.