library(here)
library(tidyverse)
library(glmmTMB)
library(ggeffects)
collembola <- read.csv(here("Week_06", "Data", "collembola_revised_2025.csv"))
#categorical data should be factors
collembola$temp.factor <- as.factor(collembola$temp.factor) #chr -> factor
collembola$predators <- as.factor(collembola$predators) #chr -> factor
1. First use an appropriate ‘standard’ model for counts (i.e., not zero-inflated) to ask whether the abundance of Collembola is affected by temperature, by the presence of one or more predators, and whether the effect of predators depends on temperature. Use the function glmmTMB() to fit the model, because later we will compare this model to others fit with this function. Plot fitted effects and perform likelihood ratio tests on the relevant terms. To perform marginal tests you will want to compare pairs of models using the function anova(). Previously we used the function Anova() (yes, the capitalized and uncapitalized functions are different) to automate this process, but this function returns a less accurate test for glmmTMB (i.e., a Wald test). So, for each term you want to test, compare a model with this term to a model without this term. How do you interpret the results at this stage?
# In the attached dataset we will use the columns total.prey (total Collembola count), temp.factor (the temperature treatment), and predators (the predator treatment).
standardmod = glmmTMB(total.prey ~ temp.factor*predators,
data = collembola,
family = "poisson")
#restricted models
tempmod = glmmTMB(total.prey ~ temp.factor,
data = collembola,
family = "poisson")
predmod = glmmTMB(total.prey ~ predators,
data = collembola,
family = "poisson")
summary(standardmod)
## Family: poisson ( log )
## Formula: total.prey ~ temp.factor * predators
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 34028.0 34053.1 -17002.0 34004.0 48
##
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.69440 0.02594 219.51 < 2e-16 ***
## temp.factor17_20 0.61442 0.03220 19.08 < 2e-16 ***
## temp.factor22_25 0.04023 0.03632 1.11 0.268
## predatorsHA+HM 0.63887 0.03207 19.92 < 2e-16 ***
## predatorsHM 0.33291 0.03399 9.79 < 2e-16 ***
## predatorsnone 0.25936 0.03453 7.51 5.84e-14 ***
## temp.factor17_20:predatorsHA+HM -0.91079 0.04324 -21.06 < 2e-16 ***
## temp.factor22_25:predatorsHA+HM -0.63179 0.04813 -13.13 < 2e-16 ***
## temp.factor17_20:predatorsHM -1.08955 0.04791 -22.74 < 2e-16 ***
## temp.factor22_25:predatorsHM -1.67558 0.06540 -25.62 < 2e-16 ***
## temp.factor17_20:predatorsnone 0.69416 0.04119 16.85 < 2e-16 ***
## temp.factor22_25:predatorsnone 0.68510 0.04572 14.99 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#calculate overdispersion for a Poisson model without glmmTMB
pmodel <- glm(total.prey ~ temp.factor * predators, data = collembola, family = "poisson")
dispersion <- sum(residuals(pmodel, type ="pearson")^2)/(length(pmodel$y)-length(pmodel$coefficients))
print(dispersion)
## [1] 641.9313
An effects plot will show us the predicted probabilities of prey with two different categorical predictors.
plot(ggeffect(standardmod, terms = c('temp.factor', 'predators')))
The relevant terms for LRTs on this model include anova().
anova(standardmod, tempmod, test = "LRT")
## Data: collembola
## Models:
## tempmod: total.prey ~ temp.factor, zi=~0, disp=~1
## standardmod: total.prey ~ temp.factor * predators, zi=~0, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## tempmod 3 43569 43575 -21782 43563
## standardmod 12 34028 34053 -17002 34004 9559.1 9 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(standardmod, predmod, test = "LRT")
## Data: collembola
## Models:
## predmod: total.prey ~ predators, zi=~0, disp=~1
## standardmod: total.prey ~ temp.factor * predators, zi=~0, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## predmod 4 39338 39346 -19665 39330
## standardmod 12 34028 34053 -17002 34004 5325.7 8 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Looking at the likelihood ratio tests, the effect of temperature on total prey seems to be an important predictor given that the AIC reduces from 43569 to 34028. Predators also appear to reduce the AIC - we interpret that both predators, temperature, and the interaction between them improves the model.
2. A large proportion of the data are zeros, and it may be the case that processes controlling abundance are different from processes controlling ‘extra’ zeros – if, in fact, there are extra zeros. Use glmmTMB to fit zero-inflated count model(s). You should decide how many to fit, and which kind of probability distribution(s) to use. Use AIC to do model selection, to determine whether incorporating zero inflation improves model fit. Using the best model (i.e., the lowest AIC), perform marginal likelihood ratio tests on the predictors, again using anova() to compare pairs of models. How have the results changed from #1?
#The quasipoisson and negative binomial both ‘account’ for overdispersion, in slightly different ways, and so they should yield results that do not have artificially small p-values. But if you think the extra zeros may be caused by different processes than the rest of the count data, then the extra zeros may be obscuring the patterns you want to quantify.
# If you want to test for zero-inflation, fit a zero-inflated Poisson model. It is possible that the count data is still overdispersed, even after accounting for extra zeros. You can calculate a dispersion parameter using the pearson residuals from the zero-inflated Poisson. If there is overdispersion, fit a zero-inflated negative binomial model
poissmod = glmmTMB(total.prey ~ temp.factor * predators,
data = collembola,
family = "poisson")
poissmod0 = glmmTMB(total.prey ~ temp.factor*predators,
data = collembola,
ziformula = ~ 1, # where we put predictors for the extra zeros. In this case there are no predictors, which means extra zeros occur with some constant probability, to be estimated from the data.
family = "poisson")
nbmod0 = glmmTMB(total.prey ~ temp.factor*predators,
data = collembola,
ziformula = ~ 1, # where we put predictors for the extra zeros. In this case there are no predictors, which means extra zeros occur with some constant probability, to be estimated from the data.
family = "nbinom2")
summary(poissmod)
## Family: poisson ( log )
## Formula: total.prey ~ temp.factor * predators
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 34028.0 34053.1 -17002.0 34004.0 48
##
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.69440 0.02594 219.51 < 2e-16 ***
## temp.factor17_20 0.61442 0.03220 19.08 < 2e-16 ***
## temp.factor22_25 0.04023 0.03632 1.11 0.268
## predatorsHA+HM 0.63887 0.03207 19.92 < 2e-16 ***
## predatorsHM 0.33291 0.03399 9.79 < 2e-16 ***
## predatorsnone 0.25936 0.03453 7.51 5.84e-14 ***
## temp.factor17_20:predatorsHA+HM -0.91079 0.04324 -21.06 < 2e-16 ***
## temp.factor22_25:predatorsHA+HM -0.63179 0.04813 -13.13 < 2e-16 ***
## temp.factor17_20:predatorsHM -1.08955 0.04791 -22.74 < 2e-16 ***
## temp.factor22_25:predatorsHM -1.67558 0.06540 -25.62 < 2e-16 ***
## temp.factor17_20:predatorsnone 0.69416 0.04119 16.85 < 2e-16 ***
## temp.factor22_25:predatorsnone 0.68510 0.04572 14.99 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(poissmod0)
## Family: poisson ( log )
## Formula: total.prey ~ temp.factor * predators
## Zero inflation: ~1
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 6478.2 6505.4 -3226.1 6452.2 47
##
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.91755 0.02594 228.11 < 2e-16 ***
## temp.factor17_20 0.61442 0.03220 19.08 < 2e-16 ***
## temp.factor22_25 1.42653 0.03632 39.27 < 2e-16 ***
## predatorsHA+HM 0.41573 0.03207 12.97 < 2e-16 ***
## predatorsHM 0.10977 0.03399 3.23 0.001241 **
## predatorsnone 0.03621 0.03453 1.05 0.294255
## temp.factor17_20:predatorsHA+HM -0.68764 0.04324 -15.90 < 2e-16 ***
## temp.factor22_25:predatorsHA+HM -1.50726 0.04813 -31.32 < 2e-16 ***
## temp.factor17_20:predatorsHM -0.17327 0.04791 -3.62 0.000298 ***
## temp.factor22_25:predatorsHM -1.45243 0.06540 -22.21 < 2e-16 ***
## temp.factor17_20:predatorsnone 1.20499 0.04119 29.26 < 2e-16 ***
## temp.factor22_25:predatorsnone 0.21510 0.04572 4.71 2.54e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.6190 0.2707 -2.287 0.0222 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(nbmod0)
## Family: nbinom2 ( log )
## Formula: total.prey ~ temp.factor * predators
## Zero inflation: ~1
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 673.6 702.9 -322.8 645.6 46
##
##
## Dispersion parameter for nbinom2 family (): 2.12
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.91754 0.34454 17.175 <2e-16 ***
## temp.factor17_20 0.61443 0.48694 1.262 0.2070
## temp.factor22_25 1.42652 0.76909 1.855 0.0636 .
## predatorsHA+HM 0.41574 0.46205 0.900 0.3682
## predatorsHM 0.10977 0.46219 0.238 0.8123
## predatorsnone 0.03622 0.46223 0.078 0.9375
## temp.factor17_20:predatorsHA+HM -0.68765 0.67112 -1.025 0.3055
## temp.factor22_25:predatorsHA+HM -1.50727 0.91886 -1.640 0.1009
## temp.factor17_20:predatorsHM -0.17328 0.75423 -0.230 0.8183
## temp.factor22_25:predatorsHM -1.45253 1.07754 -1.348 0.1777
## temp.factor17_20:predatorsnone 1.20499 0.69970 1.722 0.0850 .
## temp.factor22_25:predatorsnone 0.21510 0.96061 0.224 0.8228
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.6191 0.2707 -2.287 0.0222 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The AIC for a standard poisson model is 34028.0, a zero-inflated poisson model is 6478.2, while the AIC for a zero-inflated negative binomial model is 673.6. Given the lower AIC from the negative binomial, we will proceed with this model.
nbmodtemp = glmmTMB(total.prey ~ predators,
data = collembola,
ziformula = ~ 1,
family = "nbinom2")
nbmodpred = glmmTMB(total.prey ~ temp.factor,
data = collembola,
ziformula = ~ 1,
family = "nbinom2")
anova(nbmodtemp, nbmod0)
## Data: collembola
## Models:
## nbmodtemp: total.prey ~ predators, zi=~1, disp=~1
## nbmod0: total.prey ~ temp.factor * predators, zi=~1, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## nbmodtemp 6 673.84 686.41 -330.92 661.84
## nbmod0 14 673.56 702.88 -322.78 645.56 16.281 8 0.03853 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(nbmodpred, nbmod0)
## Data: collembola
## Models:
## nbmodpred: total.prey ~ temp.factor, zi=~1, disp=~1
## nbmod0: total.prey ~ temp.factor * predators, zi=~1, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## nbmodpred 5 671.18 681.65 -330.59 661.18
## nbmod0 14 673.56 702.88 -322.78 645.56 15.623 9 0.0752 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Given that the negative binomial with the zero-inflation was the was the best model, we observe that temperature has a significant effect on total prey.
3. If your results look the way mine do it seems like the three treatments with predators (HA, HM, HA+HM) may not be very different in their effect on the prey. Let’s test the effect of predators again, but now treating all treatments with predators as the same. Create a new column that recodes the predator treatment so that it only has two levels: predators or no predators. Fit a model using the new predator predictor instead of the old one, and plot fitted effects and perform likelihood ratio tests as before (you only need to use the ‘best’ probability distribution model determined in #2). How do these results compare to what you saw previously? Why do you think the results have changed? How do you interpret these patterns?
collembola$pred <- ifelse(collembola$predators %in% c("HA", "HA+HM", "HM"), "predators", "no predators")
nbmod0pred = glmmTMB(total.prey ~ temp.factor * pred,
data = collembola,
ziformula = ~ 1,
family = "nbinom2")
summary(nbmod0pred)
## Family: nbinom2 ( log )
## Formula: total.prey ~ temp.factor * pred
## Zero inflation: ~1
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 665.4 682.2 -324.7 649.4 52
##
##
## Dispersion parameter for nbinom2 family (): 1.94
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.9538 0.3221 18.486 < 2e-16 ***
## temp.factor17_20 1.8194 0.5253 3.464 0.000533 ***
## temp.factor22_25 1.6416 0.6017 2.728 0.006364 **
## predpredators 0.1672 0.3752 0.446 0.655916
## temp.factor17_20:predpredators -1.5221 0.6039 -2.521 0.011717 *
## temp.factor22_25:predpredators -1.2089 0.7089 -1.705 0.088126 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.6191 0.2707 -2.287 0.0222 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
nbmod0ppred = glmmTMB(total.prey ~ pred,
data = collembola,
ziformula = ~ 1,
family = "nbinom2")
nbmod0tpred = glmmTMB(total.prey ~ temp.factor,
data = collembola,
ziformula = ~ 1,
family = "nbinom2")
anova(nbmod0tpred, nbmod0pred)
## Data: collembola
## Models:
## nbmod0tpred: total.prey ~ temp.factor, zi=~1, disp=~1
## nbmod0pred: total.prey ~ temp.factor * pred, zi=~1, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## nbmod0tpred 5 671.18 681.65 -330.59 661.18
## nbmod0pred 8 665.44 682.20 -324.72 649.44 11.739 3 0.008333 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(nbmod0ppred, nbmod0pred)
## Data: collembola
## Models:
## nbmod0ppred: total.prey ~ pred, zi=~1, disp=~1
## nbmod0pred: total.prey ~ temp.factor * pred, zi=~1, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## nbmod0ppred 4 670.45 678.83 -331.22 662.45
## nbmod0pred 8 665.44 682.20 -324.72 649.44 13.005 4 0.01125 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
When we consider all possible predator treatments together, the effect on total prey is significant. However, when we look at the results based on the type of predator, it does not appear the effect is significant.
4.Finally, we have not considered that ‘extra’ zeros could themselves vary across the experimental treatments. Use an appropriate zero-inflated model, now including the same predictors we have used for the count portion of the model (you can use the simplified predators predictor with two levels). Use likelihood ratio tests to test these terms. The various packages for plotting fitted effects do not (to my knowledge) have a convenient way to plot zero-inflated effects (although this could be done ‘manually’ by extracting model predictions). Instead, look at the model coefficients returned by summary() to interpret what’s going on with the extra zeros. It may be easier to interpret the coefficients if you use a model that removes the non-significant terms. What is the pattern in the extra zeros according to this model? How does this pattern differ from the patterns in the count model?
#zero-inflated model w simplified predators predictor
zimod = glmmTMB(total.prey ~ temp.factor * pred,
data = collembola,
ziformula = ~ temp.factor * pred,
family = "nbinom2")
#zero-inflated simple model
zimodtemp = glmmTMB(total.prey ~ temp.factor * pred,
data = collembola,
ziformula = ~ temp.factor,
family = "nbinom2")
zimodpred = glmmTMB(total.prey ~ temp.factor * pred,
data = collembola,
ziformula = ~ pred,
family = "nbinom2")
summary(zimod)
## Family: nbinom2 ( log )
## Formula: total.prey ~ temp.factor * pred
## Zero inflation: ~temp.factor * pred
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 656.7 684.0 -315.4 630.7 47
##
##
## Dispersion parameter for nbinom2 family (): 1.94
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.9538 0.3221 18.485 < 2e-16 ***
## temp.factor17_20 1.8194 0.5253 3.464 0.000533 ***
## temp.factor22_25 1.6416 0.6017 2.728 0.006366 **
## predpredators 0.1671 0.3752 0.445 0.655969
## temp.factor17_20:predpredators -1.5221 0.6039 -2.520 0.011721 *
## temp.factor22_25:predpredators -1.2089 0.7089 -1.705 0.088148 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -19.15 6435.58 -0.003 0.998
## temp.factor17_20 18.74 6435.58 0.003 0.998
## temp.factor22_25 19.55 6435.58 0.003 0.998
## predpredators 16.51 6435.58 0.003 0.998
## temp.factor17_20:predpredators -16.80 6435.58 -0.003 0.998
## temp.factor22_25:predpredators -16.22 6435.58 -0.003 0.998
summary(zimodtemp)
## Family: nbinom2 ( log )
## Formula: total.prey ~ temp.factor * pred
## Zero inflation: ~temp.factor
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 651.5 672.4 -315.7 631.5 50
##
##
## Dispersion parameter for nbinom2 family (): 1.94
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.9538 0.3221 18.485 < 2e-16 ***
## temp.factor17_20 1.8194 0.5253 3.464 0.000533 ***
## temp.factor22_25 1.6416 0.6017 2.728 0.006367 **
## predpredators 0.1671 0.3752 0.445 0.655998
## temp.factor17_20:predpredators -1.5221 0.6039 -2.520 0.011723 *
## temp.factor22_25:predpredators -1.2089 0.7089 -1.705 0.088158 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.945 1.026 -2.869 0.00412 **
## temp.factor17_20 2.326 1.128 2.061 0.03929 *
## temp.factor22_25 3.564 1.128 3.158 0.00159 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(zimodpred)
## Family: nbinom2 ( log )
## Formula: total.prey ~ temp.factor * pred
## Zero inflation: ~pred
## Data: collembola
##
## AIC BIC logLik -2*log(L) df.resid
## 667.4 686.3 -324.7 649.4 51
##
##
## Dispersion parameter for nbinom2 family (): 1.94
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.9538 0.3221 18.486 < 2e-16 ***
## temp.factor17_20 1.8194 0.5253 3.464 0.000532 ***
## temp.factor22_25 1.6416 0.6017 2.728 0.006364 **
## predpredators 0.1672 0.3752 0.446 0.655912
## temp.factor17_20:predpredators -1.5221 0.6039 -2.521 0.011717 *
## temp.factor22_25:predpredators -1.2089 0.7089 -1.705 0.088127 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.6931 0.5477 -1.266 0.206
## predpredators 0.0984 0.6301 0.156 0.876
anova(zimodpred, zimod)
## Data: collembola
## Models:
## zimodpred: total.prey ~ temp.factor * pred, zi=~pred, disp=~1
## zimod: total.prey ~ temp.factor * pred, zi=~temp.factor * pred, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## zimodpred 9 667.42 686.27 -324.71 649.42
## zimod 13 656.75 683.98 -315.37 630.75 18.67 4 0.0009122 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(zimodtemp, zimod)
## Data: collembola
## Models:
## zimodtemp: total.prey ~ temp.factor * pred, zi=~temp.factor, disp=~1
## zimod: total.prey ~ temp.factor * pred, zi=~temp.factor * pred, disp=~1
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## zimodtemp 10 651.49 672.43 -315.74 631.49
## zimod 13 656.75 683.98 -315.38 630.75 0.7371 3 0.8644
The pattern in the extra zeros according to this model comes from the temperature. The model coefficients mean the log-odds of observing an extra zero, which is -2.945 for the model with ~temp.factor as the zero inflation. This also means that odds of observing an extra zero is e^-2.945, or 0.05.