A script detailing the use of information-theoretic model averaging as a model selection method.

# Dependencies ------------------------------------------------------------
library(lme4)
library(MuMIn)
#Model Data
dat <- read.csv('./bat_data.csv')

The data represents the number of acoustic bat passes that occurred in separate 50x50m grid squares across a given area. In the same grid squares, the percentage cover of several habitat variables were also calculated. From this we can construct a simple distribution model that examines bat-habitat relationships.

detections date agr_cover bldngs_cover grass_cover water_cover wood_cover
0 2014_7 -0.85 -0.62 -0.81 -0.07 -0.37
0 2014_7 -0.85 1.33 0.60 -0.07 -0.37
0 2011_8 -0.85 2.99 -0.52 -0.07 -0.37
1 2015_7 -0.85 1.43 0.97 -0.07 -0.37
0 2011_7 0.60 0.03 -0.39 -0.07 -0.37
0 2013_8 -0.76 -0.61 2.77 -0.07 -0.37
2 2015_8 -0.85 1.10 -0.62 -0.07 -0.37
0 2014_8 -0.85 -0.36 -0.46 -0.07 2.80
0 2013_7 -0.85 -0.62 -0.98 -0.07 5.43
0 2015_8 -0.85 2.01 0.23 -0.07 -0.37

Data were collected in each grid square across a five year period. To account for this nested sampling design we will construct a generalized linear mixed model that includes a temporal random effect (the survey date).

#First run a global model including all habitat variables
f1 <- detections ~ agr_cover + bldngs_cover + grass_cover + water_cover + wood_cover + (1|date)
m1 <- glmer(f1, dat, family = poisson, na.action = na.fail)
summary(m1)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: 
## detections ~ agr_cover + bldngs_cover + grass_cover + water_cover +  
##     wood_cover + (1 | date)
##    Data: dat
## 
##      AIC      BIC   logLik deviance df.resid 
##   3739.4   3789.9  -1862.7   3725.4     9993 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.7601 -0.2219 -0.2086 -0.1906 21.6675 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  date   (Intercept) 0.006712 0.08192 
## Number of obs: 10000, groups:  date, 10
## 
## Fixed effects:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -3.13718    0.05544  -56.58  < 2e-16 ***
## agr_cover     0.09746    0.08095    1.20  0.22858    
## bldngs_cover -0.12870    0.07109   -1.81  0.07023 .  
## grass_cover   0.03055    0.06981    0.44  0.66171    
## water_cover   0.09399    0.01552    6.06 1.39e-09 ***
## wood_cover    0.16417    0.05225    3.14  0.00168 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) agr_cv bldng_ grss_c wtr_cv
## agr_cover   -0.052                            
## bldngs_covr  0.085  0.576                     
## grass_cover -0.031  0.728  0.393              
## water_cover -0.102  0.360  0.263  0.328       
## wood_cover  -0.103  0.665  0.478  0.577  0.255

The global model indicates that the percentage cover of water and woodland have a significant positive effect on the number of bat passes recorded. The percentage cover of buildings has a significant negative effect.

A possible model selection method is to reduce the complexity of the global model and simply select the single model with the lowest AIC.

#Begin stepwise reduction of the global model
#Create alternative formulas
f2 <- detections ~ agr_cover + bldngs_cover + water_cover + wood_cover + (1|date)
f3 <- detections ~ agr_cover + grass_cover + water_cover + wood_cover + (1|date)
f4 <- detections ~ bldngs_cover + grass_cover + water_cover + wood_cover + (1|date)

#fit alternative models
m2 <- glmer(f2, dat, family = poisson)
m3 <- glmer(f3, dat, family = poisson)
m4 <- glmer(f4, dat, family = poisson)

#compare the AIC of the four different models
AIC(m1,m2,m3,m4)
##    df      AIC
## m1  7 3739.430
## m2  6 3737.621
## m3  6 3740.867
## m4  6 3738.912

None of the four models that have been fit have a noticeably lower AIC. As such, stepwise reduction may not be an appropriate model selection method as choosing one of the model over another could introduce uncertainty and lead to some important predictors being discarded.

To overcome this, an information-theoretic approach can be adopted to construct an average model where the parameter estimate for each predictor variable is calculated as an average across a set of models that are deemed informative (based on a pre-specified criterion).

To begin with we need to compile a comprehensive set of models in which each possible combination of the predictors specified in the global model are represented. This can be done using the MuMIn package.

#Use 'dredge' to generate comprehensive model set from the global model (m1)
m1 <- glmer(f1, dat, family = poisson, na.action = na.fail)
dr1 <- dredge(m1)

Models are then ranked in terms of AIC values and the ∆AIC value (the difference between a given model and the model with the lowest AIC) is then also calculated for each model.

(Intercept) agr_cover bldngs_cover grass_cover water_cover wood_cover df logLik AICc delta weight
27 -3.14 NA -0.18 NA 0.09 0.13 5 -1863.65 3737.30 0.00 0.33
28 -3.14 0.07 -0.14 NA 0.09 0.15 6 -1862.81 3737.63 0.33 0.28
31 -3.14 NA -0.18 -0.03 0.09 0.12 6 -1863.46 3738.92 1.62 0.14
32 -3.14 0.10 -0.13 0.03 0.09 0.16 7 -1862.71 3739.44 2.14 0.11
26 -3.13 0.13 NA NA 0.10 0.18 5 -1865.31 3740.62 3.32 0.06
30 -3.13 0.19 NA 0.09 0.10 0.21 6 -1864.43 3740.88 3.57 0.05
11 -3.13 NA -0.21 NA 0.09 NA 4 -1868.46 3744.92 7.61 0.01
15 -3.13 NA -0.20 -0.05 0.09 NA 5 -1867.84 3745.69 8.38 0.00
25 -3.12 NA NA NA 0.09 0.15 4 -1868.85 3745.71 8.41 0.00
12 -3.13 0.00 -0.21 NA 0.09 NA 5 -1868.46 3746.92 9.62 0.00
16 -3.13 -0.05 -0.22 -0.08 0.08 NA 6 -1867.49 3746.99 9.69 0.00
29 -3.12 NA NA -0.03 0.09 0.15 5 -1868.62 3747.24 9.94 0.00
19 -3.13 NA -0.19 NA NA 0.13 4 -1873.94 3755.88 18.58 0.00
10 -3.11 0.07 NA NA 0.10 NA 4 -1874.44 3756.89 19.58 0.00
23 -3.13 NA -0.19 -0.04 NA 0.12 5 -1873.55 3757.10 19.80 0.00
9 -3.11 NA NA NA 0.09 NA 3 -1875.61 3757.23 19.92 0.00
20 -3.13 0.04 -0.17 NA NA 0.14 5 -1873.62 3757.24 19.94 0.00
13 -3.11 NA NA -0.06 0.09 NA 4 -1874.76 3757.52 20.22 0.00
14 -3.11 0.05 NA -0.03 0.09 NA 5 -1874.30 3758.60 21.30 0.00
24 -3.13 0.02 -0.18 -0.03 NA 0.13 6 -1873.51 3759.04 21.74 0.00
18 -3.12 0.11 NA NA NA 0.18 4 -1877.20 3762.40 25.10 0.00
8 -3.12 -0.10 -0.25 -0.12 NA NA 5 -1876.74 3763.48 26.18 0.00
3 -3.12 NA -0.22 NA NA NA 3 -1878.94 3763.88 26.58 0.00
22 -3.12 0.14 NA 0.04 NA 0.20 5 -1876.99 3763.99 26.69 0.00
7 -3.12 NA -0.21 -0.06 NA NA 4 -1877.99 3763.99 26.69 0.00
4 -3.12 -0.02 -0.23 NA NA NA 4 -1878.82 3765.64 28.34 0.00
17 -3.11 NA NA NA NA 0.15 3 -1879.90 3765.81 28.50 0.00
21 -3.11 NA NA -0.05 NA 0.15 4 -1879.43 3766.85 29.55 0.00
5 -3.10 NA NA -0.08 NA NA 3 -1885.74 3777.49 40.19 0.00
1 -3.10 NA NA NA NA NA 2 -1887.03 3778.06 40.75 0.00
2 -3.10 0.06 NA NA NA NA 3 -1886.33 3778.65 41.35 0.00
6 -3.10 0.02 NA -0.07 NA NA 4 -1885.68 3779.37 42.07 0.00

According to Burnham et al. 2011, any model with a ∆AIC of >7 is deemed uninformative so can be discarded. Average parameter estimates can then be calculated across the remaining informative models.

#Subset list of all models to get informative models (∆AIC < 7)
inf_mod <- subset(dr1, delta < 7)
(Intercept) agr_cover bldngs_cover grass_cover water_cover wood_cover df logLik AICc delta weight
27 -3.14 NA -0.18 NA 0.09 0.13 5 -1863.65 3737.30 0.00 0.33
28 -3.14 0.07 -0.14 NA 0.09 0.15 6 -1862.81 3737.63 0.33 0.28
31 -3.14 NA -0.18 -0.03 0.09 0.12 6 -1863.46 3738.92 1.62 0.15
32 -3.14 0.10 -0.13 0.03 0.09 0.16 7 -1862.71 3739.44 2.14 0.11
26 -3.13 0.13 NA NA 0.10 0.18 5 -1865.31 3740.62 3.32 0.06
30 -3.13 0.19 NA 0.09 0.10 0.21 6 -1864.43 3740.88 3.57 0.06
#Calculate average parameter estimate across all informative models
avg_mod <- model.avg(inf_mod)

Average coefficients can be calculated in two ways:

  1. Conditional average: parameter estimates are only calculated over models in which the given parameter appears.

  2. Full average: parameter estimates are calculated across all informative models and where a given parameter is missing a value of zero is substituted.

As we aim to determine which habitat types are the most important predictors of bat abundance, we want to calculate the full average instead of the conditional average (Burnham & Anderson 2002; Nakagawa & Freckleton 2011)

An information-theoretic approach is also particularly relevant for this dataset given that none of the informative models showed a noticeably high AIC weight (< 0.34) and all covariates were included at least once in the set of informative models (Ferguson-Gow et al. 2014).

#Extract coefficients
coef <- coefficients(avg_mod, full = TRUE)
##  (Intercept) bldngs_cover  water_cover   wood_cover    agr_cover 
## -3.135572586 -0.140864217  0.091127923  0.146107210  0.050543851 
##  grass_cover 
##  0.003885834

The calculation of average models means that it becomes difficult to invoke the usual p-values to test the significance of a particular variable. Instead the 95% confidence interval can be calculated for each predictor. Any predictor variable whose confidence interval includes zero can then be deemed uninformative whereas those whose confidence interval does not include zero can be said to have a noticeable effect on the response variable (loosely equivalent to significance using p-values tested at the 5% significance level) (Grueber et al. 2011).

#Get confidence 95% confidence intervals of avg_mod
ci1 <- confint(avg_mod, full=TRUE)
##                    2.5 %      97.5 %
## (Intercept)  -3.24417035 -3.02697482
## bldngs_cover -0.29952677  0.01779834
## water_cover   0.06140874  0.12084711
## wood_cover    0.05076912  0.24144530
## agr_cover    -0.08998938  0.19107709
## grass_cover  -0.07875849  0.08653016

The relative importance of each predictor variable can then be calculated as the sum of the AIC weights of each informative model that a given predictor is included in.

#Calculate importance (cumulative AIC) weights of each variable
weights <- avg_mod$importance
##                      water_cover wood_cover bldngs_cover agr_cover
## Importance:          1.00        1.00       0.88         0.52     
## N containing models:    6           6          4            4     
##                      grass_cover
## Importance:          0.32       
## N containing models:    3

The combined outputs of the average model therefore suggest:

References

Burnham, K.P. & Anderson, D.R. (2002) Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer Science & Business Media

Burnham, K.P., Anderson, D.R. & Huyvaert, K.P. (2011) AIC model selection and multimodel inference in behavioral ecology: Some background, observations, and comparisons. Behavioral Ecology and Sociobiology, 65, 23–35.

Ferguson-Gow, H., Sumner, S., Bourke, Andrew F. G. & Jones, K.E. (2014) Colony size predicts division of labour in attine ants, 281, 20141411.

Grueber, C.E., Nakagawa, S., Laws, R.J., Jamieson, I.G., 2011. Multimodel inference in ecology and evolution: Challenges and solutions. J. Evol. Biol. 24, 699–711.

Nakagawa, S. & Freckleton, R.P. (2011) Model averaging, missing data and multiple imputation: A case study for behavioural ecology. Behavioral Ecology and Sociobiology, 65, 103–116.