Multiple_Toad_Regression

GLMs with multiple covariates

We will be exploring how to include multiple covariates in a GLM using a dataset from the following published paper: https://doi.org/10.1002/2688-8319.70062

These data represent what factors predict whether wildlife detection dogs could successfully find endangered natterjack toads (pictured above).

We will begin by reading in the data:

toad_search<-read.csv("dog_evaluation.csv")

Here is an explanation of the covariates:

Dog - dog used in test: dog 1, 2, 3 or 4 (for more information see table 1 in the manuscript)
TrainingDay – indicates how many days of training the dog has had in total
Temperature - temperature in °C measured with a TFA Dostmann thermometer. Temperature data of the fieldwork obtained from the meteorological station in Pegau, using daily average temperatures
Insect presence - quantity of flying insects present: none, few, or many, referring to no, single, or clouds of individuals, respectively
Leash – indicates whether the dog was without, on a short or long leash during the test
SearchArea - size of the area searched by the dog in m2
Blindness – referring to the blindness of the test. Not blind: the handler and the assistant knew the position of the natterjack toad in the test; single blind: only the assistant but not the handler knew the position of the natterjack toad in the test; double blind: neither the assistant nor the handler knew the position of the natterjack toad in the test
FoundFirstAttempt – the toad was found by the dog at the first attempt, i.e. after covering the search area once
DogBehaviour - cooperative: dog obviously showed targeted searching with an upright posture and without panting during the search; distracted: dog did not clearly show targeted searching and also showed other behaviour (this could include listening to noises, observing people/dogs or obviously sniffing other traces such as those of other dogs or game); fatigued: dog showed targeted searching but without an upright posture and / or with panting during the search as well as a slower movement; unresponsive: dog would stop searching at all (Grimm-Seyfarth, 2022)

Our response variable will be “FoundFirstAttempt.” This is a binary variable, so we will use a binomial regression to model the data.

As always, we want to explore the data a bit before running models.

mean(toad_search$FoundFirstAttempt)

[1] 0.7236842

Looks like there was a 72% success rate overall. good dogs!

I also like to explore correlations between our predictor variables (at least those that are numeric).

numeric_covs<-toad_search[,c("TrainingDay","Temperature","SearchArea")]
cor(numeric_covs)

            TrainingDay Temperature SearchArea
TrainingDay  1.00000000  0.02326628  0.3867033
Temperature  0.02326628  1.00000000  0.1679018
SearchArea   0.38670332  0.16790175  1.0000000

We will discuss more about this later in class, but if two covariates are very closely related to each other (for example, correlation>90%), it can be difficult to estimate an effect of each covariate. That doesn’t seem to be the case here.

We will also look at some plots of raw data. If I was doing this for real, I would spend a lot of time exploring the data visually. But for this activity, we’ll just look at an example of one plot.

plot(jitter(toad_search$FoundFirstAttempt)~toad_search$TrainingDay,pch=19,col="brown3")

Note because these are binary data, I used jitter to spread the points out a bit. There doesn’t seem to be a strong relationship here. Again, if you were doing this for real, you would spend a lot of time exploring all the relationships, looking for obvious patterns (and using plots to double check if there is anything weird with the data).

To improve model fit and interpretation, we will standardize the covariates by centering around the mean and dividing by two standard deviations. See: https://sites.stat.columbia.edu/gelman/research/published/standardizing7.pdf for an explanation.

I usually just use a function to do the rescaling. But you could also use a function in a package, like: https://search.r-project.org/CRAN/refmans/arm/html/rescale.html

rescale=function(x) {return((x - mean(x, na.rm = TRUE)) / (2 * sd(x, na.rm = TRUE)))}

numeric_covs_std<-apply(numeric_covs,2,rescale)

head(numeric_covs_std)

     TrainingDay Temperature SearchArea
[1,]  -0.7436893   0.1253719 -0.6081917
[2,]  -0.7436893   0.1253719 -0.6081917
[3,]  -0.7436893   0.1253719 -0.5469936
[4,]  -0.4816974  -0.7227606 -0.4857954
[5,]  -0.4816974  -0.7227606 -0.3633990
[6,]  -0.4816974  -0.7227606 -0.2410027

colnames(numeric_covs_std)<-paste(colnames(numeric_covs_std),"std",sep="")

toadly_ready<-data.frame(numeric_covs_std,toad_search)

In the code above, I am using an apply function to apply the rescale function to each of the columns (that is the number 2 in the call to apply). The output is each of the numeric variables, standardized and centered around the mean.

Finally, I am renaming the rescaled covariates using paste and putting them back in the original dataframe.

Now we will run a model:

library("brms")

Loading required package: Rcpp

Loading 'brms' package (version 2.23.0). Useful instructions
can be found by typing help('brms'). A more detailed introduction
to the package is available through vignette('brms_overview').


Attaching package: 'brms'

The following object is masked from 'package:stats':

    ar

toads<-brm(FoundFirstAttempt~TrainingDaystd+Temperaturestd+Leash+DogBehaviour,family="bernoulli",data=toadly_ready)

Compiling Stan program...

WARNING: Rtools is required to build R packages, but is not currently installed.

Please download and install the appropriate version of Rtools for 4.5.2 from
https://cran.r-project.org/bin/windows/Rtools/.

Trying to compile a simple C file

Running "C:/PROGRA~1/R/R-45~1.2/bin/x64/Rcmd.exe" SHLIB foo.c
using C compiler: 'gcc.exe (GCC) 14.3.0'
gcc  -I"C:/PROGRA~1/R/R-45~1.2/include" -DNDEBUG   -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/Rcpp/include/"  -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/RcppEigen/include/"  -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/RcppEigen/include/unsupported"  -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/BH/include" -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/StanHeaders/include/src/"  -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/StanHeaders/include/"  -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/RcppParallel/include/" -DRCPP_PARALLEL_USE_TBB=1 -I"C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -D_HAS_AUTO_PTR_ETC=0  -include "C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp"  -std=c++1y    -I"C:/rtools45/x86_64-w64-mingw32.static.posix/include"      -O2 -Wall -std=gnu2x  -mfpmath=sse -msse2 -mstackrealign   -c foo.c -o foo.o
cc1.exe: warning: command-line option '-std=c++14' is valid for C++/ObjC++ but not for C
In file included from C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/Core:19,
                 from C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/Dense:1,
                 from C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
                 from <command-line>:
C:/Users/trevorcaughlin/AppData/Local/R/win-library/4.5/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
  679 | #include <cmath>
      |          ^~~~~~~
compilation terminated.
make: *** [C:/PROGRA~1/R/R-45~1.2/etc/x64/Makeconf:289: foo.o] Error 1

WARNING: Rtools is required to build R packages, but is not currently installed.

Please download and install the appropriate version of Rtools for 4.5.2 from
https://cran.r-project.org/bin/windows/Rtools/.

Start sampling


SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5.5e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.55 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.208 seconds (Warm-up)
Chain 1:                0.26 seconds (Sampling)
Chain 1:                0.468 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 8e-06 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.217 seconds (Warm-up)
Chain 2:                0.21 seconds (Sampling)
Chain 2:                0.427 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3.4e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.34 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.215 seconds (Warm-up)
Chain 3:                0.23 seconds (Sampling)
Chain 3:                0.445 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 2.3e-05 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.22 seconds (Warm-up)
Chain 4:                0.186 seconds (Sampling)
Chain 4:                0.406 seconds (Total)
Chain 4:

Note I am specifying family="bernoulli" as the special case of the binomial distribution where Ntrials=1 (since toads are either found or not found). You could also say family="binomial" but it is a bit less efficient.

We can explore the output of the model using:

summary(toads)

 Family: bernoulli 
  Links: mu = logit 
Formula: FoundFirstAttempt ~ TrainingDaystd + Temperaturestd + Leash + DogBehaviour 
   Data: toadly_ready (Number of observations: 228) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept                  2.46      0.65     1.29     3.79 1.00     2057
TrainingDaystd             0.43      0.40    -0.34     1.23 1.00     2761
Temperaturestd            -0.15      0.37    -0.89     0.55 1.00     3651
Leashnoleash              -0.61      0.63    -1.88     0.57 1.00     2148
Leashshortleash           -1.59      0.76    -3.08    -0.12 1.00     2070
DogBehaviourdistracted    -3.62      0.69    -5.06    -2.36 1.00     3913
DogBehaviourfatigued      -1.04      0.41    -1.87    -0.24 1.00     3258
                       Tail_ESS
Intercept                  2244
TrainingDaystd             2906
Temperaturestd             2986
Leashnoleash               2352
Leashshortleash            2581
DogBehaviourdistracted     2786
DogBehaviourfatigued       2893

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(toads)

Remember, with the effects here, each slope is interpretable as the effect when all other covariates are at zero. Because of our rescaling, this means other covariates are at zero when they are at their mean value.

It is also useful to use conditional_effects(...) to explore each covariate. I wouldn’t use these default graphs for a paper (they are kind of ugly) but they’re a great first step to explore results.

We can immediately see in the coefficient plot which covariates have a strong effect and which have a weak effect. Note that effect strength is how far away from zero each covariate is. This is separate from uncertainty–how wide the posterior distribution for each covariate is. You can have a strong effect with high uncertainty and weak effects with low uncertainty.

We can use all the tricks we have been developing to extract information about the effects of each covariate.

toad_draws<-data.frame(toads)
length(which(toad_draws$b_TrainingDaystd>0))/length(toad_draws$b_TrainingDaystd)

[1] 0.85575

Let’s say that we wanted to evaluate whether days trained has an impact on toad search success. This seems useful to explore with a model, because spending days training dogs is probably a lot of work.

From the code chunk above, the probability that training days has a positive effect on finding toads is 83%.

min_traindays=plogis(toad_draws$b_Intercept+toad_draws$b_TrainingDaystd*min(toadly_ready$TrainingDaystd))

max_traindays=plogis(toad_draws$b_Intercept+toad_draws$b_TrainingDaystd*max(toadly_ready$TrainingDaystd))

effect_traindays<-max_traindays-min_traindays

hist(effect_traindays)

quantile(effect_traindays,c(0.05,0.5,0.95))

         5%         50%         95% 
-0.03598798  0.04186844  0.13014668

We can also look at an effect size by subtracting the predicted effect at the maximum number of training days from the minimum number. Looks like there’s about a 4% increase in finding toads when going from the minimum training days to the maximum training days. But the 90% CI are wide (-0.04 to 0.13%). Remember that the interpretation of this effect is for a dog that is average in every other way (and at the baseline levels of the categorical covariates).

Looking at the coefficient plot, it looks like dog behavior has the strongest effect overall. Let’s repeat this activity for that covariate.

We’ll start with a reminder of the levels for this factor. Looks like “cooperative” is our alphabetically first level, which matches the coefficient plot. Cooperative dogs are the intercept here, and everything else is being compared to them. Let’s look at how a distracted dog compares to a cooperative dog.

unique(toadly_ready$DogBehaviour)

[1] "cooperative" "distracted"  "fatigued"

length(which(toad_draws$b_DogBehaviourdistracted<0))/length(toad_draws$b_DogBehaviourdistracted)

[1] 1

100% of the posterior draws are negative. So we would write something like the following in a paper, “There is a >99% probability that distracted dogs have lower success in finding toads.”

We can also look at effect size:

cooperative_dog=plogis(toad_draws$b_Intercept)

distracted_dog=plogis(toad_draws$b_Intercept+toad_draws$b_DogBehaviourdistracted)

effect_distraction<-cooperative_dog-distracted_dog

hist(effect_distraction)

quantile(effect_distraction,c(0.05,0.5,0.95))

       5%       50%       95% 
0.3860801 0.6593081 0.8173268

This is a much stronger effect. For a dog that is average in every other way, being distracted decreases the probability of finding a dog by a median of 66%. There is a 90% probability that this effect is a decrease between 39% and 82% .

Finally, we can use Bayes R2 to evaluate model fit.

bayes_R2(toads)

    Estimate  Est.Error      Q2.5    Q97.5
R2 0.2546617 0.03434762 0.1789316 0.313297

R2 is about 26%. Not bad for a behavior study!