Signal detection theory in ecological biology
Hanley et al. (2017)
NOTE: in progress.
Hanley et al. (2017) investigated the effect of egg colour on egg rejection from nests of 2 bird species, using 2 different spectra of egg colours. The authors placed 1 model egg (each of different colours) in 82 blackbird and 52 robin nests. Egg colours were uniquely positioned either along a gradient of blue-green to brown colour variation representative of natural avian eggshell colours or along an alternative orthogonal gradient varying from green to purple. Aim is to compare how rejection varies between the two spectra of colours.
# Preprocess data
hanley <- read_csv("../data/Hanley_2017.csv") %>%
clean_names() %>%
select(-x1) %>%
mutate(across(response, recode,
eject = 1,
accept = 0)) # Missing data
count(hanley, is.na(response))# A tibble: 2 × 2
`is.na(response)` n
<lgl> <int>
1 FALSE 101
2 TRUE 33
# Preview data
glimpse(hanley)Rows: 134
Columns: 11
$ species <chr> "blackbird", "blackbird", "blackbird", "blackbird", "blackb…
$ u <dbl> 0.04361443, 0.03912587, 0.04067170, 0.03920246, 0.03757227,…
$ s <dbl> 0.2948303, 0.2183655, 0.2039089, 0.3400397, 0.2091810, 0.23…
$ m <dbl> 0.4008509, 0.3687523, 0.3317988, 0.4073794, 0.3711084, 0.36…
$ l <dbl> 0.2607044, 0.3737564, 0.4236206, 0.2133784, 0.3821383, 0.35…
$ lum <dbl> 2203.486, 1633.383, 2032.755, 2702.872, 1784.996, 1872.785,…
$ response <dbl> 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1,…
$ flush <chr> "N", "Y", "N", "Y", "N", "Y", "Y", "Y", "N", "Y", "Y", "Y",…
$ clutch <dbl> 4, 5, 4, 3, 5, 4, 3, 4, 4, 4, 4, 4, 5, 4, 4, 5, 5, 5, 6, 5,…
$ layingdate <dbl> 103, 105, 106, 102, 105, 105, 104, 103, 107, 109, 111, 108,…
$ age <dbl> 2, -1, 2, 7, 2, 3, 6, 6, 2, 1, 0, 3, 0, 6, 3, 2, 0, 1, 5, 0…
# Model specifications
model_0 <- glm(response ~ 1, data = hanley, family = binomial(link = "probit"))
model_1 <- glm(response ~ species, data = hanley, family = binomial(link = "probit"))
model_2 <- glm(response ~ species + u, data = hanley, family = binomial(link = "probit"))A model comparison using a \(\chi^2\) test revealed support for the model with all predictors.
models <- anova(model_0, model_1, model_2, test = "Chisq")The results are shown in Figure ??.
Hanley et al. (2019)
Hanley et al. (2019) investigated the effect of both model egg colour and spotting on rejection from chalk-browed mockingbirds nests. The authors manipulated the colour of the model eggs along a spectrum from blue-green to brown and either with or without spots. The mockingbirds’ eggs are spotted. A single egg is then placed in a nest, and recorded if “accepted” or “ejected.”
# Data processing
hanley <- read_csv("../data/Hanley_2019.csv") %>%
clean_names() %>%
mutate(across(spot, recode_factor, "y" = 0,
"n" = 1),
across(response, recode, "accept" = 0,
"eject" = 1),
prop_cowbird_eggs =
cowbird_eggs / (cowbird_eggs + mockingbird_eggs),
prop_cowbird_eggs =
ifelse(mockingbird_eggs == 0, 1, prop_cowbird_eggs),
prop_cowbird_eggs =
ifelse(cowbird_eggs == 0, 0, prop_cowbird_eggs))# Preview data
glimpse(hanley)Rows: 70
Columns: 7
$ response <dbl> 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1,…
$ chromatic_contrast_jnd <dbl> 3.576661, 4.757981, 4.161816, 1.817076, 1.89454…
$ directional_colour_jnd <dbl> 3.576661, -4.757981, 4.161816, 1.817076, 1.8945…
$ spot <fct> 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0,…
$ mockingbird_eggs <dbl> 3, 3, 3, 2, 2, 2, 1, 2, 2, 1, 2, 3, 2, 3, 2, 2,…
$ cowbird_eggs <dbl> 2, 0, 0, 0, 1, 0, 2, 2, 0, 2, 1, 1, 1, 0, 0, 0,…
$ prop_cowbird_eggs <dbl> 0.4000000, 0.0000000, 0.0000000, 0.0000000, 0.3…
# Model specifications
model_0 <- glm(response ~ 1, data = hanley, family = binomial(link = "probit"))
model_1 <- glm(response ~ spot, data = hanley, family = binomial(link = "probit"))
model_2 <- glm(response ~ spot + prop_cowbird_eggs, data = hanley, family = binomial(link = "probit"))
model_3 <- glm(response ~ spot + directional_colour_jnd + chromatic_contrast_jnd + prop_cowbird_eggs, data = hanley, family = binomial(link = "probit"))
model_4 <- glm(response ~ spot * (directional_colour_jnd + chromatic_contrast_jnd + prop_cowbird_eggs), data = hanley, family = binomial(link = "probit"))A model comparison using a \(\chi^2\) test revealed support for the model with independent predictor effects.
models <- anova(model_0, model_1, model_2, model_3, model_4, test = "Chisq")| Model | Resid. Df | Resid. Dev | Df | Deviance | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|
| 1 | 69 | 96.53 | – | – | – |
| 2 | 68 | 91.71 | 1 | 4.82 | <0.05 |
| 3 | 67 | 86.41 | 1 | 5.3 | <0.05 |
| 4 | 65 | 75.43 | 2 | 10.98 | <0.05 |
| 5 | 62 | 75.08 | 3 | 0.35 | 0.95 |
The results are shown in Figure 1. On the left is a reproduction from Hanley et al. (2019); the results of the SDT model are shown on the right. These results show that unspotted eggs were more likely to be rejected, as well as eggs that are on the brown side of the colour spectrum.
Figure 1: STD results (right). Note: For this inferred visualisation chronomatic contrast was set to the median of the contrast scale and proportion of cowbird eggs was set to 0. The model including all interactions was used for visualisation to allow the model slopes for directional colour to vary by whether the egg was spotted or not.
Figure 2 shows that this effect is roughly similar regardless of how many eggs were either cowbird or mockingbird eggs.
Figure 2: STD results relative to the proportion of cowbird eggs in the nest.
Noh et al. (2018)
Noh et al. (2018) investigated factors of how hosts recognise parsiting chicks. The authors located 54 nests, some parasitised by cuckoos and some unparasitised. They then monitored ejection of chicks after manipulating some nests with (1) cross-fostered cuckoo eggs resulting in some naturally parasitised nests and some artificially parasitised nests, (2) delay hatching of cuckoo to ensure cuckoo hatches after the host’s own chicks, (3) trimming the feathers of some chicks.
# Preprocess data
noh <- read_csv("../data/Noh_2018.csv") %>%
clean_names() %>%
mutate(response = recode(reject, "y" = 1, "n" = 0),
across(species, recode_factor, "h" = 0, "c" = 1),
across(trim, recode_factor, "y" = "trimmed feathers", "n" = "natural feathers"),
across(hatch_first, ~ifelse(hatch_first == "y" & species == "1", "cuckoo", hatch_first)),
across(hatch_first, ~ifelse(hatch_first == "n" & species == "1", "host", hatch_first)),
across(hatch_first, ~ifelse(hatch_first == "y" & species == "0", "host", hatch_first)),
across(hatch_first, ~ifelse(hatch_first == "n" & species == "0", "cuckoo", hatch_first)),
across(c(hatch_first, cuckoo_visit_to_nest, minority), factor),
across(days_it_was_remvoed, replace_na, 0)) %>%
select(-reject, days_to_removal = days_it_was_remvoed) # Preview data
glimpse(noh)Rows: 125
Columns: 10
$ nest_id <chr> "LBG002", "LBG002", "LBG002", "LBG002", "LBG003",…
$ chick_id <chr> "c1", "h1", "h2", "h3", "c2", "c3", "h4", "h5", "…
$ species <fct> 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0…
$ hatch_first <fct> cuckoo, cuckoo, cuckoo, cuckoo, host, host, cucko…
$ trim <fct> natural feathers, natural feathers, natural feath…
$ cuckoo_visit_to_nest <fct> y, y, y, y, y, n, n, n, n, n, n, n, n, n, n, n, n…
$ minority <fct> y, n, n, n, n, y, n, n, n, n, y, n, n, n, n, y, n…
$ days_to_removal <dbl> 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0…
$ hatching_date <dbl> 1, 1, 2, 2, 19, 24, 24, 24, 41, 44, 43, 44, 50, 5…
$ response <dbl> 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0…
# should be 32 rows, some some cells are missing
count(noh, species, hatch_first, trim, cuckoo_visit_to_nest, minority)# A tibble: 22 × 6
species hatch_first trim cuckoo_visit_to_nest minority n
<fct> <fct> <fct> <fct> <fct> <int>
1 0 cuckoo trimmed feathers n n 2
2 0 cuckoo trimmed feathers n y 4
3 0 cuckoo trimmed feathers y n 2
4 0 cuckoo natural feathers n n 28
5 0 cuckoo natural feathers y n 11
6 0 host trimmed feathers n n 6
7 0 host trimmed feathers n y 6
8 0 host trimmed feathers y n 1
9 0 host natural feathers n n 20
10 0 host natural feathers n y 1
# … with 12 more rows
# Model specifications
model_0 <- glmer(response ~ 1 + (1|nest_id), family = binomial("probit"), data = noh)
model_1 <- glmer(response ~ species + (1|nest_id), family = binomial("probit"), data = noh)
model_2 <- glmer(response ~ species + hatch_first + trim + minority + cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)
model_3 <- glmer(response ~ species * hatch_first + trim + minority + cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)
model_4 <- glmer(response ~ species * (hatch_first + trim) + minority + cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)
model_5 <- glmer(response ~ species * (hatch_first + trim + minority) + cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)
model_6 <- glmer(response ~ species * (hatch_first + trim + minority + cuckoo_visit_to_nest) + (1|nest_id), family = binomial("probit"), data = noh)model_1 <- glmer(response ~ trim * cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)
summary(model_1)Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( probit )
Formula: response ~ trim * cuckoo_visit_to_nest + (1 | nest_id)
Data: noh
AIC BIC logLik deviance df.resid
110.2 124.3 -50.1 100.2 120
Scaled residuals:
Min 1Q Median 3Q Max
-1.7321 -0.1857 -0.1857 0.5774 5.3852
Random effects:
Groups Name Variance Std.Dev.
nest_id (Intercept) 0 0
Number of obs: 125, groups: nest_id, 54
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.3488 0.2732 1.277 0.2018
trimnatural feathers -2.1827 0.4148 -5.261 1.43e-07
cuckoo_visit_to_nesty 0.3257 0.4789 0.680 0.4964
trimnatural feathers:cuckoo_visit_to_nesty 1.1359 0.6165 1.842 0.0654
(Intercept)
trimnatural feathers ***
cuckoo_visit_to_nesty
trimnatural feathers:cuckoo_visit_to_nesty .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) trmntf cck___
trmntrlfthr -0.659
cck_vst_t_n -0.570 0.376
tfthrs:c___ 0.443 -0.673 -0.777
optimizer (Nelder_Mead) convergence code: 0 (OK)
boundary (singular) fit: see ?isSingular
A model comparison using a \(\chi^2\) test revealed that the model with all independent effects performed the best.
models <- anova(model_0, model_1, model_2, model_3, model_4, model_5, model_6, test = "Chisq")| Model | npar | AIC | BIC | logLik | deviance | \(\chi^2\) | Df | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 154.09 | 159.74 | -75.04 | 150.09 | – | – | – |
| 1 | 5 | 110.2 | 124.34 | -50.1 | 100.2 | 49.89 | 3 | <0.05 |
| 2 | 7 | 88.21 | 108.01 | -37.11 | 74.21 | 25.99 | 2 | <0.05 |
| 3 | 8 | 89.78 | 112.41 | -36.89 | 73.78 | 0.43 | 1 | 0.5113 |
| 4 | 9 | 88.69 | 114.14 | -35.34 | 70.69 | 3.09 | 1 | 0.0786 |
| 5 | 10 | 89.86 | 118.15 | -34.93 | 69.86 | 0.82 | 1 | 0.3641 |
| 6 | 11 | 90.16 | 121.27 | -34.08 | 68.16 | 1.71 | 1 | 0.1914 |
Due to the number of predictors it might be worth investigating only particular factors and their interactions.
# Model specifications
model_2 <- glmer(response ~ species + hatch_first + (1|nest_id), family = binomial("probit"), data = noh)
model_3 <- glmer(response ~ species + hatch_first + trim + (1|nest_id), family = binomial("probit"), data = noh)
model_4 <- glmer(response ~ species + hatch_first + trim + minority + (1|nest_id), family = binomial("probit"), data = noh)
model_5 <- glmer(response ~ species + hatch_first + trim + minority + cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)Species, trim and hatch first appear to me relevant factors.
models <- anova(model_0, model_1, model_2, model_3, model_4, model_5, test = "Chisq")| Model | npar | AIC | BIC | logLik | deviance | \(\chi^2\) | Df | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 154.09 | 159.74 | -75.04 | 150.09 | – | – | – |
| 2 | 4 | 126.41 | 137.73 | -59.21 | 118.41 | 31.68 | 2 | <0.05 |
| 1 | 5 | 110.2 | 124.34 | -50.1 | 100.2 | 18.21 | 1 | <0.05 |
| 3 | 5 | 86.4 | 100.54 | -38.2 | 76.4 | 23.8 | 0 | – |
| 4 | 6 | 88.15 | 105.12 | -38.08 | 76.15 | 0.25 | 1 | 0.617 |
| 5 | 7 | 88.21 | 108.01 | -37.11 | 74.21 | 1.94 | 1 | 0.164 |
Focusing on a possible interaction of species and whether the chick had trimmed or natural feathers revealed no evidence.
model_2 <- glmer(response ~ species + trim + hatch_first + minority + cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)
model_3 <- glmer(response ~ species * trim + hatch_first + minority + cuckoo_visit_to_nest + (1|nest_id), family = binomial("probit"), data = noh)models <- anova(model_2, model_3, test = "Chisq")| Model | npar | AIC | BIC | logLik | deviance | \(\chi^2\) | Df | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|---|---|
| 2 | 7 | 88.21 | 108.01 | -37.11 | 74.21 | – | – | – |
| 3 | 8 | 87.09 | 109.72 | -35.54 | 71.09 | 3.12 | 1 | 0.0771 |
The SDT results are summarised in Figure 3 showing the ejection rate in z-scores. The results suggest a higher ejection rate for cuckoo chicks than for host chicks and for chicks with trimmed feathers compared to chicks with natural feathers.
Figure 3: SDT results
Torres & Tsutsui (2016)
Torres & Tsutsui (2016) investigated aggression between enslaved and free-living ants. The authors put 3 ants of a focal colony in an arena, then added 3 intruder ants, and recorded if the focal ant initiate aggressions.
NOTE: I’m not quite sure here if the question is whether the focal or the intruder initiates aggression. We don’t know if the intruder did, as far as I can tell, we only know if the focal ant did.
They manipulated the identity of the focal worker (enslaved or free-living) and the distance of the foreign worker (originating from a nest close or far from the focal nest).
# Data preprocessing
torres <- read_csv("../data/Torres_2016.csv") %>%
clean_names() %>%
select(-treatment) %>%
rename(id_foc = focal,
focal = id_focal,
id_for = foreign,
foreign = idforeign) %>%
mutate(across(c(focal, foreign), recode,
NC = "same colony",
FL = "free living",
En = "slaves"),
across(distance, replace_na, "same"),
across(distance, recode,
"C" = "close",
"F" = "far away"),
across(c(focal, foreign, distance), factor)) # Data preview
glimpse(torres)Rows: 360
Columns: 6
$ id_foc <chr> "N11", "N11", "N11", "N11", "N11", "N11", "N11", "N11", "N11…
$ focal <fct> slaves, slaves, slaves, slaves, slaves, slaves, slaves, slav…
$ id_for <chr> "N12", "N12", "N12", "N12", "N12", "N12", "N12", "N12", "N12…
$ foreign <fct> free living, free living, free living, free living, free liv…
$ distance <fct> close, close, close, close, close, close, close, close, clos…
$ agr_focal <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
# Levels of factors and number of observations
count(torres, focal, foreign, distance)# A tibble: 5 × 4
focal foreign distance n
<fct> <fct> <fct> <int>
1 free living free living close 60
2 free living free living far away 60
3 same colony same colony same 120
4 slaves free living close 60
5 slaves free living far away 60
# Notice that foreign can be fully explained by focal.torres <- mutate(torres, COND = str_c(focal, "_", distance),
across(COND, factor))
# Hypothesis matrix
cmat <- MASS::fractions(matrix(c(
# Main effects
1, 1, 0,-1,-1, # Slaves vs free
1, 1,-2, 1, 1, # Nestmates vs non-nestmates
1,-1, 0, 0, 0, # free living: close vs far
0, 0, 0, 1,-1 # slaves: close vs far
),
nrow=length(levels(torres$COND)),
byrow=F))
rown <- levels(torres$COND)
rownames(cmat) <- rown
coln <- c(" slaves vs free-living", " nm vs nnm",
" free-living: distance", " slaves: distance")
colnames(cmat)<- coln
# transpose and inverse matrix to get contrast between the expected levels
inv.cmat<-MASS::fractions(t(MASS::ginv(cmat)))
rownames(inv.cmat) <- rown
colnames(inv.cmat) <- coln
# Assign contrasts
contrasts(torres$COND) <- inv.cmat
contrasts(torres$COND) slaves vs free-living nm vs nnm free-living: distance
free living_close 1/4 1/8 1/2
free living_far away 1/4 1/8 -1/2
same colony_same 0 -1/4 0
slaves_close -1/4 1/8 0
slaves_far away -1/4 1/8 0
slaves: distance
free living_close 0
free living_far away 0
same colony_same 0
slaves_close 1/2
slaves_far away -1/2
# Add a random 1 to the same colony data.
#torres[torres$focal == "same colony" &
# torres$distance == "same",][sample(120, size = 1),]$agr_focal <- 1model_cc <- glmer(agr_focal ~ COND + (1|id_foc/id_for), family = binomial("probit"), data = torres)Simpler models
model_0 <- glmer(agr_focal ~ 1 + (1|id_foc/id_for), family = binomial("probit"), data = torres)
model_1 <- glmer(agr_focal ~ focal + (1|id_foc/id_for), family = binomial("probit"), data = torres)
model_2 <- glmer(agr_focal ~ focal + distance + (1|id_foc/id_for), family = binomial("probit"), data = torres)
#model_3 <- glmer(agr_focal ~ focal * distance + (1|id_foc/id_for), family = binomial("probit"), data = torres)A model comparison using a \(\chi^2\) test revealed that the model with the focal ant by distance interaction outperforms all other models.
models <- anova(model_0, model_1, model_2, model_cc, test = "Chisq")| Model | npar | AIC | BIC | logLik | deviance | \(\chi^2\) | Df | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|---|---|
| 0 | 3 | 278.83 | 290.49 | -136.42 | 272.83 | – | – | – |
| 1 | 5 | 241.47 | 260.9 | -115.73 | 231.47 | 41.37 | 2 | <0.05 |
| 2 | 6 | 243.23 | 266.55 | -115.62 | 231.23 | 0.23 | 1 | 0.629 |
| c | 7 | 241.33 | 268.53 | -113.66 | 227.33 | 3.91 | 1 | <0.05 |
Note that because of the lack of variability in the nest-mate group, some model estimates were dropped from the model (couldn’t be estimated). Other estimates, in particular the response criteria showed extremely large confidence intervals indicating a failure to render model estimates. We will get back to this below using Bayesian inference.
| Predictor | Estimate | lower | upper | z-score | Pr(>|z|) |
|---|---|---|---|---|---|
| (Intercept) | -2.67 | -505.47 | 500.14 | -0.01 | 0.99 |
| slaves vs free-living | 1.92 | 0.86 | 2.99 | 3.54 | <0.05 |
| nm vs nnm | 14.29 | -4008.16 | 4036.74 | 0.01 | 0.99 |
| free-living: distance | 0.18 | -0.28 | 0.65 | 0.78 | 0.44 |
| slaves: distance | -0.64 | -1.33 | 0.05 | -1.82 | 0.07 |
The SDT results are summarised in Figure 4 illustrating the aggressivity rate in z-scores. Note, for visualisation we removed ants from the same colony (negative control condition) as the lack of variance in the data resulted in large confidence intervals. Other than that we observe a higher aggression rate initiated by free-living ants compared to enslaved ants. Distance of invader colony was relevant for enslaved ants but not for free-living ants. Enslaved ants displayed more aggression towards intruder ants from far away colonies.
Figure 4: SDT model results
The nest-mate group exhibited a floor effect. In this sample, and in many previous samples, there were no observations of aggression towards nest-mates. This makes it difficult to get a good parameter estimate estimate for the response criterion. Also, this doesn’t mean that ants never exhibit aggression towards nest-mates. It is merely very rare and might only show in much larger samples. The glmer rendered extremely large CIs due to the lack of variability in the nest-mate group and related coefficients were dropped from the glmer.
The Bayesian modeling framework provides a solution to the lack of variability in the nest-mate group. We can use a prior on the response criterion (aggression towards nest-mates) that is indicating that the aggression rate is very low. We estimated the cell means shown above from a Bayesian model with a intercept prior (criterion) that is distributed normally with a logit mean of 6 (which is in proportion \(\approx\) .0025) with a standard deviation of 1. Priors on slops were weakly informative; we used a Student-t distribution with a mean of 2, a standard deviation of 1 and 7 degrees of freedom. Details of the code are omitted here. We largely reproduced the results for the cell means.
Figure 5: Bayesian SDT model results
In comparison to the frequentist glmer model above, the Bayesian model successfully returned parameter estimates for response criteria and signal strength.
| Predictor | Estimate | lower | upper |
|---|---|---|---|
| Intercept | -2.54 | -3.85 | -1.61 |
| slaves vs free living | 2.19 | 0.86 | 3.64 |
| nm vs nnm | 9.05 | 2.98 | 22.48 |
| free living:distance | 0.35 | -0.29 | 1.14 |
| slaves:distance | -0.51 | -1.46 | 0.49 |
Saar et al. (2018)
We probably would need the actual aggression index rather than the average.
read_csv("../data/Saar_2018.csv") %>%
clean_names()# A tibble: 63 × 5
species colony_id encounter_type agression_index interaction_time
<chr> <dbl> <chr> <dbl> <dbl>
1 M. arenarius 6 con 0.174 65.8
2 M. arenarius 21 con 0.783 133.
3 M. arenarius 23 con 0 0
4 M. arenarius 24 con 0.466 149.
5 M. arenarius 25 con 0.476 57.4
6 M. arenarius 26 con 0.129 142.
7 M. arenarius 27 con 0.309 53.8
8 M. arenarius 28 con 0.694 167.
9 M. arenarius 31 con 0.746 153.
10 M. arenarius 210 con 0.0732 94.4
# … with 53 more rows
Csata et al. (2017)
Csata et al. (2017) investigated the effect of infection by a parasite on recognition behaviour in both worker and queen ants.
# Preprocess data
cstata <- read_csv("../data/Csata_2017.csv") %>%
clean_names() %>%
mutate(pair_id = as.numeric(factor(str_c(colony_id_1, colony_id_2)))) %>%
select(-starts_with("colony")) %>%
pivot_longer(-c(pair_id, contains("status"), ant_type)) %>%
mutate(attack_by_queen = ifelse(ant_type == "queen" & str_detect(name, "2"), 1, 0),
across(name, stringr::str_sub, start = 1, end = -2)) %>%
pivot_wider(names_from = name,
values_from = value) %>%
unnest() %>%
mutate(across(starts_with("status"), recode_factor,
"Uninfected" = 0,
"Infected" = 1),
across(attack_by_queen, factor),
across(ant_type, recode_factor,
"queen" = 1,
"worker" = 0)) %>%
rename(worker_infected = status1,
ant_2_infected = status2,
ant_2_is_queen = ant_type)We rearranged the data for an analysis in SDT. Importantly, we collapsed the two data sets where the foreign ant (ant 2) can either be a worker or a queen. This factor is included in the analysis.
# Data preview
glimpse(cstata)Rows: 296
Columns: 7
$ worker_infected <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ ant_2_infected <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ ant_2_is_queen <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ pair_id <dbl> 67, 67, 67, 67, 67, 67, 56, 56, 56, 56, 56, 56, 54, 54…
$ attack_by_queen <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ total <dbl> 14, 9, 18, 21, 3, 10, 7, 6, 3, 2, 2, 0, 10, 9, 5, 2, 5…
$ adverse <dbl> 4, 1, 0, 0, 0, 1, 2, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, …
model_0 <- glmer(cbind(adverse, total - adverse) ~ 1 + (1|pair_id), family = binomial("probit"), data = cstata)
model_1 <- glmer(cbind(adverse, total - adverse) ~ worker_infected + (1|pair_id), family = binomial("probit"), data = cstata)
model_2 <- glmer(cbind(adverse, total - adverse) ~ worker_infected + attack_by_queen + ant_2_infected + ant_2_is_queen + (1|pair_id), family = binomial("probit"), data = cstata)
model_3 <- glmer(cbind(adverse, total - adverse) ~ worker_infected * ( attack_by_queen + ant_2_infected + ant_2_is_queen) + (1|pair_id), family = binomial("probit"), data = cstata)
model_4 <- glmer(cbind(adverse, total - adverse) ~ (worker_infected + attack_by_queen + ant_2_infected + ant_2_is_queen)^2 + (1|pair_id), family = binomial("probit"), data = cstata)
model_5 <- glmer(cbind(adverse, total - adverse) ~ worker_infected * attack_by_queen * ant_2_infected * ant_2_is_queen + (1|pair_id), family = binomial("probit"), data = cstata)A model comparison using a \(\chi^2\) test revealed that the model with all interactions outperforms all other models.
models <- anova(model_0, model_1, model_2, model_3, model_4, model_5, test = "Chisq")| Model | npar | AIC | BIC | logLik | deviance | \(\chi^2\) | Df | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 1005.21 | 1012.59 | -500.6 | 1001.21 | – | – | – |
| 1 | 3 | 988.13 | 999.2 | -491.07 | 982.13 | 19.07 | 1 | <0.05 |
| 2 | 6 | 873.43 | 895.57 | -430.71 | 861.43 | 120.71 | 3 | <0.05 |
| 3 | 9 | 875.14 | 908.35 | -428.57 | 857.14 | 4.29 | 3 | 0.232 |
| 4 | 11 | 877.25 | 917.85 | -427.63 | 855.25 | 1.89 | 2 | 0.389 |
| 5 | 13 | 873.81 | 921.79 | -423.91 | 847.81 | 7.44 | 2 | <0.05 |
The SDT results are summarised in Figure 6 illustrating the aggressivity rate in z-scores. The results show that in particular uninfected ants are aggressive towards uninfected foreign queens. Infected host ants are no more aggressive towards infected than uninfected foreign queens. Generally there is little aggressive towards foreign workers, and host ants display more aggression than foreign queens but not compared to foreign workers.
Figure 6: SDT model results
Beros et al. (2015)
Beros et al. (2015) investigated how infection by a parasite effects aggression in ants. In particular the tested how tapeworm-infected workers affect colony aggression by manipulating their presence in ant colonies and whether their presence resulted in behavioural alterations in their nest-mates.
The authors presented colonies with infected workers, uninfected workers from infected colonies, uninfected workers from uninfected colonies, and uninfected ants from a different species (T. affinis), and recorded colony aggression. They presented intruder ants both before and after colony ‘manipulation,’ where each colony was assigned one of 6 treatments to manipulated colony infection status.
# Data preprocessing
beros <- read_csv("../data/Beros_2015.csv") %>%
clean_names() %>%
mutate(id = as.numeric(factor(unique_id)),
across(treatment, recode, "1" = "no change-infected",
"2" = "no change-uninfected",
"3" = "removal-infected",
"4" = "addition-infected",
"5" = "removal-uninfected",
"6" = "addition-uninfected")) %>%
select(-rel_agg, -unique_id, -opponent) %>%
separate(treatment, into = c("change", "health"), sep = "-") %>%
rename(aggressions = num_br_worker_agg_interactions,
total = total_interactions)# Data preview
glimpse(beros)Rows: 912
Columns: 7
$ colony_id <chr> "A13", "A2", "D86", "C70", "D74", "A37", "D82", "A3", "D85…
$ change <chr> "no change", "no change", "removal", "removal", "addition"…
$ health <chr> "infected", "uninfected", "infected", "uninfected", "infec…
$ pre_post <chr> "pre", "pre", "pre", "pre", "pre", "pre", "pre", "pre", "p…
$ aggressions <dbl> 13, 10, 15, 10, 16, 27, 13, 3, 1, 8, 26, 27, 13, 9, 11, 8,…
$ total <dbl> 15, 24, 58, 32, 27, 30, 25, 26, 47, 18, 29, 45, 34, 14, 18…
$ id <dbl> 20, 32, 184, 124, 156, 52, 168, 48, 180, 24, 84, 104, 160,…
# Model specifications
model_0 <- glmer(cbind(aggressions, total - aggressions) ~ 1 + (1|colony_id/id), family = binomial("probit"), data = beros)
model_1 <- glmer(cbind(aggressions, total - aggressions) ~ health + (1|colony_id), family = binomial("probit"), data = beros)
model_2 <- glmer(cbind(aggressions, total - aggressions) ~ health + change + pre_post + (1|colony_id/id), family = binomial("probit"), data = beros)
model_3 <- glmer(cbind(aggressions, total - aggressions) ~ (health + change + pre_post)^2 + (1|colony_id/id), family = binomial("probit"), data = beros)
model_4 <- glmer(cbind(aggressions, total - aggressions) ~ health * change * pre_post + (1|colony_id/id), family = binomial("probit"), data = beros)A model comparison using a \(\chi^2\) test revealed that the model with all interactions outperforms all other models.
models <- anova(model_0, model_1, model_2, model_3, model_4, test = "Chisq")| Model | npar | AIC | BIC | logLik | deviance | \(\chi^2\) | Df | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|---|---|
| model_0 | 3 | 7269.8 | 7284.25 | -3631.9 | 7263.8 | – | – | – |
| model_1 | 3 | 9006.51 | 9020.96 | -4500.25 | 9000.51 | 0 | 0 | – |
| model_2 | 7 | 7259.59 | 7293.3 | -3622.8 | 7245.59 | 1754.92 | 4 | <0.05 |
| model_3 | 12 | 7245.59 | 7303.38 | -3610.79 | 7221.59 | 24 | 5 | <0.05 |
| model_4 | 14 | 7207.28 | 7274.7 | -3589.64 | 7179.28 | 42.31 | 2 | <0.05 |
The SDT results are summarised in Figure 7 illustrating the aggressivity rate in z-scores. The results suggest more aggressive behaviours when uninfected ants were added, removed or stayed in the colonies compared to infected ants. Adding infected ants did not change the aggression in the colony but removing infected ants increased the number of aggressions.
Figure 7: SDT model results
DaCosta & Sorenson (2014)
- SC: sympatric conspecific; same species + living in same place / same time.
- AC: allopatric conspecific; same species + living somewhere else.
- SH: sympatric heterospecific; other species + living in same place / same time.
Load data and ensure that SC is first condition of Treatment factor.
costa <- read_csv("../data/DaCosta_2014.csv") %>%
select(-Date,-contains("ranks")) %>%
mutate(Treatment = factor(Treatment, levels = c("SC", "AC", "SH")))Some variables show a large number of zero observations.
# A tibble: 9 × 2
variable prop_of_zeros
<chr> <dbl>
1 Singing 87
2 Absent 76
3 Mimicry 74
4 CloseDur 48
5 Chatter 22
6 Hops 15
7 Flights 7
8 Lclose 0
9 Lresp 0
Variables with high proportion of zeros were transformed to binary outcomes. Lclose had a high proportion of values of 300 msecs (48 %) which corresponds to trial end (5 mins). Therefore Lclose outcomes were treated as binary (300 = 0, 1 otherwise). For Lresp observations with the value 300 (4 %) were replaced with NA. All variables that were not treated as binary were separated into categories (breaks argument below) and treated as ordinal.
breaks <- 10
costa <- costa %>%
mutate(across(Lresp, ~na_if(.,300)), # 300 secs is end of trial
across(Lclose, ~. != 300), # transform to binary variable
across(c(Absent, Mimicry, Singing, CloseDur), ~. == 0), # all binary
across(where(is.numeric), cut, breaks = breaks, labels = seq(breaks)), # categorise scales
across(Lresp:Singing, as.numeric))The treatment condition SC (same home, same species) was coded as baseline condition.
different home (AC) different species (SH)
SC 0 0
AC 1 0
SH 0 1
To keep the code compact helper functions were defined in the background that applied
glm(value ~ Treatment, data = data, family = binomial(link = "probit"))
to binomial outcome variables and
clm(value ~ Treatment, data = data, threshold = "flexible", link = "probit")
to ordinal (categorical) outcomes (clm and clmm2 below are from the ordinal package). A \(\chi^2\)-test was used to compare the model with the treatment predictor to a null model.
| Outcome variable | d’ | SE | Pr(>|z|) | d’ | SE | Pr(>|z|) | LRT | AIC | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|---|---|---|
| Lresp | 0.21 | 0.38 | 0.577 | -1.02 | 0.46 | 0.027 | 8.35 | 124 | 0.015 |
| Lclose | 0.00 | 0.42 | 1.000 | 0.14 | 0.42 | 0.739 | 0.15 | 77 | 0.928 |
| Flights | 0.04 | 0.34 | 0.898 | 0.00 | 0.34 | 0.997 | 0.02 | 241 | 0.989 |
| Hops | -0.05 | 0.36 | 0.887 | -0.78 | 0.38 | 0.039 | 5.26 | 169 | 0.072 |
| CloseDur | 0.00 | 0.42 | 1.000 | -0.14 | 0.42 | 0.739 | 0.15 | 77 | 0.928 |
| Absent | 0.20 | 0.48 | 0.674 | -0.33 | 0.45 | 0.457 | 1.40 | 62 | 0.495 |
| Chatter | 0.24 | 0.36 | 0.511 | 0.90 | 0.36 | 0.013 | 6.80 | 217 | 0.033 |
| Mimicry | 0.00 | 0.47 | 1.000 | -0.33 | 0.45 | 0.457 | 0.75 | 64 | 0.686 |
| Singing | -4.53 | 369.70 | 0.990 | -5.16 | 369.70 | 0.989 | 7.83 | 44 | 0.020 |
Manubay & Powell (2020)
Recruitment rate and raid speed were transformed to categorical ordered responses with breaks number of levels. Baseline levels were set for prey and nest.
breaks <- 10
manubay <- read_csv("../data/Manubay_Powell_2020.csv") %>%
clean_names() %>%
mutate(across(c(raid_speed:recruitment_rate), cut, breaks = breaks, labels = seq(breaks)),
across(c(raid_speed:recruitment_rate), as.numeric),
across(c(raid_speed:recruitment_rate), as.ordered),
treatment = recode_factor(treatment, e = "prey", c = "non prey"),
treatment = relevel(factor(treatment), ref = "prey"),
odor = relevel(factor(tolower(odor)), ref = "nest"),
across(colony_number, factor))Raid speed models:
m_interaction <- clmm2(raid_speed ~ treatment * odor,
random = colony_number,
data = manubay,
link = "probit",
Hess = TRUE)
m_maineffects <- clmm2(raid_speed ~ treatment + odor,
random = colony_number,
data = manubay,
link = "probit",
Hess = TRUE)
m_treatment <- clmm2(raid_speed ~ treatment,
random = colony_number,
data = manubay,
link = "probit",
Hess = TRUE)
m_0 <- clmm2(raid_speed ~ 1,
random = colony_number,
data = manubay,
link = "probit",
Hess = TRUE)raid_speed_models <- anova(m_0, m_treatment,
m_maineffects, m_interaction,
test = "Chisq")For the analysis of the recruitment_rate, the condition non prey will be ignored because
with(manubay, table(recruitment_rate, treatment)) treatment
recruitment_rate prey non prey
1 96 241
2 74 2
3 58 0
4 30 0
5 23 0
6 11 0
7 2 0
8 3 0
10 2 0
Recruitment rate models:
manubay_rr <- filter(manubay, treatment == "prey")
m_odor <- clmm2(recruitment_rate ~ odor,
random = colony_number,
data = manubay_rr,
link = "probit",
Hess = TRUE)
m_0 <- clmm2(recruitment_rate ~ 1,
random = colony_number,
data = manubay_rr,
link = "probit",
Hess = TRUE)recruit_rate_models <- anova(m_0, m_odor, test = "Chisq")| Model | AIC | LRT | Pr(>\(\chi^2\)) |
|---|---|---|---|
| Raid speed | |||
| random effects only | 2144 | – | – |
| treatment | 1724 | 419.64 | <0.05 |
| treatment + odor | 1640 | 84 | <0.05 |
| treatment \(\times\) odor | 1495 | 145.46 | <0.05 |
| Recruitment rate | |||
| random effects only | 1011 | – | – |
| odor | 823 | 187.86 | <0.05 |
| Predictor | d’ | SE | Pr(>|z|) |
|---|---|---|---|
| Raid speed | |||
| non prey | 5.51 | 0.42 | <0.001 |
| alarm | 2.24 | 0.40 | <0.001 |
| dead | 3.26 | 0.40 | <0.001 |
| live | 2.43 | 0.41 | <0.001 |
| non prey\(\times\)alarm | -2.86 | 0.44 | <0.001 |
| non prey\(\times\)dead | -3.36 | 0.44 | <0.001 |
| non prey\(\times\)live | -3.93 | 0.46 | <0.001 |
| Recruitment rate | |||
| alarm | -2.67 | 0.22 | <0.001 |
| dead | -1.66 | 0.18 | <0.001 |
| live | -0.92 | 0.18 | <0.001 |
Corral-López et al. (2017)
Load and preprocess data. The outcome variable choose_left was set depending on the base level of colourful_left: guppy choosing the left guppy was coded choose_left = 1, and 0 for choosing right; the position of the colourful guppy was coded colourful_left = 1, otherwise 0 for right.
corral <- read_csv("../data/Corral-Lopez_2017.csv") %>%
mutate(# if guppy chose left (1) or chose right (0) (nb: assuming guppy tries to chose colourful)
choose_left = as.numeric(timeleft > timeright),
# if, in reality, colourful was on the left (1) or the right (0)
colourful_left = case_when(timeleft == time_attractive ~ 1,
timeleft == time_nonattractive ~ 0),
size_group = case_when(bs == 2 ~ "large",
bs == 1 ~ "med",
bs == 0 ~ "small"),
size_group = factor(size_group, levels = c("small", "med", "large"))) %>%
select(id = femaleid, choose_left, colourful_left, size_group)# Assign contrast
contrasts(corral$size_group) <- contr.sum(3)
colnames(contrasts(corral$size_group)) <- c("small vs med", "med vs large")
contrasts(corral$size_group) small vs med med vs large
small 1 0
med 0 1
large -1 -1
Models:
m_interaction <- glm(choose_left ~ colourful_left*size_group,
data = corral,
family = binomial(link = "probit"))
m_maineffects <- glm(choose_left ~ colourful_left + size_group,
data = corral,
family = binomial(link = "probit"))
m_colourful <- glm(choose_left ~ colourful_left,
data = corral,
family = binomial(link = "probit"))
m_0 <- glm(choose_left ~ 1,
data = corral,
family = binomial(link = "probit"))Compare models:
modelcomp <- anova(m_0, m_colourful, m_maineffects, m_interaction, test = "Chisq")| Model | Residual deviance | Deviance | Pr(>\(\chi^2\)) |
|---|---|---|---|
| Random effects only | 119.07 | – | – |
| Colourful left | 116.04 | 3.03 | 0.082 |
| Colourful left + Size | 115.97 | 0.07 | 0.967 |
| Colourful left \(\times\) Size | 111.4 | 4.58 | 0.101 |
| Predictor | d’ | SE | Pr(>|z|) |
|---|---|---|---|
| colourful_left | 0.69 | 0.32 | 0.029 |
| small vs med | 0.36 | 0.27 | 0.181 |
| med vs large | 0.00 | 0.27 | 0.999 |
| colourful_left\(\times\)small vs med | -0.83 | 0.40 | 0.037 |
| colourful_left\(\times\)med vs large | 0.02 | 0.40 | 0.961 |
Russell et al. (2020)
Data processing:
russell_e2 <- read_csv("../data/Russell_2019_exp_2.csv") %>% mutate(exp = 2)
russell_e3 <- read_csv("../data/Russell_2019_exp_3.csv") %>% mutate(exp = 3)
russell <- bind_rows(russell_e2, russell_e3) %>%
mutate(trial_type = case_when(M_approach == 0 &
F_approach == 0 &
All_correct == 1 &
F_correct == 0 &
M_correct == 1 ~ "hit",
M_approach == 0 &
F_approach == 1 &
All_correct == 1 &
F_correct == 1 &
M_correct == 0 ~ "CR",
M_approach == 1 &
F_approach == 0 &
All_correct == 0 &
F_correct == 0 &
M_correct == 0 ~ "miss",
M_approach == 0 &
F_approach == 0 &
All_correct == 0 &
F_correct == 0 &
M_correct == 0 ~ "FA"),
# Recode whether a trial involved a Male or Female flower.
encounter = case_when(trial_type %in% c("hit", "miss") ~ "Male",
trial_type %in% c("CR", "FA") ~ "Female"),
# Recode whether the bee landed (1) or not (0)
response = case_when(trial_type %in% c("hit", "FA") ~ 1,
trial_type %in% c("CR", "miss") ~ 0),
visits_scaled = scales::rescale(NumVisits),
across(c(Treatment, encounter), factor)) %>%
select(exp, bee = BeeID, visits_scaled, encounter, response, treatment = Treatment)# Contrast codes (encounter doesn't need specification)
contrasts(russell$encounter) Male
Female 0
Male 1
# Sum contrast for treatment
contrasts(russell$treatment) <- contr.sum(3)
colnames(contrasts(russell$treatment)) <- c("25 vs 50", "50 vs 75")
contrasts(russell$treatment) 25 vs 50 50 vs 75
25_Male 1 0
50_Male 0 1
75_Male -1 -1
The glmer function of the lme4 package was used to model the binomial outcome variable for each experiment separately. As before model complexity was increased incrementally and compared.
m_int2 <- list()
m_int1 <- list()
m_mes <- list()
m_vis <- list()
m_enc <- list()
m_0 <- list()
model_comp <- list()
for(e in c(2,3)){
idx <- e - 1
tmp <- filter(russell, exp == e)
m_int2[[idx]] <- glmer(response ~ encounter * (visits_scaled + treatment) + (1 | bee),
family=binomial(link="probit"), data = tmp)
m_int1[[idx]] <- glmer(response ~ encounter * visits_scaled + treatment + (1 | bee),
family=binomial(link="probit"), data = tmp)
m_mes[[idx]] <- glmer(response ~ encounter + visits_scaled + treatment + (1 | bee),
family=binomial(link="probit"), data = tmp)
m_vis[[idx]] <- glmer(response ~ encounter + visits_scaled + (1 | bee),
family=binomial(link="probit"), data = tmp)
m_enc[[idx]] <- glmer(response ~ encounter + (1 | bee),
family=binomial(link="probit"), data = tmp)
m_0[[idx]] <- glmer(response ~ 1 + (1 | bee),
family=binomial(link="probit"), data = tmp)
model_comp[[idx]] <- anova(m_int2[[idx]], m_int1[[idx]],
m_mes[[idx]], m_vis[[idx]],
m_enc[[idx]], m_0[[idx]], test = "Chisq")
}| Model | AIC | Deviance | Pr(>\(\chi^2\)) | AIC | Deviance | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|
| Random effects only | 1981.28 | 1977.28 | – | 609.56 | 605.56 | – |
| Encounter | 1820.72 | 1814.72 | <0.05 | 543.61 | 537.61 | <0.05 |
| Encounter + Visits | 1663.3 | 1655.3 | <0.05 | 505.33 | 497.33 | <0.05 |
| Encounter + Visists + Treatment | 1664.38 | 1652.38 | 0.232 | 505.6 | 495.6 | 0.188 |
| Encounter \(\times\) Visits + Treatment | 1658.28 | 1644.28 | <0.05 | 507 | 495 | 0.441 |
| Encounter \(\times\) (Visits + Treatment) | 1659.08 | 1641.08 | 0.202 | 502.32 | 488.32 | <0.05 |
| Predictor | d’ | SE | Pr(>|z|) | d’ | SE | Pr(>|z|) |
|---|---|---|---|---|---|---|
| encounter: Male | 0.62 | 0.16 | < 0.001 | 1.76 | 0.43 | < 0.001 |
| visits_scaled | -1.88 | 0.17 | < 0.001 | -3.21 | 0.6 | < 0.001 |
| treatment: 25 vs 50 | -0.12 | 0.08 | 0.129 | – | – | – |
| treatment: 25 vs 75 | – | – | – | -0.02 | 0.28 | 0.949 |
| treatment: 50 vs 75 | 0.13 | 0.08 | 0.113 | – | – | – |
| encounter: Male\(\times\)visits_scaled | 0.75 | 0.26 | 0.004 | 0.3 | 1.06 | 0.78 |
| encounter: Male\(\times\)treatment: 25 vs 75 | – | – | – | -0.91 | 0.37 | 0.013 |
| encounter: Male\(\times\)treatment: 25 vs 50 | 0.22 | 0.13 | 0.079 | – | – | – |
| encounter: Male\(\times\)treatment: 50 vs 75 | -0.09 | 0.11 | 0.403 | – | – | – |
Hemingway et al. (2019)
# Prep bat data
hemingway_bats <- read_csv("../data/Hemingway_2019_bats.csv") %>%
clean_names() %>%
rename(num_of_speakers = num_spkrs,
stim = stim_w_wc) %>%
mutate(across(c(num_of_speakers, stim), factor),
stim = recode_factor(stim, `0` = "whine only",
`1` = "whine-chuck"),
species = "bats") %>%
select(species, ppt = bat, num_of_speakers, stim, latency)
# Prep frog data
hemingway_frogs <- read_csv("../data/Hemingway_2019_frogs.csv") %>%
clean_names() %>%
rename(stim = stimulus,
latency = total_trial_time_to_choice,
num_of_speakers = num_spkrs) %>%
mutate(across(c(stim, num_of_speakers), factor),
frog_toeclip = as.character(frog_toeclip),
stim = recode_factor(stim, w = "whine only", wc = "whine-chuck"),
species = "frogs") %>%
select(species, ppt = frog_toeclip, num_of_speakers, stim, latency)
hemingway <- bind_rows(hemingway_bats, hemingway_frogs) Models for bat data:
# Get bat data
data <- filter(hemingway, species == "bats") %>%
mutate(lat_bins = ordered(cut(latency, breaks = breaks, labels = seq(breaks))),
ppt = factor(ppt))
m_0 <- clmm2(lat_bins ~ 1,
random = ppt,
data = data,
link = "probit",
Hess = TRUE)
m_stim <- clmm2(lat_bins ~ stim,
random = ppt,
data = data,
link = "probit",
Hess = TRUE)
m_mes <- clmm2(lat_bins ~ stim + num_of_speakers,
random = ppt,
data = data,
link = "probit",
Hess = TRUE)
m_int_bats <- clmm2(lat_bins ~ stim * num_of_speakers,
random = ppt,
data = data,
link = "probit",
Hess = TRUE)model_comparison_bats <- anova(m_0, m_stim, m_mes, m_int_bats, test = "Chisq")Models for frog data:
# Get frog data
data <- filter(hemingway, species == "frogs") %>%
mutate(lat_bins = ordered(cut(latency, breaks = breaks, labels = seq(breaks))),
ppt = factor(ppt))
m_0 <- clm(lat_bins ~ 1,
data = data,
link = "probit",
Hess = TRUE)
m_stim <- clm(lat_bins ~ stim,
data = data,
link = "probit",
Hess = TRUE)
m_mes <- clm(lat_bins ~ stim + num_of_speakers,
data = data,
link = "probit",
Hess = TRUE)
m_int_frogs <- clm(lat_bins ~ stim * num_of_speakers,
data = data,
link = "probit",
Hess = TRUE)model_comparison_frogs <- anova(m_0, m_stim, m_mes, m_int_frogs, test = "Chisq")| Predictors | d’ | SE | Pr(>|z|) | Deviance | LR test | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|
| Bats | ||||||
| Stimulus | -0.62 | 0.24 | <0.05 | 557.81 | 1.98 | 0.159 |
| Stimulus + No. of speakers | -0.02 | 0.23 | 0.915 | 554.09 | 3.72 | 0.054 |
| Stimulus \(\times\) No. of speakers | 0.72 | 0.33 | <0.05 | 549.36 | 4.73 | 0.03 |
| Frogs | ||||||
| Stimulus | -0.41 | 0.36 | 0.256 | 241.79 | 5.07 | 0.024 |
| Stimulus + No. of speakers | 0.82 | 0.35 | <0.05 | 234.68 | 7.11 | 0.008 |
| Stimulus \(\times\) No. of speakers | -0.32 | 0.49 | 0.518 | 234.26 | 0.42 | 0.518 |
Cole & Endler (2018)
- Outcome: relative area of long (
rellw) and short (relsw) wave length - Generation information missing (data not available for all their analysis)?
- I’m not clear about what many of these variables means!!!
- Short vs long wave length could be treated as predictor? What does the outcome mean?
cole <- read_csv("../data/Cole_2018.csv") %>%
clean_names() %>%
rename(female = run_female) %>%
mutate(treatment = recode(treatment, "C" = "control",
"B" = "blue",
"R" = "red"),
treatment = factor(treatment, levels = c("control", "blue", "red")),
across(c(time, rellw, relsw), cut, breaks, seq(breaks))) %>%
pivot_longer(rellw:relsw, names_to = "wl_type", values_to = "wavelength") %>%
mutate(wl_type = recode(wl_type, relsw = "short", rellw = "long"),
across(everything(), factor))Not so sure about this yet
contrasts(cole$wl_type) <- c(1, 0) # long, short
# contrasts(cole$treatment) # control is baseline
# contrasts(cole$rep) # 1 is baseline
link <- "probit" # convergence problems!!!
m_int_3 <- clmm2(wavelength ~ wl_type * treatment * rep,
random = female,
data = cole,
Hess = TRUE,
link = link)
m_int_2 <- clmm2(wavelength ~ (wl_type + treatment + rep)^2,
random = female,
data = cole,
Hess = TRUE,
link = link)
m_mes <- clmm2(wavelength ~ wl_type + treatment + rep,
random = female,
data = cole,
Hess = TRUE,
link = link)
m_2 <- clmm2(wavelength ~ wl_type + treatment,
random = female,
data = cole,
Hess = TRUE,
link = link)
m_1 <- clmm2(wavelength ~ wl_type,
random = female,
data = cole,
Hess = TRUE,
link = link)
m_0 <- clmm2(wavelength ~ 1,
random = female,
data = cole,
Hess = TRUE,
link = link)Model comparisons:
model_comps <- anova(m_int_3, m_int_2, m_mes, m_2, m_1, m_0, test = "Chisq")MASS::dropterm(m_int_3, test = "Chisq")Single term deletions
Model:
location : wavelength ~ wl_type * treatment * rep
Df AIC LRT Pr(Chi)
<none> 3128.7
wl_type:treatment:rep 2 3128.1 3.4323 0.1798
| Predictors | d’ | SE | Pr(>|z|) | Deviance | LR test | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|
| Wave length | -0.52 | 0.19 | 0.007 | 3097.96 | 17.41 | <0.001 |
| Treatment: blue | -0.09 | 0.19 | 0.613 | 3095.29 | 2.68 | 0.262 |
| Treatment: red | -0.02 | 0.18 | 0.905 | – | – | – |
| Rep 2 | -0.25 | 0.19 | 0.189 | 3095.21 | 0.07 | 0.79 |
| Wave length \(\times\) Blue | 0.02 | 0.26 | 0.947 | 3090.08 | 5.13 | 0.4 |
| Wave length \(\times\) Red | 0.31 | 0.26 | 0.232 | – | – | – |
| Wave length \(\times\) Rep | 0.58 | 0.27 | 0.032 | – | – | – |
| Blue \(\times\) Rep 2 | 0.35 | 0.26 | 0.189 | – | – | – |
| Blue \(\times\) Rep 2 | 0.2 | 0.26 | 0.442 | – | – | – |
| Wave length \(\times\) Blue \(\times\) Rep 2 | -0.6 | 0.37 | 0.107 | 3086.65 | 3.43 | 0.18 |
| Wave length \(\times\) Red \(\times\) Rep 2 | -0.6 | 0.37 | 0.106 | – | – | – |
Caspers et al. (2017)
caspers <- read_csv("../data/Caspers_2017.csv") %>%
clean_names() %>%
rename(parent = experiment,
experiment = egg_treatment,
total_duration = duration_begging) %>%
mutate(parent = case_when(parent %in% c(1,3) ~ "mother",
parent %in% c(2,4) ~ "father"),
experiment = recode(experiment,
`0` = "Exp. 1 (no egg treatment)",
`1` = "Exp. 2 (cross-fostered egg)")) %>%
pivot_longer(genetic:non_genetic, names_to = "related", values_to = "duration") %>%
# chick is nested in cage so chick is really chick_id + cage
mutate(chick = as.numeric(factor(paste0(chick_id, cage))),
related = factor(related, levels = c("non_genetic", "genetic"))) %>%
select(experiment, chick, related, parent, duration, total_duration) What do I do with durations of 0 (hence no begging)? That’s just missing data, right? Problem is this would be roughly 50% of the data in experiment 1.
# Calculate the prop of zero observations
caspers %>%
mutate(total_duration = ifelse(total_duration == 0, NA, total_duration),
duration = ifelse(total_duration == 0, NA, duration)) %>%
count(experiment, related, parent, dur = is.na(duration)) %>%
group_by(experiment, related, parent) %>%
mutate(total = sum(n),
prop_zeros = n / total * 100) %>%
mutate(across(prop_zeros, round, 0)) %>%
filter(dur) %>% select(-dur:-total)# A tibble: 8 × 4
# Groups: experiment, related, parent [8]
experiment related parent prop_zeros
<chr> <fct> <chr> <dbl>
1 Exp. 1 (no egg treatment) non_genetic father 43
2 Exp. 1 (no egg treatment) non_genetic mother 37
3 Exp. 1 (no egg treatment) genetic father 43
4 Exp. 1 (no egg treatment) genetic mother 37
5 Exp. 2 (cross-fostered egg) non_genetic father 12
6 Exp. 2 (cross-fostered egg) non_genetic mother 13
7 Exp. 2 (cross-fostered egg) genetic father 12
8 Exp. 2 (cross-fostered egg) genetic mother 13
The total duration is the sum of duration of genetic and duration of non_genetic. So I tried both, the duration and the relative duration (duration / total_duration). The problem with the relative duration is that duration / total_duration is going to render a lot of NaNs (because \(\frac{0}{0}\) is a problem). Removing that many data doesn’t see a good option.
Models for experiment 1 (eggs without treatment).
exp1 <- caspers %>%
filter(experiment == unique(experiment)[1]) %>%
mutate(across(duration, cut, breaks, seq(breaks)),
across(c(duration, chick), factor))
m_int_1 <- clmm2(duration ~ related * parent,
random = chick,
data = exp1,
Hess = TRUE,
link = "probit")
m_mes <- clmm2(duration ~ related + parent,
random = chick,
data = exp1,
Hess = TRUE,
link = "probit")
m_rel <- clmm2(duration ~ related,
random = chick,
data = exp1,
Hess = TRUE,
link = "probit")
m_0 <- clmm2(duration ~ 1,
random = chick,
data = exp1,
Hess = TRUE,
link = "probit")Model comparisons:
model_comp_1 <- anova(m_int_1, m_mes, m_rel, m_0, test = "Chisq")Models for experiment 2 (cross-fostered eggs).
exp2 <- caspers %>%
filter(experiment == unique(experiment)[2]) %>%
mutate(across(duration, cut, breaks, seq(breaks)),
across(c(duration, chick), factor))
m_int_2 <- clmm2(duration ~ related * parent,
random = chick,
data = exp2,
Hess = TRUE,
link = "probit")
m_mes <- clmm2(duration ~ related + parent,
random = chick,
data = exp2,
Hess = TRUE,
link = "probit")
m_rel <- clmm2(duration ~ related,
random = chick,
data = exp2,
Hess = TRUE,
link = "probit")
m_0 <- clmm2(duration ~ 1,
random = chick,
data = exp2,
Hess = TRUE,
link = "probit")Model comparisons:
model_comp_2 <- anova(m_int_2, m_mes, m_rel, m_0, test = "Chisq")| Predictors | d’ | SE | Pr(>|z|) | Deviance | LR test | Pr(>\(\chi^2\)) |
|---|---|---|---|---|---|---|
| Experiment 1 | ||||||
| Gen. related | 1.33 | 0.46 | 0.004 | 354.53 | 23.02 | <0.001 |
| Gen. related + Parent | 1.3 | 0.76 | 0.087 | 351.13 | 3.39 | 0.065 |
| Gen. related \(\times\) Parent | -0.09 | 0.54 | 0.874 | 351.11 | 0.03 | 0.874 |
| Experiment 2 | ||||||
| Gen. related | 0.08 | 0.38 | 0.823 | 240.4 | 4.62 | 0.032 |
| Gen. related + Parent | -0.37 | 0.54 | 0.495 | 240.11 | 0.29 | 0.591 |
| Gen. related \(\times\) Parent | 1.14 | 0.56 | 0.041 | 235.9 | 4.21 | 0.04 |