PTWL connectivity analysis

Author

Danswell Starrs

Background

This file outlines the analysis of the Pink tailed worm lizard (PTWL) dataset.

This data was provided by Richard Milner. It consists of 3 files:

  • PTWL catch data

  • Ant data

  • Site characteristics data

Research Questions

The following questions are to be answered with the data:

  1. Compare ant number/colonies etc between control sites and treatment sites (islands). Do the islands provide similar number of ants/colonies to known habitat?
  2. Compare ant number/colonies between treatment sites (islands) in former pine forest and outside former pine forest or maybe with control sites. Do ant numbers/colonies differ between islands within former pine plantation and outside former pine plantation?
  3. Look to see if there is a relationship between ant numbers and PTWL detection rates. That is, do the islands with no or low PTWL detections rates have lower numbers of ants compared to islands with high detection rates or controls?
  4. Is there a relationship between time taken for first colonisation and the distance to natural habitat? (PTWL habitat patch data file and see attached email from Will)
  5. Is there relationship between number of PTWL in 2021 per island and distance to known habitat. (PTWL habitat patch data file)
  6. Is there a relationship between number of PTWL in 2021 per island and occurrence of island in or outside former pine plantation? (PTWL habitat patch data file)

Preliminary examination of data

Show the code
dat_PTWL <- read.csv("PTWL_connectivity_capture_data.csv", header = T)
dat_ants <- read.csv("PTWL_connectivity_ant_data.csv", header = T)
dat_patch <- read.csv("PTWL_connectivity_patch_data.csv", header = T)

PTWL dataset

'data.frame':   80 obs. of  25 variables:
 $ Capture_number          : num  1 2 3 4 5 6 7 8 9 10 ...
 $ Unique_capturenumber    : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Site                    : chr  "1" "1" "1" "1" ...
 $ Distance_nearest_habitat: int  30 30 30 30 30 30 30 30 30 30 ...
 $ Former_pine_plantation  : chr  "N" "N" "N" "N" ...
 $ LH_stage                : chr  "Adult" "Adult" "Adult" "Subadult" ...
 $ Rock_size_length_width  : chr  "25/40" "24/40" "20/40" "10/30" ...
 $ Ants_present            : chr  "11-50" "11-50" "11-50" "1-10" ...
 $ Ant_eggs_present        : chr  "Y" "Y" "-" "Y" ...
 $ Ant_larvae_present      : chr  "Y" "Y" "-" "-" ...
 $ Ant_burrow_seen         : chr  "Y" "Y" "Y" "Y" ...
 $ Comments                : chr  "" "Captured with individual GPS 1482" "Skin observed under rock " "" ...
 $ Easting                 : num  684210 684210 684211 684209 684210 ...
 $ Northing                : num  6093430 6093430 6093429 6093428 6093428 ...
 $ GPS_reference           : chr  "1481" "1482" "1483" "1484" ...
 $ X                       : logi  NA NA NA NA NA NA ...
 $ Common_name             : chr  "Pink-tailed Worm-lizard" "Pink-tailed Worm-lizard" "Pink-tailed Worm-lizard" "Pink-tailed Worm-lizard" ...
 $ Species_name            : chr  "Aprasia parapulchella" "Aprasia parapulchella" "Aprasia parapulchella" "Aprasia parapulchella" ...
 $ X.1                     : logi  NA NA NA NA NA NA ...
 $ X.2                     : logi  NA NA NA NA NA NA ...
 $ X.3                     : logi  NA NA NA NA NA NA ...
 $ X.4                     : logi  NA NA NA NA NA NA ...
 $ X.5                     : logi  NA NA NA NA NA NA ...
 $ X.6                     : logi  NA NA NA NA NA NA ...
 $ X.7                     : logi  NA NA NA NA NA NA ...
 [1] "1"         "2"         "3"         "4"         "5"         "7"        
 [7] "8"         "10"        "11"        "Control 3"
[1]  30  55 110 135 185 230 195  65  NA
[1] "N" "Y"
[1] "11-50" "1-10"  "0"     "50+"  
[1] "Y" "-"
[1] "Y" "-"

Ants dataset

'data.frame':   1400 obs. of  7 variables:
 $ Site                  : chr  "1" "1" "1" "1" ...
 $ Former_pine_plantation: chr  "N" "N" "N" "N" ...
 $ Sampled_rock          : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Ants_present          : chr  "(1-10)" "0" "0" "0" ...
 $ Ant_eggs_present      : chr  "N" "N" "N" "N" ...
 $ Ant_larvae_present    : chr  "N" "N" "N" "N" ...
 $ Ant_burrow_seen       : chr  "Y" "N" "N" "N" ...
 [1] "1"         "2"         "3"         "4"         "5"         "6"        
 [7] "7"         "8"         "9"         "10"        "11"        "Control 1"
[13] "Control 2" "Control 3"
[1] "N" "Y" "" 
  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
 [19]  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
 [37]  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
 [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
 [73]  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
 [91]  91  92  93  94  95  96  97  98  99 100
[1] "(1-10)"  "0"       "(11-50)" "(>50)"  
[1] "N" "Y"
[1] "N" "Y"
[1] "Y" "N"

Site characteristics dataset

Show the code
#str(dat_patch)

dat_patch %>% 
  ggplot() + 
  geom_point(aes(x = Patch_number, y = Distance_PTWL_habitat), size = 2) + 
  labs(x = "Patch number", y = "Distance to nearest PTWL habitat (m)") + 
  scale_x_continuous(n.breaks = 11) +
  theme_classic() + 
  theme(axis.title = element_text(size = 14),
        axis.text = element_text(color = "black", size = 12))


This plot shows that patch1 and 11 are the closest to their respective patches of PTWL habitat, whereas patch 6 is the furthest from its nearest patch. Could colour code for prior land use if figure is going into the paper.

Question 1

  1. Compare ant number/colonies etc between control sites and treatment sites (islands). Do the islands provide similar number of ants/colonies to known habitat?
Show the code
a <- dat_ants %>% 
  group_by(Site, Ants_present) %>% 
  summarise(Count = n()) %>% 
  mutate(Ants_present = factor(Ants_present, levels = c("0",
                                                        "(1-10)",
                                                        "(11-50)",
                                                        "(>50)"))) %>% 
  mutate(Site = factor(Site, levels = c("Control 1",
                                        "Control 2",
                                        "Control 3",
                                        "1",
                                        "2",
                                        "3",
                                        "4",
                                        "5",
                                        "6",
                                        "7",
                                        "8",
                                        "9",
                                        "10",
                                        "11"))) %>% 
  ggplot() + 
  geom_col(aes(x = Site, y = Count, group = Ants_present, fill = Ants_present), position = "stack") + 
  labs(x = "Patch Number", y = "% of rocks", fill = "# Ants", tag = "A") + 
  scale_fill_grey() + 
  theme_classic() + 
  theme(axis.title.y = element_text(size = 14),
        axis.title.x = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_text(size = 12, color = "black"))
a

This plot shows percentage of rocks in each patch and the categorical number of ants (either, 0, 1-10, 11-50 or > 50 ants) under the rocks. We see that in patch 5, ~ 75% of rocks had no ants. Conversely, Control 2 had > 50 ants under ~ 35% of rocks.

Show the code
b <- dat_ants %>% 
  filter(Ant_eggs_present != "") %>% 
  group_by(Site, Ant_eggs_present) %>% 
  summarise(Count = n()) %>% 
  mutate(Site = factor(Site, levels = c("Control 1",
                                        "Control 2",
                                        "Control 3",
                                        "1",
                                        "2",
                                        "3",
                                        "4",
                                        "5",
                                        "6",
                                        "7",
                                        "8",
                                        "9",
                                        "10",
                                        "11"))) %>% 
  ggplot() + 
  geom_col(aes(x = Site, y = Count, group = Ant_eggs_present, fill = Ant_eggs_present), position = "stack") + 
  labs(x = "Patch Number", y = "Percentage", fill = "Ant eggs", tag = "B") + 
  scale_fill_grey() + 
  theme_classic() + 
  theme(axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        axis.text.x = element_blank(),
        axis.text.y = element_blank())
b

This plot shows the percentage of rocks in each plot that had ant eggs present. We see in all patches, greater than 50% of rocks did not have ant eggs present.

Show the code
c <- dat_ants %>% 
  group_by(Site, Ant_larvae_present) %>% 
  summarise(Count = n()) %>% 
  mutate(Site = factor(Site, levels = c("Control 1",
                                        "Control 2",
                                        "Control 3",
                                        "1",
                                        "2",
                                        "3",
                                        "4",
                                        "5",
                                        "6",
                                        "7",
                                        "8",
                                        "9",
                                        "10",
                                        "11"))) %>% 
  ggplot() + 
  geom_col(aes(x = Site, y = Count, group = Ant_larvae_present, fill = Ant_larvae_present), position = "stack") + 
  labs(x = "Patch Number", y = "% of rocks", fill = "Ant larvae", tag = "C") + 
  scale_fill_grey() + 
  theme_classic() + 
  theme(axis.title = element_text(size = 14),
        axis.text.x = element_text(size = 12, color = "black", angle = 45, hjust = 1),
        axis.text.y = element_text(size = 12, color = "black"))
c

This plot shows the percentage of rocks in each patch that had ant larvae present. We see in call cases, greater than 75% of rocks in each patch did not have ant larvae present.

Show the code
d <- dat_ants %>% 
  group_by(Site, Ant_burrow_seen) %>% 
  summarise(Count = n()) %>% 
  mutate(Site = factor(Site, levels = c("Control 1",
                                        "Control 2",
                                        "Control 3",
                                        "1",
                                        "2",
                                        "3",
                                        "4",
                                        "5",
                                        "6",
                                        "7",
                                        "8",
                                        "9",
                                        "10",
                                        "11"))) %>% 
  ggplot() + 
  geom_col(aes(x = Site, y = Count, group = Ant_burrow_seen, fill = Ant_burrow_seen), position = "stack") + 
  labs(x = "Patch Number", y = "Percentage", fill = "Ant burrows", tag = "D") + 
  scale_fill_grey() + 
  theme_classic() + 
  theme(axis.title.y = element_blank(),
        axis.title.x = element_text(size = 14),
        axis.text.x = element_text(size = 12, color = "black", angle = 45, hjust = 1),
        axis.text.y = element_blank())
d

In the case of ant burrows observed, it appears that in plots 6 - 11, there were far fewer ant burrows seen than in other plots.

Show the code
#Some patchwork magic to stick the plots together

Fig_1 <- (a+b)/(c+d)

Fig_1

Statistical test

To analyse the ants present data, I will use the R package ordinal. Why?

The reason is that this data is ranked ordinal data. Eg. the data are categories, but they are also in a logical order. This can be analysed using cumulative link modelling. The R package I like to use is clm. As there are no mixed effects, the specific function is: clm

A key assumption made here is that there is proportional odds between the categories. There is suggestion that violation of the assumption doesn’t greatly alter interpretation: https://www.fharrell.com/post/po/

I construct a model in the form of Ants ~ Patch. Nice and simple. I then perform a Tukey post-hoc test to explore between-patch differences.

Also see: https://rcompanion.org/handbook/G_07.html

Also see: https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/

Ants present

Show the code
#re-shape the data for analysis in ordinal
dat_ants_ordinal <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ants_present)

dat_ants_ordinal$Ants_present[dat_ants_ordinal$Ants_present == "(1-10)"] <- "1"
dat_ants_ordinal$Ants_present[dat_ants_ordinal$Ants_present == "(11-50)"] <- "2"
dat_ants_ordinal$Ants_present[dat_ants_ordinal$Ants_present == "(>50)"] <- "3"

unique(dat_ants_ordinal$Ants_present)
[1] "1" "0" "2" "3"
Show the code
dat_ants_ordinal$Ants_present <- as.factor(dat_ants_ordinal$Ants_present)

dat_ants_ordinal$Site <- as.factor(dat_ants_ordinal$Site)
dat_ants_ordinal$Former_pine_plantation <- as.factor(dat_ants_ordinal$Former_pine_plantation)
Show the code
#Null model
Mod0 <- clm(Ants_present~1, data = dat_ants_ordinal)
summary(Mod0)
formula: Ants_present ~ 1
data:    dat_ants_ordinal

 link  threshold nobs logLik   AIC     niter max.grad cond.H 
 logit flexible  1400 -1846.42 3698.84 4(0)  8.50e-09 7.5e+00

Threshold coefficients:
    Estimate Std. Error z value
0|1 -0.41142    0.05459  -7.537
1|2  0.67286    0.05651  11.908
2|3  1.60601    0.07163  22.420
Show the code
Mod1 <- clm(Ants_present ~ Site, data = dat_ants_ordinal)
summary(Mod1)
formula: Ants_present ~ Site
data:    dat_ants_ordinal

 link  threshold nobs logLik   AIC     niter max.grad cond.H 
 logit flexible  1400 -1768.42 3568.84 4(0)  3.26e-09 4.9e+02

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
Site10          0.5206     0.2503   2.080 0.037549 *  
Site11          1.0680     0.2577   4.144 3.41e-05 ***
Site2           0.4169     0.2607   1.599 0.109818    
Site3           0.6275     0.2612   2.403 0.016279 *  
Site4           0.1478     0.2585   0.572 0.567353    
Site5          -1.4244     0.3037  -4.691 2.72e-06 ***
Site6           0.8490     0.2542   3.340 0.000839 ***
Site7           0.7656     0.2709   2.826 0.004715 ** 
Site8           0.5321     0.2585   2.059 0.039513 *  
Site9           0.8712     0.2613   3.333 0.000858 ***
SiteControl 1  -0.4956     0.2753  -1.800 0.071787 .  
SiteControl 2   0.9154     0.2716   3.370 0.000751 ***
SiteControl 3  -0.3556     0.2680  -1.327 0.184534    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
0|1  -0.1176     0.1883  -0.625
1|2   1.0628     0.1907   5.574
2|3   2.0479     0.1970  10.394
Show the code
anova(Mod1, type = "II")
Type II Analysis of Deviance Table with Wald chi-square tests

     Df  Chisq Pr(>Chisq)    
Site 13 135.89  < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
#With emmeans
marginal <- emmeans(Mod1, "Site")
  
pairs(marginal,
      adjust="tukey")
 contrast              estimate    SE  df z.ratio p.value
 1 - 10                 -0.5206 0.250 Inf  -2.080  0.7156
 1 - 11                 -1.0680 0.258 Inf  -4.144  0.0027
 1 - 2                  -0.4169 0.261 Inf  -1.599  0.9469
 1 - 3                  -0.6275 0.261 Inf  -2.403  0.4782
 1 - 4                  -0.1478 0.258 Inf  -0.572  1.0000
 1 - 5                   1.4244 0.304 Inf   4.691  0.0002
 1 - 6                  -0.8490 0.254 Inf  -3.340  0.0522
 1 - 7                  -0.7656 0.271 Inf  -2.826  0.2114
 1 - 8                  -0.5321 0.258 Inf  -2.059  0.7298
 1 - 9                  -0.8712 0.261 Inf  -3.333  0.0533
 1 - Control 1           0.4956 0.275 Inf   1.800  0.8765
 1 - Control 2          -0.9154 0.272 Inf  -3.370  0.0474
 1 - Control 3           0.3556 0.268 Inf   1.327  0.9889
 10 - 11                -0.5475 0.242 Inf  -2.262  0.5837
 10 - 2                  0.1037 0.246 Inf   0.422  1.0000
 10 - 3                 -0.1069 0.246 Inf  -0.435  1.0000
 10 - 4                  0.3727 0.244 Inf   1.530  0.9624
 10 - 5                  1.9449 0.292 Inf   6.666  <.0001
 10 - 6                 -0.3284 0.238 Inf  -1.377  0.9845
 10 - 7                 -0.2450 0.256 Inf  -0.956  0.9996
 10 - 8                 -0.0116 0.243 Inf  -0.048  1.0000
 10 - 9                 -0.3506 0.246 Inf  -1.425  0.9792
 10 - Control 1          1.0162 0.262 Inf   3.880  0.0079
 10 - Control 2         -0.3949 0.257 Inf  -1.536  0.9611
 10 - Control 3          0.8762 0.254 Inf   3.449  0.0368
 11 - 2                  0.6511 0.253 Inf   2.574  0.3572
 11 - 3                  0.4405 0.253 Inf   1.742  0.9009
 11 - 4                  0.9202 0.251 Inf   3.664  0.0177
 11 - 5                  2.4924 0.299 Inf   8.344  <.0001
 11 - 6                  0.2190 0.245 Inf   0.892  0.9998
 11 - 7                  0.3024 0.263 Inf   1.151  0.9972
 11 - 8                  0.5359 0.251 Inf   2.139  0.6740
 11 - 9                  0.1968 0.253 Inf   0.779  1.0000
 11 - Control 1          1.5636 0.269 Inf   5.806  <.0001
 11 - Control 2          0.1526 0.263 Inf   0.580  1.0000
 11 - Control 3          1.4237 0.262 Inf   5.444  <.0001
 2 - 3                  -0.2106 0.257 Inf  -0.820  0.9999
 2 - 4                   0.2690 0.254 Inf   1.058  0.9988
 2 - 5                   1.8413 0.301 Inf   6.125  <.0001
 2 - 6                  -0.4321 0.250 Inf  -1.732  0.9050
 2 - 7                  -0.3487 0.267 Inf  -1.308  0.9903
 2 - 8                  -0.1153 0.254 Inf  -0.454  1.0000
 2 - 9                  -0.4543 0.257 Inf  -1.769  0.8901
 2 - Control 1           0.9125 0.272 Inf   3.358  0.0493
 2 - Control 2          -0.4985 0.267 Inf  -1.866  0.8451
 2 - Control 3           0.7725 0.264 Inf   2.923  0.1675
 3 - 4                   0.4796 0.255 Inf   1.883  0.8364
 3 - 5                   2.0518 0.301 Inf   6.809  <.0001
 3 - 6                  -0.2215 0.250 Inf  -0.888  0.9998
 3 - 7                  -0.1381 0.267 Inf  -0.518  1.0000
 3 - 8                   0.0953 0.254 Inf   0.375  1.0000
 3 - 9                  -0.2437 0.257 Inf  -0.949  0.9996
 3 - Control 1           1.1231 0.272 Inf   4.123  0.0030
 3 - Control 2          -0.2880 0.267 Inf  -1.078  0.9985
 3 - Control 3           0.9831 0.265 Inf   3.713  0.0148
 4 - 5                   1.5722 0.298 Inf   5.268  <.0001
 4 - 6                  -0.7012 0.248 Inf  -2.832  0.2085
 4 - 7                  -0.6177 0.265 Inf  -2.333  0.5301
 4 - 8                  -0.3843 0.252 Inf  -1.525  0.9634
 4 - 9                  -0.7233 0.255 Inf  -2.837  0.2058
 4 - Control 1           0.6435 0.269 Inf   2.388  0.4889
 4 - Control 2          -0.7676 0.265 Inf  -2.891  0.1811
 4 - Control 3           0.5035 0.262 Inf   1.922  0.8148
 5 - 6                  -2.2734 0.296 Inf  -7.693  <.0001
 5 - 7                  -2.1900 0.310 Inf  -7.067  <.0001
 5 - 8                  -1.9565 0.299 Inf  -6.549  <.0001
 5 - 9                  -2.2955 0.302 Inf  -7.610  <.0001
 5 - Control 1          -0.9288 0.312 Inf  -2.972  0.1482
 5 - Control 2          -2.3398 0.311 Inf  -7.532  <.0001
 5 - Control 3          -1.0687 0.306 Inf  -3.488  0.0323
 6 - 7                   0.0834 0.260 Inf   0.321  1.0000
 6 - 8                   0.3169 0.247 Inf   1.282  0.9919
 6 - 9                  -0.0222 0.250 Inf  -0.089  1.0000
 6 - Control 1           1.3446 0.266 Inf   5.058  <.0001
 6 - Control 2          -0.0664 0.260 Inf  -0.255  1.0000
 6 - Control 3           1.2047 0.258 Inf   4.669  0.0003
 7 - 8                   0.2335 0.264 Inf   0.883  0.9998
 7 - 9                  -0.1056 0.267 Inf  -0.396  1.0000
 7 - Control 1           1.2612 0.282 Inf   4.476  0.0006
 7 - Control 2          -0.1498 0.276 Inf  -0.542  1.0000
 7 - Control 3           1.1212 0.274 Inf   4.085  0.0035
 8 - 9                  -0.3390 0.254 Inf  -1.333  0.9885
 8 - Control 1           1.0278 0.270 Inf   3.811  0.0103
 8 - Control 2          -0.3833 0.265 Inf  -1.447  0.9763
 8 - Control 3           0.8878 0.262 Inf   3.388  0.0449
 9 - Control 1           1.3668 0.273 Inf   5.012  <.0001
 9 - Control 2          -0.0443 0.267 Inf  -0.166  1.0000
 9 - Control 3           1.2268 0.265 Inf   4.629  0.0003
 Control 1 - Control 2  -1.4110 0.283 Inf  -4.994  0.0001
 Control 1 - Control 3  -0.1400 0.278 Inf  -0.503  1.0000
 Control 2 - Control 3   1.2711 0.275 Inf   4.619  0.0003

P value adjustment: tukey method for comparing a family of 14 estimates 
Show the code
Tukey_A <- multcomp::cld(marginal, Letters = letters)

#nominal_test(Mod1)

#scale_test(Mod1)


####

The statistical test shows that there is a significant effect of patch on ant (see Chisq test result, P < 0.001). The Tukey post-hoc test supports that between all the patches with the exception of Patch #5, there are no meaningful differences. Eg. Patch 1, 2 3, 4, 6, 7, 8, 9, 10, 11 all share the letter ‘d’ indicating no support for them being different (but not neccessarily the same)!

Ant eggs

The ant eggs dataset is a little different to the ant numbers, as this data is binary (presence/absence). This is a more straightforward statistical test. We will use a glm model, with a logit-link = binomial.

The same model will be applied: Ant_eggs ~ Patch.

Show the code
#re-shape the data for analysis in ordinal
dat_ants_eggs_ordinal <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ant_eggs_present)

dat_ants_eggs_ordinal$Ant_eggs_present[dat_ants_eggs_ordinal$Ant_eggs_present == "N"] <- "0"
dat_ants_eggs_ordinal$Ant_eggs_present[dat_ants_eggs_ordinal$Ant_eggs_present == "Y"] <- "1"

unique(dat_ants_eggs_ordinal$Ant_eggs_present)
[1] "0" "1"
Show the code
dat_ants_eggs_ordinal$Ant_eggs_present <- as.numeric(dat_ants_eggs_ordinal$Ant_eggs_present)
#dat_ants_eggs_ordinal$Ants_present <- as.factor(dat_ants_eggs_ordinal$Ants_present)

dat_ants_eggs_ordinal$Site <- as.factor(dat_ants_eggs_ordinal$Site)
dat_ants_eggs_ordinal$Former_pine_plantation <- as.factor(dat_ants_eggs_ordinal$Former_pine_plantation)
Show the code
#Null model
Mod0 <- glm(Ant_eggs_present~1, family = binomial, data = dat_ants_eggs_ordinal)
#summary(Mod0)


Mod1 <- glm(Ant_eggs_present~Site-1, family = binomial, data = dat_ants_eggs_ordinal)
summary(Mod1)

Call:
glm(formula = Ant_eggs_present ~ Site - 1, family = binomial, 
    data = dat_ants_eggs_ordinal)

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
Site1          -1.5163     0.2603  -5.826 5.69e-09 ***
Site10         -2.3136     0.3494  -6.621 3.56e-11 ***
Site11         -0.5754     0.2083  -2.762  0.00575 ** 
Site2          -1.7346     0.2801  -6.194 5.87e-10 ***
Site3          -0.4895     0.2060  -2.376  0.01749 *  
Site4          -2.9444     0.4588  -6.417 1.39e-10 ***
Site5          -3.8918     0.7142  -5.449 5.07e-08 ***
Site6          -1.3863     0.2500  -5.545 2.94e-08 ***
Site7          -0.8954     0.2204  -4.063 4.85e-05 ***
Site8          -1.6582     0.2728  -6.079 1.21e-09 ***
Site9          -0.6633     0.2111  -3.142  0.00168 ** 
SiteControl 1  -1.0986     0.2309  -4.757 1.96e-06 ***
SiteControl 2  -0.6190     0.2097  -2.953  0.00315 ** 
SiteControl 3  -2.3136     0.3494  -6.621 3.56e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1940.8  on 1400  degrees of freedom
Residual deviance: 1301.3  on 1386  degrees of freedom
AIC: 1329.3

Number of Fisher Scoring iterations: 6
Show the code
anova(Mod1)
Analysis of Deviance Table

Model: binomial, link: logit

Response: Ant_eggs_present

Terms added sequentially (first to last)

     Df Deviance Resid. Df Resid. Dev
NULL                  1400     1940.8
Site 14   639.56      1386     1301.2
Show the code
#With emmeans
marginal <- emmeans(Mod1, "Site")
  
pairs(marginal,
      adjust="tukey")
 contrast              estimate    SE  df z.ratio p.value
 1 - 10                  0.7973 0.436 Inf   1.830  0.8629
 1 - 11                 -0.9410 0.333 Inf  -2.822  0.2130
 1 - 2                   0.2183 0.382 Inf   0.571  1.0000
 1 - 3                  -1.0268 0.332 Inf  -3.093  0.1076
 1 - 4                   1.4281 0.528 Inf   2.707  0.2744
 1 - 5                   2.3755 0.760 Inf   3.125  0.0986
 1 - 6                  -0.1301 0.361 Inf  -0.360  1.0000
 1 - 7                  -0.6210 0.341 Inf  -1.821  0.8672
 1 - 8                   0.1419 0.377 Inf   0.376  1.0000
 1 - 9                  -0.8531 0.335 Inf  -2.545  0.3764
 1 - Control 1          -0.4177 0.348 Inf  -1.200  0.9957
 1 - Control 2          -0.8973 0.334 Inf  -2.685  0.2875
 1 - Control 3           0.7973 0.436 Inf   1.830  0.8629
 10 - 11                -1.7383 0.407 Inf  -4.273  0.0016
 10 - 2                 -0.5790 0.448 Inf  -1.293  0.9913
 10 - 3                 -1.8241 0.406 Inf  -4.497  0.0006
 10 - 4                  0.6308 0.577 Inf   1.094  0.9983
 10 - 5                  1.5782 0.795 Inf   1.985  0.7776
 10 - 6                 -0.9273 0.430 Inf  -2.158  0.6601
 10 - 7                 -1.4183 0.413 Inf  -3.433  0.0388
 10 - 8                 -0.6554 0.443 Inf  -1.479  0.9716
 10 - 9                 -1.6503 0.408 Inf  -4.043  0.0042
 10 - Control 1         -1.2150 0.419 Inf  -2.901  0.1770
 10 - Control 2         -1.6946 0.407 Inf  -4.159  0.0026
 10 - Control 3          0.0000 0.494 Inf   0.000  1.0000
 11 - 2                  1.1592 0.349 Inf   3.321  0.0553
 11 - 3                 -0.0858 0.293 Inf  -0.293  1.0000
 11 - 4                  2.3691 0.504 Inf   4.701  0.0002
 11 - 5                  3.3165 0.744 Inf   4.458  0.0007
 11 - 6                  0.8109 0.325 Inf   2.492  0.4135
 11 - 7                  0.3200 0.303 Inf   1.055  0.9988
 11 - 8                  1.0829 0.343 Inf   3.155  0.0906
 11 - 9                  0.0879 0.297 Inf   0.296  1.0000
 11 - Control 1          0.5232 0.311 Inf   1.682  0.9225
 11 - Control 2          0.0437 0.296 Inf   0.148  1.0000
 11 - Control 3          1.7383 0.407 Inf   4.273  0.0016
 2 - 3                  -1.2451 0.348 Inf  -3.581  0.0236
 2 - 4                   1.2098 0.538 Inf   2.251  0.5919
 2 - 5                   2.1572 0.767 Inf   2.812  0.2182
 2 - 6                  -0.3483 0.375 Inf  -0.928  0.9997
 2 - 7                  -0.8392 0.356 Inf  -2.355  0.5137
 2 - 8                  -0.0764 0.391 Inf  -0.195  1.0000
 2 - 9                  -1.0713 0.351 Inf  -3.055  0.1194
 2 - Control 1          -0.6360 0.363 Inf  -1.752  0.8970
 2 - Control 2          -1.1156 0.350 Inf  -3.189  0.0822
 2 - Control 3           0.5790 0.448 Inf   1.293  0.9913
 3 - 4                   2.4549 0.503 Inf   4.881  0.0001
 3 - 5                   3.4023 0.743 Inf   4.577  0.0004
 3 - 6                   0.8967 0.324 Inf   2.768  0.2407
 3 - 7                   0.4058 0.302 Inf   1.345  0.9875
 3 - 8                   1.1687 0.342 Inf   3.419  0.0406
 3 - 9                   0.1737 0.295 Inf   0.589  1.0000
 3 - Control 1           0.6091 0.309 Inf   1.968  0.7879
 3 - Control 2           0.1295 0.294 Inf   0.441  1.0000
 3 - Control 3           1.8241 0.406 Inf   4.497  0.0006
 4 - 5                   0.9474 0.849 Inf   1.116  0.9979
 4 - 6                  -1.5581 0.523 Inf  -2.982  0.1445
 4 - 7                  -2.0491 0.509 Inf  -4.026  0.0045
 4 - 8                  -1.2862 0.534 Inf  -2.410  0.4730
 4 - 9                  -2.2811 0.505 Inf  -4.517  0.0005
 4 - Control 1          -1.8458 0.514 Inf  -3.593  0.0226
 4 - Control 2          -2.3254 0.504 Inf  -4.610  0.0003
 4 - Control 3          -0.6308 0.577 Inf  -1.094  0.9983
 5 - 6                  -2.5055 0.757 Inf  -3.311  0.0571
 5 - 7                  -2.9964 0.747 Inf  -4.009  0.0048
 5 - 8                  -2.2336 0.765 Inf  -2.921  0.1683
 5 - 9                  -3.2285 0.745 Inf  -4.335  0.0012
 5 - Control 1          -2.7932 0.751 Inf  -3.721  0.0144
 5 - Control 2          -3.2728 0.744 Inf  -4.397  0.0009
 5 - Control 3          -1.5782 0.795 Inf  -1.985  0.7776
 6 - 7                  -0.4909 0.333 Inf  -1.473  0.9724
 6 - 8                   0.2719 0.370 Inf   0.735  1.0000
 6 - 9                  -0.7230 0.327 Inf  -2.210  0.6225
 6 - Control 1          -0.2877 0.340 Inf  -0.845  0.9999
 6 - Control 2          -0.7673 0.326 Inf  -2.352  0.5162
 6 - Control 3           0.9273 0.430 Inf   2.158  0.6601
 7 - 8                   0.7628 0.351 Inf   2.175  0.6477
 7 - 9                  -0.2321 0.305 Inf  -0.761  1.0000
 7 - Control 1           0.2032 0.319 Inf   0.637  1.0000
 7 - Control 2          -0.2763 0.304 Inf  -0.909  0.9998
 7 - Control 3           1.4183 0.413 Inf   3.433  0.0388
 8 - 9                  -0.9949 0.345 Inf  -2.885  0.1841
 8 - Control 1          -0.5596 0.357 Inf  -1.566  0.9548
 8 - Control 2          -1.0392 0.344 Inf  -3.021  0.1308
 8 - Control 3           0.6554 0.443 Inf   1.479  0.9716
 9 - Control 1           0.4353 0.313 Inf   1.391  0.9831
 9 - Control 2          -0.0443 0.298 Inf  -0.149  1.0000
 9 - Control 3           1.6503 0.408 Inf   4.043  0.0042
 Control 1 - Control 2  -0.4796 0.312 Inf  -1.538  0.9609
 Control 1 - Control 3   1.2150 0.419 Inf   2.901  0.1770
 Control 2 - Control 3   1.6946 0.407 Inf   4.159  0.0026

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 14 estimates 
Show the code
Tukey_B <- multcomp::cld(marginal, Letters = letters)

##Model checking with DHARMa
#simulationOutput <- simulateResiduals(fittedModel = Mod1, n = 1000, seed = 123)
#plot(simulationOutput)
#all fine

Calculate R2 - formula: 1 - (Residual Deviance/Null Deviance)

0.3295033

The Tukey post-hoc tests shows some more differences between the patches, with Patches #3 and #5 being completely unique, which is not surprising, as these patches show the highest and lowest proportion of ant eggs present.

Ant larvae

This analysis on ant larvae is pretty much identical to the ant eggs analysis above. This data is also binary (presence/absence). This is a more straightforward statistical test. We will use a glm model, with a logit-link.

The same model will be applied: Ant_larvae ~ Patch.

Show the code
#re-shape the data for analysis in ordinal
dat_ants_larvae_ordinal <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ant_larvae_present)

dat_ants_larvae_ordinal$Ant_larvae_present[dat_ants_larvae_ordinal$Ant_larvae_present == "N"] <- "0"
dat_ants_larvae_ordinal$Ant_larvae_present[dat_ants_larvae_ordinal$Ant_larvae_present == "Y"] <- "1"

unique(dat_ants_larvae_ordinal$Ant_larvae_present)
[1] "0" "1"
Show the code
dat_ants_larvae_ordinal$Ant_larvae_present <- as.numeric(dat_ants_larvae_ordinal$Ant_larvae_present)


dat_ants_larvae_ordinal$Site <- as.factor(dat_ants_larvae_ordinal$Site)
dat_ants_larvae_ordinal$Former_pine_plantation <- as.factor(dat_ants_larvae_ordinal$Former_pine_plantation)
Show the code
#Null model
Mod0 <- glm(Ant_larvae_present~1, family = binomial, data = dat_ants_larvae_ordinal)
#summary(Mod0)


Mod1 <- glm(Ant_larvae_present~Site-1, family = binomial, data = dat_ants_larvae_ordinal)
summary(Mod1)

Call:
glm(formula = Ant_larvae_present ~ Site - 1, family = binomial, 
    data = dat_ants_larvae_ordinal)

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
Site1          -1.9924     0.3077  -6.475 9.50e-11 ***
Site10         -2.7515     0.4211  -6.535 6.38e-11 ***
Site11         -1.3249     0.2455  -5.397 6.79e-08 ***
Site2          -2.1972     0.3333  -6.592 4.35e-11 ***
Site3          -1.9924     0.3077  -6.475 9.50e-11 ***
Site4          -3.1781     0.5103  -6.228 4.73e-10 ***
Site5         -17.5661   395.6180  -0.044    0.965    
Site6          -2.3136     0.3494  -6.621 3.56e-11 ***
Site7          -1.4500     0.2549  -5.688 1.28e-08 ***
Site8          -2.9444     0.4588  -6.417 1.39e-10 ***
Site9          -1.9924     0.3077  -6.475 9.50e-11 ***
SiteControl 1  -2.4423     0.3686  -6.626 3.45e-11 ***
SiteControl 2  -1.5163     0.2603  -5.826 5.69e-09 ***
SiteControl 3  -1.9924     0.3077  -6.475 9.50e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1940.81  on 1400  degrees of freedom
Residual deviance:  887.82  on 1386  degrees of freedom
AIC: 915.82

Number of Fisher Scoring iterations: 16
Show the code
anova(Mod1)
Analysis of Deviance Table

Model: binomial, link: logit

Response: Ant_larvae_present

Terms added sequentially (first to last)

     Df Deviance Resid. Df Resid. Dev
NULL                  1400    1940.81
Site 14     1053      1386     887.82
Show the code
#With emmeans
marginal <- emmeans(Mod1, "Site")
  
pairs(marginal,
      adjust="tukey")
 contrast              estimate      SE  df z.ratio p.value
 1 - 10                  0.7591   0.521 Inf   1.456  0.9751
 1 - 11                 -0.6675   0.394 Inf  -1.696  0.9180
 1 - 2                   0.2048   0.454 Inf   0.451  1.0000
 1 - 3                   0.0000   0.435 Inf   0.000  1.0000
 1 - 4                   1.1856   0.596 Inf   1.990  0.7747
 1 - 5                  15.5736 395.618 Inf   0.039  1.0000
 1 - 6                   0.3212   0.466 Inf   0.690  1.0000
 1 - 7                  -0.5424   0.400 Inf  -1.357  0.9864
 1 - 8                   0.9520   0.552 Inf   1.723  0.9082
 1 - 9                   0.0000   0.435 Inf   0.000  1.0000
 1 - Control 1           0.4499   0.480 Inf   0.937  0.9997
 1 - Control 2          -0.4761   0.403 Inf  -1.181  0.9963
 1 - Control 3           0.0000   0.435 Inf   0.000  1.0000
 10 - 11                -1.4266   0.487 Inf  -2.927  0.1661
 10 - 2                 -0.5543   0.537 Inf  -1.032  0.9991
 10 - 3                 -0.7591   0.521 Inf  -1.456  0.9751
 10 - 4                  0.4265   0.662 Inf   0.645  1.0000
 10 - 5                 14.8145 395.618 Inf   0.037  1.0000
 10 - 6                 -0.4379   0.547 Inf  -0.800  0.9999
 10 - 7                 -1.3015   0.492 Inf  -2.644  0.3121
 10 - 8                  0.1929   0.623 Inf   0.310  1.0000
 10 - 9                 -0.7591   0.521 Inf  -1.456  0.9751
 10 - Control 1         -0.3092   0.560 Inf  -0.552  1.0000
 10 - Control 2         -1.2352   0.495 Inf  -2.495  0.4112
 10 - Control 3         -0.7591   0.521 Inf  -1.456  0.9751
 11 - 2                  0.8723   0.414 Inf   2.107  0.6967
 11 - 3                  0.6675   0.394 Inf   1.696  0.9180
 11 - 4                  1.8531   0.566 Inf   3.272  0.0642
 11 - 5                 16.2411 395.618 Inf   0.041  1.0000
 11 - 6                  0.9887   0.427 Inf   2.315  0.5435
 11 - 7                  0.1251   0.354 Inf   0.353  1.0000
 11 - 8                  1.6195   0.520 Inf   3.112  0.1021
 11 - 9                  0.6675   0.394 Inf   1.696  0.9180
 11 - Control 1          1.1174   0.443 Inf   2.523  0.3917
 11 - Control 2          0.1914   0.358 Inf   0.535  1.0000
 11 - Control 3          0.6675   0.394 Inf   1.696  0.9180
 2 - 3                  -0.2048   0.454 Inf  -0.451  1.0000
 2 - 4                   0.9808   0.610 Inf   1.609  0.9442
 2 - 5                  15.3688 395.618 Inf   0.039  1.0000
 2 - 6                   0.1164   0.483 Inf   0.241  1.0000
 2 - 7                  -0.7472   0.420 Inf  -1.781  0.8852
 2 - 8                   0.7472   0.567 Inf   1.318  0.9896
 2 - 9                  -0.2048   0.454 Inf  -0.451  1.0000
 2 - Control 1           0.2451   0.497 Inf   0.493  1.0000
 2 - Control 2          -0.6809   0.423 Inf  -1.610  0.9440
 2 - Control 3          -0.2048   0.454 Inf  -0.451  1.0000
 3 - 4                   1.1856   0.596 Inf   1.990  0.7747
 3 - 5                  15.5736 395.618 Inf   0.039  1.0000
 3 - 6                   0.3212   0.466 Inf   0.690  1.0000
 3 - 7                  -0.5424   0.400 Inf  -1.357  0.9864
 3 - 8                   0.9520   0.552 Inf   1.723  0.9082
 3 - 9                   0.0000   0.435 Inf   0.000  1.0000
 3 - Control 1           0.4499   0.480 Inf   0.937  0.9997
 3 - Control 2          -0.4761   0.403 Inf  -1.181  0.9963
 3 - Control 3           0.0000   0.435 Inf   0.000  1.0000
 4 - 5                  14.3880 395.618 Inf   0.036  1.0000
 4 - 6                  -0.8644   0.619 Inf  -1.398  0.9824
 4 - 7                  -1.7280   0.570 Inf  -3.029  0.1278
 4 - 8                  -0.2336   0.686 Inf  -0.340  1.0000
 4 - 9                  -1.1856   0.596 Inf  -1.990  0.7747
 4 - Control 1          -0.7357   0.629 Inf  -1.169  0.9967
 4 - Control 2          -1.6617   0.573 Inf  -2.901  0.1770
 4 - Control 3          -1.1856   0.596 Inf  -1.990  0.7747
 5 - 6                 -15.2524 395.618 Inf  -0.039  1.0000
 5 - 7                 -16.1161 395.618 Inf  -0.041  1.0000
 5 - 8                 -14.6216 395.618 Inf  -0.037  1.0000
 5 - 9                 -15.5736 395.618 Inf  -0.039  1.0000
 5 - Control 1         -15.1237 395.618 Inf  -0.038  1.0000
 5 - Control 2         -16.0497 395.618 Inf  -0.041  1.0000
 5 - Control 3         -15.5736 395.618 Inf  -0.039  1.0000
 6 - 7                  -0.8636   0.432 Inf  -1.997  0.7702
 6 - 8                   0.6308   0.577 Inf   1.094  0.9983
 6 - 9                  -0.3212   0.466 Inf  -0.690  1.0000
 6 - Control 1           0.1287   0.508 Inf   0.253  1.0000
 6 - Control 2          -0.7973   0.436 Inf  -1.830  0.8629
 6 - Control 3          -0.3212   0.466 Inf  -0.690  1.0000
 7 - 8                   1.4944   0.525 Inf   2.847  0.2012
 7 - 9                   0.5424   0.400 Inf   1.357  0.9864
 7 - Control 1           0.9923   0.448 Inf   2.214  0.6191
 7 - Control 2           0.0663   0.364 Inf   0.182  1.0000
 7 - Control 3           0.5424   0.400 Inf   1.357  0.9864
 8 - 9                  -0.9520   0.552 Inf  -1.723  0.9082
 8 - Control 1          -0.5021   0.589 Inf  -0.853  0.9999
 8 - Control 2          -1.4281   0.527 Inf  -2.707  0.2744
 8 - Control 3          -0.9520   0.552 Inf  -1.723  0.9082
 9 - Control 1           0.4499   0.480 Inf   0.937  0.9997
 9 - Control 2          -0.4761   0.403 Inf  -1.181  0.9963
 9 - Control 3           0.0000   0.435 Inf   0.000  1.0000
 Control 1 - Control 2  -0.9260   0.451 Inf  -2.052  0.7343
 Control 1 - Control 3  -0.4499   0.480 Inf  -0.937  0.9997
 Control 2 - Control 3   0.4761   0.403 Inf   1.181  0.9963

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 14 estimates 
Show the code
Tukey_C <- multcomp::cld(marginal, Letters = letters)

##Model checking with DHARMa
#simulationOutput <- simulateResiduals(fittedModel = Mod1, n = 1000, seed = 123)
#plot(simulationOutput)
#all fine

Calculate R2 - formula: 1 - (Residual Deviance/Null Deviance)

0.5425518

In the case of the ant larvae data, the post-hoc test did not identify any significant differences between the groups.

Ant burrows

Show the code
#re-shape the data for analysis in ordinal
dat_ants_burrow_ordinal <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ant_burrow_seen)

dat_ants_burrow_ordinal$Ant_burrow_seen[dat_ants_burrow_ordinal$Ant_burrow_seen == "N"] <- "0"
dat_ants_burrow_ordinal$Ant_burrow_seen[dat_ants_burrow_ordinal$Ant_burrow_seen == "Y"] <- "1"

unique(dat_ants_burrow_ordinal$Ant_burrow_seen)
[1] "1" "0"
Show the code
dat_ants_burrow_ordinal$Ant_burrow_seen <- as.numeric(dat_ants_burrow_ordinal$Ant_burrow_seen)


dat_ants_burrow_ordinal$Site <- as.factor(dat_ants_burrow_ordinal$Site)
dat_ants_burrow_ordinal$Former_pine_plantation <- as.factor(dat_ants_burrow_ordinal$Former_pine_plantation)
Show the code
#Null model
Mod0 <- glm(Ant_burrow_seen~1, family = binomial, data = dat_ants_burrow_ordinal)
#summary(Mod0)


Mod1 <- glm(Ant_burrow_seen~Site-1, family = binomial, data = dat_ants_burrow_ordinal)
summary(Mod1)

Call:
glm(formula = Ant_burrow_seen ~ Site - 1, family = binomial, 
    data = dat_ants_burrow_ordinal)

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
Site1           0.1201     0.2004   0.600 0.548747    
Site10          0.9445     0.2227   4.241 2.23e-05 ***
Site11          1.2083     0.2376   5.085 3.68e-07 ***
Site2           0.4895     0.2060   2.376 0.017492 *  
Site3           0.7538     0.2144   3.516 0.000438 ***
Site4           0.4055     0.2041   1.986 0.046993 *  
Site5          -1.1527     0.2341  -4.923 8.53e-07 ***
Site6           1.0460     0.2280   4.588 4.48e-06 ***
Site7           0.9946     0.2252   4.416 1.01e-05 ***
Site8           0.6190     0.2097   2.953 0.003151 ** 
Site9           0.8001     0.2162   3.700 0.000215 ***
SiteControl 1   0.2007     0.2010   0.998 0.318122    
SiteControl 2   0.6190     0.2097   2.953 0.003151 ** 
SiteControl 3  -0.4895     0.2060  -2.376 0.017492 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1940.8  on 1400  degrees of freedom
Residual deviance: 1752.2  on 1386  degrees of freedom
AIC: 1780.2

Number of Fisher Scoring iterations: 4
Show the code
anova(Mod1)
Analysis of Deviance Table

Model: binomial, link: logit

Response: Ant_burrow_seen

Terms added sequentially (first to last)

     Df Deviance Resid. Df Resid. Dev
NULL                  1400     1940.8
Site 14   188.59      1386     1752.2
Show the code
#With emmeans
marginal <- emmeans(Mod1, "Site")
  
pairs(marginal,
      adjust="tukey")
 contrast              estimate    SE  df z.ratio p.value
 1 - 10                 -0.8243 0.300 Inf  -2.752  0.2496
 1 - 11                 -1.0882 0.311 Inf  -3.501  0.0310
 1 - 2                  -0.3694 0.287 Inf  -1.285  0.9917
 1 - 3                  -0.6336 0.293 Inf  -2.159  0.6593
 1 - 4                  -0.2853 0.286 Inf  -0.998  0.9994
 1 - 5                   1.2728 0.308 Inf   4.130  0.0029
 1 - 6                  -0.9258 0.304 Inf  -3.050  0.1208
 1 - 7                  -0.8745 0.301 Inf  -2.901  0.1770
 1 - 8                  -0.4989 0.290 Inf  -1.720  0.9092
 1 - 9                  -0.6800 0.295 Inf  -2.307  0.5498
 1 - Control 1          -0.0805 0.284 Inf  -0.284  1.0000
 1 - Control 2          -0.4989 0.290 Inf  -1.720  0.9092
 1 - Control 3           0.6097 0.287 Inf   2.122  0.6865
 10 - 11                -0.2638 0.326 Inf  -0.810  0.9999
 10 - 2                  0.4549 0.303 Inf   1.499  0.9681
 10 - 3                  0.1907 0.309 Inf   0.617  1.0000
 10 - 4                  0.5390 0.302 Inf   1.784  0.8837
 10 - 5                  2.0971 0.323 Inf   6.490  <.0001
 10 - 6                 -0.1015 0.319 Inf  -0.318  1.0000
 10 - 7                 -0.0502 0.317 Inf  -0.158  1.0000
 10 - 8                  0.3254 0.306 Inf   1.064  0.9987
 10 - 9                  0.1443 0.310 Inf   0.465  1.0000
 10 - Control 1          0.7438 0.300 Inf   2.479  0.4225
 10 - Control 2          0.3254 0.306 Inf   1.064  0.9987
 10 - Control 3          1.4340 0.303 Inf   4.727  0.0002
 11 - 2                  0.7188 0.314 Inf   2.285  0.5659
 11 - 3                  0.4545 0.320 Inf   1.420  0.9798
 11 - 4                  0.8028 0.313 Inf   2.563  0.3646
 11 - 5                  2.3610 0.334 Inf   7.077  <.0001
 11 - 6                  0.1623 0.329 Inf   0.493  1.0000
 11 - 7                  0.2137 0.327 Inf   0.653  1.0000
 11 - 8                  0.5893 0.317 Inf   1.860  0.8483
 11 - 9                  0.4082 0.321 Inf   1.271  0.9926
 11 - Control 1          1.0076 0.311 Inf   3.238  0.0713
 11 - Control 2          0.5893 0.317 Inf   1.860  0.8483
 11 - Control 3          1.6979 0.314 Inf   5.399  <.0001
 2 - 3                  -0.2642 0.297 Inf  -0.889  0.9998
 2 - 4                   0.0841 0.290 Inf   0.290  1.0000
 2 - 5                   1.6422 0.312 Inf   5.266  <.0001
 2 - 6                  -0.5564 0.307 Inf  -1.811  0.8718
 2 - 7                  -0.5051 0.305 Inf  -1.655  0.9313
 2 - 8                  -0.1295 0.294 Inf  -0.441  1.0000
 2 - 9                  -0.3106 0.299 Inf  -1.040  0.9990
 2 - Control 1           0.2889 0.288 Inf   1.004  0.9993
 2 - Control 2          -0.1295 0.294 Inf  -0.441  1.0000
 2 - Control 3           0.9791 0.291 Inf   3.360  0.0489
 3 - 4                   0.3483 0.296 Inf   1.177  0.9965
 3 - 5                   1.9065 0.317 Inf   6.005  <.0001
 3 - 6                  -0.2922 0.313 Inf  -0.934  0.9997
 3 - 7                  -0.2409 0.311 Inf  -0.775  1.0000
 3 - 8                   0.1347 0.300 Inf   0.449  1.0000
 3 - 9                  -0.0463 0.304 Inf  -0.152  1.0000
 3 - Control 1           0.5531 0.294 Inf   1.882  0.8366
 3 - Control 2           0.1347 0.300 Inf   0.449  1.0000
 3 - Control 3           1.2433 0.297 Inf   4.182  0.0023
 4 - 5                   1.5581 0.311 Inf   5.016  <.0001
 4 - 6                  -0.6405 0.306 Inf  -2.093  0.7064
 4 - 7                  -0.5892 0.304 Inf  -1.938  0.8056
 4 - 8                  -0.2136 0.293 Inf  -0.730  1.0000
 4 - 9                  -0.3947 0.297 Inf  -1.327  0.9889
 4 - Control 1           0.2048 0.286 Inf   0.715  1.0000
 4 - Control 2          -0.2136 0.293 Inf  -0.730  1.0000
 4 - Control 3           0.8950 0.290 Inf   3.086  0.1097
 5 - 6                  -2.1986 0.327 Inf  -6.728  <.0001
 5 - 7                  -2.1473 0.325 Inf  -6.609  <.0001
 5 - 8                  -1.7717 0.314 Inf  -5.637  <.0001
 5 - 9                  -1.9528 0.319 Inf  -6.127  <.0001
 5 - Control 1          -1.3534 0.309 Inf  -4.386  0.0010
 5 - Control 2          -1.7717 0.314 Inf  -5.637  <.0001
 5 - Control 3          -0.6631 0.312 Inf  -2.126  0.6831
 6 - 7                   0.0513 0.320 Inf   0.160  1.0000
 6 - 8                   0.4269 0.310 Inf   1.378  0.9844
 6 - 9                   0.2458 0.314 Inf   0.782  1.0000
 6 - Control 1           0.8453 0.304 Inf   2.781  0.2339
 6 - Control 2           0.4269 0.310 Inf   1.378  0.9844
 6 - Control 3           1.5355 0.307 Inf   4.997  0.0001
 7 - 8                   0.3756 0.308 Inf   1.221  0.9950
 7 - 9                   0.1945 0.312 Inf   0.623  1.0000
 7 - Control 1           0.7940 0.302 Inf   2.630  0.3210
 7 - Control 2           0.3756 0.308 Inf   1.221  0.9950
 7 - Control 3           1.4842 0.305 Inf   4.862  0.0001
 8 - 9                  -0.1811 0.301 Inf  -0.601  1.0000
 8 - Control 1           0.4184 0.290 Inf   1.440  0.9772
 8 - Control 2           0.0000 0.296 Inf   0.000  1.0000
 8 - Control 3           1.1086 0.294 Inf   3.771  0.0119
 9 - Control 1           0.5994 0.295 Inf   2.031  0.7485
 9 - Control 2           0.1811 0.301 Inf   0.601  1.0000
 9 - Control 3           1.2897 0.299 Inf   4.318  0.0013
 Control 1 - Control 2  -0.4184 0.290 Inf  -1.440  0.9772
 Control 1 - Control 3   0.6902 0.288 Inf   2.398  0.4816
 Control 2 - Control 3   1.1086 0.294 Inf   3.771  0.0119

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 14 estimates 
Show the code
Tukey_D <- multcomp::cld(marginal, Letters = letters)

##Model checking with DHARMa
#simulationOutput <- simulateResiduals(fittedModel = Mod1, n = 1000, seed = 123)
#plot(simulationOutput)
#all fine

Calculate R2 - formula: 1 - (Residual Deviance/Null Deviance)

0.0971764

The analysis shows there is some structuring between patches with regards to ant burrows, but its not hugely compelling.

Show the code
#Produce table of Tukey results

Table_1 <- bind_rows(Tukey_A, Tukey_B, Tukey_C, Tukey_D, .id = "Metric")

Table_1 <- Table_1 %>% 
  dplyr::select(Metric, Site, .group) %>% 
  pivot_wider(., id_cols = Site, names_from = Metric, values_from = .group)

colnames(Table_1) <- c("Patch", "No. Ants", "Ant eggs", "Ant larvae", "Ant burrows")

Question 2

  1. Compare ant number/colonies between treatment sites (islands) in former pine forest and outside former pine forest or maybe with control sites. Do ant numbers/colonies differ between islands within former pine plantation and outside former pine plantation?

This can be looked at in 2 ways:

  • Control sites vs treatment sites

  • Control sites vs treatment sites (pine forest) and treatment sites (non-pine forest)

Part A - look at number of ants

Show the code
#New variable: Control sites and new habitat sites

dat_ants_ordinal <- dat_ants_ordinal %>% 
  mutate(Treatment = case_when(Site == "Control 1" ~ "Control",
                               Site == "Control 2" ~ "Control",
                               Site == "Control 3" ~ "Control",
                               .default = "Treatment"))

#New variable: Control sites and new habitat sites. Sites 6 - 11 are ex-pine forest

dat_ants_ordinal <- dat_ants_ordinal %>% 
  mutate(Treatment2 = case_when(Site == "Control 1" ~ "Control",Site == "Control 2" ~ "Control",
          Site == "Control 3" ~ "Control",
          Site == "6" ~ "Pine",
          Site == "7" ~ "Pine",
          Site == "8" ~ "Pine",
          Site == "9" ~ "Pine",
          Site == "10" ~ "Pine",
          Site == "11" ~ "Pine",
                               .default = "Non_pine"))
Show the code
Mod2 <- clm(Ants_present ~ Treatment, data = dat_ants_ordinal)
summary(Mod2)
formula: Ants_present ~ Treatment
data:    dat_ants_ordinal

 link  threshold nobs logLik   AIC     niter max.grad cond.H 
 logit flexible  1400 -1840.67 3689.35 4(0)  1.19e-08 4.9e+01

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
TreatmentTreatment   0.4180     0.1241   3.367  0.00076 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
0|1 -0.07901    0.11304  -0.699
1|2  1.01403    0.11675   8.686
2|3  1.94826    0.12533  15.545
Show the code
anova(Mod2, type = "II")
Type II Analysis of Deviance Table with Wald chi-square tests

          Df  Chisq Pr(>Chisq)    
Treatment  1 11.336  0.0007601 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
#With emmeans
marginal2 <- emmeans(Mod2, "Treatment")
  
pairs(marginal2,
      adjust="tukey")
 contrast            estimate    SE  df z.ratio p.value
 Control - Treatment   -0.418 0.124 Inf  -3.367  0.0008
Show the code
multcomp::cld(marginal2, Letters = letters)
 Treatment emmean     SE  df asymp.LCL asymp.UCL .group
 Control   -0.961 0.1131 Inf    -1.183    -0.739  a    
 Treatment -0.543 0.0554 Inf    -0.652    -0.435   b   

Confidence level used: 0.95 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping letter,
      then we cannot show them to be different.
      But we also did not show them to be the same. 
Show the code
#nominal_test(Mod2)
#scale_test(Mod2)



Mod3 <- clm(Ants_present ~ Treatment2, data = dat_ants_ordinal)
summary(Mod3)
formula: Ants_present ~ Treatment2
data:    dat_ants_ordinal

 link  threshold nobs logLik   AIC     niter max.grad cond.H 
 logit flexible  1400 -1818.14 3646.28 4(0)  1.14e-08 6.0e+01

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
Treatment2Non_pine 0.008263   0.139224   0.059    0.953    
Treatment2Pine     0.745677   0.133704   5.577 2.45e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
0|1 -0.09109    0.11338  -0.803
1|2  1.03108    0.11735   8.786
2|3  1.98065    0.12620  15.695
Show the code
anova(Mod3, type = "II")
Type II Analysis of Deviance Table with Wald chi-square tests

           Df  Chisq Pr(>Chisq)    
Treatment2  2 55.871  7.374e-13 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
#With emmeans
marginal3 <- emmeans(Mod3, "Treatment2")
  
pairs(marginal3,
      adjust="tukey")
 contrast           estimate    SE  df z.ratio p.value
 Control - Non_pine -0.00826 0.139 Inf  -0.059  0.9981
 Control - Pine     -0.74568 0.134 Inf  -5.577  <.0001
 Non_pine - Pine    -0.73741 0.111 Inf  -6.662  <.0001

P value adjustment: tukey method for comparing a family of 3 estimates 
Show the code
multcomp::cld(marginal3, Letters = letters)
 Treatment2 emmean     SE  df asymp.LCL asymp.UCL .group
 Control    -0.974 0.1135 Inf    -1.196   -0.7511  a    
 Non_pine   -0.965 0.0853 Inf    -1.132   -0.7982  a    
 Pine       -0.228 0.0722 Inf    -0.369   -0.0864   b   

Confidence level used: 0.95 
P value adjustment: tukey method for comparing a family of 3 estimates 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping letter,
      then we cannot show them to be different.
      But we also did not show them to be the same. 
Show the code
#nominal_test(Mod3)
#scale_test(Mod3)
Show the code
sjPlot::tab_model(Mod3)
  Ants_present
Predictors Odds Ratios CI p
0|1 0.91 0.73 – 1.14 0.422
1|2 2.80 2.23 – 3.53 <0.001
2|3 7.25 5.66 – 9.28 <0.001
Treatment2 [Non_pine] 1.01 0.77 – 1.32 0.953
Treatment2 [Pine] 2.11 1.62 – 2.74 <0.001
Observations 1400
R2 Nagelkerke 0.043
Show the code
Table_2 <- tab_model(Mod3)

Conclusion There is a significant difference between control sites, and island sites with regards to the abundance of ants. However, when island sites are partitioned into whether they were previously pine plantation or not, it is revealed that sites that were previously pine plantation are significantly different from both the control and island sites that were not pine plantation. [What direction is this going? eg. more ants in pine?]. The R2 of this model is absolute junk (0.04) so I wouldn’t read too much into this result. This is largely confirmed by the previous analysis at the patch level, which showed there were no meaningful differences between the patches anyway.

Part B - Ant eggs

Show the code
#re-shape the data for analysis in ordinal
dat_Q2B <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ant_eggs_present)

dat_Q2B$Ant_eggs_present[dat_Q2B$Ant_eggs_present == "N"] <- "0"
dat_Q2B$Ant_eggs_present[dat_Q2B$Ant_eggs_present == "Y"] <- "1"

unique(dat_Q2B$Ant_eggs_present)
[1] "0" "1"
Show the code
dat_Q2B$Ant_eggs_present <- as.factor(dat_Q2B$Ant_eggs_present)

dat_Q2B$Site <- as.factor(dat_Q2B$Site)
dat_Q2B$Former_pine_plantation <- as.factor(dat_Q2B$Former_pine_plantation)
Show the code
dat_Q2B <- dat_Q2B %>% 
  mutate(Treatment = case_when(Site == "Control 1" ~ "Control",
                               Site == "Control 2" ~ "Control",
                               Site == "Control 3" ~ "Control",
                               .default = "Treatment"))

#New variable: Control sites and new habitat sites. Sites 6 - 11 are ex-pine forest

dat_Q2B <- dat_Q2B %>% 
  mutate(Treatment2 = case_when(Site == "Control 1" ~ "Control",Site == "Control 2" ~ "Control",
          Site == "Control 3" ~ "Control",
          Site == "6" ~ "Pine",
          Site == "7" ~ "Pine",
          Site == "8" ~ "Pine",
          Site == "9" ~ "Pine",
          Site == "10" ~ "Pine",
          Site == "11" ~ "Pine",
                               .default = "Non_pine"))
Show the code
Mod_Q2B1 <- glm(Ant_eggs_present ~ Treatment, family = "binomial", data = dat_Q2B)
summary(Mod_Q2B1)

Call:
glm(formula = Ant_eggs_present ~ Treatment, family = "binomial", 
    data = dat_Q2B)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)         -1.2083     0.1372  -8.807   <2e-16 ***
TreatmentTreatment  -0.1667     0.1564  -1.065    0.287    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1431.1  on 1399  degrees of freedom
Residual deviance: 1430.0  on 1398  degrees of freedom
AIC: 1434

Number of Fisher Scoring iterations: 4
Show the code
Mod_Q2B2 <- glm(Ant_eggs_present ~ Treatment2-1, family = "binomial", data = dat_Q2B)
summary(Mod_Q2B2)

Call:
glm(formula = Ant_eggs_present ~ Treatment2 - 1, family = "binomial", 
    data = dat_Q2B)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
Treatment2Control  -1.20831    0.13719  -8.807   <2e-16 ***
Treatment2Non_pine -1.68830    0.12325 -13.698   <2e-16 ***
Treatment2Pine     -1.15268    0.09559 -12.059   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1940.8  on 1400  degrees of freedom
Residual deviance: 1417.8  on 1397  degrees of freedom
AIC: 1423.8

Number of Fisher Scoring iterations: 4
Show the code
Marginal_Q2B2 <- emmeans(Mod_Q2B2, "Treatment2")
  
pairs(Marginal_Q2B2,
      adjust="tukey")
 contrast           estimate    SE  df z.ratio p.value
 Control - Non_pine   0.4800 0.184 Inf   2.603  0.0251
 Control - Pine      -0.0556 0.167 Inf  -0.333  0.9408
 Non_pine - Pine     -0.5356 0.156 Inf  -3.434  0.0017

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates 
Show the code
multcomp::cld(Marginal_Q2B2, Letters = letters)
 Treatment2 emmean     SE  df asymp.LCL asymp.UCL .group
 Non_pine    -1.69 0.1232 Inf     -1.93    -1.447  a    
 Control     -1.21 0.1372 Inf     -1.48    -0.939   b   
 Pine        -1.15 0.0956 Inf     -1.34    -0.965   b   

Results are given on the logit (not the response) scale. 
Confidence level used: 0.95 
Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping letter,
      then we cannot show them to be different.
      But we also did not show them to be the same. 
Show the code
##Model checking with DHARMa
#simulationOutput <- simulateResiduals(fittedModel = Mod_Q2B2, n = 1000, seed = 123)
#plot(simulationOutput)
#all fine

TMod_Q2B2 <- summary(Mod_Q2B2)

TQ2B2 <- data.frame(cbind(TMod_Q2B2$coefficients, multcomp::cld(Marginal_Q2B2, Letters = letters)$.group))

TQ2B2$Coefficient <- rownames(TQ2B2)
Show the code
sjPlot::tab_model(Mod_Q2B2)
  Ant_eggs_present
Predictors Odds Ratios CI p
Treatment2 [Control] 0.30 0.23 – 0.39 <0.001
Treatment2 [Non_pine] 0.18 0.14 – 0.23 <0.001
Treatment2 [Pine] 0.32 0.26 – 0.38 <0.001
Observations 1400
R2 Tjur 0.009
Show the code
#R2
1 - (1417.8/1940.8)
[1] 0.2694765

Conclusion There is no overall difference between control sites and island sites with respect to the presence of ant eggs, however if island sites are split by previous land use (pine plantation or not) then there is a significant difference between sites that were not previously pine plantations, and those that were. This is not an intuitive result. Model is of low value, statistically however (R2 = 0.27). Again, this is largely consistent with the previous analysis that found only minor differences between patches.

Part C - Ant larvae

Show the code
#re-shape the data for analysis
dat_Q2C <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ant_larvae_present)

dat_Q2C$Ant_larvae_present[dat_Q2C$Ant_larvae_present == "N"] <- "0"
dat_Q2C$Ant_larvae_present[dat_Q2C$Ant_larvae_present == "Y"] <- "1"

unique(dat_Q2C$Ant_larvae_present)
[1] "0" "1"
Show the code
dat_Q2C$Ant_larvae_present <- as.factor(dat_Q2C$Ant_larvae_present)

dat_Q2C$Site <- as.factor(dat_Q2C$Site)
dat_Q2C$Former_pine_plantation <- as.factor(dat_Q2C$Former_pine_plantation)
Show the code
dat_Q2C <- dat_Q2C %>% 
  mutate(Treatment = case_when(Site == "Control 1" ~ "Control",
                               Site == "Control 2" ~ "Control",
                               Site == "Control 3" ~ "Control",
                               .default = "Treatment"))

#New variable: Control sites and new habitat sites. Sites 6 - 11 are ex-pine forest

dat_Q2C <- dat_Q2C %>% 
  mutate(Treatment2 = case_when(Site == "Control 1" ~ "Control",Site == "Control 2" ~ "Control",
          Site == "Control 3" ~ "Control",
          Site == "6" ~ "Pine",
          Site == "7" ~ "Pine",
          Site == "8" ~ "Pine",
          Site == "9" ~ "Pine",
          Site == "10" ~ "Pine",
          Site == "11" ~ "Pine",
                               .default = "Non_pine"))
Show the code
Mod_Q2C1 <- glm(Ant_larvae_present ~ Treatment, family = "binomial", data = dat_Q2C)
summary(Mod_Q2C1)

Call:
glm(formula = Ant_larvae_present ~ Treatment, family = "binomial", 
    data = dat_Q2C)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)         -1.9308     0.1736 -11.123   <2e-16 ***
TreatmentTreatment  -0.2665     0.2006  -1.328    0.184    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 944.89  on 1399  degrees of freedom
Residual deviance: 943.18  on 1398  degrees of freedom
AIC: 947.18

Number of Fisher Scoring iterations: 4
Show the code
Mod_Q2C2 <- glm(Ant_larvae_present ~ Treatment2-1, family = "binomial", data = dat_Q2C)
summary(Mod_Q2C2)

Call:
glm(formula = Ant_larvae_present ~ Treatment2 - 1, family = "binomial", 
    data = dat_Q2C)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
Treatment2Control   -1.9308     0.1736  -11.12   <2e-16 ***
Treatment2Non_pine  -2.4980     0.1688  -14.80   <2e-16 ***
Treatment2Pine      -1.9924     0.1256  -15.86   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1940.8  on 1400  degrees of freedom
Residual deviance:  937.2  on 1397  degrees of freedom
AIC: 943.2

Number of Fisher Scoring iterations: 5
Show the code
Marginal_Q2C2 <- emmeans(Mod_Q2C2, "Treatment2")
  
pairs(Marginal_Q2C2,
      adjust="tukey")
 contrast           estimate    SE  df z.ratio p.value
 Control - Non_pine   0.5672 0.242 Inf   2.343  0.0501
 Control - Pine       0.0617 0.214 Inf   0.288  0.9554
 Non_pine - Pine     -0.5055 0.210 Inf  -2.403  0.0430

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates 
Show the code
multcomp::cld(Marginal_Q2C2, Letters = letters)
 Treatment2 emmean    SE  df asymp.LCL asymp.UCL .group
 Non_pine    -2.50 0.169 Inf     -2.83     -2.17  a    
 Pine        -1.99 0.126 Inf     -2.24     -1.75   b   
 Control     -1.93 0.174 Inf     -2.27     -1.59  ab   

Results are given on the logit (not the response) scale. 
Confidence level used: 0.95 
Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping letter,
      then we cannot show them to be different.
      But we also did not show them to be the same. 
Show the code
#Control site straddles both pine and non-pine sites

##Model checking with DHARMa
#simulationOutput <- simulateResiduals(fittedModel = Mod_Q2C2, n = 1000, seed = 123)
#plot(simulationOutput)
#all fine


TMod_Q2C2 <- summary(Mod_Q2C2)

TQ2C2 <- data.frame(cbind(TMod_Q2C2$coefficients, multcomp::cld(Marginal_Q2C2, Letters = letters)$.group))

TQ2C2$Coefficient <- rownames(TQ2C2)
Show the code
sjPlot::tab_model(Mod_Q2C2)
  Ant_larvae_present
Predictors Odds Ratios CI p
Treatment2 [Control] 0.15 0.10 – 0.20 <0.001
Treatment2 [Non_pine] 0.08 0.06 – 0.11 <0.001
Treatment2 [Pine] 0.14 0.11 – 0.17 <0.001
Observations 1400
R2 Tjur 0.005
Show the code
#R2
1 - (937.2/1940.8)
[1] 0.5171063

Conclusion There is no overall difference between control sites and island sites with respect to the presence of ant larvae, however if island sites are split by previous land use (pine plantation or not) then there is a significant difference between sites that were not previously pine plantations, and those that were. This is not an intuitive result. In terms of Tukey HSD, the effect is marginal, however model R2 is ok (R2 = 0.51).

Part D - Ant burrows

Show the code
#re-shape the data for analysis
dat_Q2D <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ant_burrow_seen)

dat_Q2D$Ant_burrow_seen[dat_Q2D$Ant_burrow_seen == "N"] <- "0"
dat_Q2D$Ant_burrow_seen[dat_Q2D$Ant_burrow_seen == "Y"] <- "1"

unique(dat_Q2D$Ant_burrow_seen)
[1] "1" "0"
Show the code
dat_Q2D$Ant_burrow_seen <- as.factor(dat_Q2D$Ant_burrow_seen)

dat_Q2D$Site <- as.factor(dat_Q2D$Site)
dat_Q2D$Former_pine_plantation <- as.factor(dat_Q2D$Former_pine_plantation)
Show the code
dat_Q2D <- dat_Q2D %>% 
  mutate(Treatment = case_when(Site == "Control 1" ~ "Control",
                               Site == "Control 2" ~ "Control",
                               Site == "Control 3" ~ "Control",
                               .default = "Treatment"))

#New variable: Control sites and new habitat sites. Sites 6 - 11 are ex-pine forest

dat_Q2D <- dat_Q2D %>% 
  mutate(Treatment2 = case_when(Site == "Control 1" ~ "Control",Site == "Control 2" ~ "Control",
          Site == "Control 3" ~ "Control",
          Site == "6" ~ "Pine",
          Site == "7" ~ "Pine",
          Site == "8" ~ "Pine",
          Site == "9" ~ "Pine",
          Site == "10" ~ "Pine",
          Site == "11" ~ "Pine",
                               .default = "Non_pine"))
Show the code
Mod_Q2D1 <- glm(Ant_burrow_seen ~ Treatment, family = "binomial", data = dat_Q2D)
summary(Mod_Q2D1)

Call:
glm(formula = Ant_burrow_seen ~ Treatment, family = "binomial", 
    data = dat_Q2D)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)          0.1068     0.1156   0.923 0.355840    
TreatmentTreatment   0.4411     0.1315   3.355 0.000795 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1871.6  on 1399  degrees of freedom
Residual deviance: 1860.4  on 1398  degrees of freedom
AIC: 1864.4

Number of Fisher Scoring iterations: 4
Show the code
Mod_Q2D2 <- glm(Ant_burrow_seen ~ Treatment2-1, family = "binomial", data = dat_Q2D)
summary(Mod_Q2D2)

Call:
glm(formula = Ant_burrow_seen ~ Treatment2 - 1, family = "binomial", 
    data = dat_Q2D)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
Treatment2Control   0.10677    0.11563   0.923    0.356    
Treatment2Non_pine  0.13621    0.08965   1.519    0.129    
Treatment2Pine      0.92799    0.09060  10.243   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1940.8  on 1400  degrees of freedom
Residual deviance: 1821.2  on 1397  degrees of freedom
AIC: 1827.2

Number of Fisher Scoring iterations: 4
Show the code
Marginal_Q2D2 <- emmeans(Mod_Q2D2, "Treatment2")
  
pairs(Marginal_Q2D2,
      adjust="tukey")
 contrast           estimate    SE  df z.ratio p.value
 Control - Non_pine  -0.0294 0.146 Inf  -0.201  0.9779
 Control - Pine      -0.8212 0.147 Inf  -5.590  <.0001
 Non_pine - Pine     -0.7918 0.127 Inf  -6.212  <.0001

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates 
Show the code
multcomp::cld(Marginal_Q2D2, Letters = letters)
 Treatment2 emmean     SE  df asymp.LCL asymp.UCL .group
 Control     0.107 0.1156 Inf   -0.1199     0.333  a    
 Non_pine    0.136 0.0897 Inf   -0.0395     0.312  a    
 Pine        0.928 0.0906 Inf    0.7504     1.106   b   

Results are given on the logit (not the response) scale. 
Confidence level used: 0.95 
Results are given on the log odds ratio (not the response) scale. 
P value adjustment: tukey method for comparing a family of 3 estimates 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping letter,
      then we cannot show them to be different.
      But we also did not show them to be the same. 
Show the code
##Model checking with DHARMa
#simulationOutput <- simulateResiduals(fittedModel = Mod_Q2D2, n = 1000, seed = 123)
#plot(simulationOutput)
#all fine

TMod_Q2D2 <- summary(Mod_Q2D2)

TQ2D2 <- data.frame(cbind(TMod_Q2D2$coefficients, multcomp::cld(Marginal_Q2D2, Letters = letters)$.group))

TQ2D2$Coefficient <- rownames(TQ2D2)
Show the code
sjPlot::tab_model(Mod_Q2D2)
  Ant_burrow_seen
Predictors Odds Ratios CI p
Treatment2 [Control] 1.11 0.89 – 1.40 0.356
Treatment2 [Non_pine] 1.15 0.96 – 1.37 0.129
Treatment2 [Pine] 2.53 2.12 – 3.03 <0.001
Observations 1400
R2 Tjur 0.035
Show the code
#R2
1 - (1821.2/1940.8)
[1] 0.06162407
Show the code
#Combine summary tables

Table_3 <- bind_rows(TQ2B2, TQ2C2, TQ2D2, .id = "Metric")
rownames(Table_3) <- NULL

Table_3 <- Table_3 %>% 
  dplyr::select(Metric, Coefficient, Estimate, Std..Error, Pr...z.., "Tukey" = V5)

Table_3$Estimate <- as.numeric(Table_3$Estimate)
Table_3$Std..Error <- as.numeric(Table_3$Std..Error)
Table_3$Pr...z.. <- as.numeric(Table_3$Pr...z..)

Table_3 <- Table_3 %>% 
  mutate_if(is.numeric, round, digits=3)

Table_3$Metric[Table_3$Metric == "1"] <- "Ant eggs"
Table_3$Metric[Table_3$Metric == "2"] <- "Ant larvae"
Table_3$Metric[Table_3$Metric == "3"] <- "Ant burrows"

knitr::kable(Table_3)
Metric Coefficient Estimate Std..Error Pr…z.. Tukey
Ant eggs Treatment2Control -1.208 0.137 0.000 a
Ant eggs Treatment2Non_pine -1.688 0.123 0.000 b
Ant eggs Treatment2Pine -1.153 0.096 0.000 b
Ant larvae Treatment2Control -1.931 0.174 0.000 a
Ant larvae Treatment2Non_pine -2.498 0.169 0.000 b
Ant larvae Treatment2Pine -1.992 0.126 0.000 ab
Ant burrows Treatment2Control 0.107 0.116 0.356 a
Ant burrows Treatment2Non_pine 0.136 0.090 0.129 a
Ant burrows Treatment2Pine 0.928 0.091 0.000 b

Conclusion: There is a difference between control sites and island sites with respect to the presence of ant burrows, however if island sites are split by previous land use (pine plantation or not) then there is a significant difference between sites that were previously pine plantations, and those that were not. Model is of no value (R2 = 0.06)

[Say something like “while there were statistically significant difference, (and the direction), the overall proportion of variation explained by the models is low (figures), hence the ecologcal relevance of these results is likely minimal.]

Question 3

  1. Look to see if there is a relationship between ant numbers and PTWL detection rates. That is, do the islands with no or low PTWL detection rates have lower numbers of ants compared to islands with high detection rates or controls?

To do this, first need to create an index of capture rates for each site, to use as the response variable. Then calculate an index for ant colonies.

Show the code
dat_PTWL_summary <- dat_PTWL %>% 
  group_by(Site) %>% 
  summarise(N_PTWL = n(),
            Distance = mean(Distance_nearest_habitat, na.rm = T))

dat_PTWL_summary$Distance[dat_PTWL_summary$Site == "Control 3"] <- 0
Show the code
#dat_ant_summary <- dat_ants_ordinal %>% 
#  group_by(Site) %>% 
#  summarise(Ants1 = sum(if_else(Ants_present == 0, 0, 1)),
#            Ants2 = sum(if_else(Ants_present == 0| Ants_present ==1, 0, 1)),
#            Ants3 = mean(as.numeric(Ants_present)))


dat_ant_summary <- dat_ants %>% 
  dplyr::select(Site, Former_pine_plantation, Ants_present, Ant_eggs_present, Ant_larvae_present, Ant_burrow_seen)

dat_ant_summary$Ants_present[dat_ant_summary$Ants_present == "(1-10)"] <- "1"
dat_ant_summary$Ants_present[dat_ant_summary$Ants_present == "(11-50)"] <- "2"
dat_ant_summary$Ants_present[dat_ant_summary$Ants_present == "(>50)"] <- "3"

dat_ant_summary <- dat_ant_summary %>% 
  group_by(Site) %>% 
  summarise(Ants1 = sum(if_else(Ants_present == 0, 0, 1)),
            Ants2 = sum(if_else(Ants_present == 0| Ants_present ==1, 0, 1)),
            Ants3 = mean(as.numeric(Ants_present, na.rm = T)),
            Prop_ant_eggs = sum(if_else(Ant_eggs_present == "Y", 1, 0))/n(),
            Prop_ant_larvae = sum(if_else(Ant_larvae_present == "Y", 1, 0))/n(),
            Prop_ant_burrow = sum(if_else(Ant_burrow_seen == "Y", 1, 0))/n())
Show the code
#Join datasets
dat_Q3 <- full_join(dat_PTWL_summary, dat_ant_summary, by = "Site")

dat_Q3$N_PTWL[is.na(dat_Q3$N_PTWL)] <- 0
dat_Q3$Distance[dat_Q3$Site == 6] <- 240
dat_Q3$Distance[dat_Q3$Site == 9] <- 175

#Remove control sites
dat_Q3 <- dat_Q3 %>% 
  filter(Site != "Control 1" &
           Site != "Control 2" &
           Site != "Control 3")
Show the code
#Plots

dat_Q3 %>% 
  ggplot() + 
  geom_point(aes(x = Ants1, y = N_PTWL)) + 
  labs(x = "Number of ant colonies per site", y = "Number of PTWL") + 
  geom_smooth(aes(x = Ants1, y = N_PTWL), method = "lm") + 
  theme_bw()

Show the code
#Plots

dat_Q3 %>% 
  ggplot() + 
  geom_point(aes(x = Ants2, y = N_PTWL)) + 
  labs(x = "Number of ant colonies > 1-10 per site", y = "Number of PTWL") + 
  geom_smooth(aes(x = Ants2, y = N_PTWL), method = "lm") + 
  theme_bw()

Show the code
#Plots

dat_Q3 %>% 
  ggplot() + 
  geom_point(aes(x = Ants3, y = N_PTWL)) + 
  labs(x = "Mean ant site score", y = "Number of PTWL") + 
  geom_smooth(aes(x = Ants3, y = N_PTWL), method = "lm") + 
  theme_bw()

Show the code
dat_Q3 %>% 
  ggplot() + 
  geom_point(aes(x = Prop_ant_eggs, y = N_PTWL)) + 
  labs(x = "Proportion of rocks with ant eggs", y = "Number of PTWL") + 
  geom_smooth(aes(x = Prop_ant_eggs, y = N_PTWL), method = "lm") + 
  theme_bw()

Show the code
dat_Q3 %>% 
  ggplot() + 
  geom_point(aes(x = Prop_ant_larvae, y = N_PTWL)) + 
  labs(x = "Proportion of rocks with ant larvae", y = "Number of PTWL") + 
  geom_smooth(aes(x = Prop_ant_larvae, y = N_PTWL), method = "lm") + 
  theme_bw()

Show the code
dat_Q3 %>% 
  ggplot() + 
  geom_point(aes(x = Prop_ant_burrow, y = N_PTWL)) + 
  labs(x = "Proportion of rocks with ant burrows", y = "Number of PTWL") + 
  geom_smooth(aes(x = Prop_ant_burrow, y = N_PTWL), method = "lm") + 
  theme_bw()

No obvious effect of ant colonies on PTWL detections. Given that the dominant process is probably distance from habitat patch, it does not make a lot of sense to look at this in isolation of distance from habitat. Further, given the minimal differences in ants between patches, there is limited reason to expect to see any effects here.

Question 4

  1. Is there a relationship between time taken for first colonisation and the distance to natural habitat? (PTWL habitat patch data file and see attached email from Will)
Show the code
dat_patch_Q4 <- dat_patch %>% 
  dplyr::select(Distance_PTWL_habitat, Year_first_detection) %>% 
  na.omit(.)
Show the code
Fig_2 <- dat_patch_Q4 %>% 
  ggplot()+
  geom_point(aes(x = Distance_PTWL_habitat,
                 y = Year_first_detection)) + 
  geom_smooth(aes(x = Distance_PTWL_habitat,
                 y = Year_first_detection), method = "lm", color = "black") + 
  labs(x = "Distance to nearest PTWL habitat (m)",
       y = "Years to first detection") + 
  theme_bw() + 
  theme(axis.title = element_text(size = 14),
        axis.text = element_text(size = 12, color = "black"))

Fig_2

Show the code
Mod6 <- lm(Year_first_detection ~ Distance_PTWL_habitat, data = dat_patch_Q4)
Mod6a <-lm(Year_first_detection ~ 1, data = dat_patch_Q4)

summary(Mod6)

Call:
lm(formula = Year_first_detection ~ Distance_PTWL_habitat, data = dat_patch_Q4)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.6239 -0.5020 -0.2508  0.2238  1.5331 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)  
(Intercept)           0.979494   0.527151   1.858    0.100  
Distance_PTWL_habitat 0.011018   0.003407   3.234    0.012 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7542 on 8 degrees of freedom
Multiple R-squared:  0.5666,    Adjusted R-squared:  0.5125 
F-statistic: 10.46 on 1 and 8 DF,  p-value: 0.01198
Show the code
anova(Mod6a, Mod6)
Analysis of Variance Table

Model 1: Year_first_detection ~ 1
Model 2: Year_first_detection ~ Distance_PTWL_habitat
  Res.Df     RSS Df Sum of Sq      F  Pr(>F)  
1      9 10.5000                              
2      8  4.5502  1    5.9498 10.461 0.01198 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
##Model checking with DHARMa
#plotResiduals(Mod6)
#all fine

#Probably should treat as a count process (not continuous?)

R2 = 0.5666476

Show the code
sjPlot::tab_model(Mod6)
  Year_first_detection
Predictors Estimates CI p
(Intercept) 0.98 -0.24 – 2.20 0.100
Distance PTWL habitat 0.01 0.00 – 0.02 0.012
Observations 10
R2 / R2 adjusted 0.567 / 0.512

Conclusion: There is clearly a strong linear relationship between distance to nearest habitat and time time taken to colonise. The R2 is good (0.57).

Question 5

  1. Is there relationship between number of PTWL in 2021 per island and distance to known habitat. (PTWL habitat patch data file)
Show the code
dat_Q3 %>% 
  ggplot() + 
  geom_point(aes(x = Distance, y = N_PTWL)) +
  geom_smooth(aes(x = Distance, y = N_PTWL), method = "lm") + 
  labs(x = "Distance from nearest PTWL habitat (m)",
       y = "Number of PTWL detected (n)") + 
  theme_bw()

Show the code
#Note this figure non-sensical as treats data as continuous
Show the code
#Test effect of ants and distance on PTWL

Mod4 <- glm(N_PTWL ~ Distance + Ants1, family = "poisson", data = dat_Q3)
summary(Mod4)

Call:
glm(formula = N_PTWL ~ Distance + Ants1, family = "poisson", 
    data = dat_Q3)

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  4.035784   0.541511   7.453 9.14e-14 ***
Distance    -0.011774   0.001882  -6.255 3.97e-10 ***
Ants1       -0.012051   0.008162  -1.476     0.14    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 92.022  on 10  degrees of freedom
Residual deviance: 45.914  on  8  degrees of freedom
AIC: 83.236

Number of Fisher Scoring iterations: 5
Show the code
#Ants is not significant

#Mod5 <- glm(N_PTWL ~ Distance, family = "poisson", data = dat_Q3)
#summary(Mod5)
#anova(Mod5)

#Distance may need to be standardised

#testResiduals(Mod5)
#hmm, zero-inflated? Fit ZIP model using GLMMTMB

Mod5_0 <- glmmTMB(N_PTWL ~ 1, family = nbinom1,  data = dat_Q3)

Mod5a <- glmmTMB(N_PTWL ~ Distance, family = nbinom1,  data = dat_Q3)
summary(Mod5a)
 Family: nbinom1  ( log )
Formula:          N_PTWL ~ Distance
Data: dat_Q3

      AIC       BIC    logLik -2*log(L)  df.resid 
     64.2      65.4     -29.1      58.2         8 


Dispersion parameter for nbinom1 family (): 4.48 

Conditional model:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  3.274386   0.443688    7.38 1.58e-13 ***
Distance    -0.011637   0.003865   -3.01  0.00261 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
anova(Mod5_0, Mod5a)
Data: dat_Q3
Models:
Mod5_0: N_PTWL ~ 1, zi=~0, disp=~1
Mod5a: N_PTWL ~ Distance, zi=~0, disp=~1
       Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
Mod5_0  2 69.295 70.091 -32.647   65.295                            
Mod5a   3 64.226 65.419 -29.113   58.226 7.0692      1   0.007842 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Show the code
testResiduals(Mod5a)

$uniformity

    Exact one-sample Kolmogorov-Smirnov test

data:  simulationOutput$scaledResiduals
D = 0.14866, p-value = 0.9393
alternative hypothesis: two-sided


$dispersion

    DHARMa nonparametric dispersion test via sd of residuals fitted vs.
    simulated

data:  simulationOutput
dispersion = 0.90855, p-value = 0.888
alternative hypothesis: two.sided


$outliers

    DHARMa bootstrapped outlier test

data:  simulationOutput
outliers at both margin(s) = 0, observations = 11, p-value = 1
alternative hypothesis: two.sided
 percent confidence interval:
 0.00000000 0.09090909
sample estimates:
outlier frequency (expected: 0.00909090909090909 ) 
                                                 0 
Show the code
#Great result. All fine

#simulate(Mod5a, nsim = 10)

my_seq <- data.frame(seq(1,250, by = 1))
colnames(my_seq) <- c("Distance")

#predict(Mod5a, data = dat_Q3)

Pred1 <- predict(Mod5a, newdata = my_seq, type = "response", 
      se.fit = TRUE, cov.fit = FALSE)

Pred1 <-do.call(cbind.data.frame, Pred1)
Pred1$Distance <- my_seq$Distance
Pred1$se_lower <- Pred1$fit-Pred1$se.fit
Pred1$se_upper <- Pred1$fit+Pred1$se.fit

Fig_3 <- Pred1 %>% 
  ggplot() + 
  geom_line(aes(x = Distance, y = fit), color = "black") + 
  geom_ribbon(aes(x = Distance, ymin = se_lower, ymax = se_upper), alpha = 0.5) + 
  labs(x = "Distance from nearest PTWL habitat (m)", y = "Number of PTWL (N) ± SE") + 
  theme_bw() + 
  theme(axis.title = element_text(size = 14),
        axis.text = element_text(size = 12, color = "black"))
Fig_3

Modelled relationship between distance and number of PTWL

Show the code
sjPlot::tab_model(Mod5a)
Random effect variances not available. Returned R2 does not account for random effects.
  N_PTWL
Predictors Incidence Rate Ratios CI p
(Intercept) 26.43 11.08 – 63.05 <0.001
Distance 0.99 0.98 – 1.00 0.003
Observations 11
R2 conditional / R2 marginal NA / 0.728

Conclusion: Clear relationship between number of PTWL and distance to nearest habitat, with more PTWL detected at sites closer to existing habitat.

Question 6

  1. Is there a relationship between number of PTWL in 2021 per island and occurrence of island in or outside former pine plantation? (PTWL habitat patch data file)

To do this: 1. Drop the control sites 2. Code treatment sites by previous pine plantation and non-pine plantation 3. Run regression with Treatment site (Pine/Non-pine) and distance from habitat 4. Check for significant effect of treatment

Show the code
dat_Q6 <- dat_Q3 %>% 
  dplyr::select(N_PTWL, Site, Distance) %>% 
  mutate(Treatment = case_when(
          Site == "6" ~ "Pine",
          Site == "7" ~ "Pine",
          Site == "8" ~ "Pine",
          Site == "9" ~ "Pine",
          Site == "10" ~ "Pine",
          Site == "11" ~ "Pine",
                               .default = "Non-pine"))
Show the code
Mod7 <- glm(N_PTWL ~ Treatment * Distance, family = "poisson", data = dat_Q6)
summary(Mod7)

Call:
glm(formula = N_PTWL ~ Treatment * Distance, family = "poisson", 
    data = dat_Q6)

Coefficients:
                        Estimate Std. Error z value Pr(>|z|)    
(Intercept)             3.326960   0.228058  14.588  < 2e-16 ***
TreatmentPine          -2.392699   0.920410  -2.600 0.009333 ** 
Distance               -0.008227   0.002423  -3.396 0.000685 ***
TreatmentPine:Distance  0.005717   0.005632   1.015 0.309999    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 92.022  on 10  degrees of freedom
Residual deviance: 22.254  on  7  degrees of freedom
AIC: 61.577

Number of Fisher Scoring iterations: 5
Show the code
#Interaction not significant

Mod8_0 <- glm(N_PTWL ~ 1, family = "poisson", data = dat_Q6)

Mod8 <- glm(N_PTWL ~ Treatment + Distance, family = "poisson", data = dat_Q6)
summary(Mod8)

Call:
glm(formula = N_PTWL ~ Treatment + Distance, family = "poisson", 
    data = dat_Q6)

Coefficients:
               Estimate Std. Error z value Pr(>|z|)    
(Intercept)    3.239786   0.214476  15.106  < 2e-16 ***
TreatmentPine -1.588140   0.365243  -4.348 1.37e-05 ***
Distance      -0.007146   0.002140  -3.340 0.000838 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 92.022  on 10  degrees of freedom
Residual deviance: 23.316  on  8  degrees of freedom
AIC: 60.638

Number of Fisher Scoring iterations: 5
Show the code
anova(Mod8_0, Mod8)
Analysis of Deviance Table

Model 1: N_PTWL ~ 1
Model 2: N_PTWL ~ Treatment + Distance
  Resid. Df Resid. Dev Df Deviance
1        10     92.022            
2         8     23.316  2   68.707
Show the code
#testResiduals(Mod8)
#Spot on

R2 = 0.3804308

Show the code
#Plot

Fig_4 <- plot_model(Mod8, type = "pred", terms = c("Distance", "Treatment"), colors = "bw") + 
  labs(x = "Distance to nearest PTWL habitat (m)",
       y = "Predicted count of PTWL (n) ± SE",
       title = "") +
  theme_bw() + 
  theme(axis.title = element_text(size = 14),
        axis.text = element_text(color = "black", size = 12))

Fig_4

Show the code
sjPlot::tab_model(Mod8)
  N_PTWL
Predictors Incidence Rate Ratios CI p
(Intercept) 25.53 16.51 – 38.33 <0.001
Treatment [Pine] 0.20 0.09 – 0.40 <0.001
Distance 0.99 0.99 – 1.00 0.001
Observations 11
R2 Nagelkerke 0.998

Conclusion: There is a significant effect of whether the island sites were previously pine plantation or not. However, there is no significant interaction effect between previous land use and distance to nearest habitat.

Overall it is evident that the approach to place ‘islands’ of habitat to facilitate connectivity for PTWL as a restoration technique is valid. Over 10 years, PTWL successfully bridged a 250m habitat connectivity gap by utilising ‘islands’ of habitat that had been artificially created. The characteristics of the islands (in terms of key prey species, ants) were not meaningfully different from natural habitats, or differed meaningfully between past- pine forest or non-pine forest land.

It is evident that the number of PTWL in ex pine-forest is lower than non-pine forest habitats. As the cause is not likely to be related to the island habitats, it is plausible that other traits of the environment differ significantly. For example, perhaps predation rates are greater in ex pine forest habitat, or there are other landscape traits that disuade dispersal by PTWL.

Write-up for publication

Statistical analysis

Ant populations among habitat island patches

The number of ants under rocks (being a ranked categorical variable, with ant counts being recorded as one of either 0, 1-10, 11-50, or > 50) was analysed with ordinal logistic regression models with proportional odds. Ant eggs, ant larvae and ant burrows were recorded as presence/absence, and as such were analysed with binomial logistic regression. A model with a 14-level factor (Individual patches, number 1 - 11, and three control patches) were included as an explanatory variable to examine variation between patches. Post-hoc Tukey tests were applied to examine individual differences between patches in each model, respectively.

Ant populations and prior land use

As there were minimal differences between patches, data were combined patches with identical prior land use (Control plots, ex-pine forest plantation, non-pine forest plantation) to explore prior land use history on ant colonies. To achieve this, a second model was developed in which patches were combined based on their historical land use to form a 3 level factor (Control, Ex-pine forest, Non-pine forest). Post-hoc Tukey tests were applied to examine individual differences between patches and land use categories in each model, respectively.

PTWL data

A preliminary examination found no significant relationships between any of the ant metrics, and PTWL counts. As such, the analysis focused on exploring the primary effects of time and distance on PTWL. The number of years to first detection of PTWL in a patch, and the linear distance to nearest known PTWL habitat was examined by linear regression, with number of years as the response variable, and distance to nearest known PTWL habitat as the explanatory variable. To examine if distance had an effect on the number of PTWL detected, a generalized linear model (family = negative binomial) with distance to nearest known PTWL habitat included as an explanatory factor. A poisson distribution was not employed due to over-dispersion in the model residuals. Finally, to consider if there were effects of prior land use on number of PTWL detected, a generalised linear model (with poisson distribution) was fitted with both distance to nearest known PTWL habitat and prior land use of the island patch. The interaction effect between distance and prior land use was not significant, so the interaction term was removed.

R packages

All analyses were undertaken in R version 4.3.3. Ordinal logistic regression was undertaken using the R package Ordinal (version 2022.11-16). Binomial logistic regression was performed with the R function ‘glm’. Negative binomial generalised linear models were performed in glmmTMB (version 1.1.11). Post-hoc tests were performed using emmeans (version 1.8.0) and multcomp (version 1.4-20). Model diagnostics were examined using the R packages DHARMa (version 0.4.7) and sjPlot (version 2.8.15). Plots were produced using ggplot2 (version 3.4.3) and sjPlot. Model pseudo-R2 values were calculated for glm models using the formula: R2 = 1 - (Residual Deviance/Null Deviance).

Results

Ant populations by patch

Overall there were minor and inconsistent responses between the 4 metrics of ant populations and the island patches (Figure 1; Table 1). In all cases, models were significant but explained a small proportion of the variation (as indicated by R2 values). Tukey post-hoc tests provided limited evidence of significant difference between patches (Table 1).

With regards to the number of ants, there were significant differences between patches. However, these were relatively minor. No patches were found to be completely unique and differ significantly from all other patches. Indeed, 9 of the 11 island patches did not differ significantly from each other, as revealed by a Tukey post-hoc test (Table 1).

The presence of ant eggs also varied significantly, however the variation between patches was not marked. 8 of 11 island patches did not differ significantly from each other as revealed by a Tukey post-hoc test. Control site #2 differed significantly from Control site #3, but did not differ from Control site #1. Likewise Control site #1 did not differ significantly from the other two control sites (Table 1).

Ant larvae did differ significantly between patches, however the difference appears to be largely driven by a single patch (patch #5) (Table 1).

Likewise, there were minimal differences between the presence of ant burrows between patches. While there was a significant effect, a total of 9 of 11 patches were found not to differ significantly from eachother as revealed by a Tukey post-hoc test. In all, Patch #5 was the most different from other patches, consistently between the 4 ant colony metrics (Figure 1, Table 1).

Show the code
Fig_1

Figure 1: Percentage of rocks in each patch under which were ants present (A), ant eggs (B), ant larvae (C) and ant burrows (D). Ants present is present as 4 categories (0, 1-10, 11-50, > 50, darkest colour to lightest colour). Ant eggs, ant larvae and ant burrows are presented as present (white) or absent (black).

Ant populations by historical land use

The number of ants varied significantly between control patches, and the 11 island patches examined in this study (Chisq (1) = 11.336, P < 0.001). When split into ex-pine plantation and non-pine plantation, there was significant difference in ant numbers between ex-pine plantations, and non-pine plantations and control sites (Chisq (2) = 55.87, P < 0.001). The number of ants in the ex-pine plantation differed significantly from the control and non-pine plantation patches, however, overall model explanatory power was low (R2 = 0.043, Table 2).

The presence of ant eggs also varied significantly between control patches, and the 11 island patches examined in this study (TEST) . However, when broken down further into ex-pine plantation and non-pine plantation land use, it was found that patches in the non-pine plantation varied significantly from the control and ex-pine plantation patches, which themselves did not differ significantly (Table 3).

The presence of ant larvae also varied significantly between control patches and the 11 island patches examined in this study (TEST). However, the Tukey post-hoc test revealed that ex-pine plantations and non-pine plantation patches differed significantly, neither was significantly different from the control patches (Table 3).

The presence of ant burrows varied significantly between control patches and the 11 island patches examined in this study (TEST). There was no significant difference between the control patches and patches in the non-pine plantation land use, however patches in the ex-pine plantation were significantly different from the other 2 groups (Table 3).

PTWL

There was a significant effect of distance from nearest patch of PTWL habitat, and the number of years to detect PTWL in the island patches (F1,8 = 10.46, P =0.012; Figure 2, Table 4). There was also a strong negative relationship between number of PTWL detected and distance from nearest patch of PTWL habitat (Chisq (1) = 7.07, P = 0.008; Figure 3, Table 5). Finally, there was a significant relationship between number of PTWL detected, and distance and prior land use (Figure 4, Table 6).

Show the code
Fig_2

Figure 2. Relationship betweeen years to first detection of PTWL in an island patch, and distance to nearest known PTWL habitat.

Show the code
Fig_3

Figure 3. Relationship between number of PTWL detected in each island patch and distance to nearest known PTWL habitat. [Not sure much point in presenting this given below figure]

Show the code
Fig_4

Figure 4. Relationship between number of PTWL detected in a patch, and distance to nearest known PTWL habitat. There is a significant difference due to prior land use at the patch.

Table 1. Summary of Tukey HSD groupings between patches with respect to number of ants, presence of ant eggs, presence of ant larvae and presence of ant burrows. Letters denote similarity between patches.

Show the code
knitr::kable(Table_1)
Patch No. Ants Ant eggs Ant larvae Ant burrows
5 a a a a
Control 1 ab bcd a bcd
Control 3 bc ab a ab
1 bcd abcd a bc
4 bcde a a bcd
2 cdef abc a cd
10 def ab a cd
8 def abc a cd
3 def d a cd
7 def cd a cd
6 def abcd a cd
9 def cd a cd
Control 2 ef cd a cd
11 f cd a d

Table 2. Summary of cumulative link model of number of ants in island patches by prior land-use (control, ex-pine plantation and non-pine plantation).

Show the code
sjPlot::tab_model(Mod3)
  Ants_present
Predictors Odds Ratios CI p
0|1 0.91 0.73 – 1.14 0.422
1|2 2.80 2.23 – 3.53 <0.001
2|3 7.25 5.66 – 9.28 <0.001
Treatment2 [Non_pine] 1.01 0.77 – 1.32 0.953
Treatment2 [Pine] 2.11 1.62 – 2.74 <0.001
Observations 1400
R2 Nagelkerke 0.043

Table 3. Model coefficients for generalised linear models of ant eggs, ant larvae and ant burrows in island patches grouped by prior land use (control, ex-pine plantation and non-pine plantation. Tukey post-hoc letters…

Show the code
knitr::kable(Table_3)
Metric Coefficient Estimate Std..Error Pr…z.. Tukey
Ant eggs Treatment2Control -1.208 0.137 0.000 a
Ant eggs Treatment2Non_pine -1.688 0.123 0.000 b
Ant eggs Treatment2Pine -1.153 0.096 0.000 b
Ant larvae Treatment2Control -1.931 0.174 0.000 a
Ant larvae Treatment2Non_pine -2.498 0.169 0.000 b
Ant larvae Treatment2Pine -1.992 0.126 0.000 ab
Ant burrows Treatment2Control 0.107 0.116 0.356 a
Ant burrows Treatment2Non_pine 0.136 0.090 0.129 a
Ant burrows Treatment2Pine 0.928 0.091 0.000 b

Table 4. Summary statistics of linear model of Year to first detection of PTWL and distance to nearest PTWL habitat

Show the code
sjPlot::tab_model(Mod6)
  Year_first_detection
Predictors Estimates CI p
(Intercept) 0.98 -0.24 – 2.20 0.100
Distance PTWL habitat 0.01 0.00 – 0.02 0.012
Observations 10
R2 / R2 adjusted 0.567 / 0.512

Table 5. Summary statistics of generalised linear model of number of PTWL and distance to nearest PTWL habitat

Show the code
sjPlot::tab_model(Mod5a)
Random effect variances not available. Returned R2 does not account for random effects.
  N_PTWL
Predictors Incidence Rate Ratios CI p
(Intercept) 26.43 11.08 – 63.05 <0.001
Distance 0.99 0.98 – 1.00 0.003
Observations 11
R2 conditional / R2 marginal NA / 0.728

Table 6. Summary statistics of generalied linear model of number of PTWL and distance to nearest PTWL habitat and prior land use

Show the code
sjPlot::tab_model(Mod8)
  N_PTWL
Predictors Incidence Rate Ratios CI p
(Intercept) 25.53 16.51 – 38.33 <0.001
Treatment [Pine] 0.20 0.09 – 0.40 <0.001
Distance 0.99 0.99 – 1.00 0.001
Observations 11
R2 Nagelkerke 0.998