1. Context

1.1 Summary

On this webpage we will look on how sand mining in Stradbroke Island has affected native ants, and we will (statitiscally) explore if rehabilation of mine sties has had any effect on the ants’ abundance and diversity

Ant in the sand “Sand mining”

Figure 1. Ants vs Sand Minining, Who will win? Only time and stats will tell if Sand Mine Rehabilitation is Effective for increasing native ant diversity and abundance

1.2 Sand Mining in Stradbroke

Stradbroke Island, also known as Minjerribah, is located near the Moreton Bay, just 30 km away southeast of Brisbane CBD. Sand Mining on Stradbroke began in the 1950’s. Concerns about environmental impacts began a decade later, with cases of dredge mining. This type of mining levels high sand dunes and stripping native vegetation. Due to the high economic importance of minerals found in Stradbroke’s sands, like Silica, Rutile, Zircon, Ilimenite, sand mining kept going.

Silbelco Australia operated 3 mine sites, including the Enterprise Mine, which operated close to Ramsar-Listed wetlands but had a small buffer zone. After several environmental inicidents and legal disputes, Tte Enterpise Mine closed in 2019. Rehabilitation of the area is still underway.

Figure 2. Location of Stradbroke Ilsand and Beachside Scenery of Stradbroke Figure 2. Location of Stradbroke Ilsand and Beachside Scenery of Stradbroke

Figure 2. Location of Stradbroke Island and Beachside Scenery of Stradbroke

2. Analysis

To measure the efficacy of mine rehabilitation efforts, students analyze data collected from the sites.

2.1 Data Collection

Students of the Masters of Conservation Biology program traveled to Stradbroke Island to survey ant fauna and their response towards mine rehabilitation.

Each site is classified according to the year in which the rehabilitation happened, There are four distinct possibilities for the year of rehabilitation. At each of these possibilities, three representative sites are randomly chosen by the students, and paired to a nearby control site that has never been sandmined. That makes six sites at each “year of rehabilitation”.

Ants are collected using pitfall traps (See Figure 3). Each study site contains 10 traps, separated by a meter from each other. Ants are collected after one night of trapping.

Figure 3. Basic architecture of insect pitfalltrap (c) Bedfordshire Natural History Society

2.2 Species Abundance and Diversity

2.2.1 Species Overview

Fisrt lets look on how each species is represented in control and rehabilitation sites.

library("viridis")
## Loading required package: viridisLite
par(mar=c(11,5,2,2))
Xlocations <- barplot(Abundance_at_Control_Sites, beside=T, names.arg=c(names(Species_Names)), ylab="Abundance", xlab="", main="Graph1: Abundance of Ants - Control Sites", ylim=c(0,1500), las=2, cex.names=0.8, col="magma" (5))    
text("<-Nylanderia obscura", y=300, x=33)
text("<-Aphaenogaster longiceps", y=1400, x=12)

Xlocations <- barplot(Abundance_at_Mine_Sites, beside=T, names.arg=c(names(Species_Names)), ylab="Abundance", xlab="", main="Graph 2: Abundance of Ants - Mine Sites", ylim=c(0,1500), las=2, cex.names=0.8, col="magma"(5))  
text("<-Nylanderia obscura", y=1400, x=33)
text("<-Aphaenogaster longiceps", y=300, x=12)

In both of these graphs we can see that Aphaenogaster longiceps and Nylanderia obscura are amongst the most common abundant species across all sites. Aphaenogaster longiceps is more common in control sites while Nylanderia obscura is more common in the Mine Sites. The “overabundance” of Aphaenogaster longiceps could be explained by ant interspecific competition (Majer, 1985). This hints that either this two species are naturally the most common ones, but it could aslo suggest that the effects of rehabilitation are not signficant to restoring ant diversity.

2.2.1.1 Aphaenogaster longiceps and Nylanderia obscura

For our two most common species we can see that the majority of observations lean towards zero, hinting that despite their supposed abundance, the records for species is quite small.

2.2.2 Ant diversity vs Abundance

Lets take a closer look on how Species diversity compares to abundance

Ants<- read.csv("Ants.txt")
attach(Ants)
## The following objects are masked from Ants_2015_2019:
## 
##     Rehab_Year, Rep, Rep_Year, Samples, Site, Treatment
library("viridis")
par(mar=c(5.1, 6.1, 4.1, 4.1))
plot(Abundance,Diversity,main="Abundance vs Diversity",font=3,cex=1,cex.lab=1,pch=19,col=magma(50))

In this graph we can observe that Ant Diversity is clusterred towards lower Abundance. While this could be attributed to interspecific competition (Majer, 1985), we would need to transform the data to have a bettter understanding of the different effects (Site, Rehabilitation Year etc) are doing to Ant Diversity.

2.2.2.2 QQNORM and Residuals

Here we will use scatter plots to see if the assumptions of our F-test have been violated.

Ants<- read.csv("Ants.txt")  
attach(Ants) # make 'Ants' the default data set 
library("viridis")
fit1 <- lm(Abundance ~ Diversity)  # fit the regression line of best fit through the data 
plot(fit1$fitted.values,fit1$residuals, col=magma(5))

qqnorm(fit1$residuals, col=magma(500))

Here we can observe that the dots on “qqnomr” plot do not form a straight line. This is telling us that are residuals are not gaussian, indicating that there is no statistical association between diversity and abundance.

2.2.3 Transforming the data

Here we wil transform our Abundance data with “Log-Tansformation”. Since cannot calculate “Log 0” We add the value of +1 incase any of our exisitng values are zero . Let’s have a look at our new graph…

fit2 <- (log(Abundance+1) ~ Diversity)
plot(fit2,bty="n",main="Log Transformed Abundance vs Diversity",font=3,cex=1,cex.lab=1,pch=19,col=magma(50))

It seems that diversity and Abundance are positively associated with each other, but the graph does not gives us concrete evidence of such (maybe a small P value will confirm our theory)

2.2.3 .1 QQNORM AND RESIDUALS

Now lets do some diagnostics…

fit2 <- lm(log(Abundance+1) ~ Diversity) # try log-transforming 'Abundance' 
plot(fit2$fitted.values,fit2$residuals, col=magma(500))

qqnorm(fit2$residuals, col=magma(500))

Again our qqplot does not form a straight line, indicating that our residuals are not gaussian. Our log-transformed graph at the beginning of this section is indicating us a possible association between Species and Diversity, we will now use an ANOVA to verify this. However, the scatter of our QQPLOT shows us that we need to be careful and we can only take P-Values that are not borderline (i.e. close to 0.05)

Ants<- read.csv("Ants.txt")  
attach(Ants) # make 'Ants' the default data set
anova(fit2)
## Analysis of Variance Table
## 
## Response: log(Abundance + 1)
##            Df Sum Sq Mean Sq F value    Pr(>F)    
## Diversity   1  96.06  96.057  83.975 < 2.2e-16 ***
## Residuals 789 902.52   1.144                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(fit2)
## 
## Call:
## lm(formula = log(Abundance + 1) ~ Diversity)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.0270 -1.0270 -0.3338  0.5825  5.3497 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.02698    0.04413  23.274   <2e-16 ***
## Diversity    1.01184    0.11042   9.164   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.07 on 789 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.09619,    Adjusted R-squared:  0.09505 
## F-statistic: 83.97 on 1 and 789 DF,  p-value: < 2.2e-16

If we look at our Pvalue and our regression coefficient, we can observe that Ant Abundance and Diversity have a strong positive association. Even tho we used a log-transformation on Abundance, we can assume sand mining has the same effect because the “Abdunadce” and “logAbundance+1” have a one to one correspondance.

2.3 Do rehabilitation efforts help?

If we plot the results from our last Anova, we have a an overlap between the abundance of ants in control and rehabilitated site…

library("plotrix")
## Warning: package 'plotrix' was built under R version 4.1.1
Ants<- read.csv("Ants.txt")  
attach(Ants) # make 'Ants' the default data set
 groups <- split(log(Abundance+1),Treatment) # split data into "Treatment" groups 
n <- sapply(groups,length) # calculate the sample size of each group 
MSE <- deviance(fit2)/fit2$df.residual # calculate the residual error variance 
se <- sqrt(MSE/n) # convert it to a standard error 
t <- qt(0.975,df=fit2$df.residual) # use the t-distribution to define a 95% CI 
size = t*se # set the size of each error bar 
means <- sapply(groups ,mean, na.rm=TRUE) # calculate a mean at each level of Treatment 
xloc = seq(1:length(groups)) # set locations on X-axis for the two treatments 
    plotCI( xloc, means, uiw=size, gap=0.1, xlim=c(min(xloc)-0.5,max(xloc)+0.5), 
           ylab=list("log(Abundance+1)",cex=1.75), xlab=list("Treatment",cex=1.75), xaxt="n") 
    # plot means and error bars 
    axis(side=1, at=xloc, labels=names(groups), cex.axis=1.5) # label the x-axis 

This is telling us that Ant Abundance is not changing across sites, perphas hinting that the effect of Rehabilitation Efforts is not helping restore and Abundance. Since we already know that Ant Abundance is positively associated with Diversity, we can assume that changes in Ant abundance can reflect changes in Ant Diversity.

2.3.1 ANOVA

In the ANOVA table below we an observe that there is a highly significant interaction between the treatment and year of rehabilitation (P= 2.870e-05) .

Ants<- read.csv("Ants.txt")  
attach(Ants) # make 'Ants' the default data set 
full_model <- lm(log(Abundance+1) ~ Treatment * Rep_Year * Rehab_Year )
anova(full_model)
## Analysis of Variance Table
## 
## Response: log(Abundance + 1)
##                                Df Sum Sq Mean Sq F value    Pr(>F)    
## Treatment                       1   1.46   1.460  1.3883  0.239054    
## Rep_Year                        2  27.23  13.614 12.9456 2.955e-06 ***
## Rehab_Year                      3  16.63   5.545  5.2724  0.001327 ** 
## Treatment:Rep_Year              2  22.30  11.150 10.6024 2.870e-05 ***
## Treatment:Rehab_Year            3  95.60  31.867 30.3026 < 2.2e-16 ***
## Rep_Year:Rehab_Year             6  14.64   2.441  2.3209  0.031521 *  
## Treatment:Rep_Year:Rehab_Year   6  14.12   2.353  2.2375  0.037897 *  
## Residuals                     767 806.60   1.052                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Since Rehabilitation_Year was had a strong effect on abundance, lets see how that relationship looks in a graph

Ants<- read.csv("Ants.txt")  
attach(Ants) # make 'Ants' the default data set
library(plotrix)
full_model <- lm(log(Abundance+1) ~ Treatment * Rep_Year * Rehab_Year )
groups <- split(log(Abundance+1) , paste(Rehab_Year, Treatment)) # split data into groups 
n <- sapply(groups,length)              # calculate a sample size within each group 
means <- sapply(groups ,mean,na.rm=TRUE)  # calculate a mean  within each group 
MSE <- deviance(full_model)/df.residual(full_model) # best estimate of the residual variance 
se <- sqrt(MSE/n)           # calculate the standard error of each mean 
t <- qt(0.975,df=full_model$df.residual)    # define 95% CI using t-distribution 
width =  t*se               # set the width of each error bar 
xloc <- seq(1:length(groups))       # X-axis locations 
plotCI(xloc, means, uiw=width, xaxt="n",xlim=c(0.5,max(xloc)+0.5),ylab=list("mean log(Abundance+1)",cex=1.25),,cex=1.5,
       xlab=list("Rehabilitation Year",cex=1.75),col=c("blue","purple")) 
axis(side=1, at=c(1.5,3.5,5.5,7.5), labels=unique(Rehab_Year), cex.axis=1.5) # add X-axis labels 
legend("topleft",legend=unique(Treatment),pch=1,col=c("blue","purple"),cex=1.5 ) 
for (i in seq(1,length(groups),2)){ 
  segments(xloc[i],means[i],xloc[i+1],means[i+1],lty="dashed",lwd=1.2) } 

This is telling us that Ant Abundance is strongly affected by Treatment and Rehabilitation Year. This means that the longer the period has been rehabilitated, the more Ants we can observe, This is also telling us that Rehabilitation Alone has no effect and the best "Rehabilitation: is to give the Ants time to recolonize.

If we want to sure that this statistical relationship is true, we need to consider Sites into the analysis. We havent compared if variance between sites has caused changes in our P-Values. Perhaps by analyzing our data using mixed models, we can state that “Sites” are a “random explanatory variable”.

Let’s get started (trust me is that the last bit of hard code, congrats on making it this far!)

Ants<- read.csv("Ants.txt")  
attach(Ants) # make 'Ants' the default data set 
library(lmerTest)
## Warning: package 'lmerTest' was built under R version 4.1.1
## Warning: package 'lme4' was built under R version 4.1.1
mixed_model <- lmer(log(Abundance+1) ~ Treatment * Rehab_Year * Rep_Year + (1|Site) , data=Ants ) 
anova(mixed_model)
## Type III Analysis of Variance Table with Satterthwaite's method
##                                Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)
## Treatment                      0.2015  0.2015     1  15.98  0.2150   0.64917
## Rehab_Year                     2.4200  0.8067     3  15.98  0.8603   0.48172
## Rep_Year                      26.9714 13.4857     2 750.99 14.3830 7.417e-07
## Treatment:Rehab_Year          13.9610  4.6537     3  15.98  4.9633   0.01271
## Treatment:Rep_Year            22.6802 11.3401     2 750.99 12.0946 6.764e-06
## Rehab_Year:Rep_Year           14.7683  2.4614     6 750.99  2.6252   0.01589
## Treatment:Rehab_Year:Rep_Year 14.1676  2.3613     6 750.99  2.5184   0.02026
##                                  
## Treatment                        
## Rehab_Year                       
## Rep_Year                      ***
## Treatment:Rehab_Year          *  
## Treatment:Rep_Year            ***
## Rehab_Year:Rep_Year           *  
## Treatment:Rehab_Year:Rep_Year *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now lets look at the residuals and their coefficients

summary(mixed_model)$coefficients  
##                                                 Estimate Std. Error       df
## (Intercept)                                   1.55197342  0.2896300  26.6036
## TreatmentMine                                -1.15127520  0.4095986  26.6036
## Rehab_YearY_1980                             -0.23097124  0.4095986  26.6036
## Rehab_YearY_1996                             -0.18529235  0.4095986  26.6036
## Rehab_YearY_1997                             -0.52041120  0.4095986  26.6036
## Rep_YearR2016                                -0.51581115  0.2383803 750.9804
## Rep_YearR2017                                 0.21663969  0.2383803 750.9804
## TreatmentMine:Rehab_YearY_1980                0.72763685  0.5792599  26.6036
## TreatmentMine:Rehab_YearY_1996                0.82506659  0.5792599  26.6036
## TreatmentMine:Rehab_YearY_1997                1.24534357  0.5792599  26.6036
## TreatmentMine:Rep_YearR2016                   0.85668872  0.3371206 750.9804
## TreatmentMine:Rep_YearR2017                   0.12571638  0.3371206 750.9804
## Rehab_YearY_1980:Rep_YearR2016                0.32293538  0.3371206 750.9804
## Rehab_YearY_1996:Rep_YearR2016                0.03271149  0.3371206 750.9804
## Rehab_YearY_1997:Rep_YearR2016                0.15914137  0.3371206 750.9804
## Rehab_YearY_1980:Rep_YearR2017                0.54980275  0.3371206 750.9804
## Rehab_YearY_1996:Rep_YearR2017               -0.30896232  0.3371206 750.9804
## Rehab_YearY_1997:Rep_YearR2017               -0.07270681  0.3371206 750.9804
## TreatmentMine:Rehab_YearY_1980:Rep_YearR2016 -0.84575391  0.4767606 750.9804
## TreatmentMine:Rehab_YearY_1996:Rep_YearR2016  0.08538906  0.4777077 750.9988
## TreatmentMine:Rehab_YearY_1997:Rep_YearR2016  0.60521730  0.4767606 750.9804
## TreatmentMine:Rehab_YearY_1980:Rep_YearR2017 -0.57509931  0.4767606 750.9804
## TreatmentMine:Rehab_YearY_1996:Rep_YearR2017  0.16244479  0.4767606 750.9804
## TreatmentMine:Rehab_YearY_1997:Rep_YearR2017  1.05669878  0.4767606 750.9804
##                                                  t value     Pr(>|t|)
## (Intercept)                                   5.35846967 1.218355e-05
## TreatmentMine                                -2.81073986 9.158600e-03
## Rehab_YearY_1980                             -0.56389650 5.775478e-01
## Rehab_YearY_1996                             -0.45237541 6.546659e-01
## Rehab_YearY_1997                             -1.27053941 2.148917e-01
## Rep_YearR2016                                -2.16381629 3.079309e-02
## Rep_YearR2017                                 0.90879867 3.637480e-01
## TreatmentMine:Rehab_YearY_1980                1.25614910 2.199815e-01
## TreatmentMine:Rehab_YearY_1996                1.42434601 1.659716e-01
## TreatmentMine:Rehab_YearY_1997                2.14988726 4.082106e-02
## TreatmentMine:Rep_YearR2016                   2.54119337 1.124755e-02
## TreatmentMine:Rep_YearR2017                   0.37291213 7.093191e-01
## Rehab_YearY_1980:Rep_YearR2016                0.95792231 3.384102e-01
## Rehab_YearY_1996:Rep_YearR2016                0.09703199 9.227269e-01
## Rehab_YearY_1997:Rep_YearR2016                0.47206062 6.370207e-01
## Rehab_YearY_1980:Rep_YearR2017                1.63087836 1.033353e-01
## Rehab_YearY_1996:Rep_YearR2017               -0.91647406 3.597125e-01
## Rehab_YearY_1997:Rep_YearR2017               -0.21567002 8.293035e-01
## TreatmentMine:Rehab_YearY_1980:Rep_YearR2016 -1.77395942 7.647500e-02
## TreatmentMine:Rehab_YearY_1996:Rep_YearR2016  0.17874752 8.581842e-01
## TreatmentMine:Rehab_YearY_1997:Rep_YearR2016  1.26943655 2.046787e-01
## TreatmentMine:Rehab_YearY_1980:Rep_YearR2017 -1.20626441 2.280954e-01
## TreatmentMine:Rehab_YearY_1996:Rep_YearR2017  0.34072615 7.334050e-01
## TreatmentMine:Rehab_YearY_1997:Rep_YearR2017  2.21641394 2.696211e-02
summary(mixed_model)$sigma^2 
## [1] 0.9376152

By declaring “Site” as a “random explanatory variable” , we have increased the statistical significance of our variables. Our residual variance has been reduced, and our new model considers that some of the “noise” in the data is due to the “random explanatory variables”. In this case, Site has been treated as an independent level of replication, so our new code has removed “Site” variation from residual variance.

Compared to previous ANOVA Tables, we can see that the interaction between Treatment:Rehabilitation_Year has become more significant (P: 6.764e-06), indicating that even with adjusting “Site” as a r“random explanatory variable”, the interaction between Treatment:Rehabilitation_Year is highly signficant

3. Conclusions

First of all :CONGRATUALTIONS! You were brave enough to follow me in this journey of Ants (but mostly Stats) Now that we’ve made this far…What can we conclude? Firstly that we need a good comprehension of stats. The Ants of Stradie need us to do the hard stats. We can condense the data and prove that the rehabilitation of mine sites hasnt helped our eight-legged friends.

We learnt that we need to be careful into considering different levels of replication. Including “Sites” as random explanatory variable helped us to show a stronger relationship between “Treatment:Rehabilitation_Year” and “LogAntAbundance+1” . Perhaps the rehabilitate has not been implemented equally across all sites, or maybe different ants species respond differently to ecological disturbances (we did see that Aphaenogaster longiceps and Nylanderia obscura were the most common ants ). This indicates that ant ecology and/or behaviour influences diversity and abundance on sites (Majer, 1985)

Where off to next? Maybe we could look at the response from different species across all the sites (perhaps use your most eager student…or the one you hate the most). Maybe if we have data of the Response of each species to rehabilitation efforts (and treatment, and treamtent:rehabilitation_year, etc) we can conclude how effective ant restoration has been in Stradie.

Thank you for time, now go have break, take a walk, have a cuppa and help spread the better use of stats (and Rmarkdown!)

(

via GIPHY

)

References

MAJER, J.D. (1985), Recolonization by ants of rehabilitated mineral sand mines on North Stradbroke Island, Queensland, with particular reference to seed removal. Australian Journal of Ecology, 10: 31-48. https://doi.org/10.1111/j.1442-9993.1985.tb00861.x