Equilibrium Theory of Island Biogeography
Working in pairs, you’ll be turning over stones, count the number of species present, and record the size of the stone.
Easy, right?
Do you need to identify the species present?
Be clear about the analysis before you start collecting data.
What kind of variable? - and how will I test \(y\sim x\)?
“Random sampling” and “avoiding bias” is drilled into you.
Ensure all stone sizes are evenly (equally) represented in your sample.
Each group to do \(N = 30\) stones
Survey (eyeball) the sizes available (and manageable)
Roughly call them small (S), medium (M) and large (L) and get data for \(N=10\) each, aiming for variation within.
Record number of species and size of the stone (not just S, M, L):
Beyond “it’s correlated,” what do we expect the relationship to look like?
Most common assumption is that number of species \(S\) follows a power relationship with area \(A\):
\[S = cA^z\]
which would give a straight line in a log-log plot.
Done for you – I use package readxl (after renaming your sheets A, B, C…)
library(readxl)
dat.A <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "A")
dat.B <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "B")
dat.C <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "C")
dat.D <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "D")
dat.E <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "E")[, 1:5]
dat.F <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "F")
dat.G <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "G")
dat.H <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "H")
dat.I <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "I")
dat.J <- read_xlsx("species-area data 2024 fin.xlsx", sheet = "J")
SpA.data <- rbind(dat.A, dat.B, dat.C, dat.D, dat.E, dat.F, dat.G, dat.H, dat.I, dat.J)
# write out CSV file that you will be using
write.csv(SpA.data, file = "Species-Area class data 2024.csv", row.names = F) All you need to do is read in the CSV file, which has the aggregated class data in a single ‘table’ (data frame).
Not too bad… we were hoping for \(N\sim 30\times 10 = 300\) :)
How do our S, M, L classes come out in terms of area?
How do S, M L compare between groups?
With log scale \(x\)-axis:
\(\log 0\) is undefined: add 1 to all counts.
Recall that rank correlation is unaffected by log transformation: the logarithm is ‘rank-preserving’ i.e. does not change the rank order of the values. The whole point of using a rank correlation is that it does not assume a linear relationship.
As an alternative to Spearman rank correlation test.
Call:
lm(formula = log(n.species + 1) ~ log(area), data = SpA.data)
Residuals:
Min 1Q Median 3Q Max
-1.40189 -0.30246 -0.00392 0.34005 1.31639
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.62283 0.11206 -5.558 6.53e-08 ***
log(area) 0.25679 0.01984 12.946 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.4662 on 271 degrees of freedom
Multiple R-squared: 0.3821, Adjusted R-squared: 0.3798
F-statistic: 167.6 on 1 and 271 DF, p-value: < 2.2e-16
Recall that for count data, we shouldn’t really be fitting a Linear Model (LM), because the residuals won’t be normally distributed. This is particularly so if the counts are low (small numbers of species). Count data are usually best modelled using a Poisson distribution, so we could try a Poisson Generalised Linear Model (GLM).
I’ll be fitting \(Y \sim \log X\), i.e., log-transform area but not the species counts.
Call:
glm(formula = n.species ~ log10(area), family = poisson(), data = SpA.data)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8671 -0.9467 -0.1739 0.5052 3.2596
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.34346 0.26994 -8.681 <2e-16 ***
log10(area) 1.09712 0.09794 11.202 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 433.22 on 272 degrees of freedom
Residual deviance: 285.12 on 271 degrees of freedom
AIC: 789.35
Number of Fisher Scoring iterations: 5
The line shows the predictions from the Poisson model (GLM). Looks reasonable…