See my webpage for more R-related material https://brouwern.wordpress.com/main-page/
This is a very short outline for using the iNEXT and vegan packages to calcualte the chao1 species richness estimate. It also has notes on a difference between iNEXT and vegan’s output, which I believe has to do with the use of a small sample size correction by iNEXT.
library(vegan)
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.4-0
library(iNEXT)
I’ll use the spider data from iNEXT as an example (origianly from Ellison et al. 2010, Sackett et al. 2011, probably a subset of Table 3 in Sackett.).
Note that iNEXT presents data in as vectors in a list, not as a dataframe. Below I work with the data both as a list and as a dataframe. I am not sure if it matters; I thought I had previously had problems with dataframes but I was able to work with dataframes below. I think that if data is in list-vector format there might be issues with zerios, but I am not sure.
data(spider)
spider
## $Girdled
## [1] 46 22 17 15 15 9 8 6 6 4 2 2 2 2 1 1 1 1 1 1 1 1 1
## [24] 1 1 1
##
## $Logged
## [1] 88 22 16 15 13 10 8 8 7 7 7 5 4 4 4 3 3 3 3 2 2 2 2
## [24] 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Let’s say you need to get data into iNEXT yourself. You probably have your data in a dataframe. In what follows, I will take the sample iNEXT::spider data, turn it into a dataframe, then re-format it for iNEXT. I’m going to make up artitrary species designations, and insert some zeros into the girdled data, which has fewer observations than the “Logged” data. NOTE: this DOES NOT reflect the structure of the original data!
Details done matter.
girdled <- data.frame(spp.i = 1:length(spider$Girdled),
Girdled = spider$Girdled)
logged <- data.frame(spp.i = 1:length(spider$Logged),
Logged = spider$Logged)
df <- merge(girdled,logged, by = "spp.i", all = T)
df[is.na(df)] <- 0
Here’s the fake spider dataframe I made. This is probably a common way to have your data set up for an analysis. Note that several species that occurr at the logged site do not show up in the girdled site.
head(df)
## spp.i Girdled Logged
## 1 1 46 88
## 2 2 22 22
## 3 3 17 16
## 4 4 15 15
## 5 5 15 13
## 6 6 9 10
tail(df)
## spp.i Girdled Logged
## 32 32 0 1
## 33 33 0 1
## 34 34 0 1
## 35 35 0 1
## 36 36 0 1
## 37 37 0 1
iNEXT will run on a single column of a dataframe
ChaoRichness(df$Girdled)
## Observed Estimator Est_s.e. 95% Lower 95% Upper
## 1 26 43.893 14.306 30.511 96.971
iNEXT will run aon multiple columns of a dataframe if you specify the columns with the data.
ChaoRichness(df[,c("Girdled","Logged")])
## Observed Estimator Est_s.e. 95% Lower 95% Upper
## Girdled 26 43.893 14.306 30.511 96.971
## Logged 37 61.403 18.532 43.502 128.583
The iNEXT function itself will also work.
i.out <- iNEXT(df[,c("Girdled","Logged")], q=0, datatype="abundance")
library(ggplot2)
ggiNEXT(i.out,)
Run the iNEXT function on the spider data
spider.girdled <- iNEXT(spider$Girdled, q=0, datatype="abundance")
Look at Chao1 estiamte from iNEXt. The output comes in a list, and the Chao1 point estimates are in “AsyEst”, which is think means “asymptotic estimate”.
#Called from results of iNEXT()
spider.girdled$AsyEst[1,]
## Observed Estimator Est_s.e. 95% Lower 95% Upper
## 26.000 43.893 14.306 30.511 96.971
There’s also a function ChaoRichness() that gives it to you directly.
#Called directly with ChaoRichness()
i.next.out <-ChaoRichness(spider$Girdled)
Vegan can calculate “Extrapolted Species Richness in a Speices Pool” using the estimateR() function, which is “based on abundances (counts) on single sample site.”
Chao1 from vegan
vegan.out <-estimateR(spider$Girdled)
The documentation for estiamteR states: “The abundance-based estimates in estimateR use counts (numbers of individuals) of species in a single site. … The two variants of extrapolated richness in estimateR are bias-corrected Chao and ACE (O’Hara 2005, Chiu et al. 2014). The Chao estimate is similar as the bias corrected one above, but a_i refers to the number of species with abundance i instead of number of sites, and the small-sample correction is not used.” (emphasis mine)
For the related vegan::specpool() function vegan gives the following equations
w/ the note “specpool normally uses basic Chao equation, but when there are no doubletons (a2=0) it switches to bias-corrected version. In that case the Chao equation simplifies to S_0 + (N-1)/N * a1*(a1-1)/2."
i.next.out[,1:2]
## Observed Estimator
## 1 26 43.893
round(vegan.out[1:2],1)
## S.obs S.chao1
## 26.0 39.2