1 Packages needed

library(BiodiversityR) # also loads vegan
library(ggplot2)
library(ggsci)
library(readxl)

2 Introduction

The main objective of this document is to give some examples of how data on species accumulation curves, obtained via vegan and BiodiversityR, can be plotted via ggplot2. I expect that researchers and students will modify the scripts given here to create their own graphs. This obviously requires some working knowledge of the ggplot2 package, for which I will not give details here on specific functions, arguments and choices that I selected (as an introduction to ggplot2, I would especially recommend a two-part online workshop given by Thomas Lin Pedersen).

I will also not dive deeply in the theory of species accumulation curves. Most important details and key references are available from Chapter 4 - Analysis of species richness of the Tree Diversity Analysis book that I wrote with Ric Coe. This manual on community ecology and biodiversity analysis can be downloaded for free and is also the recommended citation for BiodiversityR.

Another method by which somebody may become familiar with plotting species accumulation curves via ggplot2 is from the Graphical User Interface of BiodiversityR, available from function BiodiversityRGUI

In the two examples below, the same work flow is used of:

Step 1. Obtain the species accumulation result via function accumresult
Step 2. Extract data for plotting via function accumcomp.long
Step 3. Use the extracted data to generate a plot with ggplot2

The theme for ggplot2 plotting will be modified as follows, resulting in a more empty plotting canvas than the default for ggplot2.

BioR.theme <- theme(
        panel.background = element_blank(),
        panel.border = element_blank(),
        panel.grid = element_blank(),
        axis.line = element_line("gray25"),
        text = element_text(size = 12),
        axis.text = element_text(size = 10, colour = "gray25"),
        axis.title = element_text(size = 14, colour = "gray25"),
        legend.title = element_text(size = 14),
        legend.text = element_text(size = 14),
        legend.key = element_blank())

3 Example 1: Tree loci

Typical for community ecology analysis in vegan and BiodiversityR, example data are included in two data sets, the community data set and the environmental data set. It is expected that the rows of these data sets correspond to the same sample sites.

Data set warcom is a community data set, where variables (columns) typically correspond to different species. For this data set, however, the columns correspond to different loci, but the same general principles for species accumulation curves apply of calculating the pooled number of species (here:loci) for different sampling efforts (here: the number of pooled individual trees).

data(warcom)
str(warcom)

## 'data.frame':    100 obs. of  185 variables:
##  $ locus001: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus002: int  0 1 0 0 0 0 1 0 0 0 ...
##  $ locus003: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus004: int  0 1 1 1 1 1 1 1 0 1 ...
##  $ locus005: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus006: int  0 1 0 0 0 0 0 0 0 0 ...
##  $ locus007: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus008: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus009: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus010: int  0 1 0 1 0 0 1 0 0 1 ...
##  $ locus011: int  0 0 1 0 0 0 0 0 0 0 ...
##  $ locus012: int  1 0 1 1 1 1 1 1 0 0 ...
##  $ locus013: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus014: int  0 1 1 0 0 0 1 0 0 0 ...
##  $ locus015: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus016: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus017: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus018: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus019: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus020: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus021: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus022: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus023: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus024: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus025: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus026: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus027: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus028: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus029: int  0 1 1 0 0 0 1 0 1 0 ...
##  $ locus030: int  0 1 1 1 0 0 1 1 1 1 ...
##  $ locus031: int  0 1 1 0 0 1 1 0 1 1 ...
##  $ locus032: int  0 0 0 0 0 0 1 0 0 0 ...
##  $ locus033: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus034: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus035: int  0 1 0 0 0 0 1 0 0 0 ...
##  $ locus036: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus037: int  0 0 1 0 0 0 0 0 0 0 ...
##  $ locus038: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus039: int  0 0 1 1 1 0 1 0 0 0 ...
##  $ locus040: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus041: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus042: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus043: int  1 1 0 0 0 1 0 1 0 1 ...
##  $ locus044: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus045: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus046: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus047: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus048: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus049: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus050: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus051: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus052: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus053: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus054: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus055: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus056: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus057: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus058: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus059: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus060: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus061: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus062: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus063: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus064: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus065: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus066: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus067: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus068: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus069: int  0 0 0 1 0 0 0 0 0 0 ...
##  $ locus070: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus071: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus072: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus073: int  1 0 1 1 0 1 0 1 1 0 ...
##  $ locus074: int  1 1 1 0 1 1 1 1 1 1 ...
##  $ locus075: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus076: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus077: int  0 1 0 1 1 1 1 0 1 1 ...
##  $ locus078: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus079: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus080: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus081: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus082: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus083: int  1 1 1 0 1 1 1 1 1 1 ...
##  $ locus084: int  0 0 0 1 0 0 0 0 0 0 ...
##  $ locus085: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus086: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus087: int  1 1 1 1 1 1 1 1 1 0 ...
##  $ locus088: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus089: int  1 1 1 1 1 1 1 1 1 0 ...
##  $ locus090: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ locus091: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus092: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ locus093: int  1 1 1 1 1 1 1 1 1 0 ...
##  $ locus094: int  1 1 0 0 0 0 0 0 1 0 ...
##  $ locus095: int  1 1 1 0 1 1 1 1 1 1 ...
##  $ locus096: int  0 0 0 1 0 0 0 0 0 0 ...
##  $ locus097: int  1 1 0 1 0 1 0 1 1 1 ...
##  $ locus098: int  1 0 0 0 1 0 0 0 0 0 ...
##  $ locus099: int  1 1 1 0 0 1 0 1 0 1 ...
##   [list output truncated]

Data set warenv is an environmental data set, where variables (columns) correspond to different descriptors (typically continuous and categorical variables) of the sample sites. One of the variables is Population, a categorical variable that describes from which forest population trees were sampled. The sampling design was balanced with 20 trees sampled from each population.

data(warenv)
summary(warenv)

##     population popshort      country   rift.valley
##  Kibale  :20   KKIT:20   Kenya   :60   east:60    
##  Kitale  :20   KLAI:20   Tanzania:20   west:40    
##  Laikipia:20   KMAR:20   Uganda  :20              
##  Lushoto :20   TLUS:20                            
##  Mara    :20   UKIB:20

3.1 Step 1

Next we obtain the BiodiversityR::accumcomp result. Among the options, an exact algorithm is used to calculate accumulated species richness. Option conditioned=FALSE switches estimation of standard deviation to a formula based on the estimation of extrapolated species richness. The function returns data on species accumulation separately for each different level of the selected factor (here: population).

Accum.1 <- accumcomp(warcom, y=warenv, factor='population', 
    method='exact', conditioned=FALSE, plotit=FALSE)
Accum.1

## , ,  = Sites
## 
##           obs
## population 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
##   Kibale   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
##   Kitale   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
##   Laikipia 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
##   Lushoto  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
##   Mara     1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
## 
## , ,  = Richness
## 
##           obs
## population     1        2        3        4        5        6        7        8
##   Kibale   33.45 48.93684 58.82719 66.22559 72.21478 77.28818 81.71249 85.64720
##   Kitale   48.25 60.60000 67.22193 71.84871 75.45788 78.43524 80.97348 83.18608
##   Laikipia 37.90 53.64211 62.92456 69.22229 73.83714 77.39997 80.25867 82.62047
##   Lushoto  32.80 44.23158 50.96316 55.61981 59.10262 61.83775 64.05866 65.90712
##   Mara     37.70 50.14737 57.10702 61.79608 65.29444 68.07113 70.36694 72.32107
##           obs
## population        9       10       11       12        13        14        15
##   Kibale   89.19454 92.42348 95.38272 98.10823 100.62771 102.96334 105.13351
##   Kitale   85.14759 86.91055 88.51362 89.98616  91.35083  92.62531  93.82359
##   Laikipia 84.61668 86.33470 87.83484 89.16002  90.34150  91.40255  92.36087
##   Lushoto  67.47507 68.82459 69.99884 71.02867  71.93680  72.74058  73.45369
##   Mara     74.02111 75.52546 76.87461 78.09751  79.21518  80.24314  81.19285
##           obs
## population        16        17        18     19  20
##   Kibale   107.15397 109.03860 110.80000 112.45 114
##   Kitale    94.95686  96.03421  97.06316  98.05  99
##   Laikipia  93.23013  94.02105  94.74211  95.40  96
##   Lushoto   74.08731  74.65088  75.15263  75.60  76
##   Mara      82.07286  82.88947  83.64737  84.35  85
## 
## , ,  = sd
## 
##           obs
## population        1        2        3        4        5        6        7
##   Kibale   3.161357 3.798334 3.956127 3.971783 3.932587 3.871210 3.802779
##   Kitale   3.840989 3.994426 3.890245 3.752951 3.617703 3.490893 3.373018
##   Laikipia 2.976409 3.334598 3.357741 3.289549 3.190260 3.082316 2.975506
##   Lushoto  2.923180 3.070267 3.002921 2.892519 2.774193 2.659405 2.552739
##   Mara     3.204175 3.439760 3.407811 3.311262 3.198890 3.085490 2.976467
##           obs
## population        8        9       10       11       12       13       14
##   Kibale   3.734879 3.671377 3.614223 3.564396 3.522440 3.488748 3.463717
##   Kitale   3.263691 3.162707 3.070162 2.986387 2.911856 2.847125 2.792786
##   Laikipia 2.874397 2.781211 2.697020 2.622288 2.557145 2.501548 2.455372
##   Lushoto  2.456240 2.370638 2.295874 2.231437 2.176616 2.130673 2.092945
##   Mara     2.874251 2.780172 2.695047 2.619382 2.553463 2.497408 2.451214
##           obs
## population       15       16       17       18       19       20
##   Kibale   3.447817 3.441610 3.445757 3.460994 3.488122 3.527993
##   Kitale   2.749433 2.717641 2.697933 2.690766 2.696509 2.715440
##   Laikipia 2.418474 2.390733 2.372091 2.362573 2.362312 2.371572
##   Lushoto  2.062908 2.040195 2.024607 2.016107 2.014813 2.021004
##   Mara     2.414813 2.388118 2.371068 2.363672 2.366039 2.378402

3.2 Step 2

To plot data via ggplot2, species accumulation data is rendered in the long format via function BiodiversityR::accumcomp.long. The label.freq option modifies the variable labelit that can be used to select how often a symbol will be plotted along the species accumulation curve.

accum.long1 <- accumcomp.long(Accum.1, ci=NA, label.freq=5)
head(accum.long1)

##   Grouping Obs Sites Richness       SD      LWR      UPR labelit
## 1     Mara   1     1 37.70000 3.204175 31.01621 44.38379    TRUE
## 2     Mara   2     2 50.14737 3.439760 42.97215 57.32258   FALSE
## 3     Mara   3     3 57.10702 3.407811 49.99845 64.21559   FALSE
## 4     Mara   4     4 61.79608 3.311262 54.88891 68.70325   FALSE
## 5     Mara   5     5 65.29444 3.198890 58.62167 71.96721   FALSE
## 6     Mara   6     6 68.07113 3.085490 61.63491 74.50735    TRUE

3.3 Step 3

Now we can generate the plot…

plotgg1 <- ggplot(data=accum.long1, aes(x = Sites, y = Richness, ymax = UPR, ymin = LWR)) + 
    scale_x_continuous(expand=c(0, 1), sec.axis = dup_axis(labels=NULL, name=NULL)) +
    scale_y_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +
    geom_line(aes(colour=Grouping), size=2) +
    geom_point(data=subset(accum.long1, labelit==TRUE), 
               aes(colour=Grouping, shape=Grouping), size=5) +
    geom_ribbon(aes(colour=Grouping), alpha=0.2, show.legend=FALSE) + 
    BioR.theme +
    scale_colour_npg() +
    labs(x = "Trees", y = "Loci", colour = "Population", shape = "Population")

… and see the results.

plotgg1

As one alternative to plotting, the confidence intervals can be plotted as less opaque colours. Now the fill colour of the confidence intervals is coloured via the after_scale function introduced in ggplot2 version 3.3.0.

plotgg1 <- ggplot(data=accum.long1, aes(x = Sites, y = Richness, ymax = UPR, ymin = LWR)) + 
    scale_x_continuous(expand=c(0, 1), sec.axis = dup_axis(labels=NULL, name=NULL)) +
    scale_y_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +
    geom_line(aes(colour=Grouping), size=2) +
    geom_point(data=subset(accum.long1, labelit==TRUE), 
               aes(colour=Grouping, shape=Grouping), size=5) +
    geom_ribbon(aes(colour=Grouping, fill=after_scale(alpha(colour, 0.3))), 
                show.legend=FALSE) + 
    BioR.theme +
    scale_colour_npg() +
    labs(x = "Trees", y = "Loci", colour = "Population", shape = "Population")

… which gives these results:

plotgg1

4 Example 2: Tree inventories from Panama

The second data sets document tree species composition and site characteristics for 1-ha plots sampled in Panama. The data set can be be downloaded from the supplementary data provided by Condit et al. 2002. Beta-diversity in tropical forest trees. Science 295: 666-669. (The same data set is used in a Rpub where I showed how to generate ordination diagrams with vegan, BiodiversityR and ggplot2 ).

As I had some problems with accessing the downloaded Excel file directly with readxl, I saved the file in the .xlsx format on a local directory in my computer.

Condit.down <- "C:\\BiodiversityR_test\\Condit.xlsx"

The following script reads and formats the community data set.

Condit.1 <- as.data.frame(readxl::read_excel(Condit.down, sheet=1))
spec.columns <- as.character(Condit.1[, 1])
Condit.spec <- as.data.frame(t(Condit.1[, -1]))
names(Condit.spec) <- spec.columns

The community data set has 100 sites (rows) and 802 species (columns):

dim(Condit.spec)

## [1] 100 802

Next the environmental data set is read:

Condit.2 <- as.data.frame(readxl::read_excel(Condit.down, sheet=2))

## New names:
## * `` -> ...8

site.rows <- as.character(Condit.2[, 1])
Condit.env <- Condit.2[, 2:7]
rownames(Condit.env) <- site.rows
Condit.env$Age.cat <- factor(Condit.env$age, levels=c("1", "2", "3"))
Condit.env$Area <- rep(10000, nrow(Condit.env))

In the script above, I created two new variables: Age.cat, a categorical variable corresponding to age category; and Area that gives the sampling area in m². The reason for including the second variable was to show how the x-axis can be rescaled to different measures of sampling effort (this would be especially important if these differ among the sites).

summary(Condit.env)

##     EW coord         NS coord           precip          elev      
##  Min.   :600714   Min.   : 962862   Min.   :1887   Min.   : 10.0  
##  1st Qu.:625929   1st Qu.:1011569   1st Qu.:2516   1st Qu.:120.0  
##  Median :626404   Median :1011819   Median :2530   Median :120.0  
##  Mean   :631536   Mean   :1013061   Mean   :2581   Mean   :150.4  
##  3rd Qu.:637892   3rd Qu.:1012908   3rd Qu.:2530   3rd Qu.:120.0  
##  Max.   :688165   Max.   :1045987   Max.   :4002   Max.   :830.0  
##                                     NA's   :2      NA's   :2      
##       age         geology          Age.cat        Area      
##  Min.   :1.00   Length:100         1   :18   Min.   :10000  
##  1st Qu.:2.00   Class :character   2   :15   1st Qu.:10000  
##  Median :3.00   Mode  :character   3   :65   Median :10000  
##  Mean   :2.48                      NA's: 2   Mean   :10000  
##  3rd Qu.:3.00                                3rd Qu.:10000  
##  Max.   :3.00                                Max.   :10000  
##  NA's   :2

Some further changes were made to the data sets, first to ensure that they had the same rownames (I checked first that the order of sites was the same).

check.datasets(Condit.spec, Condit.env)

## Warning: rownames for community and environmental datasets are different

# same rownames
rownames(Condit.spec) <- rownames(Condit.env)

A next change was to remove cases with missing values.

Condit.env <- Condit.env[complete.cases(Condit.env), ]
Condit.spec <- same.sites(Condit.spec, Condit.env)
check.datasets(Condit.spec, Condit.env)

## OK

And a final change was to only use one (plot B27) of the 1-ha subplots of a 50-ha plot, in order to have a more even spatial distribution among the sites.

rows.include <- c(28, 51:98)
rownames(Condit.spec)[rows.include]

##  [1] "B27" "p1"  "p2"  "p3"  "p4"  "p5"  "p6"  "p7"  "p8"  "p9"  "p10" "p11"
## [13] "p12" "p13" "p14" "p15" "p16" "p17" "p18" "p19" "p20" "p21" "p22" "p23"
## [25] "p24" "p25" "p26" "p27" "p28" "p29" "p30" "p31" "p32" "p33" "p34" "p35"
## [37] "p36" "p37" "p38" "p39" "C1"  "C2"  "C3"  "C4"  "S0"  "S1"  "S2"  "S3" 
## [49] "S4"

Condit.spec2 <- Condit.spec[rows.include, ]
Condit.env2 <- Condit.env[rows.include, ]
check.datasets(Condit.spec2, Condit.env2)

## OK

4.1 Step 1

Now that we have a community and environmental data set in the proper format, we can proceed with step 1. Note that this time the argument of scale refers to the Area variable.

Accum.2 <- accumcomp(Condit.spec2, y=Condit.env2, factor='Age.cat', scale='Area',
    method='exact', conditioned=FALSE, plotit=FALSE)
Accum.2

## , ,  = Sites
## 
##        obs
## Age.cat     1     2     3     4     5     6     7     8     9    10     11
##       1 10000 20000 30000 40000 50000 60000 70000 80000 90000 1e+05 110000
##       2 10000 20000 30000 40000 50000 60000 70000 80000 90000 1e+05 110000
##       3 10000 20000 30000 40000 50000 60000 70000 80000 90000 1e+05 110000
##        obs
## Age.cat     12     13     14     15     16     17     18
##       1 120000 130000 140000 150000 160000 170000 180000
##       2 120000 130000 140000     NA     NA     NA     NA
##       3 120000 130000 140000 150000 160000 170000     NA
## 
## , ,  = Richness
## 
##        obs
## Age.cat        1        2        3        4        5        6        7        8
##       1 71.44444 123.2288 163.3235 195.9516 223.5049 247.4329 268.6660 287.8280
##       2 73.00000 121.9780 157.0330 183.3437 203.9226 220.6037 234.5268 246.4159
##       3 90.82353 158.2059 211.7471 256.3668 294.7959 328.6773 359.0709 386.7007
##        obs
## Age.cat        9       10       11       12       13       14       15       16
##       1 305.3511 321.5424 336.6246 350.7619 364.0774 376.6650 388.5980 399.9346
##       2 256.7418 265.8172 273.8544 281.0000 287.3571 293.0000       NA       NA
##       3 412.0854 435.6114 457.5755 478.2117 497.7080 516.2176 533.8676 550.7647
##        obs
## Age.cat       17  18
##       1 410.7222 421
##       2       NA  NA
##       3 567.0000  NA
## 
## , ,  = sd
## 
##        obs
## Age.cat        1        2        3        4        5        6        7        8
##       1 3.505466 5.226408 6.150961 6.671719 6.968705 7.136218 7.228716 7.279710
##       2 3.370460 4.806843 5.458557 5.723399 5.795832 5.777650 5.723221 5.661988
##       3 3.905818 5.809059 6.857071 7.482286 7.874308 8.128551 8.298916 8.419040
##        obs
## Age.cat        9       10       11       12       13       14       15       16
##       1 7.310675 7.335760 7.364528 7.403639 7.457936 7.531146 7.626336 7.746207
##       2 5.610003 5.576163 5.565795 5.582799 5.630981 5.714841       NA       NA
##       3 8.511777 8.593760 8.677713 8.773697 8.889807 9.032598 9.207380 9.418436
##        obs
## Age.cat       17       18
##       1 7.893275 8.069984
##       2       NA       NA
##       3 9.669246       NA

4.2 Step 2

Step 2 is the same as before:

accum.long2 <- accumcomp.long(Accum.2, ci=NA, label.freq=1)
head(accum.long2)

##   Grouping Obs Sites  Richness       SD       LWR       UPR labelit
## 1        3   1 10000  90.82353 3.905818  82.61771  99.02935    TRUE
## 2        3   2 20000 158.20588 5.809059 146.00150 170.41026    TRUE
## 3        3   3 30000 211.74706 6.857071 197.34089 226.15323    TRUE
## 4        3   4 40000 256.36681 7.482286 240.64711 272.08651    TRUE
## 5        3   5 50000 294.79590 7.874308 278.25259 311.33920    TRUE
## 6        3   6 60000 328.67728 8.128551 311.59983 345.75473    TRUE

4.3 Step 3

And step 3 is also similar (except for removing the argument of expand for the x-axis and label names). Note also that the same variable of Grouping can be used.

plotgg2 <- ggplot(data=accum.long2, aes(x = Sites, y = Richness, ymax = UPR, ymin = LWR)) + 
    scale_x_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +
    scale_y_continuous(sec.axis = dup_axis(labels=NULL, name=NULL)) +
    geom_line(aes(colour=Grouping), size=2) +
    geom_point(data=subset(accum.long2, labelit==TRUE), 
               aes(colour=Grouping, shape=Grouping), size=5) +
    geom_ribbon(aes(colour=Grouping, fill=after_scale(alpha(colour, alpha=0.3))), 
                show.legend=FALSE) + 
    BioR.theme +
    scale_colour_npg() +
    labs(x = expression("m"^2), y = "Species", colour = "Age", shape = "Age")

This is now the resulting plot.

plotgg2

5 Session Information

sessionInfo()

## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.1252 
## [2] LC_CTYPE=English_United Kingdom.1252   
## [3] LC_MONETARY=English_United Kingdom.1252
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=English_United Kingdom.1252    
## 
## attached base packages:
## [1] tcltk     stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] readxl_1.3.1         ggsci_2.9            ggplot2_3.3.2       
## [4] BiodiversityR_2.12-2 rgl_0.100.54         vegan3d_1.1-2       
## [7] vegan_2.5-6          lattice_0.20-41      permute_0.9-5       
## 
## loaded via a namespace (and not attached):
##   [1] minqa_1.2.4             colorspace_1.4-1        ellipsis_0.3.1         
##   [4] class_7.3-17            rio_0.5.16              htmlTable_2.0.0        
##   [7] base64enc_0.1-3         rstudioapi_0.11         farver_2.0.3           
##  [10] splines_4.0.2           knitr_1.28              effects_4.1-4          
##  [13] Formula_1.2-3           jsonlite_1.6.1          nloptr_1.2.2.1         
##  [16] cluster_2.1.0           Rcmdr_2.6-2             png_0.1-7              
##  [19] shiny_1.4.0.2           compiler_4.0.2          backports_1.1.7        
##  [22] Matrix_1.2-18           fastmap_1.0.1           survey_4.0             
##  [25] later_1.1.0.1           tcltk2_1.2-11           acepack_1.4.1          
##  [28] htmltools_0.5.0         tools_4.0.2             gtable_0.3.0           
##  [31] glue_1.4.1              dplyr_1.0.2             Rcpp_1.0.4.6           
##  [34] carData_3.0-4           cellranger_1.1.0        vctrs_0.3.4            
##  [37] nlme_3.1-148            crosstalk_1.1.0.1       xfun_0.15              
##  [40] stringr_1.4.0           openxlsx_4.1.5          lme4_1.1-23            
##  [43] mime_0.9                miniUI_0.1.1.1          lifecycle_0.2.0        
##  [46] RcmdrMisc_2.7-0         statmod_1.4.34          MASS_7.3-51.6          
##  [49] zoo_1.8-8               scales_1.1.1            hms_0.5.3              
##  [52] promises_1.1.1          parallel_4.0.2          sandwich_2.5-1         
##  [55] RColorBrewer_1.1-2      yaml_2.2.1              curl_4.3               
##  [58] gridExtra_2.3           rpart_4.1-15            latticeExtra_0.6-29    
##  [61] stringi_1.4.6           nortest_1.0-4           e1071_1.7-3            
##  [64] checkmate_2.0.0         boot_1.3-25             zip_2.0.4              
##  [67] manipulateWidget_0.10.1 rlang_0.4.8             pkgconfig_2.0.3        
##  [70] evaluate_0.14           purrr_0.3.4             htmlwidgets_1.5.1      
##  [73] labeling_0.3            tidyselect_1.1.0        magrittr_1.5           
##  [76] R6_2.4.1                generics_0.1.0          Hmisc_4.4-0            
##  [79] DBI_1.1.0               pillar_1.4.4            haven_2.3.1            
##  [82] foreign_0.8-80          withr_2.2.0             mgcv_1.8-31            
##  [85] survival_3.1-12         scatterplot3d_0.3-41    abind_1.4-5            
##  [88] nnet_7.3-14             tibble_3.0.1            crayon_1.3.4           
##  [91] car_3.0-8               relimp_1.0-5            rmarkdown_2.3          
##  [94] jpeg_0.1-8.1            grid_4.0.2              data.table_1.12.8      
##  [97] forcats_0.5.0           digest_0.6.25           webshot_0.5.2          
## [100] xtable_1.8-4            httpuv_1.5.4            munsell_0.5.0          
## [103] mitools_2.4

Species Accumulation Curves with vegan, BiodiversityR and ggplot2

Roeland Kindt

19/11/2020