Example workflow

There are four steps to a wrappeR workflow: 1) create a roster to specify which filters should be applied to the occupancy model outputs; 2) apply those filters; 3) create the MSI using the filtered outputs; and, optionally, 4) summarise the outputs in the format that e.g. JNCC require. Note that I don’t explain all arguments to the wrappeR functions in this document, but they can be found in the package documentation.

Step 1: Create a roster

It may be necessary to filter the occupancy model outputs before they can be used to create a MSI. For example, you may want to clip the individual species’ time series such that they only contribute to the MSI after the first year for which they have any records, and before the final year for which they have any records. Or you may want to select a subset of species, such as pollinators or priority species, for inclusion. Later I will introduce the function applyFilters which, as you can probably guess, applies filters to the occupancy model outputs. applyFilters works on one taxonomic group at a time. If an identical set of filters were to be applied to each taxonomic group, then it would be simple to use e.g. lapply to apply the function to all groups of interest. However, you may want to apply different filters to different groups; for example, you may want outputs for the whole of the UK for one group, but only Great Britain for another which has not been recorded in Northern Ireland. To enable the use of lapply where multiple arguments (filters) vary, we create a roster that can be passed to applyFilters. The roster is a list of 1-row dataframes in which each column corresponds to a filter. Rosters can be created using the createRoster function:

roster <- createRoster(index = 1:2,
                       modPath = rep("/data-s3/occmods/", 2), 
                       metaPath = rep("/data-s3/metadata/", 2),
                       ver = rep("2017_Charlie", 2),
                       indicator = rep("all", 2),
                       region = rep("GB", 2),
                       nSamps = rep(1000, 2),
                       minObs = rep(50, 2),
                       write = rep(FALSE, 2), 
                       outPath = rep(0, 2),
                       clipBy = rep("species", 2),
                       group = c("Ants","Bees"),
                       t0 = rep(1970, 2), 
                       tn = rep(2015, 2))

str(roster)

## List of 2
##  $ 1:'data.frame':   1 obs. of  15 variables:
##   ..$ index    : int 1
##   ..$ modPath  : Factor w/ 1 level "/data-s3/occmods/": 1
##   ..$ datPath  : Factor w/ 2 levels "/data-s3/occmods/Ants/2017_Charlie/",..: 1
##   ..$ metaPath : Factor w/ 1 level "/data-s3/metadata/": 1
##   ..$ ver      : Factor w/ 1 level "2017_Charlie": 1
##   ..$ group    : Factor w/ 2 levels "Ants","Bees": 1
##   ..$ indicator: Factor w/ 1 level "all": 1
##   ..$ region   : Factor w/ 1 level "GB": 1
##   ..$ nSamps   : num 1000
##   ..$ minObs   : num 50
##   ..$ write    : logi FALSE
##   ..$ outPath  : num 0
##   ..$ clipBy   : Factor w/ 1 level "species": 1
##   ..$ t0       : num 1970
##   ..$ tn       : num 2015
##  $ 2:'data.frame':   1 obs. of  15 variables:
##   ..$ index    : int 2
##   ..$ modPath  : Factor w/ 1 level "/data-s3/occmods/": 1
##   ..$ datPath  : Factor w/ 2 levels "/data-s3/occmods/Ants/2017_Charlie/",..: 2
##   ..$ metaPath : Factor w/ 1 level "/data-s3/metadata/": 1
##   ..$ ver      : Factor w/ 1 level "2017_Charlie": 1
##   ..$ group    : Factor w/ 2 levels "Ants","Bees": 2
##   ..$ indicator: Factor w/ 1 level "all": 1
##   ..$ region   : Factor w/ 1 level "GB": 1
##   ..$ nSamps   : num 1000
##   ..$ minObs   : num 50
##   ..$ write    : logi FALSE
##   ..$ outPath  : num 0
##   ..$ clipBy   : Factor w/ 1 level "species": 1
##   ..$ t0       : num 1970
##   ..$ tn       : num 2015

Step 2: Filter the model outputs

Each argument passed to createRoster should be a vector with a length equal to the number of taxonomic groups to be included in the MSI. This way roster can be passed to the applyFilters function which will filter the outputs for each taxonomic group according to the other columns in roster:

filterDat <- rbind.fill(lapply(roster, 
       wrappeR::applyFilters,
       parallel = TRUE))

At this stage we have produced a set of filtered occupancy model outputs for the chosen taxa in the following format:

str(filterDat)

## 'data.frame':    244000 obs. of  49 variables:
##  $ year_1970: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1971: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1972: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1973: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1974: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1975: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1976: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1977: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1978: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1979: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1980: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_1981: num  0.0271 0.0185 0.0279 0.0767 0.0767 ...
##  $ year_1982: num  0.0418 0.0172 0.0324 0.0771 0.0771 ...
##  $ year_1983: num  0.0377 0.0135 0.0213 0.0767 0.0767 ...
##  $ year_1984: num  0.02788 0.00984 0.02091 0.07708 0.07667 ...
##  $ year_1985: num  0.0254 0.0123 0.0144 0.0763 0.0767 ...
##  $ year_1986: num  0.0246 0.0139 0.0164 0.0767 0.0767 ...
##  $ year_1987: num  0.02542 0.00902 0.01886 0.07667 0.07667 ...
##  $ year_1988: num  0.0316 0.0103 0.0205 0.0767 0.0767 ...
##  $ year_1989: num  0.0398 0.0168 0.0213 0.0767 0.0771 ...
##  $ year_1990: num  0.0549 0.0217 0.0258 0.0767 0.0771 ...
##  $ year_1991: num  0.0504 0.0209 0.0258 0.0779 0.0771 ...
##  $ year_1992: num  0.05 0.016 0.0344 0.0767 0.0767 ...
##  $ year_1993: num  0.0402 0.0234 0.0295 0.0767 0.0771 ...
##  $ year_1994: num  0.0316 0.0254 0.032 0.0767 0.0767 ...
##  $ year_1995: num  0.0291 0.023 0.0365 0.0767 0.0767 ...
##  $ year_1996: num  0.0287 0.0189 0.034 0.0767 0.0767 ...
##  $ year_1997: num  0.0283 0.0213 0.0295 0.0771 0.0767 ...
##  $ year_1998: num  0.0287 0.0246 0.0353 0.0767 0.0767 ...
##  $ year_1999: num  0.0418 0.0312 0.0402 0.0767 0.0775 ...
##  $ year_2000: num  0.0373 0.0267 0.0324 0.0771 0.0771 ...
##  $ year_2001: num  0.0316 0.0193 0.0193 0.0767 0.0771 ...
##  $ year_2002: num  0.0295 0.0209 0.0209 0.0783 0.0767 ...
##  $ year_2003: num  0.0267 0.0164 0.0164 0.0771 0.0767 ...
##  $ year_2004: num  0.018 0.0172 0.0176 0.0783 0.0787 ...
##  $ year_2005: num  0.0152 0.0172 0.0189 0.0779 0.0808 ...
##  $ year_2006: num  0.00984 0.01435 0.01886 0.07954 0.07708 ...
##  $ year_2007: num  0.00984 0.01353 0.01435 0.07749 0.07872 ...
##  $ year_2008: num  0.018 0.0152 0.0234 0.0771 0.0783 ...
##  $ year_2009: num  0.0205 0.0172 0.0205 0.0775 0.0771 ...
##  $ year_2010: num  0.0193 0.0152 0.0172 0.0775 0.0775 ...
##  $ year_2011: num  0.01845 0.00943 0.02214 0.07708 0.07708 ...
##  $ year_2012: num  0.0234 0.0115 0.018 0.0783 0.0771 ...
##  $ year_2013: num  0.0312 0.0205 0.0201 0.0779 0.0767 ...
##  $ year_2014: num  0.0328 0.0152 0.0262 0.0779 0.0775 ...
##  $ year_2015: num  NA NA NA NA NA NA NA NA NA NA ...
##  $ year_2016: num  0.023 0.0201 0.0201 0.0767 0.0779 ...
##  $ iteration: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ species  : chr  "formica aquilonia" "formica aquilonia" "formica aquilonia" "formica aquilonia" ...

Step 3: Produce a multispecies indicator

Now we can pass the filtered outputs to the calcMSI function which produces a MSI using a user-defined method. First let’s use the standard lambda indicator method:

ind <- calcMSI(dat = filterDat, 
               method = "lambda", 
               write = FALSE, 
               outPath = NULL)

##  num [1:244, 1:48, 1:1000] NA 0.239 NA NA NA ...
##  - attr(*, "dimnames")=List of 3
##   ..$ : chr [1:244] "1" "1001" "2001" "3001" ...
##   ..$ : chr [1:48] "1970" "1971" "1972" "1973" ...
##   ..$ : chr [1:1000] "1" "2" "3" "4" ...
## NULL

str(ind)

## List of 5
##  $ Summary :'data.frame':    47 obs. of  5 variables:
##   ..$ indicator     : num [1:47] 100 96.5 94.1 95.1 97.1 ...
##   ..$ lower         : num [1:47] 100 91.4 88.2 88.6 90 ...
##   ..$ upper         : num [1:47] 100 101 100 101 104 ...
##   ..$ year          : num [1:47] 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ Species_Number: num [1:47] 0 120 165 195 205 219 223 225 230 232 ...
##  $ MetaData:List of 5
##   ..$ summary       :'data.frame':   47 obs. of  5 variables:
##   .. ..$ indicator     : num [1:47] 100 96.5 94.1 95.1 97.1 ...
##   .. ..$ lower         : num [1:47] 100 91.4 88.2 88.6 90 ...
##   .. ..$ upper         : num [1:47] 100 101 100 101 104 ...
##   .. ..$ year          : num [1:47] 1 2 3 4 5 6 7 8 9 10 ...
##   .. ..$ Species_Number: num [1:47] 0 120 165 195 205 219 223 225 230 232 ...
##   ..$ LogLambda     : num [1:244, 1:47, 1:1000] NA NA NA NA NA NA NA NA NA NA ...
##   .. ..- attr(*, "dimnames")=List of 3
##   .. .. ..$ : chr [1:244] "1" "2" "3" "4" ...
##   .. .. ..$ : chr [1:47] "1" "2" "3" "4" ...
##   .. .. ..$ : NULL
##   ..$ ind_data      : num [1:47, 1:1000] 100 91.5 88.5 90.7 92.4 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:47] "1" "2" "3" "4" ...
##   .. .. ..$ : NULL
##   ..$ species_change:'data.frame':   244 obs. of  2 variables:
##   .. ..$ percent_change_year: num [1:244] -0.758 -2.392 2.487 -1 1.061 ...
##   .. ..$ category           : Ord.factor w/ 5 levels "strong decrease"<..: 3 2 4 3 3 5 2 3 3 2 ...
##   ..$ good_years    : logi [1:244, 1:47] NA TRUE NA NA NA NA ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : chr [1:244] "1" "2" "3" "4" ...
##   .. .. ..$ : chr [1:47] "1" "2" "3" "4" ...
##  $ st      :List of 2
##   ..$ species_assessment   :'data.frame':    244 obs. of  2 variables:
##   .. ..$ percent_change_year: num [1:244] 0.4606 -6.1172 -10.6827 -2.5933 -0.0941 ...
##   .. ..$ category           : Ord.factor w/ 5 levels "strong decrease"<..: 3 1 1 2 3 4 1 3 1 2 ...
##   ..$ indicator_asssessment:'data.frame':    1 obs. of  4 variables:
##   .. ..$ start_index: num 106
##   .. ..$ end_lower  : num 94
##   .. ..$ end_upper  : num 116
##   .. ..$ assessment : Factor w/ 1 level "stable": 1
##  $ lt      :List of 2
##   ..$ species_assessment   :'data.frame':    244 obs. of  2 variables:
##   .. ..$ percent_change_year: num [1:244] -0.758 -2.392 2.487 -1 1.061 ...
##   .. ..$ category           : Ord.factor w/ 5 levels "strong decrease"<..: 3 2 4 3 3 5 2 3 3 2 ...
##   ..$ indicator_asssessment:'data.frame':    1 obs. of  4 variables:
##   .. ..$ start_index: num 100
##   .. ..$ end_lower  : num 94
##   .. ..$ end_upper  : num 116
##   .. ..$ assessment : Factor w/ 1 level "stable": 1
##  $ final   :List of 2
##   ..$ species_assessment   :'data.frame':    244 obs. of  2 variables:
##   .. ..$ percent_change_year: num [1:244] -1.796 -3.208 1.378 0.642 -0.872 ...
##   .. ..$ category           : Ord.factor w/ 5 levels "strong decrease"<..: 2 1 4 3 3 2 2 5 3 3 ...
##   ..$ indicator_asssessment:'data.frame':    1 obs. of  4 variables:
##   .. ..$ start_index: num 104
##   .. ..$ end_lower  : num 94
##   .. ..$ end_upper  : num 116
##   .. ..$ assessment : Factor w/ 1 level "stable": 1

calcMSI produces a list with five elements: a simple summary of the indicator, the associated metadata, and short term (final five years), long term (whole time series) and final year assessments of distribution change for each species. The metadata are the objects returned by BRCindicators::lambda_indicator or BRCindicators::bma when method = “lambda” or “bma”, respectively.

When using the bma method you can specify any of the arguments that are passed to BRCindicators::bma in your call to calcMSI. As an example I will specify plot = FALSE, which prevents model diagnostics being plotted, and m.scale = “logit” to tell the function that our outputs are on the log odds scale:

BMA <- calcMSI(dat = filterDat, 
               method = "bma", 
               write = FALSE, 
               outPath = NULL,
               plotLabel = "Ants and bees",
               bmaInd = "prime",
               plot = FALSE, 
               m.scale = "logit")

Step 4: Summarise the indicator for external use

We currently produce UK- and England-level MSIs for JNCC who require the data in a particular format, and it is likely that we will be asked to produce future indicators in a similar format. The function summariseMSI will plot the indicator and summarise it in three tables that are required as “data sheets” by JNCC. I will apply it to the lambda indicator we generated earlier:

sumStats <- summariseMSI(minYear = 1970, 
                         maxYear  = 2016, 
                         indicator = ind, 
                         method = "lambda",
                         plotType = "indicator",
                         label = "Ants and Bees")

## TableGrob (1 x 2) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]

str(sumStats)

## List of 3
##  $ :'data.frame':    47 obs. of  5 variables:
##   ..$ indicator     : num [1:47] 100 96.5 94.1 95.1 97.1 ...
##   ..$ lower         : num [1:47] 100 91.4 88.2 88.6 90 ...
##   ..$ upper         : num [1:47] 100 101 100 101 104 ...
##   ..$ year          : num [1:47] 1970 1971 1972 1973 1974 ...
##   ..$ Species_Number: num [1:47] 0 120 165 195 205 219 223 225 230 232 ...
##  $ :'data.frame':    5 obs. of  4 variables:
##   ..$ Trend     : Factor w/ 5 levels "strong decrease",..: 1 2 3 4 5
##   ..$ Long term : int [1:5] 24 51 96 45 28
##   ..$ Short term: int [1:5] 80 32 51 21 60
##   ..$ Final year: int [1:5] 92 27 32 19 74
##  $ :'data.frame':    244 obs. of  3 variables:
##   ..$ Long term : num [1:244] -0.758 -2.392 2.487 -1 1.061 ...
##   ..$ Short term: num [1:244] 0.4606 -6.1172 -10.6827 -2.5933 -0.0941 ...
##   ..$ Final year: num [1:244] -1.796 -3.208 1.378 0.642 -0.872 ...

We can then write the outputs of sumStats to a three-tab excel file as JNCC require:

for (i in 1:3) { 
   
  write.xlsx(sumStats[i], file=paste0("/data-s3/", 
                               "ants_bees_", "sumStats.xlsx"),  
             sheetName=paste(i),  
             append=T, 
             row.names = F) 
}

How to use the occupancy indicator pipeline on datalabs

Rob Boyd

11/24/2020

Introduction

Example workflow

Step 1: Create a roster

Step 2: Filter the model outputs

Step 3: Produce a multispecies indicator

Step 4: Summarise the indicator for external use