DiversityOccupancy Quickstart

Installing DiversityOccupancy

Requirements

To use the newest version of this package you need R version 3.3.0 or newer (use the function sessionInfo() in your R session to check your current version).

Installing the package

Install from cran repository

install.packages("DiversityOccupancy")

Objectives of the Package

The objective of this package is to simultaneously model factors associated with occupancy and abundance of individual species using a detection history file, and to use predicted abundances to calculate species diversity for each sampling site. The package then models factor(s) associated with among-site species diversity, which can then be combined with spatial data to identify areas that contain both high abundance of species of conservation concern and high species diversity.

Use of the package

In order to calculate abundance and alpha diversity we need at least three files:

Detection history of multiple species

A data frame consisting on the detection history of at least two species. As an example DiversityOccupancy has the dataset IslandBirds which contains the detection history of 5 species in the Pohnpei Island for 4 consecutive days (Columns) in 120 different sites (Rows). The data set includes a 1 for each time a species was detected, and a 0 for each time it was not detected.

A detection for the first two species is presented below:

library(DiversityOccupancy)

## Loading required package: MuMIn

## Loading required package: unmarked

## Loading required package: reshape

## Loading required package: lattice

## Loading required package: Rcpp

data("IslandBirds")
kable(head(IslandBirds[1:8]))

CICA.1	CICA.2	CICA.3	CICA.4	CIRW.1	CIRW.2	CIRW.3	CIRW.4
0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0
0	0	0	0	0	0	0	0
0	0	0	0	0	0	1	0

Site covariates

Site covariates are presented in a data frame consisting of measurements taken at each site. The covariates are used singly and in combination to model occupancy or abundance, and they should be variables that are stable within the scope of the length of the study. In DiversityOccupancy there is an example concordant with the IslandBirds dataset called siteCov:

data("siteCov")
head(siteCov)

Elev	AgroFo	SecVec	Upland
214.6	0.0453677	0.3096148	0.6450175
254.7	0.0000000	0.0130532	0.9869468
321.5	0.0000000	0.0090735	0.9909265
68.2	0.0000000	0.0000000	1.0000000
346.5	0.0000000	0.1069723	0.8930277
74.5	0.0000000	0.0000000	1.0000000

Detection covariates

A list of data frames, in which each data frame includes a daily measurement of variables with the potential to affect detection probabilities. It is important that each element (data frame) of the list has a name, so that it can be called to fit the occupancy model. These variables are used to model the probability of detection.

DiversityOccupancy has a dataset called Daily_Cov which ilustrates how the Daily covariates have to be structured:

#All the items of the ist must have names
names(Daily_Cov)

## [1] "Day"    "Wind"   "Obs"    "Time"   "Rain"   "Noise"  "Clouds"

#here we see the first dataframe of the Dailycov dataset
head(Daily_Cov[[1]])

##    Day1 Day2 Day3 Day4
## 3    25   27   33   42
## 5    25   27   33   42
## 7    25   27   33   41
## 11   19   22   25   51
## 13   19   22   25   51
## 16   19   22   25   51

Fiting models for abundance and predicting alpha diversity

In this example we will fit and model the abundance for 5 bird species and calculate alpha diversity from those results.

BirdDiversity <- diversityoccu(pres = IslandBirds, sitecov = siteCov,
obscov = Daily_Cov,spp =  5, form = ~ Day + Wind + Time ~ Elev + Wetland + Upland, dredge = FALSE)

The resulting object of class diversityoccupancy has the following elements

names(BirdDiversity)

## [1] "Covs"      "models"    "Diversity" "species"

If you need to see the parameters of the model of one of the species, you call the species number with the element$models. For example extract the model for the second species:

BirdDiversity$models[[2]]

## 
## Call:
## occuRN(formula = form, data = data2)
## 
## Abundance:
##              Estimate       SE      z P(>|z|)
## (Intercept)   0.66046 7.41e-01  0.891  0.3731
## Elev         -0.00398 1.73e-03 -2.305  0.0212
## Wetland     -47.31278 1.72e+02 -0.276  0.7828
## Upland       -0.42087 5.56e-01 -0.756  0.4494
## 
## Detection:
##             Estimate      SE      z P(>|z|)
## (Intercept) -2.36626 1.80923 -1.308  0.1909
## Day          0.03480 0.01801  1.932  0.0534
## Wind        -0.10959 0.19166 -0.572  0.5675
## Time        -0.00114 0.00375 -0.305  0.7605
## 
## AIC: 208.1134

The Best.model parameter for a diversityoccupancy object shows us a table with the abundance and alpha diversity calculated for each sampled point:

BirdDiversity$Best.model

Automatic model selection for abundance

If the option of dredge is set to “TRUE”, then diversityoccu attempts to fit all first order models, and it selects the one with the lowest AICc value, for each species. Be aware that processing times rapidly increases with added numbers of parameters, and that processing can require many hours or days for complex data sets. The following graph and table shows the processing time for the BatOccu dataset.

From now on we will work with automatically selected models for bat abundance and diversity using an information theoretic approach (AICc).

birdmodel.selected <- diversityoccu(pres = IslandBirds, sitecov = siteCov,
obscov = Daily_Cov,spp =  5, form = ~ Day + Wind + Time ~ Elev + Wetland + Upland, dredge = TRUE)

Below we present an example of an analysis with the full model (includes all variables) and subsequently results from a model selection analysis, both of them only for the second species:

BirdDiversity$models[[2]]

## 
## Call:
## occuRN(formula = form, data = data2)
## 
## Abundance:
##              Estimate       SE      z P(>|z|)
## (Intercept)   0.66046 7.41e-01  0.891  0.3731
## Elev         -0.00398 1.73e-03 -2.305  0.0212
## Wetland     -47.31278 1.72e+02 -0.276  0.7828
## Upland       -0.42087 5.56e-01 -0.756  0.4494
## 
## Detection:
##             Estimate      SE      z P(>|z|)
## (Intercept) -2.36626 1.80923 -1.308  0.1909
## Day          0.03480 0.01801  1.932  0.0534
## Wind        -0.10959 0.19166 -0.572  0.5675
## Time        -0.00114 0.00375 -0.305  0.7605
## 
## AIC: 208.1134

birdmodel.selected$models[[2]]

## 
## Call:
## occuRN(formula = ~Day + 1 ~ Elev + Wetland + 1, data = data2)
## 
## Abundance:
##             Estimate      SE      z P(>|z|)
## (Intercept)  0.50953 0.66082  0.771 0.44067
## Elev        -0.00446 0.00164 -2.725 0.00643
## Wetland     -5.05318 2.71875 -1.859 0.06308
## 
## Detection:
##             Estimate     SE     z  P(>|z|)
## (Intercept)  -3.1681 0.8008 -3.96 7.62e-05
## Day           0.0324 0.0174  1.86 6.29e-02
## 
## AIC: 204.0382

Model selection for alpha diversity modeling

The next step is to select the best model for predicting alpha diversity using the model.diversity function. The function takes a diversityoccupancy object, and fits all possible glm models and ranks them by AICc. Other than the diversityoccupancy object, there are three other parameters to select: 1) Method, which can be either “h” which fits every possible model, or “g”, which uses genetic algorithms to select models (recommended for large candidate sets); 2) Delta, which allows the user to identify an AICc delta threshold, which returns all models with AICc values below the threshold; 3) Squared, which includes only linear combinations when set to FALSE (Default), and both linear and quadratic (second order) if set to TRUE.

glm.BirdDiverse <- model.diversity(birdmodel.selected, method = "g", delta = 2, squared = TRUE)

To see the top models extract the Table element of the modelselection object

glm.BirdDiverse$Table

model	aicc	weights	Delta.AICc
Diversity ~ 1 + Elev + AgroFo + Wetland + I(Elev^2) + I(AgroFo^2) + I(SecVec^2) + I(Wetland^2) + I(Upland^2)	-681.3992	0.3120093	0.0000000
Diversity ~ 1 + Elev + AgroFo + Wetland + I(Elev^2) + I(AgroFo^2) + I(Wetland^2) + I(Upland^2)	-681.1351	0.2734182	0.2640605
Diversity ~ 1 + Elev + AgroFo + Wetland + Upland + I(Elev^2) + I(AgroFo^2) + I(Wetland^2) + I(Upland^2)	-680.7723	0.2280579	0.6268668
Diversity ~ 1 + Elev + AgroFo + SecVec + Wetland + I(Elev^2) + I(AgroFo^2) + I(Wetland^2) + I(Upland^2)	-680.3701	0.1865146	1.0290469

The responseplot.diver function takes a modeldiversity object and one of the variables used to predict the alpha diversity index, and makes a plot showing the response of the diversity index against the selected variable. This function automatically limits the values of that variable to the maximum and minimum values of the dataset. It also shows the standard deviation of the estimation.

responseplot.diver(glm.BirdDiverse, Elev)

Also since the returned plot is a ggplot type object, it can be easily modified following ggplot2 grammar of graphics.

library(ggplot2)
k <- responseplot.diver(glm.BirdDiverse, Elev)
k

k + theme_dark()

k + ylim(c(0,50))

Selecting conservation areas based on alpha diversity and abundance of species of conservation concern

Of the 5 species modeled, there are at least two of conservation concern, the second and third of our list. Since we already have models relating site characteristics to species abundance and a model relating site characteristics to alpha diversity, with the use of a spatial raster layers (rasterstack) with the modeled variables we can choose an area with high species diversity and/or abundance.

The function diversity predict has the option of creating a kml file of the selected area, but in order to do that, your raster layers need to be in latitude longitude format. Here we will keep the Birdstack file in UTM format and reproject it to latitude longitude and call it Birdstack2 in order to show both posible outcomes.

library(raster)

## Loading required package: sp

## 
## Attaching package: 'sp'

## The following object is masked from 'package:unmarked':
## 
##     coordinates

## 
## Attaching package: 'raster'

## The following objects are masked from 'package:unmarked':
## 
##     getData, projection

data(Birdstack)
plot(Birdstack)

newproj <- '+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0'
library(rgdal)

## rgdal: version: 1.1-10, (SVN revision 622)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 1.10.1, released 2013/08/26
##  Path to GDAL shared files: /usr/share/gdal/1.10
##  Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
##  Path to PROJ.4 shared files: (autodetected)
##  Linking to sp version: 1.2-3

Birdstack2 <- stack(projectRaster(Birdstack, crs=newproj))

In order to find areas of high conservation value, we use the predict.diversity function. We need both a diversityoccupancy and a modeldiversity class object, used in the model, and diverse parameters respectfully, a spatial representation of site covariates as raster files (rasterstack), with the variables in the new.data parameter, a boolean vector in the species parameter indicating which species shall be considered (T or F), and the quantile.nth parameter, which indicates a quantile threshold that is used for abundance and/or richness to indicate conservation value (areas above the threshold will be returned). If you want to get the kml files you would set the kml parameter as TRUE, and state a name for the kml file.

library(rgdal)
Selected.area <- diversity.predict(model = birdmodel.selected, diverse = glm.BirdDiverse, new.data = Birdstack2, quantile.nth = 0.65, species =
c(F,T,T,F,F), kml = TRUE, name = "Selected_Bird_Area.kml")

If KML is set to TRUE, this function automatically creates a KML file that will be stored in the home directory of your session with the name you give it.

plot(Selected.area$diversity.raster)

plot(Selected.area$species)

On the other hand if you set the kml as FALSE, you would keep the original projection of your raster layers.

Selected.area2 <- diversity.predict(model = birdmodel.selected, diverse = glm.BirdDiverse, new.data = Birdstack, quantile.nth = 0.65, species =
c(F,T,F,F,F), kml = FALSE)

plot(Selected.area2$diversity.raster)

plot(Selected.area2$species)