Occupancy Analysis (single species and multiple seasons)-Wolf

Author

Antonio Uzal

Published

October 1, 2024

Occupancy Analysis

Firstly, install required packages, if they are not already in your computer, and load them:

#Load Packages
list.of.packages <- c(
                      "unmarked",       # occupancy modeling packate
                      "tidyr",         # A package for data manipulation
                      "dplyr",         # A package for data manipulation
                      "vegan",         # tools for descriptive parameters in ecology
                      "AICcmodavg",    # package to implement data averaging
                      "readr",          # package to read tables
                      "ggplot2",       # visusalisations
                      "gridExtra",     # multiple grid based
                      "kableExtra",     # HTML tables
                      "knitr")         # dynamic reporting (nice tables) 
                              

# Check you have them in your library
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
# install any missing packages and load them
if(length(new.packages)) install.packages(new.packages,repos = "http://cran.us.r-project.org")
lapply(list.of.packages, require, character.only = TRUE)

Step 1: Read detection history & covariates

We continue using the example describe in LINK LINK LINK. We already have 2 dataframes:

  • wolf_detection contains detection history and detection/site covariates over the four seasons. and was saved as wolf_detection.csv.

  • effort_df contains the effort covariate (detection covariate); saved as effort_data.csv.

wolf_detection <- read.csv("wolf_detection.csv",
                 header = TRUE, 
                 sep = ",", 
                 stringsAsFactors = FALSE) 

effort_df <- read.csv("effort_data.csv",
                 header = TRUE, 
                 sep = ",", 
                 stringsAsFactors = FALSE)  

# Match effort_df to wolf_detection based on deployment_id, to make sure detections, covariates and effort are in the same order and match
effort_aligned <- effort_df[match(wolf_detection$deployment_id, effort_df$deployment_id), ]

# Create data frame with detection histories columns 7 -> 38 (SO1-SO32); remember that we only make use of 32 sessions, 8 per season
y.cross <- wolf_detection[,7:38] 

# Create data frame with site covariates, which are found in column 4-6
siteCovs <- wolf_detection[,4:6] 

# elevation and ruggedness need to be standardised
siteCovs$elevation <- scale(siteCovs$elevation)
siteCovs$ruggedness <- scale(siteCovs$ruggedness)

# Create a list for yearly site covariates
#seasons <- list(season = rep(c("summer", "autumn", "winter", "spring")))
#seasons <- matrix(seasons, nrow(wolf_detection),4,byrow=TRUE)
# Define seasons as a matrix with each column representing a season
season_matrix <- matrix(c("summer", "autumn", "winter", "spring"), 
                        nrow = nrow(y.cross), # as many rows as deployments
                        ncol = 4, # same number than seasons
                        byrow = TRUE)

# Convert to a list for yearlySiteCovs
seasons_list <- list(season = season_matrix)

# Read in & create data frame with effort (columns 2-33) ; covering the 21 sessions
effort <- effort_aligned[, 2:33]

# Check dimensions (should match)
dim(y.cross)
[1] 135  32
dim(effort)
[1] 135  32
dim(season_matrix)
[1] 135   4

Step 2: create & visualise unmarked data frame

Setting up the season periods

A quick look up on Excel and previous outputs from R code will show you that the earliest deployment date was the 15th July 2021 and Beth deployed cameras up to the 6th April 2022.

We will be extracting data for the four seasons:

  • Summer: 15th July to 2nd of October - Eight sessions

  • Autumn: 3rd of October to 21st of December - Eight sessions

  • Winter: 22nd of December to 11th of March - Eight sessions

  • Spring: 12th of March to 30th of May - Eight sessions

Please note that the approach here to classify the seasons is different than that used in the previous tutorial. Here, we set up the seasons based on the period when the camera was active, rather than when the camera was deployed. The dates are also arbitrary and decided for training purposes; you might choose different dates!

Making the secondary periods (“sessions”) the same (n=8)across all primary periods (“seasons”) makes analyses much simpler.

Join all dataframes in an single unmarked dataframe

The unmarked frame has five key arguments:

  1. “y” (the occupancy data; detection histories), where rows correspond to different sites and columns correspond to different sessions. The entries typically represent the presence (1) or absence (0) of the species detected.

  2. “yearlySiteCovs” (covariates which are site specific but change every year (e.g., vegetation succession)), in our case, the only thing that changes is the season, as a proxy for some potential changes (e.g. snow, temperature…). This is a list of matrices (or data frames) representing site covariates that vary by primary period (e.g., season). In this case, the list contains the season_matrix, indicating which season each site belongs to for each session.

  3. “siteCovs” (covariates which are site specific but remain stable over time (i.e., elevation, ruggedness, feature type)),

  4. “obsCovs” (covariates which are observation-specific… in our case we have selected the effort).

  5. numPrimary: This indicates the number of primary periods (in our case, seasons), which allows the model to account for temporal variation in occupancy.

# create single unmarked data frame with detection histories, site covs and sample covs #

# Create the unmarkedMultFrame object for occupancy modeling
wolf <- unmarkedMultFrame(
  y = y.cross,
  siteCovs = siteCovs,
  yearlySiteCovs = seasons_list,  # a list for site variables that differ between primary survey periods (seasons)
  obsCovs = list(effort = effort_aligned[, 2:33]),  # Ensure this matches y.cross dimensions
  numPrimary = 4  # Number of unique primary periods (4 seasons)
)

# get a breakdown of no. of sites, no. of detection/non-detections,
# site specific details etc.

summary(wolf)
unmarkedFrame Object

135 sites
Maximum number of observations per site: 32 
Mean number of observations per site: 6.71 
Number of primary survey periods: 4 
Number of secondary survey periods: 8 
Sites with at least one detection: 72 

Tabulation of y observations:
   0    1 <NA> 
 762  144 3414 

Site-level covariates:
     feature_type     elevation.V1        ruggedness.V1    
 Road dirt :86    Min.   :-1.5029743   Min.   :-1.9093948  
 Trail game:49    1st Qu.:-1.0180183   1st Qu.:-0.5454748  
                  Median :-0.0400238   Median :-0.0788717  
                  Mean   : 0.0000000   Mean   : 0.0000000  
                  3rd Qu.: 0.9480739   3rd Qu.: 0.4229789  
                  Max.   : 1.6411568   Max.   : 2.1585435  

Observation-level covariates:
     effort      
 Min.   : 1.000  
 1st Qu.: 9.000  
 Median :10.000  
 Mean   : 8.724  
 3rd Qu.:10.000  
 Max.   :10.000  
 NA's   :3414    

Yearly-site-level covariates:
    season   
 autumn:135  
 spring:135  
 summer:135  
 winter:135  

Output tells us that over the 135 deployments, 72 had detections of wolf.

  • Maximum number of observations per site: 32

  • Mean number of observations per site: 6.71

  • Number of primary survey periods: 4

  • Number of secondary survey periods: 8

Plotting detections vs non detections

# plot of detection/non-detection

plot(wolf)

This outputs show us the detection history per deployment (1 = detection; 0 = no detection, blank = no data). There were wolf detections throughout the whole year.

Check for correlation between covariates

cor.test(siteCovs$elevation, siteCovs$ruggedness)

    Pearson's product-moment correlation

data:  siteCovs$elevation and siteCovs$ruggedness
t = 10.123, df = 133, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5522925 0.7455498
sample estimates:
      cor 
0.6596914 

The correlation between elevation and ruggedness is close to the threshold of 0.7 (r=0.659). We will keep both for now, although you will find in the literature that others might decide removing one of them.

Creating a range of models

The model function (“occu”) has three main arguments:
~“psi” (ψ), ~“gamma” (γ), ~“epsilon” (ε), ~p”, and “data”. 

  • is first-season’s occupancy. If you want to keep any parameter constant,  place a 1 here.

  • Colonisation rate is γ 

  • Extinction rate is ε

  • Detection probability is p

  • “data” is the unmarked dataframe

The null model is the model with all constant parameters:

model <- colext (occupancy, colonisation, extinction, detection, data)

m0 <- colext (ψ = ~1, γ= ~ 1, ε = ~ 1, p= ~ 1, data = wolf)

This can give you the first set of estimates:

# Fit the model
m0 <- colext(~1, # occupancy constant
             ~1, # colonisation constant
             ~1, # extinction constant
             ~1, # detection constant
             data = wolf)

# Extract parameter estimates 
occupancy_est <- backTransform(m0, type = "psi")  # Occupancy probability
colonization_est <- backTransform(m0, type = "col")  # Colonisation probability
extinction_est <- backTransform(m0, type = "ext")  # Extinction probability
detection_est <- backTransform(m0, type = "det")  # Detection probability

# Print the estimates
cat("Occupancy probability (psi):", occupancy_est@estimate, "\n")
Occupancy probability (psi): 0.2955842 
cat("Colonization probability (gamma):", colonization_est@estimate, "\n")
Colonization probability (gamma): 0.6368276 
cat("Extinction probability (epsilon):", extinction_est@estimate, "\n")
Extinction probability (epsilon): 0.2492083 
cat("Detection probability (p):", detection_est@estimate, "\n")
Detection probability (p): 0.2696603 

And now we fit a series of models that represents different hypotheses about the colonisation-extinction dynamics of wolf in the study area. Perhaps in our case we should not be talking about colonisation/extinction, as the occupancy dynamics of wolf in the area depend on their management, and simply they do not use the study area during winter and spring. During these seasons the pastures are not available due to snow or their forage quality is too low for the shepherd to move the wolf to the area.

Let’s start first by fitting models for detection probability, keeping ψ, γ, and ε constant

# model with detection dependent on effort

m1 <- colext(~1, # occupancy constant
             ~1, # colonisation constant
             ~1, # extinction constant
             ~effort, # probability of detection is affected by effort
             data = wolf)

m2 <- colext(~1, ~1, ~1, ~feature_type, data = wolf)

m3 <- colext(~1, ~1, ~1, ~effort+feature_type, data = wolf)

We can compare these 3 models against the null model very quickly:

dlist<-fitList(Null = m0,m1=m1, m2=m2, m3=m3)
selmod<-modSel(dlist,nullmod="Null")
selmod
     nPars    AIC delta  AICwt cumltvWt   Rsq
m1       5 593.46  0.00 0.5124     0.51 0.071
m3       6 593.64  0.18 0.4676     0.98 0.083
Null     4 601.24  7.79 0.0104     0.99 0.000
m2       5 601.42  7.96 0.0096     1.00 0.014

The output suggest that effort might be a predictor of detection probability as it is chosen in the most parsimonious model and again in the next best model (m3).

When looking at the 95% confidence intervals of the effort beta coefficient (0.190) they do not overlap 0, so this is telling us that the parameter effort is a significant predictor of detection probability. See the following outcome.

coef(m1)
  psi(Int)   col(Int)   ext(Int)     p(Int)  p(effort) 
-0.5756099  0.4642960 -1.1250026 -2.7645433  0.1908941 
confint(m1, type="det")
               0.025     0.975
p(Int)    -4.0804922 -1.448594
p(effort)  0.0548083  0.326980

Subsequent models will include effort a a covariate in the detection component.

Next, we are going to continue by modelling first-year occupancy (ψ) by adding the covariates that might affect this (SiteCovs).

m4 <- colext(~elevation, # occupancy affected by elevation
             ~1, # colonisation constant
             ~1, # extinction constant
             ~effort, # detection prob affected by effort
             data = wolf)

m5 <- colext(~ruggedness, ~1, ~1, ~effort, data = wolf)

m6 <- colext(~feature_type, ~1, ~1, ~effort, data = wolf)

m7 <- colext(~elevation+ruggedness, ~1, ~1, ~effort, data = wolf)

m8 <- colext(~elevation+feature_type, ~1, ~1, ~effort, data = wolf)

m9 <- colext(~ruggedness+feature_type, ~1, ~1, ~effort, data = wolf)

m10 <- colext(~elevation+ruggedness+feature_type, ~1, ~1, ~effort, data = wolf)

And again, we can compare all these models against the null model:

dlist<-fitList(Null = m0,m1=m1, m2=m2, m3=m3, m4=m4, m5=m5, 
               m6=m6, m7=m7, m8=m8, m9=m9, m10=m10 ) 
selmod<-modSel(dlist,nullmod="Null") 
selmod
     nPars    AIC delta  AICwt cumltvWt   Rsq
m7       7 590.55  0.00 0.2604     0.26 0.118
m6       6 591.42  0.87 0.1686     0.43 0.099
m10      8 591.65  1.09 0.1507     0.58 0.124
m9       7 591.94  1.38 0.1305     0.71 0.109
m8       7 592.63  2.07 0.0923     0.80 0.104
m1       5 593.46  2.90 0.0609     0.86 0.071
m3       6 593.64  3.09 0.0556     0.92 0.083
m5       6 593.75  3.19 0.0528     0.97 0.083
m4       6 595.17  4.61 0.0259     1.00 0.073
Null     4 601.24 10.69 0.0012     1.00 0.000
m2       5 601.42 10.87 0.0011     1.00 0.014

The output shows the most parsimonious model is m7, however m6 is very close in terms of AIC, and even m10 and m9 are within 2 delta-AIC so they could be considered equally good.

The most parsimonious model (m7) contains both elevation and ruggedness as predictors of occupancy, so let’s focus on these two covariates.

However, when we look at the beta coefficients and their 95% confidence intervals, it shows that only elevation can be a significant predictor of occupancy, as its beta coefficient (-42.36) 95% confidence interval does not overal zero. See below:

coef(m7)
       psi(Int)  psi(elevation) psi(ruggedness)        col(Int)        ext(Int) 
     -5.5819496     -42.3673275      54.1234413      -0.0352738      -1.8827624 
         p(Int)       p(effort) 
     -2.9178441       0.2023893 
confint(m7, type="psi")
                     0.025     0.975
psi(Int)         -23.27146  12.10756
psi(elevation)  -341.74553 257.01088
psi(ruggedness) -393.05067 501.29756

Based on this output we are going to take forward the covariate elevation as a predictor of occupancy, whilst we take effort as a predictor of detection probability. The next step would be to model the two parameters special to multi-season occupancy models: colonisation (gamma γ) and extinction (epsilon ε). For this we could use season as the variable affecting colonisation, extinction, or both:

m11 <- colext(
  ~ elevation,   # Initial occupancy modeled as a function of ruggedness
  ~ season,       # Colonization probability varies by season
  ~ 1,            # Extinction probability is constant across seasons
  ~ effort,       # Detection probability is modeled as a function of effort
  data = wolf    # Use the unmarkedMultFrame object created earlier
)
m12 <- colext(~elevation, ~1, ~season, ~effort, data = wolf) # seasons as predictor of extinction
m13 <- colext(~elevation, ~season, ~season, ~effort, data = wolf) # seasons as predictor of colonisation AND extinction

And we can compare models

dlist<-fitList(Null = m0,m1=m1, m2=m2, m3=m3, m4=m4, m5=m5, 
               m6=m6, m7=m7, m8=m8, m9=m9, m10=m10,
               m11=m11, m12=m12, m13=m13) 
selmod<-modSel(dlist,nullmod="Null") 
selmod
     nPars    AIC delta   AICwt cumltvWt   Rsq
m13     10 587.77  0.00 0.41931     0.42 0.174
m7       7 590.55  2.79 0.10407     0.52 0.118
m12      8 590.76  3.00 0.09374     0.62 0.130
m11      8 590.91  3.14 0.08723     0.70 0.129
m6       6 591.42  3.66 0.06738     0.77 0.099
m10      8 591.65  3.88 0.06023     0.83 0.124
m9       7 591.94  4.17 0.05215     0.88 0.109
m8       7 592.63  4.86 0.03689     0.92 0.104
m1       5 593.46  5.69 0.02435     0.95 0.071
m3       6 593.64  5.87 0.02222     0.97 0.083
m5       6 593.75  5.98 0.02109     0.99 0.083
m4       6 595.17  7.40 0.01036     1.00 0.073
Null     4 601.24 13.48 0.00050     1.00 0.000
m2       5 601.42 13.66 0.00045     1.00 0.014

The output indicates that the most parsimonious model (m13) includes season as predictor of both colonisation and extinction, elevation as predictor of occupancy and effort as predictor of detection probability.

The 95% confidence intervals obtained below tell us:

  • psi(elevation): The coefficient for elevation is 1.13, with a 95% CI [0.06, 2.19], indicating that initial occupancy increases with elevation, and this effect is statistically significant (CI does not include zero).

  • p(effort): Effort has a positive coefficient of 0.20 with a CI [0.06, 0.33], showing that detection probability increases with more effort, which is statistically significant.

  • col(seasonsummer): Colonization during summer has a high positive coefficient (16.19) but with a very wide CI [-65.14, 97.52], suggesting that while there might be higher colonization in summer, the estimate is uncertain.

  • col(seasonwinter): Similarly, the colonization in winter (5.97) also has a wide CI [-60.83, 72.78], indicating high uncertainty and a lack of statistical significance.

  • ext(seasonsummer): The extinction rate in summer (1.15) has a CI [-1.45, 3.74], spanning zero, so it is not significant.

  • ext(seasonwinter): Winter extinction is highly variable (-8.02) with an extremely wide CI [-68.96, 52.92], indicating low reliability and likely insignificance.

coef(m13)
         psi(Int)    psi(elevation)          col(Int) col(seasonsummer) 
       -0.5915752         1.1279126        -7.1852380        16.1883587 
col(seasonwinter)          ext(Int) ext(seasonsummer) ext(seasonwinter) 
        5.9730702        -1.1702680         1.1456950        -8.0207685 
           p(Int)         p(effort) 
       -2.8344810         0.1987172 
confint(m13, type="psi")  # for occupancy (elevation)
                     0.025     0.975
psi(Int)       -2.04915265 0.8660022
psi(elevation)  0.06138207 2.1944432
confint(m13, type="col")  # For colonization
                      0.025    0.975
col(Int)          -73.91776 59.54728
col(seasonsummer) -65.14468 97.52139
col(seasonwinter) -60.83265 72.77879
confint(m13, type="ext")  # For extinction
                       0.025       0.975
ext(Int)           -2.282809 -0.05772694
ext(seasonsummer)  -1.445463  3.73685250
ext(seasonwinter) -68.959480 52.91794271
confint(m13, type="det")  # For detection (effort)
                0.025      0.975
p(Int)    -4.15239745 -1.5165645
p(effort)  0.06250073  0.3349337

Outcome:

  • Elevation positively influences initial occupancy: Sites at higher elevations have a higher probability of being initially occupied.

  • Detection probability is low but increases with effort: The model suggests that more survey effort improves detection.

  • Seasonal effects on colonization and extinction show high variability and large confidence intervals, suggesting the model does not have enough data to confidently estimate seasonal impacts on these parameters.

You can visualise this:

Elevation as predictor of initial occupancy

# Load necessary libraries
library(ggplot2)
library(dplyr)

# Load your data to determine the range of standardized elevation
min_elevation <- min(siteCovs$elevation, na.rm = TRUE)  
max_elevation <- max(siteCovs$elevation, na.rm = TRUE)

# Generate standardized elevation values for prediction
elevation_vals <- seq(min_elevation, max_elevation, length.out = 100)

# Calculate the mean effort across all columns for each row
effort_means <- rowMeans(effort, na.rm = TRUE)

# Use the overall mean effort for predictions
mean_effort <- mean(effort_means, na.rm = TRUE)

# Create a data frame for predictions, keeping effort constant
elev_data <- data.frame(elevation = elevation_vals, season = "default_season", effort = mean_effort)

# Make predictions for occupancy based on standardized elevation values with confidence intervals
occupancy_preds <- predict(m13, newdata = elev_data, type = "psi", se.fit = TRUE)

# Add predictions and confidence intervals to data frame for plotting
elev_data$Occupancy <- occupancy_preds$Predicted
elev_data$CI_Lower <- occupancy_preds$Predicted - 1.96 * occupancy_preds$SE
elev_data$CI_Upper <- occupancy_preds$Predicted + 1.96 * occupancy_preds$SE

# Plot the effect of standardized elevation on occupancy with confidence intervals
occupancy_plot <- ggplot(elev_data, aes(x = elevation, y = Occupancy)) +
  geom_line(color = "blue") +
  geom_ribbon(aes(ymin = CI_Lower, ymax = CI_Upper), alpha = 0.2, fill = "blue") +
  labs(title = "Effect of Standardized Elevation on Occupancy Probability",
       x = "Standardized Elevation", y = "Occupancy Probability") +
  theme_minimal()

# Display the plot
occupancy_plot

Effort as predictor of initial detection probability

# Load necessary libraries
library(ggplot2)
library(dplyr)

# Calculate the mean effort across all columns for each row
effort_means <- rowMeans(effort, na.rm = TRUE)

# Generate effort values for prediction
effort_vals <- seq(min(effort_means, na.rm = TRUE), max(effort_means, na.rm = TRUE), length.out = 100)

# Use the overall mean elevation for predictions
mean_elevation <- mean(siteCovs$elevation, na.rm = TRUE)

# Create a data frame for predictions, keeping elevation constant
effort_data <- data.frame(elevation = mean_elevation, season = "default_season", effort = effort_vals)

# Make predictions for detection probability based on effort with confidence intervals
detection_preds <- predict(m13, newdata = effort_data, type = "det", se.fit = TRUE)

# Add predictions and confidence intervals to data frame for plotting
effort_data$Detection_Probability <- detection_preds$Predicted
effort_data$CI_Lower <- detection_preds$Predicted - 1.96 * detection_preds$SE
effort_data$CI_Upper <- detection_preds$Predicted + 1.96 * detection_preds$SE

# Plot the effect of effort on detection probability with confidence intervals
detection_plot <- ggplot(effort_data, aes(x = effort, y = Detection_Probability)) +
  geom_line(color = "blue") +
  geom_ribbon(aes(ymin = CI_Lower, ymax = CI_Upper), alpha = 0.2, fill = "blue") + 
  labs(title = "Effect of Effort on Detection Probability",
       x = "Effort", y = "Detection Probability") +
  theme_minimal()

# Display plot
print(detection_plot)