Basic Camera Trap descriptors - Mara Workshop

Author

Antonio Uzal

Published

January 1, 2026


Introduction

This tutorial covers basic manipulation of data and calculations directed at obtaining basic Camera Trapping descriptors.

The case study is based on the current data obtained from an area that includes Enarau and adjacent landscape

The dataset

Obtained via Wildlife Insights. The download output includes 5 files:

  • projects.csv > description of the project

  • cameras.csv > description of cameras (make, model, serial number and ID)

  • deployments.csv > description of deployments, including location, start and end date, camera ID used and landscape feature where the deployment took place (e.g. trail game)

  • images_2010431.csv > raw camera trap data for single images

  • sequences.csv > raw camera trap data for sequences of images

Import datasets

Show Code
deployment <- read.csv("deployments.csv",
                 header = TRUE, 
                 sep = ",", 
                 stringsAsFactors = FALSE)
camtraps <- read.csv("sequences.csv", 
                 header = TRUE, 
                 sep = ",", 
                 stringsAsFactors = FALSE)

Obtaining descriptors

Firstly, install required packages, if they are not already in your computer, and load them:

Show Code
#Load Packages
list.of.packages <- c(
                      "leaflet",       # creates interactive maps
                      "plotly",        # creates interactive plots   
                      "kableExtra",    # Creates interactive tables 
                      "tidyr",         # A package for data manipulation
                      "dplyr",         # A package for data manipulation
                      "viridis",       # Generates colors for plots  
                      "corrplot",      # Plots pairwise correlations
                      "lubridate",     # Easy manipulation of date objects
                      "taxize",        # Package to check taxonomy 
                      "sf",            # Package for spatial data analysis 
                      "reshape",       # allows restructuring of data
                      "vegan",         # tools for descriptive parameters in ecology
                      "plotrix",       # visualises circular data (activity)
                      "chron",         # handles chronological objects
                      "knitr")         # dynamic reporting (nice tables) 
                              

# Check you have them in your library
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
# install any missing packages and load them
if(length(new.packages)) install.packages(new.packages,repos = "http://cran.us.r-project.org")
lapply(list.of.packages, require, character.only = TRUE)

Checking data

Quick set of functions to make sure everything looks like it should.

Show Code
names(deployment) # return the name of variables contained in dataset
 [1] "project_id"                 "deployment_id"             
 [3] "placename"                  "longitude"                 
 [5] "latitude"                   "start_date"                
 [7] "end_date"                   "bait_type"                 
 [9] "bait_description"           "feature_type"              
[11] "feature_type_methodology"   "camera_id"                 
[13] "camera_name"                "quiet_period"              
[15] "camera_functioning"         "sensor_height"             
[17] "height_other"               "sensor_orientation"        
[19] "orientation_other"          "plot_treatment"            
[21] "plot_treatment_description" "detection_distance"        
[23] "subproject_name"            "subproject_design"         
[25] "event_name"                 "event_description"         
[27] "event_type"                 "recorded_by"               
[29] "remarks"                   
Show Code
names(camtraps)
 [1] "project_id"              "deployment_id"          
 [3] "sequence_id"             "is_blank"               
 [5] "identified_by"           "wi_taxon_id"            
 [7] "class"                   "order"                  
 [9] "family"                  "genus"                  
[11] "species"                 "common_name"            
[13] "uncertainty"             "start_time"             
[15] "end_time"                "group_size"             
[17] "age"                     "sex"                    
[19] "animal_recognizable"     "individual_id"          
[21] "individual_animal_notes" "behavior"               
[23] "highlighted"             "markings"               
[25] "cv_confidence"           "license"                

We can also look at the structure of each of the variables using the str function. Output not shown here.

Show Code
str(deployment)
str(camtraps)

Let’s obtain some numerical outputs:

Number of photos

Show Code
number <- nrow(camtraps) # number of rows in the dataset
print(paste("Number of photographs:", number))
[1] "Number of photographs: 1219"

Notice that the number differs from the Wildlife Insights summary, this might be because photographs with humans were filtered out.

Species

How many different species have been captured and which ones? in the dataset the full species name is divided in genus and species so we need to combine this first

Show Code
# Combine genus and species into a full species name
camtraps$full_species <- paste(camtraps$genus, camtraps$species, sep = " ")

# Extract unique species
unique_species <- unique(camtraps$full_species)

# Count the number of unique species
print(paste("Number of unique species:",length(unique_species)))
[1] "Number of unique species: 40"
Show Code
# View the unique species
print(unique_species)
 [1] " "                          "Bos taurus"                
 [3] "Equus quagga"               "Madoqua "                  
 [5] "Tragelaphus scriptus"       "Papio anubis"              
 [7] "Phacochoerus africanus"     "Lepus capensis"            
 [9] "Civettictis civetta"        "Loxodonta africana"        
[11] "Leptailurus serval"         "Connochaetes taurinus"     
[13] "Ovis aries"                 "Capra aegagrus hircus"     
[15] "Orycteropus afer"           "Ardeotis kori"             
[17] "Canis familiaris"           "Numida meleagris"          
[19] "Genetta genetta"            "Canis "                    
[21] "Sylvicapra grimmia"         "Crocuta crocuta"           
[23] "Potamochoerus larvatus"     "Madoqua guentheri"         
[25] "Chlorocebus pygerythrus"    "Ichneumia albicauda"       
[27] "Hystrix africaeaustralis"   "Mungos mungo"              
[29] "Hylochoerus meinertzhageni" "No CV Result No CV Result" 
[31] "Capra "                     "Equus asinus"              
[33] "Mellivora capensis"         "Otolemur crassicaudatus"   
[35] "Equus "                     "Felis catus"               
[37] "Oreotragus oreotragus"      "Hystrix "                  
[39] "Caracal caracal"            "Aepyceros melampus"        

The number of species (n=40) differs from the WI summary (n=57). This might be because there are some species that have not been catalogued yet. WI summary includes them, whilst we don’t do it.

There were some photographs that containeded no species (i.e. only ” ” is returned). How often does this happen?

Show Code
# Count how many times the genus has not been identified, and therefor the photograph does not contain relevant species
empty_genus <- sum(camtraps$genus == "")

# View the count
print(empty_genus)
[1] 73
Show Code
# Proportion of dataset
proportion <- (empty_genus/nrow(camtraps))*100
print(paste("Percentage of photographs without animal captures:",proportion))
[1] "Percentage of photographs without animal captures: 5.98851517637408"

This has happened in >5% of the photographs. Remember that photographs containing humans were already filtered out by WI. Looking at the dataset, this is mainly related to the camera being triggered by passing vehicles.

Deployments

We would like to know if all deployments were successful (i.e. the cameras took photos, regardless if animals were detected)

Show Code
#count number of deployments
unique_deployments <- unique(deployment$deployment_id)
num_all_deployments <- length(unique_deployments)
#count number of successful deployments
successful_deployments<- unique(camtraps$deployment_id)
num_successful_deployments <-length(successful_deployments)
#how many were not successful?
total<-num_all_deployments - num_successful_deployments
print(paste("Number of deployments with no captures:", total))
[1] "Number of deployments with no captures: 4"

Four camera stations had no photos, which could be due to malfunction, damage or lost, or simply nothing triggered the sensors. You might want to double check this by downloading again the dataset without filtering humans as the first photograph should contain the researcher holding a white board with deployment info.

Start and End of sampling

First we need to make sure that the start_date and end_date columns are properly formatted as date-time objects.

lubridate is an awesome package that allows fixing all kind of issues with dates and times. ynd_hms() from lubridate automatically recognises and converts dates with timezones:

Show Code
library(lubridate)

# Ensure character if needed (e.g., factors)
deployment$start_date <- as.character(deployment$start_date)
deployment$end_date   <- as.character(deployment$end_date)

# Parse as local Nairobi time
deployment$start_date <- ymd_hms(deployment$start_date, tz = "Africa/Nairobi")
deployment$end_date   <- ymd_hms(deployment$end_date,   tz = "Africa/Nairobi")

# Optional: keep a UTC copy for joins, modelling, or cross-site comparisons
deployment$start_date_utc <- with_tz(deployment$start_date, "UTC")
deployment$end_date_utc   <- with_tz(deployment$end_date,   "UTC")

# Sanity checks
stopifnot(!any(is.na(deployment$start_date)))
stopifnot(!any(is.na(deployment$end_date)))


# Check for NAs after conversion to ensure it has been successful
print(sum(is.na(deployment$start_date)))  
[1] 0
Show Code
# Should return 0 if conversion is successful

Now we can work out the start and end of the sampling period across all deployments

Show Code
# Find the earliest start date
earliest_start_date <- min(deployment$start_date, na.rm = TRUE)

# Find the latest end date
latest_end_date <- max(deployment$end_date, na.rm = TRUE)

# View the results
print(paste("Earliest start date:", earliest_start_date))
[1] "Earliest start date: 2025-09-30"
Show Code
print(paste("Latest end date:", latest_end_date))
[1] "Latest end date: 2025-12-24"

Deployment interval

It is useful to know the survey effort per station/deployment and identify sources of imbalance or plainly, when things have gone wrong. For this,, we can calculate the deployment interval: the number of days between start and end date.

Show Code
# Calculate the interval for each deployment
deployment$interval <- interval(deployment$start_date, deployment$end_date)

# If you want to calculate the duration in days, hours, etc.
deployment$duration_days <- as.numeric(deployment$interval) / (60 * 60 * 24)  # Convert to days

# Check the results
print(head(deployment[, c("start_date", "end_date", "duration_days")],5))
  start_date   end_date duration_days
1 2025-10-01 2025-10-21            20
2 2025-10-20 2025-10-26             6
3 2025-10-06 2025-10-20            14
4 2025-09-30 2025-10-20            20
5 2025-12-10 2025-12-24            14
Show Code
# Or you can summarise the range of values obtained
summary(deployment$duration_days)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00   14.00   20.00   18.61   21.00   54.00 

The average interval was around 19 days, but the variation is quite large… are we sure there was a camera deployed for 54 days?.

Sampling Effort

This is an important parameter, that will allow us to obtain others, such as the naive occupancy. Sampling effort is considered here as the number of camera days, so that is a simple line of coding:

Show Code
# Sum the duration_days to calculate total camera days (sampling effort)
total_camera_days <- sum(deployment$duration_days, na.rm = TRUE)

# Round the result to NO decimal places
total_camera_days_rounded <- round(total_camera_days, 0)

# View the rounded result
print(paste("Total camera days (sampling effort):", total_camera_days_rounded))
[1] "Total camera days (sampling effort): 856"

Camera Locations

A common mistake in camera trap data sets is that the locations are not where they are supposed to be. The safest way to check your data is to plot it.

The package leaflet is awesome for rapid and interactive maps and plots. The next R code produces the simplest version of a leaflet map. Notice you can zoom in and out!

Show Code
m <- leaflet() %>%             # call leaflet
        addTiles() %>%         # add the default basemap
        addCircleMarkers(      # Add circles for stations
          lng=deployment$longitude, lat=deployment$latitude) 
m                              # return the map
Tip

You can change the base map and use some satellite imagery

Show Code
m <- leaflet() %>%             # call leaflet
        addProviderTiles(providers$Esri.WorldImagery) %>% #Add Esri Wrold imagery  
        addCircleMarkers(      # Add circles for stations
          lng=deployment$longitude, lat=deployment$latitude) 
m                              # return the map

Relative Index of Abundance

The RIA is obtained by calculating the number of independent captures per species considering the sampling effort previously obtained (total_camera_days_rounded). We will exclude all blank records (i.e. those that do not have any genus identified). Remember that you need to think carefully what you consider as independent captures. For instance, you might want to consider independent captures only when there is a 30 minutes interval in between two captures of the same species, to avoid ‘double counting’ individuals that might stay around the camera, for example grazing. Examples below consider this (i.e. ‘threshold’).

A first step is dealing with the timestamp of the sequence. We will consider the timestamp of the sequence as the date and time of the first photograph of the sequence.

Show Code
# Because we are using sequences, we will use as timestamp for the event the time of the first photograph of the series
  
  camtraps$timestamp <- camtraps$start_time
  
  #  Convert timestamp column in camtraps to POSIXct if needed
  camtraps$timestamp <- as.POSIXct(camtraps$timestamp, format = "%Y-%m-%d %H:%M:%S")

Firstly, we can obtain a RAI for each species at each deployment/station:

Show Code
# Define a custom function event.sp to calculate trapping events and Relative Index of Abundance (RIA)
event.sp <- function(camtraps, 
                     deployment, 
                     total_camera_days_rounded, 
                     time_threshold = NULL) {
  # event.sp: This is the name of the function being defined. 
    # It is a convention in R to use the function_name 
    # <- function(...) syntax to create a new function.
  
  #camtraps: This parameter expects a dataset 
    #(typically a data frame) containing information about 
    #camera trap photographs. 
  
  #deployment: This parameter is intended for another dataset,
    #which contains information about camera trap deployments.
    #This dataset will often be used to summarise or correlate
    #the data from the camtraps dataset.
  
  #total_camera_days_rounded: This parameter is passed to the
    #function and represents a numerical value indicating the
    #total camera days, rounded to the nearest integer. This
    #value is important for calculations (e.g.Relative
    #Index of Abundance -RIA), as it reflects the effort over
    #which species observations occurred.
  
  #time_threshold = NULL: This parameter is optional and has a
    #default value of NULL. If a value is provided when calling
    #the function, it will be used as the time threshold for
    #determining whether consecutive observations of the same
    #species are considered separate events. For example, if you
    #set a time threshold (e.g., 30 minutes), only observations
    #separated by more than 30 minutes would be counted as
    #separate trapping events.
  
  # Check for missing values in genus
  missing_genus <- sum(is.na(camtraps$genus) | camtraps$genus == "")
  
  # Print count of missing genus values
  if (missing_genus > 0) {
    cat("Warning: Found", missing_genus, "missing or empty values in 'genus'. Rows with missing genus will be excluded.\n")
  }
  
  # Filter out rows with missing or empty genus
  camtraps <- camtraps %>%
    filter(!is.na(genus) & genus != "")
  
  # Combine 'genus' and 'species' into a new 'full_species' column in the camtraps dataset
  camtraps <- camtraps %>%
    mutate(
      full_species = paste(genus, species, sep = " "),  # Combine genus and species
      full_species = trimws(full_species)  # Remove any leading/trailing whitespace
    ) 
  
  
  # Calculate trapping events either with or without a time threshold
   # the code first checks if time_threshold is provided:
      #Without a time threshold: All detections of a species within a deployment are counted as a single trapping event.
    #With a time threshold: The time between detections is used to determine whether a new trapping event should be counted based on the time gap between detections.
  
  if (is.null(time_threshold)) { # time_threshold not provided
    trapping_events <- camtraps %>% 
      group_by(full_species, deployment_id) %>% # The code groups camtraps by full_species and deployment_id 
      summarise(trapping_event_count = n(), .groups = 'drop') # counts records for each species-deployment combination
   
  } else { # there is a time threshold
    # Sort by deployment, species, and timestamp
    camtraps <- camtraps %>%
      arrange(deployment_id, full_species, timestamp)
    
    # Calculate time difference between consecutive events
    trapping_events <- camtraps %>%
      group_by(full_species, deployment_id) %>%
      mutate(time_diff = c(NA, diff(as.numeric(timestamp)))) %>%
      # diff function calculates the difference in seconds 
      # betweenc onsecutive timestamp values for each group, 
      # resulting in time_diff
      # The c(NA, diff(...)) syntax is used to add NA at the
      # beginning, as the first row has no previous timestamp 
      # to compare to.
      mutate(new_event = ifelse(is.na(time_diff) | time_diff > time_threshold, 1, 0)) %>%
      # A new trapping event (new_event = 1) is flagged if
      # time_diff is NA (the first record) or if time_diff
      # exceeds time_threshold.
      # Otherwise, new_event is set to 0.
      summarise(trapping_event_count = sum(new_event), .groups = 'drop')
  }   # For each group, the code sums new_event values, giving
      # the count of unique trapping events based on the time
      # threshold.
    
  
  # Calculate RIA using the total camera days rounded
  trapping_events <- trapping_events %>%
    mutate(RIA = trapping_event_count / total_camera_days_rounded)  # Calculate RIA

  # Return the trapping events summary with RIA
  return(trapping_events)
}
Show Code
# Example usage
# Assuming total_camera_days_rounded is defined elsewhere in your code

# Without time threshold
trapping_events_no_threshold <- event.sp(camtraps, deployment, total_camera_days_rounded) 
Warning: Found 73 missing or empty values in 'genus'. Rows with missing genus will be excluded.
Show Code
        # notice time_threshold is not included as the default is NULL

# With a 30-minute threshold (in seconds)
trapping_events_with_threshold <- event.sp(camtraps, deployment, total_camera_days_rounded, time_threshold = 60 * 60)
Warning: Found 73 missing or empty values in 'genus'. Rows with missing genus will be excluded.
Show Code
# View the results
print(trapping_events_no_threshold)
# A tibble: 198 × 4
   full_species       deployment_id  trapping_event_count     RIA
   <chr>              <chr>                         <int>   <dbl>
 1 Aepyceros melampus 2025_164_R1                       1 0.00117
 2 Ardeotis kori      133 9/30/2025                     1 0.00117
 3 Ardeotis kori      2025_133_R1                       2 0.00234
 4 Bos taurus         126 10/2/2025                     9 0.0105 
 5 Bos taurus         128 10/13/2025                    1 0.00117
 6 Bos taurus         132 9/30/2025                    40 0.0467 
 7 Bos taurus         133 9/30/2025                    46 0.0537 
 8 Bos taurus         134 9/30/2025                     1 0.00117
 9 Bos taurus         135 10/2/2025                    73 0.0853 
10 Bos taurus         138 10/13/2025                    5 0.00584
# ℹ 188 more rows
Show Code
print(trapping_events_with_threshold)
# A tibble: 198 × 4
   full_species       deployment_id  trapping_event_count     RIA
   <chr>              <chr>                         <dbl>   <dbl>
 1 Aepyceros melampus 2025_164_R1                       1 0.00117
 2 Ardeotis kori      133 9/30/2025                     1 0.00117
 3 Ardeotis kori      2025_133_R1                       2 0.00234
 4 Bos taurus         126 10/2/2025                     7 0.00818
 5 Bos taurus         128 10/13/2025                    1 0.00117
 6 Bos taurus         132 9/30/2025                     3 0.00350
 7 Bos taurus         133 9/30/2025                    15 0.0175 
 8 Bos taurus         134 9/30/2025                     1 0.00117
 9 Bos taurus         135 10/2/2025                    27 0.0315 
10 Bos taurus         138 10/13/2025                    3 0.00350
# ℹ 188 more rows

But typically we are more interested in the overall results, the RAI for each species, across the study area:

Show Code
# Assuming trapping_events_no_threshold is obtained from the event.sp function and contains the trapping event data
   # if you have set up a threshold, like above you should use
   # trapping_events_with_threshold in this R chunk

# Summarize results across all deployments
species_summary <- trapping_events_no_threshold %>%
  group_by(full_species) %>%
  summarise(
    total_trapping_event_count = sum(trapping_event_count, na.rm = TRUE),  # Sum of trapping events
    total_RAI = sum(RIA, na.rm = TRUE),  # Sum of RAI
    .groups = 'drop'  # Drop grouping after summarization
  ) %>%
  arrange(desc(total_trapping_event_count))  # Optional: sort by total trapping events

# Print a nicely formatted table

kable(species_summary, 
      format = "html",   # C
      col.names = c("Species", "Total Trapping Events", "Total RAI"), 
      caption = "Summary of Trapping Events and RAI by Species")
Summary of Trapping Events and RAI by Species
Species Total Trapping Events Total RAI
Bos taurus 488 0.5700935
Capra aegagrus hircus 133 0.1553738
Equus quagga 92 0.1074766
Chlorocebus pygerythrus 81 0.0946262
Tragelaphus scriptus 80 0.0934579
Ovis aries 57 0.0665888
Papio anubis 24 0.0280374
Sylvicapra grimmia 23 0.0268692
Connochaetes taurinus 22 0.0257009
Madoqua 18 0.0210280
No CV Result No CV Result 17 0.0198598
Crocuta crocuta 14 0.0163551
Numida meleagris 13 0.0151869
Orycteropus afer 12 0.0140187
Phacochoerus africanus 10 0.0116822
Genetta genetta 7 0.0081776
Potamochoerus larvatus 7 0.0081776
Oreotragus oreotragus 6 0.0070093
Civettictis civetta 5 0.0058411
Canis familiaris 4 0.0046729
Lepus capensis 4 0.0046729
Ardeotis kori 3 0.0035047
Hystrix africaeaustralis 3 0.0035047
Otolemur crassicaudatus 3 0.0035047
Capra 2 0.0023364
Equus asinus 2 0.0023364
Ichneumia albicauda 2 0.0023364
Leptailurus serval 2 0.0023364
Madoqua guentheri 2 0.0023364
Aepyceros melampus 1 0.0011682
Canis 1 0.0011682
Caracal caracal 1 0.0011682
Equus 1 0.0011682
Felis catus 1 0.0011682
Hylochoerus meinertzhageni 1 0.0011682
Hystrix 1 0.0011682
Loxodonta africana 1 0.0011682
Mellivora capensis 1 0.0011682
Mungos mungo 1 0.0011682

Looking at these results we could make the rather superficial (and most likely erroneous) assumption that red deer is the most abundant species in the study area. This obviously would not consider all sort of factors affecting detectability, which were discussed during the module sessions.

You will see below how the RAI changes dramatically if you use a threshold/interval of 60 minutes between detections to be considered independent:

Summary of Trapping Events and RAI by Species
Species Total Trapping Events Total RAI
Bos taurus 194 0.2266355
Capra aegagrus hircus 74 0.0864486
Tragelaphus scriptus 71 0.0829439
Equus quagga 54 0.0630841
Ovis aries 40 0.0467290
Chlorocebus pygerythrus 24 0.0280374
Sylvicapra grimmia 21 0.0245327
Madoqua 18 0.0210280
No CV Result No CV Result 15 0.0175234
Crocuta crocuta 13 0.0151869
Papio anubis 13 0.0151869
Orycteropus afer 11 0.0128505
Genetta genetta 7 0.0081776
Phacochoerus africanus 7 0.0081776
Numida meleagris 6 0.0070093
Civettictis civetta 5 0.0058411
Connochaetes taurinus 5 0.0058411
Oreotragus oreotragus 5 0.0058411
Potamochoerus larvatus 5 0.0058411
Canis familiaris 4 0.0046729
Ardeotis kori 3 0.0035047
Hystrix africaeaustralis 3 0.0035047
Lepus capensis 3 0.0035047
Otolemur crassicaudatus 3 0.0035047
Capra 2 0.0023364
Equus asinus 2 0.0023364
Ichneumia albicauda 2 0.0023364
Leptailurus serval 2 0.0023364
Madoqua guentheri 2 0.0023364
Aepyceros melampus 1 0.0011682
Canis 1 0.0011682
Caracal caracal 1 0.0011682
Equus 1 0.0011682
Felis catus 1 0.0011682
Hylochoerus meinertzhageni 1 0.0011682
Hystrix 1 0.0011682
Loxodonta africana 1 0.0011682
Mellivora capensis 1 0.0011682
Mungos mungo 1 0.0011682

Naive Occupancy

The calculation of naive occupancy (deployment/stations positive to the presence of each species on the total number of deployment/stations over the study area) is relatively simple and you could do it by hand or on Excel. You can also do it using R!.

Show Code
# Load required libraries
library(knitr)  # For kable function

# Calculate Naive Occupancy
naive_occupancy <- camtraps %>%
  # Combine 'genus' and 'species' into a new 'full_species' column
  mutate(full_species = paste(genus, species, sep = " ")) %>%
  # Filter out rows with missing genus
  filter(!is.na(genus) & genus != "") %>%
  # Count unique deployments for each species
  group_by(full_species, deployment_id) %>%
  summarise(detected = n() > 0, .groups = 'drop') %>%  # Create a presence indicator
  group_by(full_species) %>%
  summarise(
    number_of_positive_deployments = sum(detected),  # Count of unique deployments where species was detected
    .groups = 'drop'  # Drop grouping after summarization
  ) %>%
  # Join with the deployment dataset to get total number of deployments
  left_join(deployment %>% summarise(total_deployments = n_distinct(deployment_id)), by = character()) %>%
  # Finalize naive occupancy calculation
  mutate(naive_occupancy = number_of_positive_deployments / total_deployments) %>%
  select(species = full_species, number_of_positive_deployments, total_deployments, naive_occupancy) %>%
  arrange(desc(naive_occupancy))  # Optional: sort by naive occupancy

# Print the naive occupancy results with formatted table

# Use kable to print the table in HTML format
kable(naive_occupancy, format = "html", caption = "Naive Occupancy Summary")
Naive Occupancy Summary
species number_of_positive_deployments total_deployments naive_occupancy
Bos taurus 29 46 0.6304348
Tragelaphus scriptus 20 46 0.4347826
Capra aegagrus hircus 17 46 0.3695652
Equus quagga 15 46 0.3260870
Ovis aries 10 46 0.2173913
Madoqua 9 46 0.1956522
No CV Result No CV Result 9 46 0.1956522
Sylvicapra grimmia 8 46 0.1739130
Chlorocebus pygerythrus 7 46 0.1521739
Crocuta crocuta 7 46 0.1521739
Genetta genetta 6 46 0.1304348
Orycteropus afer 6 46 0.1304348
Civettictis civetta 5 46 0.1086957
Papio anubis 5 46 0.1086957
Canis familiaris 4 46 0.0869565
Numida meleagris 4 46 0.0869565
Hystrix africaeaustralis 3 46 0.0652174
Lepus capensis 3 46 0.0652174
Phacochoerus africanus 3 46 0.0652174
Potamochoerus larvatus 3 46 0.0652174
Ardeotis kori 2 46 0.0434783
Capra 2 46 0.0434783
Equus asinus 2 46 0.0434783
Ichneumia albicauda 2 46 0.0434783
Leptailurus serval 2 46 0.0434783
Madoqua guentheri 2 46 0.0434783
Aepyceros melampus 1 46 0.0217391
Canis 1 46 0.0217391
Caracal caracal 1 46 0.0217391
Connochaetes taurinus 1 46 0.0217391
Equus 1 46 0.0217391
Felis catus 1 46 0.0217391
Hylochoerus meinertzhageni 1 46 0.0217391
Hystrix 1 46 0.0217391
Loxodonta africana 1 46 0.0217391
Mellivora capensis 1 46 0.0217391
Mungos mungo 1 46 0.0217391
Oreotragus oreotragus 1 46 0.0217391
Otolemur crassicaudatus 1 46 0.0217391

Results seem to indicate that the most widespread species are foxes and red deer (over 70% of cameras confirm their presence), followed by wolves and brown bears (over 50%)…but again, results are impacted by multiple factors that we have not controlled for.

You can also visualise naive occupancy on a very basic map using the package camtrapR:

Show Code
library(camtrapR)
par(mar = c(5, 5, 5, 10))
BasicMap<-detectionMaps(CTtable =deployment, 
                         recordTable =camtraps,
                         Xcol ="longitude", 
                         Ycol ="latitude", 
                         stationCol ="deployment_id", 
                         speciesCol ="full_species",
                         printLabels =FALSE, #no need to show labels
                         richnessPlot=TRUE,
                         addLegend=FALSE)

Species Accumulation Curve

Resources always limit our projects (e.g. workforce, camera traps, transport, etc). That is why it is important that we try to work out the sampling effort required to characterise the animal community of our study area to make sure our resources are only deployed when needed.

In camera trapping, typically, the sampling effort is measured by the number of camera trap days. We would like to work out how many camera trap days are required to detect the majority of species in our study area. Please note that this only applies for projects focused on producing an inventory.

Show Code
# Step 1: Ensure date columns are Date types
deployment <- deployment %>%
  mutate(start_date = as.Date(start_date),
         end_date = as.Date(end_date))

The mutate function is used to convert the start_date and end_date columns in the deployment dataset to Date types. This ensures that date calculations and comparisons can be performed accurately.

Show Code
# Step 2: Create a date sequence for each deployment
deployment_dates <- deployment %>%
  select(deployment_id, start_date, end_date) %>%
  # Create a data frame to hold all the dates
  rowwise() %>%
  do(data.frame(deployment_id = .$deployment_id, 
                 date = seq(.$start_date, .$end_date, by = "day"))) %>%
  ungroup()  # Expand the rows for each date
  • The code selects relevant columns (deployment_id, start_date, and end_date) from the deployment dataset.

  • The rowwise function allows operations to be performed on each row individually.

  • The do function creates a new data frame for each row, generating a sequence of dates from start_date to end_date using seq(). This captures all days for each deployment.

  • Finally, ungroup() is called to revert the data back to a regular data frame structure after generating the sequences.

Show Code
# Step 3: Combine genus and species into full_species
camtraps <- camtraps %>%
  mutate(full_species = paste(genus, species, sep = " "),
         timestamp = as.Date(timestamp)) %>%  # Convert timestamp to Date
  select(deployment_id, full_species, timestamp)
  • The camtraps dataset is modified to create a new column full_species, which combines the genus and species columns into a single string, facilitating easier identification of species.

  • The timestamp column is also converted to Date format.

  • The select function is used to keep only relevant columns for further analysis: deployment_id, full_species, and timestamp.

Show Code
# Step 4: Count unique species detected per day for each deployment
species_per_day <- deployment_dates %>%
  left_join(camtraps, by = c("deployment_id", "date" = "timestamp")) %>%
  group_by(deployment_id, date) %>%
  summarise(species_detected = list(unique(full_species)), .groups = 'drop') %>%
  ungroup()
  • The deployment_dates dataset is left-joined with the camtraps dataset using deployment_id and matching date with timestamp.

  • The result is grouped by deployment_id and date, allowing for summarization of the species detected each day.

  • The summarise function collects the unique species detected for each day into a list called species_detected.

  • The .groups = 'drop' argument prevents additional grouping metadata from being carried over to the next steps.

  • Finally, ungroup() is called to revert the dataset to a regular data frame.

Show Code
# Step 5: Calculate cumulative number of new species detected
# Initialize variables
cumulative_species <- numeric(nrow(species_per_day))

# Loop through each date and calculate cumulative new species
detected_species <- character(0)  # Start with no species detected

for (i in seq_len(nrow(species_per_day))) {
  # Get newly detected species for the day
  new_species <- species_per_day$species_detected[[i]]
  
  # Update the cumulative list
  detected_species <- union(detected_species, new_species)
  
  # Store the cumulative count
  cumulative_species[i] <- length(detected_species)
}

# Combine with dates
species_per_day <- species_per_day %>%
  mutate(cumulative_species = cumulative_species)
  • An empty numeric vector cumulative_species is initialized to store the cumulative counts of species.

  • A character vector detected_species is initialized to keep track of species detected up to the current date.

  • A loop iterates over each row of species_per_day. For each date:

    • The unique species detected on that day is retrieved.

    • The union function updates the detected_species list to include any new species detected.

    • The cumulative count of detected species is updated and stored in cumulative_species.

  • After the loop, the cumulative_species vector is added to the species_per_day dataset.

Show Code
# Step 6: Count camera days
camera_days <- species_per_day %>%
  mutate(camera_day = row_number())

The camera_days dataset is created by adding a new column camera_day to species_per_day. This column is generated using row_number(), which gives a sequential count of camera trap days.

Show Code
# Step 7: Plot the results
ggplot(camera_days, aes(x = camera_day, y = cumulative_species)) +
  geom_line(color = "blue", linewidth = 1) +  # Use `linewidth` instead of `size`
  geom_point(color = "red") +
  labs(
    x = "Number of Camera Days",
    y = "Cumulative Number of New Species Detected",
    title = "Species Accumulation Curve"
  ) +
  theme_minimal()

  • The ggplot function creates a plot using camera_days as the dataset.

  • The x-axis represents the total number of camera days, while the y-axis shows the cumulative number of new species detected.

  • geom_line draws a blue line representing the cumulative species count over time, and geom_point adds red points to indicate specific data points.

  • The labs function labels the axes and sets the title of the plot.

  • theme_minimal() applies a clean and simple theme to the plot for better aesthetics.

The result is a Species Accumulation Curve showing that after 2200 camera traps there is no much point increasing the sampling effort.