Activity Analyses in R

Author

Antonio Uzal

Published

October 1, 2024

Activity Analyses using Camera Trapping data

Background and data source strategy

There is an excellent paper that summarises the potential of camera trapping to investigate animal activity patterns and temporal niche partitioning that you should read:(Frey et al. 2017) . The paper describes the advantages of using camera trapping, but also highlights the challenges and limitations of methodologies. One of the most interesting questions that activity analyses based on camera trapping data are able to explore is how sympatric species partition their activities to promote coexistence (i.e. niche partitioning). Furthemore, we can also investigate how environmental stimuli, such as predation risk, resource availability and anthropogenic pressures, may provoke changes to activity patterns.

From the perspective of data source strategy, it is important to note that (Peral, Landman, and Kerley 2022) highlighted that “…. nearly 70% of studies using such data (camera traps) to estimate activity patterns apply a time-to-independence data filter to discard appreciable periods of sampling effort. This treatment of activity as a discrete event emerged from the use of camera trap data to estimate animal abundances, but does not reflect the continuous nature of behavior, and may bias resulting estimates of activity patterns. rinting of code (only output is displayed).”

This passage highlights a common approach in wildlife studies that use camera trap data to assess animal activity patterns. Specifically:

  • Nearly 70% of studies apply a “time-to-independence” filter. This filter treats animal detections as “independent” if they occur after a certain time interval, excluding repeated detections of the same animal within short timeframes.

  • This practice originated in studies estimating animal abundance, where discrete events (individual sightings) are ideal for counting individuals without repeated observations inflating counts.

However, this filtering approach may bias estimates of activity patterns because:

  • Animal activity is continuous, not discrete; filtering for independent detections ignores behaviours where animals linger in an area or frequently revisit, leading to gaps in data that don’t accurately reflect true activity levels.

  • By discarding potentially informative data, researchers may overlook behavioral nuances, underestimating certain aspects of activity that are integral to understanding animal routines or habitat use.

The authors conclude “…the application of time-to-independence data filters in camera trap-based estimates of activity patterns is not valid and should not be used”. Hence, we will be using raw data to conduct the analyses.

Packages

There are two key packages used in activity analyses using camera trapping data:

They both rely on using the timestamps in camera trap detections to obtain activity indices which can be compared between species, treatments or seasons.

#Load Packages
list.of.packages <- c("activity",
                      "overlap",
                      "tidyverse",
                      "dplyr",
                      "lubridate") 
                          

# Check you have them in your library
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
# install any missing packages and load them
if(length(new.packages)) install.packages(new.packages,repos = "http://cran.us.r-project.org")
lapply(list.of.packages, require, character.only = TRUE)

Import the data

By now you should be very familiar with the dataset obtained by Dr. Beth Smith in Romania.

img <- read.csv("images_LGDs.csv",
                header=T)
deployment <- read.csv("deployments.csv",
                       header=T)

Accounting for sunrise and sunset

If we are going to consider changes in diel activity depending on sunlight, as it is very common for prey and predators, it would not make much sense to use fix times across the different seasons. Although at low latitudes there is no much variation in sunrise and sunset times, in high latitudes, like in the UK (or Romania!) there is a very significant variation in sunrise/sunset times through the year. In winter some animals’ activity is constrained to a much shorter day length. If we want to make comparisons across different areas at different latitudes or across different periods, we will need to consider the solar time, rather than the clock time.

Have a look at this paper, where this is explained: Comparing diel activity patterns of wildlife across latitudes and seasons: Time transformations using day length (Vazquez et al. 2019)

The average anchoring method

Instead of using the ‘human’ 24h clock, we can instead express animal activity relative to an important anchor point in the day (e.g. sunrise).

Then, for each observation, we calculate the time difference (offset) from this average reference time. For instance, if sunrise averages at 6:30 AM and an observation is at 7:00 AM, the “anchored” time would be +30 minutes relative to sunrise.

This allow us comparing the anchored times of observations across locations and dates, minimising the confounding effects of geographic location or seasonal changes in daylight hours.

Note

The transformation is not necessary at latitudes below 20°, or in studies with a duration of less than a month (below 40° latitude), as day length doesn’t chnage substantially.

  1. Adding latitude and longitude to the dataset
# Add lat/long to the camera trapping raw data using the deployment data
img <- img %>%
  left_join(deployment %>% select(deployment_id, latitude, longitude), 
            by = "deployment_id") # This will add latitude and longitude columns to img based on matching deployment_id
  1. Ensuring the timestamp is in the correct format

    # original timestamp structure
    str(img$timestamp)
     chr [1:71768] "2021-07-25 21:48:01" "2021-07-25 21:48:04" ...
    # Convert from the existing format ("%Y-%m-%d %H:%M:%S") to POSIXct
    img$timestamp <- as.POSIXct(img$timestamp, format = "%Y-%m-%d %H:%M:%S", tz = "UTC")
    # Confirm the structure
    str(img$timestamp)
     POSIXct[1:71768], format: "2021-07-25 21:48:01" "2021-07-25 21:48:04" "2021-07-25 21:48:04" ...
  2. Calculate solar time

# calculate solar time 
tmp <- solartime ( img$timestamp, # the date time column 
                   img$latitude,  # Latitude
                   img$longitude, # Longitude
                   tz=+2,         # an offset in numeric hours to UTC (Romania is 2 hours ahead)
                   format="%Y:%m:%d %H:%M:%S")

The above code has created a temporal dataframe, containing the timestamp as input, the clock as the local clock time and solar as the solar time, calculated based on the sun’s poisition relative to the latitude and longitude provided.

We will use solar time, but it is always useful to keep both the solar and the clock time and see if there are any differences or implications

# add solar and clock times to the dataset
img$solar <- tmp$solar
img$clock <- tmp$clock
# plot the relationship between solar and clock times
plot(img$solar, img$clock)

The output shows that there is no much difference between these two methods in our study, but anyway, please do use the solar time!

You will see a 0 to 6 range. This is is often used in the average anchoring method typically represents a simplified time scale based on the time after sunrise or time before sunset. This approach translates clock time into a standardized framework that reflects periods of the day relevant to animal behaviour, especially for species whose activity is closely tied to daylight.

Here’s why this scale is used:

  • Normalisation Across Daylight Periods: By anchoring to sunrise or sunset, times of day are standardised to relative hours (in our case, 0 to 6 hours after sunrise), which accounts for changes in sunrise and sunset times across seasons and locations. This makes the time relative to daylight hours instead of fixed clock times.

  • Focus on Active Periods: Many species have distinct activity patterns tied to periods right after sunrise or before sunset (often within a few hours of these events). By focusing on a 0–6 scale, researchers emphasise the time window most relevant to animal behavior, while reducing the influence of less-active periods (like midday for nocturnal animals).

  • Ecological Relevance: The 0–6 scale can correspond to typical daylight durations across many regions. For instance, if sunrise is at 6:00 AM, then 6 hours after (noon) represents the midpoint of the day. This relative scale thus captures half of the daylight period, which can be ecologically meaningful for understanding crepuscular (dawn/dusk-active) or diurnal (day-active) species.

However, this is done just to see if there are significant differences; do not worry, we will be analysing activity patterns over 24 hours

Analyses using Activity

Single species activity

Unlike regular data distributions, time data (e.g., hours) is circular, meaning 24:00 is right next to 00:00. Therefore, activity data needs to be fit using circular statistics to correctly represent this continuous, cyclic nature. The activity package uses a Kernel Density Estimation (KDE) as a non-parametric way to estimate the probability density function (PDF) of the time of day, based on observations.

# Count the number of detections of "Grey Wolf"
wolf_count <- sum(img$common_name == "Grey Wolf")
wolf_count # 2049 detections
[1] 2045
# Fit an activity model
wolf_act <- fitact(img$solar[img$common_name == "Grey Wolf"], 
                   sample = "model", # Use a model-based sample approach for large datasets (>100)
                   reps = 100)       #Uses 100 bootstrap replicates to estimate confidence intervals, typically you should use 1000 but here we use 100 to speed up analyses

plot(wolf_act)

This is what we have done:

  1. Estimated the Circular Kernel Density

    • Fitting the Model: The function fitact() estimates this circular density model, considering wolves’ observed active periods. Here, it models img$solar[img$common_name=="Grey Wolf"], which represents the timestamps of grey wolf observations after converting to solar time.
  2. Confidence Interval Estimation and Sampling

    • sample = “model”: When sample="model", the function assumes a larger dataset and fits a statistical model to predict the distribution more reliably. This is suitable when sample sizes exceed 100-200 observations, as is typical for robust modeling.

    • sample = “data”: With smaller sample sizes (under 100), sample="data" is generally preferred because it relies more on the actual observed values than on modeled estimations, reducing overfitting or overly smoothed results.

  3. Repetitions for Confidence Intervals (reps=1000)

    • Bootstrap Repetitions: By setting reps=1000, the function uses 1000 bootstrap samples to estimate the confidence intervals around the KDE. This means it resamples from the data 1000 times, each time fitting a new kernel density to generate a distribution of estimates, which it then uses to build a more reliable confidence interval.
  4. Plotting the Activity Model

    • The plot(wolf_act) command visualizes the KDE, showing peaks and dips in wolf activity over a 24-hour solar time cycle. The peaks indicate periods when wolf sightings are most frequent, which can correlate with high activity periods like dawn or dusk. The confidence intervals around these peaks give a measure of uncertainty in the activity estimates.

The KDE curve seems to indicate that wolves are more active during the night, with a drop in activity from 12 to 16h.

We can also look at the raw data coming from the KDE:

# Extract the activity information from the wolf_act KDE
act_info <- slot(wolf_act, "act")
act_info
       act         se   lcl.2.5%  ucl.97.5% 
0.51721242 0.01985988 0.49364762 0.57362815 

The output tell us that wolves are active around 51% (SE 2.1%, 49-57%) of the daily cycle.

We can repeat this for another species, let’s go for wolf’s prey species:

# Count the number of detections of "European Roe Deer"
roe_count <- sum(img$common_name == "European Roe Deer")
roe_count # 808 detections
[1] 788
# Fit an activity model
roe_act <- fitact(img$solar[img$common_name == "European Roe Deer"], 
                   sample = "model", 
                   reps = 100)       
plot(roe_act)

# Extract the activity information from the roe_act KDE
act_info <- slot(roe_act, "act")
act_info
       act         se   lcl.2.5%  ucl.97.5% 
0.31043283 0.01584527 0.32828495 0.38704302 

Very contrasting distribution of activity, which seems to indicate that roe deer are mainly diurnal. They seem to be active around 31% of the daily cycle.

and..

# Count the number of detections of "Domestic Sheep"
sheep_count <- sum(img$common_name == "Domestic Sheep")
sheep_count # 5695 detections
[1] 5695
# Fit an activity model
sheep_act <- fitact(img$solar[img$common_name == "Domestic Sheep"], 
                   sample = "model", 
                   reps = 100)       
plot(sheep_act)

# Extract the activity information from the sheep_act KDE
act_info <- slot(sheep_act, "act")
act_info
        act          se    lcl.2.5%   ucl.97.5% 
0.147226441 0.004009036 0.165061617 0.179544282 

This output reflects the context of livestock management as sheep only enter the area covered by the camera traps during daylight and return to their sheep folds. In a sense, this is a very artificial outcome and reflects more husbandry practices than ecology!

and…

# Count the number of detections of "Red Deer"
red_count <- sum(img$common_name == "Red Deer")
red_count # 22719 detections
[1] 22719
# Fit an activity model
red_act <- fitact(img$solar[img$common_name == "Red Deer"], 
                   sample = "model", 
                   reps = 100)       
plot(red_act)

# Extract the activity information from the sheep_act KDE
act_info <- slot(red_act, "act")
act_info
        act          se    lcl.2.5%   ucl.97.5% 
0.217084267 0.003937462 0.256286512 0.269628056 

Another contrasting activity curve, indicating low levels of activity during daylight and a dramatic increase during the early hours of the night.

Comparing species activity

A first step would be to plot specie’s activities on the same axis:

# plot multiple species together
# let's make sure that the figure accommodate the maximum y-values for both activity curves. Set y-axis limits between 0 and 0.20. You might need to plot this a few times until you get the optimum y max value
y_lim <- c(0, 0.20)

# Plot the wolf activity
plot(wolf_act, yunit = "density", data = "none", las = 1, lwd = 2,
     ylim = y_lim,  # Set y-axis limits
     tline = list(lwd = 2), # Thick line 
     cline = list(lty = 0)) # Suppress confidence intervals

# Plot the roe deer activity
plot(roe_act, yunit = "density", data = "none", add = TRUE, 
     tline = list(col = "red", lwd = 2),
     cline = list(lty = 0))

# Plot the red deer activity
plot(red_act, yunit = "density", data = "none", add = TRUE, 
     tline = list(col = "blue", lwd = 2),
     cline = list(lty = 0))

# Add a legend to the plot for the three species
legend("topleft", c("Wolf", "Roe deer", "Red deer"), 
       col = c("black", "red", "blue"), lty = 1, lwd = 2)

We can make these comparisons numerically, using the coefficient of overlap (∆) - developed by Ridout and Linkie (Ridout and Linkie 2009). The coefficient ranges from 0 (no overlap) to 1 (complete overlap).

We can implement for a two species comparison as follows:

# wolf vs roe deer
# Note reps reduced to speed up running time - people typically use 1000.
compareCkern(wolf_act, roe_act, reps = 100)
       obs       null     seNull      pNull 
0.39917839 0.95487715 0.01101581 0.00000000 
  • obs = 0.399 observed overlap index;

  • null = 0.955 mean null overlap index;

  • seNull = 0.008 standard error of the null distribution;

  • pNull = very very low! probability observed index arose by chance.

    Which suggests there is low overlap between wolves and red deer (∆~0.40) - and that it did NOT come about by chance.

# wolf vs red deer
# Note reps reduced to speed up running time - people typically use 1000.
compareCkern(wolf_act, red_act, reps = 100)
        obs        null      seNull       pNull 
0.714845846 0.976023251 0.006563472 0.000000000 

Output indicates a moderate overlap between wolves and red deer (∆=0.71).

# roe vs red deer
# Note reps reduced to speed up running time - people typically use 1000.
compareCkern(roe_act, red_act, reps = 100)
        obs        null      seNull       pNull 
0.428288663 0.966245007 0.007735171 0.000000000 

The output indicates a low overlap between roe and red deer (∆=0.43).

There is a statistically more sound method to compare activity patterns using a Wald Test for the statistical difference between two or more activity level estimates,

compareAct(list(wolf_act,roe_act))
    Difference         SE        W            p
1v2  0.2067796 0.02540644 66.24111 4.440892e-16

The test indicates an estimated difference in activity between wolves and roe deer of approximately 0.21. This value indicates that wolves exhibit a significantly higher level of activity than roe deer (W=48.43, p<0.001).

compareAct(list(wolf_act,red_act))
    Difference         SE        W p
1v2  0.3001281 0.02024644 219.7436 0

The test indicates an estimated difference in activity between wolves and red deer of approximately 0.30. This value indicates that wolves exhibit a significantly higher level of activity than red deer (W=184,34, p<0.001).

compareAct(list(roe_act,red_act))
    Difference         SE        W          p
1v2 0.09334856 0.01632716 32.68842 1.0818e-08

The test indicates an estimated difference in activity between roe and red deer of approximately 0.09. This value indicates that roe deer exhibit a significantly higher level of activity than red deer (W=20.32, p<0.001).

Comparing activity patterns in relation to other variables

Instead or in addition to comparing activity patterns between species, it is also possible to make comparison between categories or levels of a given variable, for instance the habitat classes where the animals were located, different study areas or across seasons.

This is an example using the feature type where the cameras where deployed in Romania.

# Add feature_type to the camera trapping raw data using the deployment data
img <- img %>%
  left_join(deployment %>% select(deployment_id, feature_type), 
            by = "deployment_id") # This will add latitude and longitude columns to img based on matching deployment_id
# Fit an activity model, one for each level of the covariate
red.road_act <- fitact(img$solar[img$common_name == "Red Deer" & img$feature_type == "Road dirt"], 
                   sample = "model", 
                   reps = 100)  
red.trail_act <- fitact(img$solar[img$common_name == "Red Deer" & img$feature_type == "Trail game"], 
                   sample = "model", 
                   reps = 100)  
# plot them together
y_lim <- c(0, 0.25)

# Plot the red deer detections on roads
plot(red.road_act, yunit = "density", data = "none", las = 1, lwd = 2,
     ylim = y_lim,  # Set y-axis limits
     tline = list(lwd = 2), # Thick line 
     cline = list(lty = 0)) # Suppress confidence intervals

# Plot the red deer detections on trails
plot(red.trail_act, yunit = "density", data = "none", add = TRUE, 
     tline = list(col = "red", lwd = 2),
     cline = list(lty = 0))

# Add a legend to the plot for the two habitat features
legend("topleft", c("Red deer detected on dirty roads", "Red deer detected on trails"), 
       col = c("black", "red"), lty = 1, lwd = 2)

That is interesting! there are differences in the activity patters…. worth exploring it!

Analyses using Overlap

The package overlap offers an alternative to activity for the analysis of coefficients of overlap.

img$Area <-1 # the package needs an 'Area' column, so we will add this

Single species activity

Wolf

# plot a kernel density curve
wolf.overlap<-img$solar[img$Area==1 & img$common_name == "Grey Wolf"]

densityPlot(wolf.overlap, 
            rug=T, # rug= argument is for the plot along the bottom showing when detections occurred; rug=F if you don't want it
            adjust=0.8, # adjust argument is for the degree of smoothing
            ylim = c(0,0.11)) 
legend("topleft", c("Degree of smoothing = 0.8"), 
       col = c("black"), lty = 1, lwd = 2)

# values >1 give a flatter curve, values <1 more 'jagged'
densityPlot(wolf.overlap, rug=T, adjust=2, ylim = c(0,0.11))
legend("topleft", c("Degree of smoothing = 2.0"), 
       col = c("black"), lty = 1, lwd = 2)

densityPlot(wolf.overlap, rug=T, adjust=0.5, ylim = c(0,0.11))
legend("topleft", c("Degree of smoothing = 0.5"), 
       col = c("black"), lty = 1, lwd = 2)

# adjust = 0.8 is the better looking in my opinion

Roe deer

# plot a kernel density curve
roe.overlap<-img$solar[img$Area==1 & img$common_name == "European Roe Deer"]

densityPlot(roe.overlap, rug=T, adjust=0.8) 

Comparing species activity

The coefficient of overlap proposed by Meredith and Ridout provides a statistical measure Dhat) used to quantify the overlap between the activity patterns of two species or groups. There are two estimators, called Dhat 4 and Dhat1:

  • Dhat4: This is recommended for larger sample sizes (more than 50-75 observations per group). This estimator is often preferred when data is sufficiently robust, as it provides a reliable measure of overlap in such cases.

  • Dhat1: This is recommended for smaller sample sizes (fewer than 50 observations per group). It is simpler and less sensitive to data scarcity, making it more suitable for limited datasets where Dhat4 may not perform as well.

  • Dhat5: An alternative to Dhat1 and Dhat4, offering stable overlap estimates for intermediate sample sizes or less-smooth data distributions.

# Best estimator depends on the size of the smaller of the two samples
min(length(wolf.overlap), length(roe.overlap))
[1] 808
 # 808 observations is the minimum, so Dhat4 is more appropriate

Given the output of 808 detections, we should use the Dhat4 estimator.

# Calculate the overlap with the three estimators
# wolf - roe deer
wolf_roe_est<-overlapEst(wolf.overlap,roe.overlap)
wolf_roe_est
    Dhat1     Dhat4     Dhat5 
0.4069143 0.3955696 0.3811036 

This output provides a coefficient of overlap of 0.39. However, this method does not provide confidence intervals, which is always recommended. For this we use bootstrapping.

# For example and speed just 1,000 - but usually 10,000
wolf_boot<-resample(wolf.overlap,1000)
roe_boot<-resample(roe.overlap, 1000)

dim(wolf_boot)
[1] 2049 1000
dim(roe_boot)
[1]  808 1000
# To generate estimates of overlap from each pair of samples
# these two matrices are passed to the function bootEst()
wolf_roe_boot<-bootEst(wolf_boot, roe_boot,
                             adjust=c(NA,1,NA)) #1 in second place for Dhat4, would be in the first place for Dhat1

dim(wolf_roe_boot)
[1] 1000    3
# bootstrap mean of coefficient of overlap 
BSmean<-colMeans(wolf_roe_boot)
BSmean
  Dhat1   Dhat4   Dhat5 
     NA 0.43322      NA 
# confidence intervals with corrections for bootstrap bias
tmp<-wolf_roe_boot[,2] # [2] indicates you are using the second estimator(Dhat4)

# So our coefficient of overlap is:
wolf_roe_est
    Dhat1     Dhat4     Dhat5 
0.4069143 0.3955696 0.3811036 
# 0.395

# With 95% CI taken from basic0 bootCI output:
bootCI(wolf_roe_est[2],tmp)
           lower     upper
norm   0.3278342 0.3880044
norm0  0.3654846 0.4256547
basic  0.3279559 0.3903724
basic0 0.3631166 0.4255331
perc   0.4007669 0.4631834
# recommended to use basic0 from bootCI output as confidence interval
# see Meredith & Ridout (2014)
# CI's may be too narrow for small sample sizes or coefficient ~1

#95% CI 0.3652094 -  0.4264557

The bootstrapping shows a coefficient of overlap between wolves and roe deer of 0.395 (95% CI 0.365 - 0.4264).

We can also plot our results:

overlapPlot(wolf.overlap,roe.overlap, 
            rug=T, 
            main=NULL)
legend('topright', c("Grey Wolf (n = 2049)", "Roe Deer (n = 808)"), 
       lty=c(1,2), 
       col=c(1,4), 
       bty='n', 
       cex=0.85)

# add text for overlap estimate and test for significance (done later on)
text(x=6,y=0.12, # coordinates of this text
     cex=0.9,
     label=c(expression(paste(hat(Delta)," = 0.395 (95% CI 0.365 - 0.4264)"))))

You can produce a plot centered on midnight, instead of midday, if you want to highlight the nocturnal period.

# If you have nocturnal species you can also centre the plot on midnight instead of midday
overlapPlot(wolf.overlap,roe.overlap, 
            rug=T, 
            main=NULL
            ,xcenter = 'midnight')
legend('topright', c("Grey Wolf (n = 2049)", "Roe Deer (n = 808)"), 
       lty=c(1,2), 
       col=c(1,4), 
       bty='n', 
       cex=0.85)
# add text for overlap estimate and test for significance (done later on)
text(x=-4,y=0.12,
     cex=0.9,
     label=c(expression(paste(hat(Delta)," = 0.395 (95% CI 0.365 - 0.4264)"))))

References

Frey, Sandra, Jason T. Fisher, A. Cole Burton, and John P. Volpe. 2017. “Investigating Animal Activity Patterns and Temporal Niche Partitioning Using Camera-Trap Data: Challenges and Opportunities.” Edited by Marcus Rowcliffe. Remote Sensing in Ecology and Conservation 3 (3): 123–32. https://doi.org/10.1002/rse2.60.
Peral, Christopher, Marietjie Landman, and Graham I. H. Kerley. 2022. “The Inappropriate Use of Time-to-Independence Biases Estimates of Activity Patterns of Free-Ranging Mammals Derived from Camera Traps.” Ecology and Evolution 12 (10). https://doi.org/10.1002/ece3.9408.
Ridout, M. S., and M. Linkie. 2009. “Estimating Overlap of Daily Activity Patterns from Camera Trap Data.” Journal of Agricultural, Biological, and Environmental Statistics 14 (3): 322–37. https://doi.org/10.1198/jabes.2009.08038.
Rowcliffe, J. Marcus, Roland Kays, Bart Kranstauber, Chris Carbone, and Patrick A. Jansen. 2014. “Quantifying Levels of Animal Activity Using Camera Trap Data.” Edited by Diana Fisher. Methods in Ecology and Evolution 5 (11): 1170–79. https://doi.org/10.1111/2041-210x.12278.
Vazquez, Carmen, J. Marcus Rowcliffe, Kamiel Spoelstra, and Patrick A. Jansen. 2019. “Comparing Diel Activity Patterns of Wildlife Across Latitudes and Seasons: Time Transformations Using Day Length.” Edited by Aaron Ellison. Methods in Ecology and Evolution 10 (12): 2057–66. https://doi.org/10.1111/2041-210x.13290.