Project objectives

For my dissertation, I want to create a machine learning algorithm to remote assess nest initiation and nesting activity from location and activity data. I am using this class project as an opportunity to assess the feasibility of including LiDAR data as additional information for this task.

My goals for this class project is to look at factors such as understory density and water in these areas and assess how these landscape factors might be related to nest fate.

Methods

In RStudio, I used the lidR package, as well as sf, terra, and ggplot2 to perform this analysis. I also used tidyterra to nicely plot the LiDAR data with ggplot2.

Study area

Willdife Management Units across PA

Turkeys reside across the state and are hunted in each one of these management units. Hunting regulations are set per unit.

Currently, turkey are being monitored via telemetry units in four of these WMUs.

Pennsylvania Game Commission (PGC) is currently monitoring turkey nesting activity to gain information on their recruitment.

Here, we can see the nests in each WMU and their final fate, as designated by PGC personnel.

Steps for analyzing LiDAR data

Prior to analysis

We need to first:

  1. Acquire our Lidar data
  2. Normalize elevation to get interpretable outputs
  3. Perform understory functions to get forest characteristics within a particular threshold

See this graphic from Campbell et al. 2018

Fig 1. from Campbell et al. 2018: Three-dimensional lidar point cloud example of a multi-aged lodgepole pine (Pinus contorta) forest stand containing both a dense overstory and understory. The yellow circles represent simulated lidar point returns. The dotted lines distinguish between vertical strata representing ground returns (<0.25m), understory returns (0.25–2m), and overstory returns (>2m).

Fig 1. from Campbell et al. 2018: Three-dimensional lidar point cloud example of a multi-aged lodgepole pine (Pinus contorta) forest stand containing both a dense overstory and understory. The yellow circles represent simulated lidar point returns. The dotted lines distinguish between vertical strata representing ground returns (<0.25m), understory returns (0.25–2m), and overstory returns (>2m).

Step 1: Aquiring our data

There are several sites that can be sued to acquire LiDAR data.

There is the USGS and Rockyweb, another USGS site .

There is also a PA specific site Pennsylvania Spatial Data Access PASDA. This is where I acquired the .las files for this project.

Step 2: Normalizing Elevation

dir.lidR <- "D:/unzipped/"
dir.lidR.out <- "D:/unzipped/norm"
#dir.create(dir.lidR.out) to create a directory if it doesn't exist

## read in files in grid
LAS.read <- readLAScatalog(dir.lidR)
opt_output_files(LAS.read) <- paste0(dir.lidR.out, '/{*}_norm')

## parallelism for optimization
plan(multisession,workers = 6L)

## normalizing height
ctgNorm  <- normalize_height(LAS.read,
                knnidw(),
                na.rm = T)

Step 3: Understory Functions

These functions were courtesy of Dr. Tong Qui.

# Functions ----
NRD  <- function(z,lowZ,highZ){
  # normalized relative point density
  #https://doi.org/10.1016/j.rse.2018.06.023
  length(z[(z >= lowZ) & (z <= highZ)])/length(z[z <= highZ])
}
ORD  <- function(z,lowZ,highZ){
  # overall relative point density
  # https://doi.org/10.1016/j.rse.2018.06.023
  length(z[(z >= lowZ) & (z <= highZ)])/length(z)
}
underStoryFun  <- function(Z){
  metrics <- list()
  metNames <- c()
  for (lowZ in c(0.15, 0.25)){ ## where understory (us) start to define height of understory
    for (highZ in c(2,3,4,5)){ ## where us ends
      ## overall relative point density
      metrics <- append(metrics, list(ORD(Z, lowZ, highZ)))
      metNames<- c(metNames, paste0('ord_', lowZ*100, 'cm_', highZ, 'm'))

      ## normalized relative point density
      metrics <- append(metrics, list(NRD(Z, lowZ, highZ)))
      metNames<- c(metNames, paste0('nrd_', lowZ*100, 'cm_', highZ, 'm'))
    }
  }
  names(metrics) <- metNames
  metrics
}

## from https://github.com/r-lidar/lidR/issues/215
.underStoryFun <- ~underStoryFun(Z)

## read in las files
lidar_files <- list.files(dir.base, full.names = F, pattern = ".las")

# loop over files to run functions
for(i in 1:length(lidar_files)){

   ## create temp directory
  dir.tmp <- paste0(
    dir.base,
    "/",
    lidar_files[i]
  )

  ## read in LAS from that dir
  LAS.read <- readLAScatalog(dir.tmp)

  ## using the .understoryfunction, apply at a 30m resolution
  brick <- grid_metrics(
    LAS.read,
    ~underStoryFun(Z),
    res = 100
  )

  ## save brick
  outfile <- writeRaster(
    brick,
    filename = paste0(dir.out,"/",i,'.tif'),
    format = "GTiff",
    overwrite = TRUE,
    options = c("INTERLEAVE=BAND","COMPRESS=LZW")
  )
}

Data organization

After collecting and formatting my LiDAR data, I have 16 layers that correspond to different landscape metrics:

plot(lid)

This is one section of one of my WMUs, and it is over 300 GB alone. Due to processing time and data storage constraints, I opted to analyze this one area.

## SpatRaster resampled to ncells = 500703

In this section of WMU 4D, I have 6 nests: 1 predated, 2 abandoned, and 3 hatched.

## SpatRaster resampled to ncells = 500703

Extracting covarites

## Now, lets clip the covariate information ----
# get only x and y
gps.pts <- gps.crop %>%
  select(lon,
         lat)

# Extract covariate information
nest <- extract(lid, gps.pts)

# Adding in column of nest fate
nest$nestfate <- gps.df$nestfate
nest$nestfate.fac <- as.factor(test$nestfate)

Results

I first wanted to investigate the different vegetation heights (low, medium, or high) these nests were located in per nest fate.

In this graph, we can see our abandoned nest fell below a vegetation height threshold of 0.2.

# Plot
nest %>% 
  ggplot() +
  geom_point(aes(y = lowveg, x = ID, col = "blue"), size = 3) +
  geom_point(aes(y = medveg, x = ID, col = "yellow"), size = 3) +
  geom_point(aes(y = highveg, x = ID, col = "red"), size = 3) +
  facet_wrap(~ nestfate) +
  theme_bw() +
  xlab("Bird ID") +
  ylab("Understory Vegetation Height") +
  labs(legend = "Vegetation Classifications") +
  scale_color_manual(values = branded_colors, 
                     labels = c("Low", "High", "Medium")) +
  labs(color = "Vegetation Class")

When looking at our nest fate compared to our water layer, there is a trend of hatched nests all being above a o.2 threshold and the predated and abandoned nests falling below that threshold.

cols <- c("A" = "darkcyan", "B" = "coral1", "C" = "palegreen3")
nest %>% 
  ggplot() +
  geom_point(aes(y = water, x = ID, col = as.factor(nestfate)), size = 3) +
  theme_bw() +
  xlab("Bird ID") +
  ylab("Water") +
  labs(color = "Nest Fate")

Final thoughts

This preliminary analysis not only allowed me to understand a potential relationship between landscape characteristics and nest fate, but also allowed me to assess the feasibility of including LiDAR data into a machine learning algorithm to remotely assess potential nesting activity and nest initiation from location and activity data.