For my dissertation, I want to create a machine learning algorithm to remote assess nest initiation and nesting activity from location and activity data. I am using this class project as an opportunity to assess the feasibility of including LiDAR data as additional information for this task.
My goals for this class project is to look at factors such as understory density and water in these areas and assess how these landscape factors might be related to nest fate.
In RStudio, I used the lidR package, as well as
sf, terra, and ggplot2 to perform
this analysis. I also used tidyterra to nicely plot the
LiDAR data with ggplot2.
Turkeys reside across the state and are hunted in each one of these management units. Hunting regulations are set per unit.
Currently, turkey are being monitored via telemetry units in four of these WMUs.
Pennsylvania Game Commission (PGC) is currently monitoring turkey nesting activity to gain information on their recruitment.
Here, we can see the nests in each WMU and their final fate, as designated by PGC personnel.
We need to first:
See this graphic from Campbell et al. 2018
Fig 1. from Campbell et al. 2018: Three-dimensional lidar point cloud example of a multi-aged lodgepole pine (Pinus contorta) forest stand containing both a dense overstory and understory. The yellow circles represent simulated lidar point returns. The dotted lines distinguish between vertical strata representing ground returns (<0.25m), understory returns (0.25–2m), and overstory returns (>2m).
There are several sites that can be sued to acquire LiDAR data.
There is the USGS and Rockyweb, another USGS site .
There is also a PA specific site Pennsylvania Spatial Data Access PASDA. This is where I acquired the .las files for this project.
dir.lidR <- "D:/unzipped/"
dir.lidR.out <- "D:/unzipped/norm"
#dir.create(dir.lidR.out) to create a directory if it doesn't exist
## read in files in grid
LAS.read <- readLAScatalog(dir.lidR)
opt_output_files(LAS.read) <- paste0(dir.lidR.out, '/{*}_norm')
## parallelism for optimization
plan(multisession,workers = 6L)
## normalizing height
ctgNorm <- normalize_height(LAS.read,
knnidw(),
na.rm = T)
These functions were courtesy of Dr. Tong Qui.
# Functions ----
NRD <- function(z,lowZ,highZ){
# normalized relative point density
#https://doi.org/10.1016/j.rse.2018.06.023
length(z[(z >= lowZ) & (z <= highZ)])/length(z[z <= highZ])
}
ORD <- function(z,lowZ,highZ){
# overall relative point density
# https://doi.org/10.1016/j.rse.2018.06.023
length(z[(z >= lowZ) & (z <= highZ)])/length(z)
}
underStoryFun <- function(Z){
metrics <- list()
metNames <- c()
for (lowZ in c(0.15, 0.25)){ ## where understory (us) start to define height of understory
for (highZ in c(2,3,4,5)){ ## where us ends
## overall relative point density
metrics <- append(metrics, list(ORD(Z, lowZ, highZ)))
metNames<- c(metNames, paste0('ord_', lowZ*100, 'cm_', highZ, 'm'))
## normalized relative point density
metrics <- append(metrics, list(NRD(Z, lowZ, highZ)))
metNames<- c(metNames, paste0('nrd_', lowZ*100, 'cm_', highZ, 'm'))
}
}
names(metrics) <- metNames
metrics
}
## from https://github.com/r-lidar/lidR/issues/215
.underStoryFun <- ~underStoryFun(Z)
## read in las files
lidar_files <- list.files(dir.base, full.names = F, pattern = ".las")
# loop over files to run functions
for(i in 1:length(lidar_files)){
## create temp directory
dir.tmp <- paste0(
dir.base,
"/",
lidar_files[i]
)
## read in LAS from that dir
LAS.read <- readLAScatalog(dir.tmp)
## using the .understoryfunction, apply at a 30m resolution
brick <- grid_metrics(
LAS.read,
~underStoryFun(Z),
res = 100
)
## save brick
outfile <- writeRaster(
brick,
filename = paste0(dir.out,"/",i,'.tif'),
format = "GTiff",
overwrite = TRUE,
options = c("INTERLEAVE=BAND","COMPRESS=LZW")
)
}
After collecting and formatting my LiDAR data, I have 16 layers that correspond to different landscape metrics:
plot(lid)
This is one section of one of my WMUs, and it is over 300 GB alone. Due to processing time and data storage constraints, I opted to analyze this one area.
## SpatRaster resampled to ncells = 500703
In this section of WMU 4D, I have 6 nests: 1 predated, 2 abandoned, and 3 hatched.
## SpatRaster resampled to ncells = 500703
## Now, lets clip the covariate information ----
# get only x and y
gps.pts <- gps.crop %>%
select(lon,
lat)
# Extract covariate information
nest <- extract(lid, gps.pts)
# Adding in column of nest fate
nest$nestfate <- gps.df$nestfate
nest$nestfate.fac <- as.factor(test$nestfate)
I first wanted to investigate the different vegetation heights (low, medium, or high) these nests were located in per nest fate.
In this graph, we can see our abandoned nest fell below a vegetation height threshold of 0.2.
# Plot
nest %>%
ggplot() +
geom_point(aes(y = lowveg, x = ID, col = "blue"), size = 3) +
geom_point(aes(y = medveg, x = ID, col = "yellow"), size = 3) +
geom_point(aes(y = highveg, x = ID, col = "red"), size = 3) +
facet_wrap(~ nestfate) +
theme_bw() +
xlab("Bird ID") +
ylab("Understory Vegetation Height") +
labs(legend = "Vegetation Classifications") +
scale_color_manual(values = branded_colors,
labels = c("Low", "High", "Medium")) +
labs(color = "Vegetation Class")
When looking at our nest fate compared to our water layer, there is a trend of hatched nests all being above a o.2 threshold and the predated and abandoned nests falling below that threshold.
cols <- c("A" = "darkcyan", "B" = "coral1", "C" = "palegreen3")
nest %>%
ggplot() +
geom_point(aes(y = water, x = ID, col = as.factor(nestfate)), size = 3) +
theme_bw() +
xlab("Bird ID") +
ylab("Water") +
labs(color = "Nest Fate")
This preliminary analysis not only allowed me to understand a potential relationship between landscape characteristics and nest fate, but also allowed me to assess the feasibility of including LiDAR data into a machine learning algorithm to remotely assess potential nesting activity and nest initiation from location and activity data.