The overarching goal of this project is to develop and test a new approach for mapping fuel consumption using only post-fire lidar data. The motivation is that lidar data can be hard to come by, and multitemporal lidar are even more rare. So, although multitemporal lidar data are well-suited to the task of mapping fuel consumption, their rarity makes it hard to quantify fuel consumption on broad scales. However, perhaps if lidar data were collected after a fire (even within a few years of a fire), then we could model what pre-fire lidar/vegetation structure might have looked like (using spectral, topographic, climatic, soils, etc. data along with undisturbed post-fire lidar data), effectively creating pseudo-pre-fire lidar data, that could be used to map fuel consumption.
Monroe Mountain represents an ideal area to test this workflow out, because it has pre-fire and post-fire lidar data and fuel measurement data. However, before jumping right into the deep end by modeling pre-fire lidar structure, I thought it would be wise to see how well we can map fuel consumption with all real lidar data. If we can’t do it accurately with real data, then the odds of success with modeled data are low.
Accordingly, the primary objective of this document is to demonstrate methods and results for mapping fuel consumption using real lidar data. I will use FERA fuel plot data, acquired from Ryan McCarley to build several different model types using 2016 (pre-fire) and 2023 (post-fire) airborne lidar data.
Field Data
Overview
First things first, let’s check out the field data. These data were provided to us by Ryan McCarley (via Andy Hudak) on 3/5/2024. They contain point locations of FERA plots on Monroe Mountain collected over the course of several years with pre- and post-fire fuel loading measurements. Here’s a quick glimpse:
Show Code
library(terra)library(leaflet)library(rmarkdown)library(sf)library(lidR)library(lidRmetrics)library(VSURF)library(tuneRanger)library(ranger)library(mlr)library(dplyr)library(colorspace)# read in the datafuel.plots <-vect("S:/ursa/campbell/fasmee/data/from_ryan/plots_wFuels.shp")# plot them outfuel.plots.ll <-project(fuel.plots, "epsg:4326")m <-leaflet(data = fuel.plots.ll) |>addProviderTiles("Esri.WorldImagery") |>addMarkers()m
Show Code
# take a look at the tablefuel.plots |>as.data.frame() |>paged_table()
There are 262 plots in this database. But, that number is a little deceptive, for two main reasons:
Each plot was measured twice, and each measurement is recorded as its own point, effectively doubling the count.
There are several points at the end of the table that have no data associated with them.
So, let’s do some data cleaning to reformat the data in a way that will make it easier to use for modeling.
Field Data Cleaning
First, I’ll remove the plots without any data. Incidentally, these are plots 601 - 620.
Next, I’ll separate out pre-fire and post-fire plots.
Show Code
# separate pre-fire and post-fire plotsfuel.plots.pre <- fuel.plots[grep("pre", fuel.plots$FID),]fuel.plots.pst <- fuel.plots[grep("post", fuel.plots$FID),]# remove unnecessary columnsdel.cols <-c("FID", "X", "Y")keep.cols <-names(fuel.plots)[!names(fuel.plots) %in% del.cols]fuel.plots.pre <- fuel.plots.pre[,keep.cols]fuel.plots.pst <- fuel.plots.pst[,keep.cols]# move plot ID column from last to firstn <-ncol(fuel.plots.pre)fuel.plots.pre <- fuel.plots.pre[,c(n,1:(n-1))]fuel.plots.pst <- fuel.plots.pst[,c(n,1:(n-1))]# add "pre" and "pst" to column names for all attributesdif.names <-names(fuel.plots.pre)[2:ncol(fuel.plots.pre)]names(fuel.plots.pre)[2:ncol(fuel.plots.pre)] <-paste0(names(fuel.plots.pre)[2:ncol(fuel.plots.pre)], "_pre")names(fuel.plots.pst)[2:ncol(fuel.plots.pst)] <-paste0(names(fuel.plots.pst)[2:ncol(fuel.plots.pst)], "_pst")# merge themfuel.plots <-merge(fuel.plots.pre, as.data.frame(fuel.plots.pst))# get differences between pre- and post-fire fuel loads (i.e., calc consumption)for (dif.name in dif.names){ pre.fld <-paste0(dif.name, "_pre") pst.fld <-paste0(dif.name, "_pst") dif.fld <-paste0(dif.name, "_dif") fuel.plots[,dif.fld] <-as.data.frame(fuel.plots[,pre.fld]) -as.data.frame(fuel.plots[,pst.fld])}# plot them outfuel.plots.ll <-project(fuel.plots, "epsg:4326")m <-leaflet(data = fuel.plots.ll) |>addProviderTiles("Esri.WorldImagery") |>addMarkers()m
Show Code
# take a look at the tablefuel.plots |>as.data.frame() |>paged_table()
After cleaning, we have 121 plots.
Definition of Field Names
It is worth now providing a key that represents what each abbreviation in the field names means:
yr: year of measurment
hr1: 1 hour biomass
hr10: 10 hours biomass
hr100: 100 hour biomass
hr1000: 1000 hour biomass
nonwoody: herb biomass
shrubs: shrub biomass
trees: seedling biomass
llm: litter, lichen, and moss biomass
ground: duff biomass
shrb_sd: combined biomass of shrubs and seedlings
woody: combined biomass of 1hr, 10hr, 100hr, and 1000hr
uf: combined biomass of nonwoody, llm, ground, shrubs, trees, and woody
cf: tree biomass, including snags as well as live boles, branches, and foliage
cbd: canopy bulk density
cbh: canopy base height
acf: available canopy fuel (foliage and fine branches)
tf: total fuel (all biomass combined)
af: available fuel (combined biomass of uf and acf)
And here are the self-explanatory suffixes I’ve added:
pre: pre-fire measurement
pst: post-fire measurement
dif: difference between pre- and post-fire measurements
As I understand it, all of the fields are measured in Mg/ha, with the exception of cbd and cbh, which are probably measured in kg/m3 and m, respectively.
Modeling Fuel Consumption
I will test two different approaches for mapping fuel consumption. The first will be referred to as the modeled fuel load subtraction approaches. In this approach, pre-fire and post-fire fuel loads for each component will be modeled individually, and the differences between the modeled predictions will serve as the fuel consumption predictions. This is comparable to, for example, McCarley et al. (2020). The second approach will be referred to as the profile area change approach. In this approach, described in Hu et al. (2019), a cumulative probability density functions are created from lidar height percentiles for both pre- and post-fire point cloud data. The area under the curve for each resulting profile represents an approximation of the fuel volume. Theoretically, if fire consumes fuel, then the post-fire AUC should be lower than the pre-fire AUC. The difference between these AUCs is the profile area change, or PAC.
We have two airborne lidar datasets, one from 2016 and one from 2023. The former approximates pre-fire conditions, and the latter post-fire conditions, though the timing differences between when the fuels were measured and when the lidar data were flown is well worth noting, and may be especially problematic for post-fire structural modeling, given the regeneration that likely has occurred in the several years since these fires burned.
Approach #1: Modeled Fuel Load Subtraction
Following McCarley et al. (2020), I will try two different approaches here, described in the paper as the PrePost and Pooled models:
PrePost: Model pre- and post-fire fuel loads separately. Use pre-fire lidar and pre-fire fuel loads to train and validate pre-fire model. Use post-fire lidar and post-fire fuel loads to train and validate post-fire model.
Pooled: Combine pre- and post-fire data to train and validate singular pre-/post-fire model.
There are several different techniques one could employ to link plot data to lidar data. These include:
Generating a set of rasterized lidar metrics and extracting pixel values at each plot.
May not capture true within-plot conditions as well, since it requires interpolating gridded values to a plot’s geometry that doesn’t align with that grid
But, easier to translate to prediction, since the pixels are being used to both train and predict with the model
Clipping raw point cloud data to the plots, and calculating lidar metrics from the clipped clouds.
Faster, since you’re processing on small, clipped point clouds
More accurately captures the true dimensions of the plot
May not translate as directly well to pixel-based prediction
Taking those factors into consideration, for the sake of this test, I opted to go with the second technique. Doing so requires the supply of plot dimensions. FERA plots had different variables collected within different plot subgeometries, but the largest plot extent, within which tree data were collected, was an 8m-radius plot. So I will extract pre- and post-fire point cloud data within an 8m radius cirle around each plot center point, and use those clouds to derive the lidar structural metrics. I will be using Tompalski et al.’s lidRmetrics library to generate a suite of plot-level predictor variables. Specifically, I use their metrics_set3() function, which encompasses the most comprehensive set of structural predictor variables.
1a. PrePost model
Here, I’ll read in the lidar data, calculate lidar metrics within plot-clipped point clouds, and remove lidar predictor variables that, for a variety of reasons, will not be useful for modeling purposes.
Note that we lost shrubs_pst because it had too few unique values to be modeled using RF.
Obviously a pretty mixed bag in terms of model performance by fuel metric. But, there are definitely some promising results peppered throughout. Mapping fuel structure, however, may be easier than mapping changes in fuel structure (i.e., fuel consumption). So, that will be the true test of model performance. To test this, I will take all of the predicted values and subtract post-fire predictions from pre-fire predictions, comparing the results to differences between pre- and post-fire fuel metrics from the field data.
Show Code
# merge pre and post fire predictionsdf.pred.merge <-merge(df.pred.pre.merge, df.pred.pst.merge, all = T)# get fuel metric namesvars <-colnames(df.pred.merge) |>sub("_pre_prd", "", x = _) |>sub("_pst_prd", "", x = _) |>unique()vars <- vars[vars !="Plot"]# loop through them and calculate differencesfor (var in vars){ var.pre <-paste0(var, "_pre_prd") var.pst <-paste0(var, "_pst_prd") var.dif <-paste0(var, "_dif_prd")if (!all(c(var.pre, var.pst) %in%colnames(df.pred.merge))) next df.pred.merge[,var.dif] <- df.pred.merge[,var.pre] - df.pred.merge[,var.pst]}# join back to fuel plot datadf.pre.post <- fuel.plots |>as.data.frame() |>merge(df.pred.merge, all = T)# loop through the variables and compare consumption predictions to observationspar(mfrow =c(4,4))for (var in vars){ var.dif.prd <-paste0(var, "_dif_prd") var.dif.obs <-paste0(var, "_dif")if (!all(c(var.dif.prd, var.dif.obs) %in%colnames(df.pre.post))) next df.pred.obs <-data.frame(obs = df.pre.post[,var.dif.obs],prd = df.pre.post[,var.dif.prd])prd.obs.plot(df.pred.obs, var)}
Some pretty impressive results in there! There’s clearly an outlier plot that shows up in the ground fuels – I’m guessing this is a case of a new snag that fell in the middle of a plot, resulting in a massive increase in ground fuels. I think we can safely remove that plot, because it’s not an accurate reflection of fuel consumption. Here are the results with that plot omitted:
Show Code
# remove the plot that had an extreme increase in ground fuelsdf.pre.post <- df.pre.post[df.pre.post$hr1000_dif >-300,]# loop through the variables and compare consumption predictions to observationspar(mfrow =c(4,4))for (var in vars){ var.dif.prd <-paste0(var, "_dif_prd") var.dif.obs <-paste0(var, "_dif")if (!all(c(var.dif.prd, var.dif.obs) %in%colnames(df.pre.post))) next df.pred.obs <-data.frame(obs = df.pre.post[,var.dif.obs],prd = df.pre.post[,var.dif.prd]) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.prepost <- perf.mets } else { perf.mets.prepost <-rbind(perf.mets.prepost, perf.mets) }}
Show Code
perf.mets.prepost$method <-"prepost"
Those results look better.
1b. Pooled model
Again, the idea here is to create a singular structural predictive model that incorporates both pre- and post-fire fuel plot and lidar metrics data. This acts to increase the sample size, and potentially capture more structural nuance within a singular model. It also has potential issues with autocorrelation, but we can address that later if needed through a more strategic evaluation procedure.
Pretty impressive! Let’s see how it performs in the context of fuel consumption.
Show Code
# separate back out the pre- and post-fire plotsdf.pred.pooled.pre <- df.pred.pooled.merge[grep("_pre", df.pred.pooled.merge$Plot),]df.pred.pooled.pst <- df.pred.pooled.merge[grep("_pst", df.pred.pooled.merge$Plot),]# add "_pre" and "_pst" to prediction column namescolnames(df.pred.pooled.pre) <-sub("_prd", "_pre_prd", colnames(df.pred.pooled.pre))colnames(df.pred.pooled.pst) <-sub("_prd", "_pst_prd", colnames(df.pred.pooled.pst))# remove "_pre" and "_pst" from plot namesdf.pred.pooled.pre$Plot <-sub("_pre", "", df.pred.pooled.pre$Plot)df.pred.pooled.pst$Plot <-sub("_pst", "", df.pred.pooled.pst$Plot)# merge pre and post fire predictionsdf.pred.pooled.merge <-merge(df.pred.pooled.pre, df.pred.pooled.pst, all = T)# get fuel metric namesvars <-colnames(df.pred.pooled.merge) |>sub("_pre_prd", "", x = _) |>sub("_pst_prd", "", x = _) |>unique()vars <- vars[vars !="Plot"]# loop through them and calculate differencesfor (var in vars){ var.pre <-paste0(var, "_pre_prd") var.pst <-paste0(var, "_pst_prd") var.dif <-paste0(var, "_dif_prd")if (!all(c(var.pre, var.pst) %in%colnames(df.pred.pooled.merge))) next df.pred.pooled.merge[,var.dif] <- df.pred.pooled.merge[,var.pre] - df.pred.pooled.merge[,var.pst]}# join back to fuel plot datadf.pooled <- fuel.plots |>as.data.frame() |>merge(df.pred.pooled.merge, all = T)# loop through the variables and compare consumption predictions to observationspar(mfrow =c(5,4))for (var in vars){ var.dif.prd <-paste0(var, "_dif_prd") var.dif.obs <-paste0(var, "_dif")if (!all(c(var.dif.prd, var.dif.obs) %in%colnames(df.pooled))) next df.pred.obs <-data.frame(obs = df.pooled[,var.dif.obs],prd = df.pooled[,var.dif.prd]) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.pooled <- perf.mets } else { perf.mets.pooled <-rbind(perf.mets.pooled, perf.mets) }}perf.mets.pooled$method <-"pooled"
Interesting. I would have expected it to be better than the pre-post modeling approach, but it actually seemed to do somewhat worse. I guess this would suggest that the different lidar pulse densities (and perhaps other considerations like time of year, etc.) in the pre- and post-fire lidar data may have yielded enough deviation in the structural metrics to negatively impact pre-/post-fire comparisons when modeled together.
Approach #2: Profile Area Change (PAC)
One could argue that PAC is conceptually simpler than the Modeled Fuel Load Subtraction approach. It basically assumes that changes in the lidar-derived vegetation profile before and after a fire should correspond to changes in pre- and post-fire fuel loads. Very logically consistent. Indeed, its sole reliance on vegetation height quantiles makes it very appealing considering the broader objective of this study – to use post-fire-only lidar to predict pre-fire lidar structure for fuel consumption mapping purposes. Height quantiles are likely easier to predict than some of the more structurally complex lidar metrics used in the Modeled Fuel Load Subtraction portion of this analysis.
Below, I’ll calculate height percentiles on all of the plots and then plot out one example to show how PAC is calculated.
Show Code
# define percentile metrics functionperc.fun <-function(z){ q <-quantile(z, seq(0,1,0.01)) |>as.list()names(q) <-paste0("p_", stringr::str_pad(seq(0,100,1), 3, "left", "0"))return(q)}# reproject plots to match crs of pre and pst ctgsfuel.plots.pre <- fuel.plots |>st_as_sf() |>st_transform(st_crs(ctg.pre))fuel.plots.pst <- fuel.plots |>st_as_sf() |>st_transform(st_crs(ctg.pst))# run plot metricspercs.pre <-plot_metrics(las = ctg.pre,func =~perc.fun(Z),geometry = fuel.plots.pre,radius =8) |>st_drop_geometry()percs.pst <-plot_metrics(las = ctg.pst,func =~perc.fun(Z),geometry = fuel.plots.pst,radius =8) |>st_drop_geometry()# separate out predictor and response data framesresp.pre <- fuel.plots.pre |>st_drop_geometry()resp.pst <- fuel.plots.pst |>st_drop_geometry()percs.pre <- percs.pre[,!colnames(percs.pre) %in%colnames(fuel.plots.pre)]percs.pst <- percs.pst[,!colnames(percs.pst) %in%colnames(fuel.plots.pst)]# set negative values to zeropercs.pre[percs.pre <0] <-0percs.pst[percs.pst <0] <-0# plot an example outex.row <-1percs.pre.ex <- percs.pre[ex.row,]percs.pst.ex <- percs.pst[ex.row,]par(mar =c(5,5,1,1), las =1)plot(x =c(0,100), y =range(c(percs.pre.ex, percs.pst.ex)),type ="n", xlab ="Percentile", ylab ="Height (m)")grid()polygon(x =c(0, seq(0,100), 100),y =c(0, percs.pre.ex, 0),col =adjust_transparency(2, 0.5),border =NA)lines(x =seq(0,100), y = percs.pre.ex, col =2, lwd =2)polygon(x =c(0, seq(0,100), 100),y =c(0, percs.pst.ex, 0),col =adjust_transparency(4, 0.5),border =NA)lines(x =seq(0,100), y = percs.pst.ex, col =4, lwd =2)legend("topleft", legend =c("Pre-fire", "Post-fire"), col =c(2, 4), fill =c(adjust_transparency(2, 0.5), adjust_transparency(4, 0.5)))
In the figure above, PAC would be the difference in area under the curve between pre- and post-fire height distributions. Clearly there was much more fuel volume in this one example plot before the fire than after.
As with anything, there are several different ways PAC could be calculated. The simplest way would be just calculating the AUC of both of these curves, as you see them, and subtracting the two. In the Hu et al. (2019) paper, they normalize the heights first to the maximum of pre- and post-fire height for each pixel/plot, effectively coercing the y-axis to a range of 0-1. This effectively levels the playing field for different fuel types, which could be advantageous, but it could also inflate the PAC-based fuel consumption in shorter-stature fuel types. In either case (the raw height-based or the normalize height-based PAC), you could calculate PAC across the full height range (as they did in the paper), but you could also calculate PAC within vertical slices of the profile. This may be advantageous for separating out different portions of the profile for different fuel components.
So, in summary, I’ll try four different things:
2a. Raw height PAC for full profile
2b. Normalized height PAC for full profile
2c. Raw height PAC for profile slices
2d. Normalized height PAC for profile slices
2a. Raw height PAC for full profile
This approach will yield a singular PAC value for each plot. So, using a simple OLS regression, this approach will seek to understand the degree to which a singular measure of PAC can explain several different fuel metrics. The assumption is that it won’t do that well, but let’s see…
Show Code
# create auc functionauc.fun <-function(y){ x <-seq(0,100) auc <-integrate(approxfun(x, y), 0, 100)$valuereturn(auc)}# calculate raw auc on full profiles for pre- and post-fire dataauc.raw.pre <-apply(percs.pre, 1, auc.fun)auc.raw.pst <-apply(percs.pst, 1, auc.fun)# calculate pacpac.raw <- auc.raw.pre - auc.raw.pst# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ x <- pac.raw y <- fuel.plots[,var] |>as.data.frame() y <- y[,1] x <- x[!is.na(y)] y <- y[!is.na(y)] mod <-lm(y ~ x) prd <-predict(mod) df.pred.obs <-data.frame(prd = prd, obs = y)prd.obs.plot(df.pred.obs, var)}
OK, so not great. Definitely some positive trends, suggesting it’s not totally useless, but it’s too simple of a measure to be able to capture the nuance of fuel consumption – especially of different fuel components.
2b. Normalized height PAC for full profile
This next approach mirrors that which Hu et al. (2019) did, apparently with great success! Instead of using raw vegetation heights, the pre-/post-fire height values are first normalized for each plot to the maximum height between the two measurements. This may be particularly useful in this case, given the much higher point density of the post-fire lidar (and the associated greater likelihood of capturing tree tops).
Show Code
# create auc functionauc.fun <-function(y){ x <-seq(0,100) auc <-integrate(approxfun(x, y), 0, 100)$valuereturn(auc)}# get plot-level maximum heightsmax.hgt.pre <-apply(percs.pre, 1, max)max.hgt.pst <-apply(percs.pst, 1, max)max.hgt <-apply(cbind(max.hgt.pre, max.hgt.pst), 1, max)# normalize heights to maxpercs.pre.norm <- percs.prefor (i inseq(1,nrow(percs.pre.norm))){ percs.pre.norm[i,] <- percs.pre.norm[i,] / max.hgt[i]}percs.pst.norm <- percs.pstfor (i inseq(1,nrow(percs.pst.norm))){ percs.pst.norm[i,] <- percs.pst.norm[i,] / max.hgt[i]}# calculate raw auc on full profiles for pre- and post-fire dataauc.norm.pre <-apply(percs.pre.norm, 1, auc.fun)auc.norm.pst <-apply(percs.pst.norm, 1, auc.fun)# calculate pacpac.norm <- auc.norm.pre - auc.norm.pst# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ x <- pac.norm y <- fuel.plots[,var] |>as.data.frame() y <- y[,1] x <- x[!is.na(y)] y <- y[!is.na(y)] mod <-lm(y ~ x) prd <-predict(mod) df.pred.obs <-data.frame(prd = prd, obs = y)prd.obs.plot(df.pred.obs, var)}
…Essentially the same. A few better results, a few worse. Moral of the story: a singular measure of PAC is not good enough to quantify fuel consumption.
2c. Raw height PAC for profile slices
If a singular measure is insufficient, perhaps evaluating profile changes within vertical subsets will be more closely linked to field-measured fuel consumption. Let’s first try it out on the raw heights. I will create a series of slices based on percentiles in increment widths of 5 percentiles (e.g., 0-5th, 0-10th, 5-10th, etc.). Since I will wind up with a large list of potential predictors, I will fall back on the variable selection-aided random forest modeling approach used in the Modeled Fuel Load Subtraction analysis.
Show Code
# create auc functionauc.slice.fun <-function(y, slice.min, slice.max){ x <-seq(0,100) auc <-integrate(splinefun(x, y), slice.min, slice.max)$valuereturn(auc)}# begin slice.min loopcounter <-0for (i inseq(0,100,10)){for (j inseq(0,100,10)){if (j > i){ counter <- counter +1 auc.raw.slice.pre <-apply(percs.pre, 1, function(x) auc.slice.fun(x, i, j)) auc.raw.slice.pst <-apply(percs.pst, 1, function(x) auc.slice.fun(x, i, j)) pac.raw.slice <- auc.raw.slice.pre - auc.raw.slice.pstif (counter ==1){ pac.raw.slice.df <-data.frame(x = pac.raw.slice)colnames(pac.raw.slice.df) <-paste0("pac_", i, "_", j) } else { pac.raw.slice.df <-cbind(pac.raw.slice.df, data.frame(x = pac.raw.slice))colnames(pac.raw.slice.df)[ncol(pac.raw.slice.df)] <-paste0("pac_", i, "_", j) } } }}# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# get plot idsplot.ids <- fuel.plots[,"Plot"] |>as.data.frame()# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ y <- fuel.plots[,var] |>as.data.frame() x <- pac.raw.slice.df mod.df <-cbind(plot.ids, y, x) df.pred.obs <-tuned.model(mod.df) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.rawpacslice <- perf.mets } else { perf.mets.rawpacslice <-rbind(perf.mets.rawpacslice, perf.mets) }}perf.mets.rawpacslice$method <-"rawpacslice"
Much better than the singular PAC measure! And in some cases, nearly as good as the Modeled Fuel Load Subtraction approach. Interesting…
2d. Normalized height PAC for profile slices
The last thing to try is the same, multiple slice approach, but using normalized instead of raw heights.
Show Code
# create auc functionauc.slice.fun <-function(y, slice.min, slice.max){ x <-seq(0,100) auc <-integrate(splinefun(x, y), slice.min, slice.max)$valuereturn(auc)}# begin slice.min loopcounter <-0for (i inseq(0,100,10)){for (j inseq(0,100,10)){if (j > i){ counter <- counter +1 auc.norm.slice.pre <-apply(percs.pre.norm, 1, function(x) auc.slice.fun(x, i, j)) auc.norm.slice.pst <-apply(percs.pst.norm, 1, function(x) auc.slice.fun(x, i, j)) pac.norm.slice <- auc.norm.slice.pre - auc.norm.slice.pstif (counter ==1){ pac.norm.slice.df <-data.frame(x = pac.norm.slice)colnames(pac.norm.slice.df) <-paste0("pac_", i, "_", j) } else { pac.norm.slice.df <-cbind(pac.norm.slice.df, data.frame(x = pac.norm.slice))colnames(pac.norm.slice.df)[ncol(pac.norm.slice.df)] <-paste0("pac_", i, "_", j) } } }}# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# get plot idsplot.ids <- fuel.plots[,"Plot"] |>as.data.frame()# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ y <- fuel.plots[,var] |>as.data.frame() x <- pac.norm.slice.df mod.df <-cbind(plot.ids, y, x) df.pred.obs <-tuned.model(mod.df) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.normpacslice <- perf.mets } else { perf.mets.normpacslice <-rbind(perf.mets.normpacslice, perf.mets) }}perf.mets.normpacslice$method <-"normpacslice"
Results Comparison
Lastly, let’s take a look at all of the results together to compare methods. Given the poor performance of the singular PAC measures, I’ll compare four methods for fuel consumption mapping:
Modeled Fuel Load Subtraction
PrePost model
Pooled model
Profile Area Change
Raw height PAC for profile slices
Normalized height PAC for profile slices
Show Code
# define the variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")# remove "_dif" from variable names in performance metrics tablesperf.mets.rawpacslice$var <-sub("_dif", "", perf.mets.rawpacslice$var)perf.mets.normpacslice$var <-sub("_dif", "", perf.mets.normpacslice$var)# set up the plotpar(mfrow =c(1,2), las =1)# get rangesr2.range <-range(c( perf.mets.prepost$r2, perf.mets.pooled$r2, perf.mets.rawpacslice$r2, perf.mets.normpacslice$r2))rrmse.range <-range(c( perf.mets.prepost$rrmse, perf.mets.pooled$rrmse, perf.mets.rawpacslice$rrmse, perf.mets.normpacslice$rrmse))# create empty plotx <-rep(100,length(vars))names(x) <- varsdotchart(x, pch =16, xlab =expression(R^2),xlim =c(0,1))# loop though variablesfor (i inseq(1,length(vars))){ var <- vars[i]points(x = perf.mets.prepost$r2[perf.mets.prepost$var == var],y = i, pch =16, col =1)points(x = perf.mets.pooled$r2[perf.mets.pooled$var == var],y = i, pch =16, col =2)points(x = perf.mets.rawpacslice$r2[perf.mets.rawpacslice$var == var],y = i, pch =16, col =3)points(x = perf.mets.normpacslice$r2[perf.mets.normpacslice$var == var],y = i, pch =16, col =4)}# create empty ploty <-rep(10000, length(vars))names(y) <- varsdotchart(y, pch =16, xlab =expression(symbol("%")*RMSE),xlim =c(0,200))# loop though variablesfor (i inseq(1,length(vars))){ var <- vars[i]points(x = perf.mets.prepost$rrmse[perf.mets.prepost$var == var],y = i, pch =16, col =1)points(x = perf.mets.pooled$rrmse[perf.mets.pooled$var == var],y = i, pch =16, col =2)points(x = perf.mets.rawpacslice$rrmse[perf.mets.rawpacslice$var == var],y = i, pch =16, col =3)points(x = perf.mets.normpacslice$rrmse[perf.mets.normpacslice$var == var],y = i, pch =16, col =4)}# add legendlegend("topright", legend =c("PrePost", "Pooled", "Raw PAC Slice", "Norm PAC Slice"),pch =16, col =c(1,2,3,4))
So, clearly the PrePost approach does the best nearly across the board. This begs the question: can the lidar predictor variables that were used in the PrePost models be estimated? It also begs the question: which among these fuel variables are good enough to even move forward with modeling? Obviously some of them (e.g., trees, nonwoody, shrub_sd, 1-100 hr fuels) are pretty garbage-y. But several are quite good. So, maybe the focus should be on modeling the variables that we can more accurately predict.
I propose the following… We focus on the integrated fuel metrics: af, tf, acf, cf, and uf. We take a look at what lidar predictor variables were selected for the construction of those models in the PrePost modeling framework. If the list of variables looks reasonable, then we explore the process for modeling them.
I think a bigger question is: is any of what is already done here sufficiently novel to consider publishing? I don’t want to step on anyone’s toes, and I don’t know if you, Ben Bright, Ryan McCarley, or anyone else is already working towards Monroe Mountian fuel consumption mapping stuff. If so, I’ll happily yield the floor. But if no one is, then maybe these results in and of themselves warrant consideration for publication?
Source Code
---title: "Monroe Mountain Fuel Consumption Modeling"author: "Mickey Campbell"format: html: toc: true toc-expand: true toc-depth: 4 toc-title: Contents toc_location: left code-fold: true code-summary: "Show Code" code-tools: true---## IntroductionThe overarching goal of this project is to develop and test a new approach for mapping fuel consumption *using only post-fire lidar data*. The motivation is that lidar data can be hard to come by, and multitemporal lidar are even more rare. So, although multitemporal lidar data are well-suited to the task of mapping fuel consumption, their rarity makes it hard to quantify fuel consumption on broad scales. However, perhaps if lidar data were collected after a fire (even within a few years of a fire), then we could model what pre-fire lidar/vegetation structure might have looked like (using spectral, topographic, climatic, soils, etc. data along with undisturbed post-fire lidar data), effectively creating pseudo-pre-fire lidar data, that could be used to map fuel consumption.Monroe Mountain represents an ideal area to test this workflow out, because it has pre-fire and post-fire lidar data *and* fuel measurement data. However, before jumping right into the deep end by modeling pre-fire lidar structure, I thought it would be wise to see how well we can map fuel consumption *with all real lidar data*. If we can't do it accurately with real data, then the odds of success with modeled data are low.Accordingly, the primary objective of this document is to demonstrate methods and results for mapping fuel consumption using real lidar data. I will use FERA fuel plot data, acquired from Ryan McCarley to build several different model types using 2016 (pre-fire) and 2023 (post-fire) airborne lidar data.## Field Data### OverviewFirst things first, let's check out the field data. These data were provided to us by Ryan McCarley (via Andy Hudak) on 3/5/2024. They contain point locations of FERA plots on Monroe Mountain collected over the course of several years with pre- and post-fire fuel loading measurements. Here's a quick glimpse:```{r}#| warning: false#| message: falselibrary(terra)library(leaflet)library(rmarkdown)library(sf)library(lidR)library(lidRmetrics)library(VSURF)library(tuneRanger)library(ranger)library(mlr)library(dplyr)library(colorspace)# read in the datafuel.plots <-vect("S:/ursa/campbell/fasmee/data/from_ryan/plots_wFuels.shp")# plot them outfuel.plots.ll <-project(fuel.plots, "epsg:4326")m <-leaflet(data = fuel.plots.ll) |>addProviderTiles("Esri.WorldImagery") |>addMarkers()m# take a look at the tablefuel.plots |>as.data.frame() |>paged_table()```There are `r nrow(fuel.plots)` plots in this database. But, that number is a little deceptive, for two main reasons:1. Each plot was measured twice, and each measurement is recorded as its own point, effectively doubling the count.2. There are several points at the end of the table that have no data associated with them.So, let's do some data cleaning to reformat the data in a way that will make it easier to use for modeling.### Field Data CleaningFirst, I'll remove the plots without any data. Incidentally, these are plots 601 - 620.```{r}#| warning: false#| message: false# remove empty plotsfuel.plots <- fuel.plots[!is.na(fuel.plots$FID),]```Next, I'll separate out pre-fire and post-fire plots.```{r}#| warning: false#| message: false# separate pre-fire and post-fire plotsfuel.plots.pre <- fuel.plots[grep("pre", fuel.plots$FID),]fuel.plots.pst <- fuel.plots[grep("post", fuel.plots$FID),]# remove unnecessary columnsdel.cols <-c("FID", "X", "Y")keep.cols <-names(fuel.plots)[!names(fuel.plots) %in% del.cols]fuel.plots.pre <- fuel.plots.pre[,keep.cols]fuel.plots.pst <- fuel.plots.pst[,keep.cols]# move plot ID column from last to firstn <-ncol(fuel.plots.pre)fuel.plots.pre <- fuel.plots.pre[,c(n,1:(n-1))]fuel.plots.pst <- fuel.plots.pst[,c(n,1:(n-1))]# add "pre" and "pst" to column names for all attributesdif.names <-names(fuel.plots.pre)[2:ncol(fuel.plots.pre)]names(fuel.plots.pre)[2:ncol(fuel.plots.pre)] <-paste0(names(fuel.plots.pre)[2:ncol(fuel.plots.pre)], "_pre")names(fuel.plots.pst)[2:ncol(fuel.plots.pst)] <-paste0(names(fuel.plots.pst)[2:ncol(fuel.plots.pst)], "_pst")# merge themfuel.plots <-merge(fuel.plots.pre, as.data.frame(fuel.plots.pst))# get differences between pre- and post-fire fuel loads (i.e., calc consumption)for (dif.name in dif.names){ pre.fld <-paste0(dif.name, "_pre") pst.fld <-paste0(dif.name, "_pst") dif.fld <-paste0(dif.name, "_dif") fuel.plots[,dif.fld] <-as.data.frame(fuel.plots[,pre.fld]) -as.data.frame(fuel.plots[,pst.fld])}# plot them outfuel.plots.ll <-project(fuel.plots, "epsg:4326")m <-leaflet(data = fuel.plots.ll) |>addProviderTiles("Esri.WorldImagery") |>addMarkers()m# take a look at the tablefuel.plots |>as.data.frame() |>paged_table()```After cleaning, we have `r nrow(fuel.plots)` plots.### Definition of Field NamesIt is worth now providing a key that represents what each abbreviation in the field names means:- **yr**: year of measurment- **hr1**: 1 hour biomass- **hr10**: 10 hours biomass- **hr100**: 100 hour biomass- **hr1000**: 1000 hour biomass- **nonwoody**: herb biomass- **shrubs**: shrub biomass- **trees**: seedling biomass- **llm**: litter, lichen, and moss biomass- **ground**: duff biomass- **shrb_sd**: combined biomass of shrubs and seedlings- **woody**: combined biomass of 1hr, 10hr, 100hr, and 1000hr- **uf**: combined biomass of nonwoody, llm, ground, shrubs, trees, and woody- **cf**: tree biomass, including snags as well as live boles, branches, and foliage- **cbd**: canopy bulk density- **cbh**: canopy base height- **acf**: available canopy fuel (foliage and fine branches)- **tf**: total fuel (all biomass combined)- **af**: available fuel (combined biomass of uf and acf)And here are the self-explanatory suffixes I've added:- **pre**: pre-fire measurement- **pst**: post-fire measurement- **dif**: difference between pre- and post-fire measurementsAs I understand it, all of the fields are measured in Mg/ha, with the exception of **cbd** and **cbh**, which are probably measured in kg/m^3^ and m, respectively.## Modeling Fuel ConsumptionI will test two different approaches for mapping fuel consumption. The first will be referred to as the **modeled fuel load subtraction** approaches. In this approach, pre-fire and post-fire fuel loads for each component will be modeled individually, and the differences between the modeled predictions will serve as the fuel consumption predictions. This is comparable to, for example, [McCarley et al. (2020)](https://www.sciencedirect.com/science/article/pii/S0034425720304879#s0010]). The second approach will be referred to as the **profile area change** approach. In this approach, described in [Hu et al. (2019)](https://www.sciencedirect.com/science/article/pii/S0303243418308997), a cumulative probability density functions are created from lidar height percentiles for both pre- and post-fire point cloud data. The area under the curve for each resulting profile represents an approximation of the fuel volume. Theoretically, if fire consumes fuel, then the post-fire AUC should be lower than the pre-fire AUC. The difference between these AUCs is the profile area change, or **PAC**.We have two airborne lidar datasets, one from 2016 and one from 2023. The former approximates pre-fire conditions, and the latter post-fire conditions, though the timing differences between when the fuels were measured and when the lidar data were flown is well worth noting, and may be especially problematic for post-fire structural modeling, given the regeneration that likely has occurred in the several years since these fires burned.### Approach #1: Modeled Fuel Load SubtractionFollowing [McCarley et al. (2020)](https://www.sciencedirect.com/science/article/pii/S0034425720304879#s0010]), I will try two different approaches here, described in the paper as the PrePost and Pooled models:- **PrePost**: Model pre- and post-fire fuel loads separately. Use pre-fire lidar and pre-fire fuel loads to train and validate pre-fire model. Use post-fire lidar and post-fire fuel loads to train and validate post-fire model. - **Pooled**: Combine pre- and post-fire data to train and validate singular pre-/post-fire model.There are several different techniques one could employ to link plot data to lidar data. These include:1. Generating a set of rasterized lidar metrics and extracting pixel values at each plot. - May not capture true within-plot conditions as well, since it requires interpolating gridded values to a plot's geometry that doesn't align with that grid - But, easier to translate to prediction, since the pixels are being used to both train and predict with the model2. Clipping raw point cloud data to the plots, and calculating lidar metrics from the clipped clouds. - Faster, since you're processing on small, clipped point clouds - More accurately captures the true dimensions of the plot - May not translate as directly well to pixel-based predictionTaking those factors into consideration, for the sake of this test, I opted to go with the second technique. Doing so requires the supply of plot dimensions. FERA plots had different variables collected within different plot subgeometries, but the largest plot extent, within which tree data were collected, was an 8m-radius plot. So I will extract pre- and post-fire point cloud data within an 8m radius cirle around each plot center point, and use those clouds to derive the lidar structural metrics. I will be using [Tompalski et al.'s lidRmetrics library](https://github.com/ptompalski/lidRmetrics) to generate a suite of plot-level predictor variables. Specifically, I use their `metrics_set3()` function, which encompasses the most comprehensive set of structural predictor variables.#### 1a. PrePost modelHere, I'll read in the lidar data, calculate lidar metrics within plot-clipped point clouds, and remove lidar predictor variables that, for a variety of reasons, will not be useful for modeling purposes.```{r}#| warning: false#| message: false#| results: hide # suppress progress baroptions(lidR.progress = F)# directory structureals.dir.pre <-"S:/ursa/campbell/fasmee/data/lidar_data/monroe_mtn_lidar/a04_height"als.dir.pst <-"S:/ursa2/campbell/ems4d/data/als/als_2023/LAZ/Eastburn_processed"# read in lascatalogsctg.pre <-readLAScatalog(als.dir.pre, filter ="-drop_withheld -drop_class 7 18", progress = F)ctg.pst <-readLAScatalog(als.dir.pst, filter ="-drop_withheld -drop_class 7 18", progress = F)# reproject plots to match crs of pre and pst ctgsfuel.plots.pre <- fuel.plots |>st_as_sf() |>st_transform(st_crs(ctg.pre))fuel.plots.pst <- fuel.plots |>st_as_sf() |>st_transform(st_crs(ctg.pst))# run plot metricsmets.pre <-plot_metrics(las = ctg.pre,func =~lidRmetrics::metrics_set3(x = X,y = Y,z = Z,i = Intensity,ReturnNumber = ReturnNumber,NumberOfReturns = NumberOfReturns ),geometry = fuel.plots.pre,radius =8) |>as.data.frame()mets.pst <-plot_metrics(las = ctg.pst,func =~lidRmetrics::metrics_set3(x = X,y = Y,z = Z,i = Intensity,ReturnNumber = ReturnNumber,NumberOfReturns = NumberOfReturns ),geometry = fuel.plots.pst,radius =8) |>as.data.frame()# separate out predictor and response data framesresp.pre <- fuel.plots.pre |>st_drop_geometry()resp.pst <- fuel.plots.pst |>st_drop_geometry()mets.pre <- mets.pre[,!colnames(mets.pre) %in%colnames(fuel.plots.pre)]mets.pst <- mets.pst[,!colnames(mets.pst) %in%colnames(fuel.plots.pst)]# remove unscalable variablesdel.cols <-c("n", "n_first", "n_intermediate", "n_last", "n_single","n_multiple", "n_return_1", "n_return_2", "n_return_3","n_return_4", "vn")# remove predictors with no variabilitysds.pre <-apply(mets.pre, 2, sd)sds.pst <-apply(mets.pst, 2, sd)del.cols.pre <-c(del.cols, names(sds.pre)[sds.pre ==0&!is.na(sds.pre)])del.cols.pst <-c(del.cols, names(sds.pst)[sds.pst ==0&!is.na(sds.pst)])# remove predictors with na valuesnas.pre <-apply(mets.pre, 2, function(x) sum(is.na(x)))nas.pst <-apply(mets.pst, 2, function(x) sum(is.na(x)))del.cols.pre <-c(del.cols.pre, names(nas.pre)[nas.pre >0])del.cols.pst <-c(del.cols.pst, names(nas.pst)[nas.pst >0])# keep columnskeep.cols.pre <-colnames(mets.pre)[!colnames(mets.pre) %in% del.cols.pre]keep.cols.pst <-colnames(mets.pst)[!colnames(mets.pst) %in% del.cols.pst]mets.pre <- mets.pre[,keep.cols.pre]mets.pst <- mets.pst[,keep.cols.pst]```Next, I'll define a couple quick functions to avoid redundant code, one for modeling and one for plotting.```{r}#| warning: false#| message: false# define modeling functiontuned.model <-function(mod.df){ mod.df <-na.omit(mod.df) plot.ids <- mod.df[,1] mod.df$Plot <-NULL x.cols <-2:ncol(mod.df) y.col <-1 v <-VSURF(x = mod.df[,x.cols],y = mod.df[,y.col],RFimplem ="ranger",parallel = T,ncores =18,verbose = F ) x.cols <- x.cols[v$varselect.pred] mod.df <- mod.df[,c(y.col, x.cols)]# task <- makeRegrTask(data = mod.df, target = colnames(mod.df)[1])# tune <- tuneRanger(task, show.info = F) form <-as.formula(paste0(colnames(mod.df)[1], " ~ .")) mod <-ranger(formula = form,data = mod.df,# mtry = tune$recommended.pars$mtry,# sample.fraction = tune$recommended.pars$sample.fraction,# min.node.size = tune$recommended.pars$min.node.size,importance ="permutation" ) df.pred.obs <-data.frame(Plot = plot.ids,obs = mod.df[,y.col],prd = mod$predictions)return(df.pred.obs)}# define predicted vs. observed plotting functionprd.obs.plot <-function(df.pred.obs, name){ df.pred.obs <-na.omit(df.pred.obs) x <- df.pred.obs$obs y <- df.pred.obs$prd rg <-range(c(x,y), na.rm = T)par(mar =c(5,5,2,1), las =1)plot(x, y, type ="n", xlab ="Observed", ylab ="Predicted", main = name,ylim = rg, xlim = rg)grid()lines(x =c(-1e10, 1e10), y =c(-1e10, 1e10))points(x, y, pch =16, col =rgb(0,0,0,0.25), cex =1.25) mod <-lm(y ~ x) new.x <-range(x) new.y <-predict(mod, newdata =list(x = new.x))lines(new.x, new.y, lwd =2, col =2) r2 <-summary(mod)$adj.r.squared rrmse <- (100*sqrt(mean((x-y)^2))/mean(x)) r2.lab <-bquote(R^2== .(round(r2,2))) rrmse.lab <-bquote(symbol("%")*RMSE == .(round(rrmse)))legend("topleft",legend =c(r2.lab, rrmse.lab),text.col =2,bty ="n",x.intersp =0 )return(data.frame(var = name, r2 = r2, rrmse = rrmse))}```Next, I'll run the models! First, for pre-fire...```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 8# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars.pre <-paste0(vars, "_pre")# loop through variables and model, plot, and compile results of eachpar(mfrow =c(5,4))successes <-0for (var.pre in vars.pre){ resp.var <- fuel.plots.pre[,var.pre] |>st_drop_geometry() plot.id <- fuel.plots.pre[,"Plot"] |>st_drop_geometry()if (length(unique(resp.var[,1])) <5){next } else { successes <- successes +1 } mod.df <-cbind(plot.id, resp.var, mets.pre) df.pred.obs <-tuned.model(mod.df)prd.obs.plot(df.pred.obs, var.pre) df.pred.pre <-data.frame(Plot = df.pred.obs$Plot,x = df.pred.obs$prd)colnames(df.pred.pre)[2] <-paste0(var.pre, "_prd")if (successes ==1){ df.pred.pre.merge <- df.pred.pre } else { df.pred.pre.merge <-merge(df.pred.pre.merge, df.pred.pre, all = T) }}```And now, for post-fire...```{r}#| warning: false#| message: false#| fig-height: 8#| fig-width: 8# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars.pst <-paste0(vars, "_pst")# loop through variables and model, plot, and compile results of eachpar(mfrow =c(4,4))successes <-0for (var.pst in vars.pst){ resp.var <- fuel.plots.pst[,var.pst] |>st_drop_geometry() plot.id <- fuel.plots.pst[,"Plot"] |>st_drop_geometry()if (length(unique(resp.var[,1])) <5){next } else { successes <- successes +1 } mod.df <-cbind(plot.id, resp.var, mets.pst) df.pred.obs <-tuned.model(mod.df)prd.obs.plot(df.pred.obs, var.pst) df.pred.pst <-data.frame(Plot = df.pred.obs$Plot,x = df.pred.obs$prd)colnames(df.pred.pst)[2] <-paste0(var.pst, "_prd")if (successes ==1){ df.pred.pst.merge <- df.pred.pst } else { df.pred.pst.merge <-merge(df.pred.pst.merge, df.pred.pst, all = T) }}```**Note that we lost `shrubs_pst` because it had too few unique values to be modeled using RF**.Obviously a pretty mixed bag in terms of model performance by fuel metric. But, there are definitely some promising results peppered throughout. Mapping fuel structure, however, may be easier than mapping changes in fuel structure (i.e., fuel consumption). So, that will be the true test of model performance. To test this, I will take all of the predicted values and subtract post-fire predictions from pre-fire predictions, comparing the results to differences between pre- and post-fire fuel metrics from the field data.```{r}#| warning: false#| message: false#| fig-height: 8#| fig-width: 8# merge pre and post fire predictionsdf.pred.merge <-merge(df.pred.pre.merge, df.pred.pst.merge, all = T)# get fuel metric namesvars <-colnames(df.pred.merge) |>sub("_pre_prd", "", x = _) |>sub("_pst_prd", "", x = _) |>unique()vars <- vars[vars !="Plot"]# loop through them and calculate differencesfor (var in vars){ var.pre <-paste0(var, "_pre_prd") var.pst <-paste0(var, "_pst_prd") var.dif <-paste0(var, "_dif_prd")if (!all(c(var.pre, var.pst) %in%colnames(df.pred.merge))) next df.pred.merge[,var.dif] <- df.pred.merge[,var.pre] - df.pred.merge[,var.pst]}# join back to fuel plot datadf.pre.post <- fuel.plots |>as.data.frame() |>merge(df.pred.merge, all = T)# loop through the variables and compare consumption predictions to observationspar(mfrow =c(4,4))for (var in vars){ var.dif.prd <-paste0(var, "_dif_prd") var.dif.obs <-paste0(var, "_dif")if (!all(c(var.dif.prd, var.dif.obs) %in%colnames(df.pre.post))) next df.pred.obs <-data.frame(obs = df.pre.post[,var.dif.obs],prd = df.pre.post[,var.dif.prd])prd.obs.plot(df.pred.obs, var)}```Some pretty impressive results in there! There's clearly an outlier plot that shows up in the ground fuels -- I'm guessing this is a case of a new snag that fell in the middle of a plot, resulting in a massive increase in ground fuels. I think we can safely remove that plot, because it's not an accurate reflection of fuel *consumption*. Here are the results with that plot omitted:```{r}#| warning: false#| message: false#| fig-height: 8#| fig-width: 8# remove the plot that had an extreme increase in ground fuelsdf.pre.post <- df.pre.post[df.pre.post$hr1000_dif >-300,]# loop through the variables and compare consumption predictions to observationspar(mfrow =c(4,4))for (var in vars){ var.dif.prd <-paste0(var, "_dif_prd") var.dif.obs <-paste0(var, "_dif")if (!all(c(var.dif.prd, var.dif.obs) %in%colnames(df.pre.post))) next df.pred.obs <-data.frame(obs = df.pre.post[,var.dif.obs],prd = df.pre.post[,var.dif.prd]) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.prepost <- perf.mets } else { perf.mets.prepost <-rbind(perf.mets.prepost, perf.mets) }}perf.mets.prepost$method <-"prepost"```Those results look better.#### 1b. Pooled modelAgain, the idea here is to create a singular structural predictive model that incorporates both pre- and post-fire fuel plot and lidar metrics data. This acts to increase the sample size, and potentially capture more structural nuance within a singular model. It also has potential issues with autocorrelation, but we can address that later if needed through a more strategic evaluation procedure.```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 8# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars.pre <-paste0(vars, "_pre")vars.pst <-paste0(vars, "_pst")# subset the fuels datafuel.plots.pooled.pre <- fuel.plots.pre[,c("Plot", vars.pre)] |>st_drop_geometry()fuel.plots.pooled.pst <- fuel.plots.pst[,c("Plot", vars.pst)] |>st_drop_geometry()# remove "_pre" and "_pst" from column names to enable row mergingnames(fuel.plots.pooled.pre) <-sub("_pre", "", names(fuel.plots.pooled.pre))names(fuel.plots.pooled.pst) <-sub("_pst", "", names(fuel.plots.pooled.pst))# add "_pre" and "_pst" to plot idsfuel.plots.pooled.pre$Plot <-paste0(fuel.plots.pooled.pre$Plot, "_pre")fuel.plots.pooled.pst$Plot <-paste0(fuel.plots.pooled.pst$Plot, "_pst")# merge rowwisefuel.plots.pooled <-bind_rows(fuel.plots.pooled.pre, fuel.plots.pooled.pst)# merge mets rowwise toomets.pooled <-bind_rows(mets.pre, mets.pst)# remove columns with nasmets.pooled <- mets.pooled[,apply(mets.pooled, 2, function(x) sum(is.na(x)) ==0)]# loop through variables and model, plot, and compile results of eachpar(mfrow =c(5,4))successes <-0for (var in vars){ resp.var <- fuel.plots.pooled[var] plot.id <- fuel.plots.pooled["Plot"]if (length(unique(resp.var[,1])) <5){next } else { successes <- successes +1 } mod.df <-cbind(plot.id, resp.var, mets.pooled) df.pred.obs <-tuned.model(mod.df)prd.obs.plot(df.pred.obs, var) df.pred.pooled <-data.frame(Plot = df.pred.obs$Plot,x = df.pred.obs$prd)colnames(df.pred.pooled)[2] <-paste0(var, "_prd")if (successes ==1){ df.pred.pooled.merge <- df.pred.pooled } else { df.pred.pooled.merge <-merge(df.pred.pooled.merge, df.pred.pooled, all = T) }}```Pretty impressive! Let's see how it performs in the context of fuel consumption.```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 8# separate back out the pre- and post-fire plotsdf.pred.pooled.pre <- df.pred.pooled.merge[grep("_pre", df.pred.pooled.merge$Plot),]df.pred.pooled.pst <- df.pred.pooled.merge[grep("_pst", df.pred.pooled.merge$Plot),]# add "_pre" and "_pst" to prediction column namescolnames(df.pred.pooled.pre) <-sub("_prd", "_pre_prd", colnames(df.pred.pooled.pre))colnames(df.pred.pooled.pst) <-sub("_prd", "_pst_prd", colnames(df.pred.pooled.pst))# remove "_pre" and "_pst" from plot namesdf.pred.pooled.pre$Plot <-sub("_pre", "", df.pred.pooled.pre$Plot)df.pred.pooled.pst$Plot <-sub("_pst", "", df.pred.pooled.pst$Plot)# merge pre and post fire predictionsdf.pred.pooled.merge <-merge(df.pred.pooled.pre, df.pred.pooled.pst, all = T)# get fuel metric namesvars <-colnames(df.pred.pooled.merge) |>sub("_pre_prd", "", x = _) |>sub("_pst_prd", "", x = _) |>unique()vars <- vars[vars !="Plot"]# loop through them and calculate differencesfor (var in vars){ var.pre <-paste0(var, "_pre_prd") var.pst <-paste0(var, "_pst_prd") var.dif <-paste0(var, "_dif_prd")if (!all(c(var.pre, var.pst) %in%colnames(df.pred.pooled.merge))) next df.pred.pooled.merge[,var.dif] <- df.pred.pooled.merge[,var.pre] - df.pred.pooled.merge[,var.pst]}# join back to fuel plot datadf.pooled <- fuel.plots |>as.data.frame() |>merge(df.pred.pooled.merge, all = T)# loop through the variables and compare consumption predictions to observationspar(mfrow =c(5,4))for (var in vars){ var.dif.prd <-paste0(var, "_dif_prd") var.dif.obs <-paste0(var, "_dif")if (!all(c(var.dif.prd, var.dif.obs) %in%colnames(df.pooled))) next df.pred.obs <-data.frame(obs = df.pooled[,var.dif.obs],prd = df.pooled[,var.dif.prd]) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.pooled <- perf.mets } else { perf.mets.pooled <-rbind(perf.mets.pooled, perf.mets) }}perf.mets.pooled$method <-"pooled"```Interesting. I would have expected it to be better than the pre-post modeling approach, but it actually seemed to do somewhat worse. I guess this would suggest that the different lidar pulse densities (and perhaps other considerations like time of year, etc.) in the pre- and post-fire lidar data may have yielded enough deviation in the structural metrics to negatively impact pre-/post-fire comparisons when modeled together.### Approach #2: Profile Area Change (PAC)One could argue that PAC is conceptually simpler than the Modeled Fuel Load Subtraction approach. It basically assumes that changes in the lidar-derived vegetation profile before and after a fire should correspond to changes in pre- and post-fire fuel loads. Very logically consistent. Indeed, its sole reliance on vegetation height quantiles makes it very appealing considering the broader objective of this study -- to use post-fire-only lidar to predict pre-fire lidar structure for fuel consumption mapping purposes. Height quantiles are likely easier to predict than some of the more structurally complex lidar metrics used in the Modeled Fuel Load Subtraction portion of this analysis.Below, I'll calculate height percentiles on all of the plots and then plot out one example to show how PAC is calculated.```{r}#| warning: false#| message: false#| fig-width: 5#| fig-height: 5#| results: hide # define percentile metrics functionperc.fun <-function(z){ q <-quantile(z, seq(0,1,0.01)) |>as.list()names(q) <-paste0("p_", stringr::str_pad(seq(0,100,1), 3, "left", "0"))return(q)}# reproject plots to match crs of pre and pst ctgsfuel.plots.pre <- fuel.plots |>st_as_sf() |>st_transform(st_crs(ctg.pre))fuel.plots.pst <- fuel.plots |>st_as_sf() |>st_transform(st_crs(ctg.pst))# run plot metricspercs.pre <-plot_metrics(las = ctg.pre,func =~perc.fun(Z),geometry = fuel.plots.pre,radius =8) |>st_drop_geometry()percs.pst <-plot_metrics(las = ctg.pst,func =~perc.fun(Z),geometry = fuel.plots.pst,radius =8) |>st_drop_geometry()# separate out predictor and response data framesresp.pre <- fuel.plots.pre |>st_drop_geometry()resp.pst <- fuel.plots.pst |>st_drop_geometry()percs.pre <- percs.pre[,!colnames(percs.pre) %in%colnames(fuel.plots.pre)]percs.pst <- percs.pst[,!colnames(percs.pst) %in%colnames(fuel.plots.pst)]# set negative values to zeropercs.pre[percs.pre <0] <-0percs.pst[percs.pst <0] <-0# plot an example outex.row <-1percs.pre.ex <- percs.pre[ex.row,]percs.pst.ex <- percs.pst[ex.row,]par(mar =c(5,5,1,1), las =1)plot(x =c(0,100), y =range(c(percs.pre.ex, percs.pst.ex)),type ="n", xlab ="Percentile", ylab ="Height (m)")grid()polygon(x =c(0, seq(0,100), 100),y =c(0, percs.pre.ex, 0),col =adjust_transparency(2, 0.5),border =NA)lines(x =seq(0,100), y = percs.pre.ex, col =2, lwd =2)polygon(x =c(0, seq(0,100), 100),y =c(0, percs.pst.ex, 0),col =adjust_transparency(4, 0.5),border =NA)lines(x =seq(0,100), y = percs.pst.ex, col =4, lwd =2)legend("topleft", legend =c("Pre-fire", "Post-fire"), col =c(2, 4), fill =c(adjust_transparency(2, 0.5), adjust_transparency(4, 0.5)))```In the figure above, PAC would be the difference in area under the curve between pre- and post-fire height distributions. Clearly there was much more fuel volume in this one example plot before the fire than after.As with anything, there are several different ways PAC could be calculated. The simplest way would be just calculating the AUC of both of these curves, as you see them, and subtracting the two. In the [Hu et al. (2019)](https://www.sciencedirect.com/science/article/pii/S0303243418308997) paper, they normalize the heights first to the maximum of pre- and post-fire height for each pixel/plot, effectively coercing the y-axis to a range of 0-1. This effectively levels the playing field for different fuel types, which *could be* advantageous, but it could also inflate the PAC-based fuel consumption in shorter-stature fuel types. In either case (the raw height-based or the normalize height-based PAC), you could calculate PAC across the full height range (as they did in the paper), but you could also calculate PAC within vertical slices of the profile. This may be advantageous for separating out different portions of the profile for different fuel components.So, in summary, I'll try four different things: - 2a. Raw height PAC for full profile - 2b. Normalized height PAC for full profile - 2c. Raw height PAC for profile slices - 2d. Normalized height PAC for profile slices#### 2a. Raw height PAC for full profileThis approach will yield a singular PAC value for each plot. So, using a simple OLS regression, this approach will seek to understand the degree to which a singular measure of PAC can explain several different fuel metrics. The assumption is that it won't do that well, but let's see...```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 8# create auc functionauc.fun <-function(y){ x <-seq(0,100) auc <-integrate(approxfun(x, y), 0, 100)$valuereturn(auc)}# calculate raw auc on full profiles for pre- and post-fire dataauc.raw.pre <-apply(percs.pre, 1, auc.fun)auc.raw.pst <-apply(percs.pst, 1, auc.fun)# calculate pacpac.raw <- auc.raw.pre - auc.raw.pst# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ x <- pac.raw y <- fuel.plots[,var] |>as.data.frame() y <- y[,1] x <- x[!is.na(y)] y <- y[!is.na(y)] mod <-lm(y ~ x) prd <-predict(mod) df.pred.obs <-data.frame(prd = prd, obs = y)prd.obs.plot(df.pred.obs, var)}```OK, so not great. Definitely *some* positive trends, suggesting it's not totally useless, but it's too simple of a measure to be able to capture the nuance of fuel consumption -- especially of different fuel components.#### 2b. Normalized height PAC for full profileThis next approach mirrors that which Hu et al. (2019) did, apparently with great success! Instead of using raw vegetation heights, the pre-/post-fire height values are first normalized for each plot to the maximum height between the two measurements. This may be particularly useful in this case, given the much higher point density of the post-fire lidar (and the associated greater likelihood of capturing tree tops).```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 8# create auc functionauc.fun <-function(y){ x <-seq(0,100) auc <-integrate(approxfun(x, y), 0, 100)$valuereturn(auc)}# get plot-level maximum heightsmax.hgt.pre <-apply(percs.pre, 1, max)max.hgt.pst <-apply(percs.pst, 1, max)max.hgt <-apply(cbind(max.hgt.pre, max.hgt.pst), 1, max)# normalize heights to maxpercs.pre.norm <- percs.prefor (i inseq(1,nrow(percs.pre.norm))){ percs.pre.norm[i,] <- percs.pre.norm[i,] / max.hgt[i]}percs.pst.norm <- percs.pstfor (i inseq(1,nrow(percs.pst.norm))){ percs.pst.norm[i,] <- percs.pst.norm[i,] / max.hgt[i]}# calculate raw auc on full profiles for pre- and post-fire dataauc.norm.pre <-apply(percs.pre.norm, 1, auc.fun)auc.norm.pst <-apply(percs.pst.norm, 1, auc.fun)# calculate pacpac.norm <- auc.norm.pre - auc.norm.pst# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ x <- pac.norm y <- fuel.plots[,var] |>as.data.frame() y <- y[,1] x <- x[!is.na(y)] y <- y[!is.na(y)] mod <-lm(y ~ x) prd <-predict(mod) df.pred.obs <-data.frame(prd = prd, obs = y)prd.obs.plot(df.pred.obs, var)}```...Essentially the same. A few better results, a few worse. Moral of the story: **a singular measure of PAC is not good enough to quantify fuel consumption**.#### 2c. Raw height PAC for profile slicesIf a singular measure is insufficient, perhaps evaluating profile changes within vertical subsets will be more closely linked to field-measured fuel consumption. Let's first try it out on the raw heights. I will create a series of slices based on percentiles in increment widths of 5 percentiles (e.g., 0-5th, 0-10th, 5-10th, etc.). Since I will wind up with a large list of potential predictors, I will fall back on the variable selection-aided random forest modeling approach used in the Modeled Fuel Load Subtraction analysis.```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 8# create auc functionauc.slice.fun <-function(y, slice.min, slice.max){ x <-seq(0,100) auc <-integrate(splinefun(x, y), slice.min, slice.max)$valuereturn(auc)}# begin slice.min loopcounter <-0for (i inseq(0,100,10)){for (j inseq(0,100,10)){if (j > i){ counter <- counter +1 auc.raw.slice.pre <-apply(percs.pre, 1, function(x) auc.slice.fun(x, i, j)) auc.raw.slice.pst <-apply(percs.pst, 1, function(x) auc.slice.fun(x, i, j)) pac.raw.slice <- auc.raw.slice.pre - auc.raw.slice.pstif (counter ==1){ pac.raw.slice.df <-data.frame(x = pac.raw.slice)colnames(pac.raw.slice.df) <-paste0("pac_", i, "_", j) } else { pac.raw.slice.df <-cbind(pac.raw.slice.df, data.frame(x = pac.raw.slice))colnames(pac.raw.slice.df)[ncol(pac.raw.slice.df)] <-paste0("pac_", i, "_", j) } } }}# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# get plot idsplot.ids <- fuel.plots[,"Plot"] |>as.data.frame()# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ y <- fuel.plots[,var] |>as.data.frame() x <- pac.raw.slice.df mod.df <-cbind(plot.ids, y, x) df.pred.obs <-tuned.model(mod.df) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.rawpacslice <- perf.mets } else { perf.mets.rawpacslice <-rbind(perf.mets.rawpacslice, perf.mets) }}perf.mets.rawpacslice$method <-"rawpacslice"```Much better than the singular PAC measure! And in some cases, nearly as good as the Modeled Fuel Load Subtraction approach. Interesting...#### 2d. Normalized height PAC for profile slicesThe last thing to try is the same, multiple slice approach, but using normalized instead of raw heights.```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 8# create auc functionauc.slice.fun <-function(y, slice.min, slice.max){ x <-seq(0,100) auc <-integrate(splinefun(x, y), slice.min, slice.max)$valuereturn(auc)}# begin slice.min loopcounter <-0for (i inseq(0,100,10)){for (j inseq(0,100,10)){if (j > i){ counter <- counter +1 auc.norm.slice.pre <-apply(percs.pre.norm, 1, function(x) auc.slice.fun(x, i, j)) auc.norm.slice.pst <-apply(percs.pst.norm, 1, function(x) auc.slice.fun(x, i, j)) pac.norm.slice <- auc.norm.slice.pre - auc.norm.slice.pstif (counter ==1){ pac.norm.slice.df <-data.frame(x = pac.norm.slice)colnames(pac.norm.slice.df) <-paste0("pac_", i, "_", j) } else { pac.norm.slice.df <-cbind(pac.norm.slice.df, data.frame(x = pac.norm.slice))colnames(pac.norm.slice.df)[ncol(pac.norm.slice.df)] <-paste0("pac_", i, "_", j) } } }}# define response variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "shrubs", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")vars <-paste0(vars, "_dif")# get plot idsplot.ids <- fuel.plots[,"Plot"] |>as.data.frame()# loop through thempar(mfrow =c(5,4), las =1, mar =c(5,5,2,1))for (var in vars){ y <- fuel.plots[,var] |>as.data.frame() x <- pac.norm.slice.df mod.df <-cbind(plot.ids, y, x) df.pred.obs <-tuned.model(mod.df) perf.mets <-prd.obs.plot(df.pred.obs, var)if (var == vars[1]){ perf.mets.normpacslice <- perf.mets } else { perf.mets.normpacslice <-rbind(perf.mets.normpacslice, perf.mets) }}perf.mets.normpacslice$method <-"normpacslice"```## Results ComparisonLastly, let's take a look at all of the results together to compare methods. Given the poor performance of the singular PAC measures, I'll compare four methods for fuel consumption mapping:1. Modeled Fuel Load Subtraction a. PrePost model b. Pooled model2. Profile Area Change c. Raw height PAC for profile slices d. Normalized height PAC for profile slices```{r}#| warning: false#| message: false#| fig-height: 10#| fig-width: 10# define the variables of interestvars <-c("hr1", "hr10", "hr100", "hr1000", "nonwody", "trees", "llm", "ground", "shrb_sd", "woody", "uf", "cf", "cbd", "acf", "tf", "af")# remove "_dif" from variable names in performance metrics tablesperf.mets.rawpacslice$var <-sub("_dif", "", perf.mets.rawpacslice$var)perf.mets.normpacslice$var <-sub("_dif", "", perf.mets.normpacslice$var)# set up the plotpar(mfrow =c(1,2), las =1)# get rangesr2.range <-range(c( perf.mets.prepost$r2, perf.mets.pooled$r2, perf.mets.rawpacslice$r2, perf.mets.normpacslice$r2))rrmse.range <-range(c( perf.mets.prepost$rrmse, perf.mets.pooled$rrmse, perf.mets.rawpacslice$rrmse, perf.mets.normpacslice$rrmse))# create empty plotx <-rep(100,length(vars))names(x) <- varsdotchart(x, pch =16, xlab =expression(R^2),xlim =c(0,1))# loop though variablesfor (i inseq(1,length(vars))){ var <- vars[i]points(x = perf.mets.prepost$r2[perf.mets.prepost$var == var],y = i, pch =16, col =1)points(x = perf.mets.pooled$r2[perf.mets.pooled$var == var],y = i, pch =16, col =2)points(x = perf.mets.rawpacslice$r2[perf.mets.rawpacslice$var == var],y = i, pch =16, col =3)points(x = perf.mets.normpacslice$r2[perf.mets.normpacslice$var == var],y = i, pch =16, col =4)}# create empty ploty <-rep(10000, length(vars))names(y) <- varsdotchart(y, pch =16, xlab =expression(symbol("%")*RMSE),xlim =c(0,200))# loop though variablesfor (i inseq(1,length(vars))){ var <- vars[i]points(x = perf.mets.prepost$rrmse[perf.mets.prepost$var == var],y = i, pch =16, col =1)points(x = perf.mets.pooled$rrmse[perf.mets.pooled$var == var],y = i, pch =16, col =2)points(x = perf.mets.rawpacslice$rrmse[perf.mets.rawpacslice$var == var],y = i, pch =16, col =3)points(x = perf.mets.normpacslice$rrmse[perf.mets.normpacslice$var == var],y = i, pch =16, col =4)}# add legendlegend("topright", legend =c("PrePost", "Pooled", "Raw PAC Slice", "Norm PAC Slice"),pch =16, col =c(1,2,3,4))```So, clearly the PrePost approach does the best nearly across the board. This begs the question: **can the lidar predictor variables that were used in the PrePost models be estimated?** It also begs the question: **which among these fuel variables are good enough to even move forward with modeling?** Obviously some of them (e.g., trees, nonwoody, shrub_sd, 1-100 hr fuels) are pretty garbage-y. But several are quite good. So, maybe the focus should be on modeling the variables that we can more accurately predict.I propose the following... We focus on the integrated fuel metrics: af, tf, acf, cf, and uf. We take a look at what lidar predictor variables were selected for the construction of those models in the PrePost modeling framework. If the list of variables looks reasonable, then we explore the process for modeling them.I think a bigger question is: **is any of what is already done here sufficiently novel to consider publishing?** I don't want to step on anyone's toes, and I don't know if you, Ben Bright, Ryan McCarley, or anyone else is already working towards Monroe Mountian fuel consumption mapping stuff. If so, I'll happily yield the floor. But if no one is, then maybe these results in and of themselves warrant consideration for publication?