As we all know, I am in the midst of modeling PJ biomass for the CNH project, and have been for a number of months at this point. As we also know, about a month ago, we met and decided that, to improve our modeling results, it would be advantageous to collect a few more high-biomass sites, given our limited sample on the high end of the biomass spectrum. To that end, Kelly went and collected 10 more plots and delivered the data to me recently. I was able to incorporate the plots into our modeling procedure, expecting that model results would drastically improve and, well… Things not only didn’t improve but they got significantly worse. Something weird is going on.
On top of that, I’ve been doing a lot more reviewing of the biomass literature lately, and I’ve realize that the most commonly-reported unit for aboveground biomass is megagrams per hectare (Mg ha-1). In all of the past results I’ve shared with this group, the units I’ve been using are kilograms per square meter (kg m-2). I did this because the allometric equations I’ve been using produce per-tree biomass in kg, and the plot radii used to collect the data were measured in meters (15 m). In keeping with the industry standard, I decided to convert kg m-2 to Mg ha-1 (multiply by 10). As a result, the biomass estimates that emerged appear to be WAY too high in comparison to what others who have studied PJ biomass have produced.
So, I think it’s worth meeting to discuss how and why these problems are arising. And I thought it would be helpful to put together an R markdown document to try to explain what might be happening.
After reviewing the PJ biomass literature, it appears that PJ biomass rarely exceeds 70 Mg ha-1. Here is a selection of maximum reported PJ biomass totals:
For comparison purposes, let’s take a look at our data. To do this, I’ll load a summary data frame that is used for modeling that contains plot-level biomass estimates (along with a suite of lidar metrics tha we won’t worry about for now).
library(knitr)
library(stringr)
# set working directory and read in data
df.1 <- read.csv("Modeling/biomass_plots_with_lidar_metrics.csv")
# replace underscores with periods in column names
colnames(df.1) <- str_replace_all(colnames(df.1), "_", ".")
# convert biomass from kg/m2 to Mg/ha
df.1$biomass <- df.1$biomass * 10
# print the first few rows of the data frame
kable(df.1[1:8])
| plot | lat | long | n.trees | mean.drc | date | biomass | mortality |
|---|---|---|---|---|---|---|---|
| 1 | 37.35223 | -109.9600 | 31 | 17.500605 | 2018-04-07 | 43.9803661 | 0.3777196 |
| 10 | 37.43988 | -109.9196 | 71 | 23.752216 | 2018-04-20 | 131.4698865 | 0.1945273 |
| 100 | 37.58169 | -109.8869 | 77 | 10.923300 | 2018-05-04 | 40.7244259 | 0.0164353 |
| 101 | 37.60295 | -109.9033 | 109 | 20.284246 | 2018-05-05 | 212.7102934 | 0.3546020 |
| 102 | 37.61189 | -109.8735 | 211 | 9.027764 | 2018-05-05 | 88.6524352 | 0.0781815 |
| 103 | 37.58645 | -109.9213 | 35 | 5.789401 | 2018-05-05 | 4.1070113 | 0.0464187 |
| 104A | 37.59344 | -109.9224 | 44 | 18.720322 | 2018-05-06 | 103.9831421 | 0.3537040 |
| 104B | 37.59464 | -109.9223 | 139 | 15.284312 | 2018-05-06 | 141.9481435 | 0.4513033 |
| 105 | 37.56707 | -109.8962 | 91 | 16.721960 | 2018-05-06 | 101.8238454 | 0.4423604 |
| 106 | 37.54565 | -109.9401 | 40 | 22.783494 | 2018-11-03 | 46.8608143 | 0.2404879 |
| 107 | 37.60271 | -109.8955 | 112 | 24.104814 | 2018-11-03 | 217.1068131 | 0.4502042 |
| 108 | 37.55420 | -109.9386 | 48 | 25.769101 | 2019-04-22 | 89.4048964 | 0.2718539 |
| 109A | 37.54067 | -109.9425 | 26 | 19.532921 | 2019-04-22 | 36.2508830 | 0.0579860 |
| 109B | 37.53984 | -109.9425 | 24 | 15.075848 | 2019-04-22 | 18.0505481 | 0.1212702 |
| 11 | 37.37145 | -109.9585 | 64 | 21.366451 | 2018-04-20 | 98.1975928 | 0.2348037 |
| 110 | 37.56520 | -109.9258 | 47 | 25.426520 | 2019-04-22 | 79.3616768 | 0.2800627 |
| 111 | 37.58792 | -109.9104 | 49 | 22.966573 | 2019-04-22 | 80.4883206 | 0.4563638 |
| 1114 | 37.61010 | -109.8823 | 62 | 7.483871 | 2020-06-15 | 10.2010074 | 0.2711873 |
| 1115 | 37.58688 | -109.8836 | 103 | 5.635534 | 2020-06-15 | 6.8789137 | 0.1224911 |
| 1116 | 37.61420 | -109.8761 | 55 | 9.265455 | 2020-06-15 | 23.7499889 | 0.1804297 |
| 1117 | 37.60565 | -109.8685 | 88 | 6.802273 | 2020-06-15 | 11.5153462 | 0.0770625 |
| 1118 | 37.60851 | -109.8685 | 107 | 5.208411 | 2020-06-15 | 7.6498479 | 0.0688484 |
| 1119 | 37.58373 | -109.9039 | 51 | 9.423529 | 2020-06-16 | 11.7222654 | 0.2471085 |
| 112 | 37.60026 | -109.9120 | 137 | 23.866901 | 2019-04-23 | 306.4341535 | 0.2130539 |
| 1120 | 37.61559 | -109.8733 | 77 | 6.051948 | 2020-06-16 | 9.0262551 | 0.1083247 |
| 1121 | 37.60533 | -109.9107 | 72 | 10.326389 | 2020-06-16 | 23.3475251 | 0.4814150 |
| 113A | 37.60144 | -109.8968 | 79 | 25.876154 | 2019-04-24 | 172.3583399 | 0.2917850 |
| 113B | 37.60024 | -109.9001 | 60 | 23.029991 | 2019-04-24 | 135.8345509 | 0.2146876 |
| 114 | 37.58327 | -109.9011 | 67 | 19.943040 | 2019-04-24 | 84.1151732 | 0.1959126 |
| 11B | 37.37005 | -109.9566 | 66 | 23.029218 | 2019-04-20 | 103.5950332 | 0.2311981 |
| 12 | 37.45719 | -109.9090 | 89 | 17.413216 | 2018-04-21 | 102.2702630 | 0.3428152 |
| 13 | 37.38676 | -109.9421 | 57 | 28.486975 | 2019-04-20 | 119.7649108 | 0.2752291 |
| 14 | 37.43282 | -109.9286 | 32 | 28.724216 | 2019-04-21 | 82.2964753 | 0.3451241 |
| 1403 | 37.76833 | -110.0592 | 36 | 11.575000 | 2020-06-17 | 12.1600277 | 0.3150780 |
| 1404 | 37.76903 | -110.0645 | 102 | 9.952941 | 2020-06-17 | 20.2298588 | 0.3044603 |
| 15A | 37.48187 | -109.9038 | 57 | 24.301565 | 2019-04-21 | 93.8015586 | 0.2235890 |
| 15B | 37.48089 | -109.9061 | 54 | 29.854789 | 2019-04-21 | 120.4782041 | 0.2635713 |
| 16 | 37.50892 | -109.8772 | 134 | 17.765550 | 2018-05-07 | 152.6605367 | 0.4275046 |
| 17A | 37.50439 | -109.8858 | 135 | 19.348022 | 2018-05-07 | 155.3071458 | 0.3364023 |
| 17B | 37.50511 | -109.8854 | 122 | 18.136020 | 2018-05-07 | 142.9515288 | 0.3279061 |
| 19 | 37.51656 | -109.8767 | 24 | 5.260922 | 2018-04-08 | 5.8394272 | 0.0000000 |
| 2 | 37.43807 | -109.9100 | 70 | 15.511645 | 2018-11-03 | 45.8093373 | 0.2735036 |
| 200 | 37.52385 | -110.0696 | 13 | 18.492380 | 2018-11-03 | 10.1026665 | 0.3360591 |
| 201 | 37.54160 | -110.0231 | 54 | 21.500645 | 2018-11-03 | 58.5323700 | 0.2794712 |
| 202 | 37.53050 | -110.0617 | 20 | 18.833064 | 2019-04-23 | 26.2246752 | 0.4019056 |
| 203 | 37.55587 | -110.0195 | 34 | 19.364103 | 2018-04-28 | 52.9514127 | 0.0927232 |
| 204A | 37.55809 | -110.0014 | 27 | 35.054525 | 2019-04-23 | 89.1930037 | 0.3979245 |
| 204B | 37.55888 | -110.0010 | 30 | 35.546730 | 2019-04-23 | 158.3045793 | 0.4451332 |
| 205 | 37.51403 | -110.0917 | 13 | 47.516482 | 2019-04-25 | 105.6664943 | 0.2825175 |
| 208A | 37.53958 | -110.0478 | 29 | 31.740388 | 2018-05-07 | 136.1453661 | 0.1906853 |
| 208B | 37.53860 | -110.0477 | 30 | 19.316090 | 2018-11-03 | 29.3816658 | 0.4060838 |
| 3 | 37.36986 | -109.9440 | 102 | 21.545867 | 2018-04-07 | 139.5917758 | 0.2129052 |
| 300 | 37.75340 | -110.0898 | 33 | 31.177316 | 2019-04-26 | 133.0137676 | 0.3336208 |
| 301 | 37.74060 | -110.0250 | 5 | 6.681661 | 2018-04-27 | 0.4209419 | 0.0000000 |
| 302 | 37.76788 | -110.0842 | 37 | 23.317676 | 2019-04-26 | 63.4409242 | 0.4974081 |
| 303 | 37.74370 | -110.0103 | 31 | 12.922065 | 2019-04-26 | 21.5616588 | 0.0000000 |
| 4 | 37.51177 | -109.8815 | 52 | 26.332381 | 2019-04-18 | 105.5235005 | 0.2445582 |
| 400 | 37.80253 | -110.1699 | 21 | 38.460677 | 2019-04-25 | 80.7302024 | 0.3091873 |
| 401 | 37.79959 | -110.1637 | 18 | 34.807375 | 2019-04-26 | 73.8836559 | 0.2443716 |
| 402 | 37.79940 | -110.1568 | 23 | 16.633322 | 2018-04-27 | 27.3407906 | 0.3527163 |
| 5A | 37.49047 | -109.9040 | 49 | 26.439261 | 2019-04-19 | 112.0458976 | 0.1880696 |
| 5B | 37.49011 | -109.9038 | 39 | 31.059682 | 2019-04-19 | 96.4524915 | 0.2982917 |
| 7 | 37.44877 | -109.9257 | 46 | 27.939301 | 2019-04-19 | 89.3782193 | 0.1477916 |
| 8 | 37.42201 | -109.9434 | 40 | 20.279588 | 2019-04-19 | 42.2887748 | 0.2248098 |
| 9 | 37.35397 | -109.9692 | 36 | 29.132126 | 2019-04-20 | 85.3381482 | 0.2425134 |
| CM4 | 37.58076 | -109.8852 | 70 | 15.425057 | 2019-05-03 | 64.6955800 | 0.0138164 |
| MD1 | 37.49013 | -109.8975 | 97 | 22.574577 | 2019-05-04 | 142.6904922 | 0.0696933 |
# plot histogram of biomass
par(mar = c(5,5,1,1))
hist(df.1$biomass,
main = NA,
xlab = "Biomass (Mg/ha)",
breaks = seq(0,325,25),
las = 1)
The median biomass among our plots is 82.3 Mg ha-1. The maximum biomass among our plots is 306.43 Mh ha-1. So… Clearly something’s not right.
For this study I’ve been relying on the allometric equations provided by Chonacky et al. (2014). It’s widely-cited, widely-used, and has allometries for all species found in our study area.
The allometric equation for pinyon pines is:
\[ Biomass_{kg} = e^{-3.2007 + 2.5339 \cdot ln(DRC_{cm})} \]
The allometric equation for juniper is:
\[ Biomass_{kg} = e^{-2.7096 + 2.1942 \cdot ln(DRC_{cm})} \]
This results in the following DRC-biomass relationships:
# create vector of drcs and calculate allometric biomass for p and j
drc <- seq(0,150)
bio.pied <- exp(-3.2007 + 2.5339 * log(drc))
bio.juos <- exp(-2.7096 + 2.1942 * log(drc))
# plot the data
par(mar = c(5,5,1,1))
plot(bio.pied ~ drc,
type = "l",
lty = 1,
lwd = 2,
las = 1,
col = "blue",
xlab = "DRC (cm)",
ylab = "Biomass (kg)",
ylim = c(0,max(bio.pied, bio.juos)))
lines(bio.juos ~ drc,
lwd = 2,
col = "red")
grid()
# add legend
legend("topleft",
legend = c("PIED", "JUOS"),
lwd = 2,
col = c("blue", "red"))
As you can see, DRC being equal, a pinyon will have a higher biomass than a juniper.
Let’s take a closer look at what may be driving our anomalously-high biomass plot estimates. To do this, I’ll load a more detailed dataset that contains all tree-level information in all plots collected for the study. Note that biomass is shown in kg.
# read in the data
df.2 <- read.csv("Field_Data/all_plot_data_20200626.csv")
# print the first few rows
cols <- c("date", "plot", "lat", "long", "species", "drc.cm", "biomass")
kable(head(df.2[,cols]))
| date | plot | lat | long | species | drc.cm | biomass |
|---|---|---|---|---|---|---|
| 2018-04-07 | 3 | 37.36986 | -109.944 | JUOS | 28.011270 | 99.7638729 |
| 2018-04-07 | 3 | 37.36986 | -109.944 | PIED | 6.167465 | 4.0926530 |
| 2018-04-07 | 3 | 37.36986 | -109.944 | JUOS | 32.685385 | 139.9685115 |
| 2018-04-07 | 3 | 37.36986 | -109.944 | JUOS | 15.392776 | 26.8192440 |
| 2018-04-07 | 3 | 37.36986 | -109.944 | JUOS | 44.767928 | 279.1177435 |
| 2018-04-07 | 3 | 37.36986 | -109.944 | PIED | 2.668771 | 0.4899863 |
Note that this dataset contains all of Kelly’s and Steve’s data. However, one of Kelly’s plots (plot #6) was incomplete, and many of Steve’s plots were outside of the lidar boxes. So, let’s narrow it down to just the plots that will ultimately be used in the study:
# create a list of plots from the plot-level summary data frame
plots <- df.1$plot
# subset the detailed, tree-level data frame to just data collected within those plots
df.2 <- df.2[df.2$plot %in% plots,]
As a result, there are 67 plots, containing a total of 4103 trees, 2140 of which are pinyons and 1928 of which are junipers – a fairly even split! Let’s take a look at how the DRCs and resultant biomass estimates compare between these two species in our tree data:
# create normalized kernel density estimates of drc for each species
d.drc.juos <- density(df.2$drc.cm[df.2$species == "JUOS"])
d.drc.juos.x <- d.drc.juos$x
d.drc.juos.y <- d.drc.juos$y / max(d.drc.juos$y)
d.drc.pied <- density(df.2$drc.cm[df.2$species == "PIED"])
d.drc.pied.x <- d.drc.pied$x
d.drc.pied.y <- d.drc.pied$y / max(d.drc.pied$y)
# plot the drc data
par(mar = c(5,5,1,1))
plot(d.drc.juos.y ~ d.drc.juos.x,
type = "l",
lwd = 2,
col = "red",
xlab = "DRC (cm)",
ylab = "Relative Abundance",
las = 1)
lines(d.drc.pied.y ~ d.drc.pied.x,
lwd = 2,
col = "blue")
grid()
legend("topright",
legend = c("PIED", "JUOS"),
lwd = 2,
col = c("blue", "red"))
# create normalized kernel density estimates of biomass for each species
d.bio.juos <- density(df.2$biomass[df.2$species == "JUOS"])
d.bio.juos.x <- d.bio.juos$x
d.bio.juos.y <- d.bio.juos$y / max(d.bio.juos$y)
d.bio.pied <- density(df.2$biomass[df.2$species == "PIED"])
d.bio.pied.x <- d.bio.pied$x
d.bio.pied.y <- d.bio.pied$y / max(d.bio.pied$y)
# plot the biomass data
par(mar = c(5,5,1,1))
plot(d.bio.juos.y ~ d.bio.juos.x,
type = "l",
lwd = 2,
col = "red",
xlab = "Biomass (kg)",
ylab = "Relative Abundance",
las = 1)
lines(d.bio.pied.y ~ d.bio.pied.x,
lwd = 2,
col = "blue")
grid()
legend("topright",
legend = c("PIED", "JUOS"),
lwd = 2,
col = c("blue", "red"))
Notice how much larger the JUOS DRCs are and, as a result, how much higher the tree-level biomass estimates are for JUOS as compared to PIED. This is where our anomalously-high plot-level biomass estimates are likely stemming from (pardon the pun). Let’s take a look at plot-level biomass estimates vs. relative abundance of pinyon vs. juniper trees:
# loop through plots
plots <- unique(df.2$plot)
for (plot in plots){
# create a subset data frame for just that plot
df.sub <- df.2[df.2$plot == plot,]
# calculate summary statistics
n.juos <- nrow(df.sub[df.sub$species == "JUOS",])
n.pied <- nrow(df.sub[df.sub$species == "PIED",])
biomass.juos <- sum(df.sub$biomass[df.sub$species == "JUOS"])
biomass.pied <- sum(df.sub$biomass[df.sub$species == "PIED"])
biomass <- biomass.juos + biomass.pied
# create a new summary data frame
if (plot == plots[1]){
df.3 <- data.frame(plot = plot,
n.juos = n.juos,
n.pied = n.pied,
biomass.juos = biomass.juos,
biomass.pied = biomass.pied,
biomass = biomass)
} else {
df.3.temp <- data.frame(plot = plot,
n.juos = n.juos,
n.pied = n.pied,
biomass.juos = biomass.juos,
biomass.pied = biomass.pied,
biomass = biomass)
df.3 <- rbind(df.3, df.3.temp)
}
}
# calculate per-plot relative percent of juniper trees by count
df.3$rel.perc.juos <- df.3$n.juos / (df.3$n.juos + df.3$n.pied)
# convert biomass to Mg/ha (from kg), given a plot radius of 15 m
df.3$biomass.kg.m2 <- df.3$biomass / (pi * 15 ^ 2)
df.3$biomass.mg.ha <- df.3$biomass.kg.m2 * 10
# plot total plot biomass vs relatively percent juniper trees
par(mar = c(5,5,1,1))
plot(biomass.mg.ha ~ rel.perc.juos,
data = df.3,
pch = 16,
xlab = "Relative Percent Juniper Trees",
ylab = "Biomass (Mg/ha)",
las = 1)
# add regression line
abline(lm(biomass.mg.ha ~ rel.perc.juos, data = df.3),
lwd = 2,
col = "red")
grid()
So, the greater the relative abundance of junipers there are, the higher the biomass within a plot. While that’s not an inherently problematic notion, let’s take a closer look at some of the lidar data. In PJ, it has been demonstrated that aboveground biomas is closely related to canopy cover. However, let’s see how canopy cover and AGB compare in our data:
# join the lidar metrics data frame (df.1) to the new summary data frame (df.3)
df.1.temp <- df.1[,c("plot", "cc")]
df.3 <- merge(df.3, df.1.temp)
# plot cc vs. agb
par(mar = c(5,5,1,1))
plot(biomass.mg.ha ~ cc,
data = df.3,
pch = 16,
xlab = "Canopy Cover (%)",
ylab = "Biomass (Mg/ha)",
las = 1)
grid()
Notice how there’s definitely a relationship, but it’s not a clean one. Now look at that same plot, except with points colored by the relative percent of juniper (more red = more juniper; more blue = more pinyon):
# add color information to points
pal <- colorRampPalette(c("blue","red"))
df.3$col <- pal(10)[as.numeric(cut(df.3$rel.perc.juos, breaks = 10))]
# plot cc vs. agb
par(mar = c(5,5,1,1))
plot(biomass.mg.ha ~ cc,
data = df.3,
pch = 16,
col = df.3$col,
xlab = "Canopy Cover (%)",
ylab = "Biomass (Mg/ha)",
las = 1)
grid()
Most of those points in the bottom-right portion of the graph are the points that Kelly just went out and collected recently when she was targeting high-biomass areas. Most of these high-biomass areas were in the higher elevations, where pinyon was the dominant species and canopy covers were high. However, due to their lack of junipers, they are demonstrating relatively low biomass. Incidentally, these points tend to have biomass levels that one might actually expect in PJ woodlands, indicating that the pinyon-based biomass totals are probably pretty accurate. Sites dominated by juniper, however, are likely vastly over-estimating biomass. So, why is that the case? It may have to do with how DRCs were measured for juniper trees.
Taking a look back at the tree-level data, Kelly noted when trees were multi-stemmed (indicated by “ms” in the notes field). Let’s see the relative proportions of multi-stemmed pinyons and junipers:
# derive count and percent of multi-stemmed pinyons and junipers
n.juos <- nrow(df.2[df.2$species == "JUOS",])
n.juos.ms <- nrow(df.2[df.2$species == "JUOS" & grepl("ms", df.2$notes),])
perc.juos.ms <- n.juos.ms / n.juos
n.pied <- nrow(df.2[df.2$species == "PIED",])
n.pied.ms <- nrow(df.2[df.2$species == "PIED" & grepl("ms", df.2$notes),])
perc.pied.ms <- n.pied.ms / n.pied
There are 1928 juniper trees, 392 (20.33%) of which are multi-stemmed. There are 2140 pinyon trees, 132 (6.17%) of which are multi-stemmed. Here is a comparison between the DRCs of multi-stemmed vs single-stemmed juniper trees:
# add a ms factor column to the data frame
df.2$ms <- "Single-Stemmed"
df.2$ms[grepl("ms", df.2$notes)] <- "Multi-Stemmed"
df.2$ms <- as.factor(df.2$ms)
# create a boxplot
par(mar = c(2,5,1,1))
boxplot(drc.cm ~ ms,
data = df.2,
ylab = "DRC (cm)",
las = 1,
xlab = NA)
# compare means
t.test(drc.cm ~ ms, data = df.2)
##
## Welch Two Sample t-test
##
## data: drc.cm by ms
## t = 11.03, df = 618.73, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 7.937756 11.376385
## sample estimates:
## mean in group Multi-Stemmed mean in group Single-Stemmed
## 26.78908 17.13201
While there are still many large single-stemmed junipers, on average, multi-stemmed DRCs are significantly larger than single-stemmed DRCs. The question then becomes, how were DRCs computed on multi-stemmed trees. In conversations with Kelly from several months ago, it seems that the approach she took to characterizing a single DRC measure for a multi-stemmed tree evolved over time. In some cases, she took a single stem (e.g. the largest of the multiple stems) as the representative for the tree, in other cases, she summed the DRCs of the stems, and later on she began taking a single DRC beneath the point of stem splitting (as I understand it). According to this paper, the best approach is to take a root sum of squared diameters:
\[ diameter = \sqrt{\sum_{i=1}^{n} d_i^2} \]
This would result in a smaller diameter than the sum of stem diameters. In some cases, she provided several stem measurements, allowing me to apply this equation (which I have not yet done), but in most, I have no way of determining which approach she took to characterizing multi-stem DRC. That said, clearly there are still a lot of large single-stemmed trees, so even if we were to be able to go back and perfectly characterize multi-stem DRC, I suspect our allometric biomass estimates would still be anomalously high.