The aim of this exercise is to estimate the above ground biomass (AGB) of the woodlands in Silwood Park using the survey data on stems that you collected. We want a final value that is in standard units of unit mass per unit area: tonnes per hectare is widely used.

In order to estimate the biomass, we need to find or calculate the following for our woodland samples:

  1. The timber density of the tree species as kg per m3 of dry wood.
  2. The volume of sampled stems in metres cubed, and hence
  3. the biomass of sampled trees in tonnes.
  4. The spatial density of stems as number per hectare.
  5. The total biomass per hectare for each woodland.

Again, I’m using R because it is my statistical tool of choice and because the code given here provides a complete record of what I’m doing to the data. You don’t need to worry about the code details (unless you want to).

Data

Loading the data

The code below loads in the data collected from Epicollect:

# Import the three sets of form data from the Epicollect Download
transect_points <- read.csv('form-1__transect-point-data.csv', stringsAsFactors=FALSE)
tree_quarters <- read.csv('form-2__tree-quarters.csv', stringsAsFactors=FALSE)
stem_data <- read.csv('form-3__stem-data.csv', stringsAsFactors=FALSE)

Corrections and manipulation

We will simplify the labels for the tree species to remove the common names, making them easier to display. This code uses regular expressions to search and remove bracketed text. These are an advanced topic: they are a perfect quick tool here but do not worry about them, except to note that they may be a really useful thing to learn in the future!

# This line removes text in brackets
tree_quarters$Tree_Sp <- sub('\\([A-Za-z ]+\\)', '', tree_quarters$Tree_Sp)
# This line removes trailing and leading spaces
tree_quarters$Tree_Sp <- gsub('^\\s+|\\s+$', '', tree_quarters$Tree_Sp)

Now we can merge the quarter point tree level data with the transect information on woodlands and groups:

# Join transect point data to the rows in the tree quarter data
tree_level_data <- merge(tree_quarters,  transect_points, 
                         by.x='ec5_parent_uuid', by.y='ec5_uuid')

The group names used in the form were not consistent, so we’ll again use text regular expressions to standardise them all, changing patterns like G7 and Group 7 to simply 7.

tree_level_data$Group <- as.numeric(sub('G|Group ', '', tree_level_data$Group))

We also need to correct some distances that have been recorded as centimetres.

tree_level_data$Distance_SamplePoint <- with(tree_level_data, 
                                             ifelse(Group == 7, Distance_SamplePoint/100,
                                             Distance_SamplePoint))

Lastly, one of the groups had instrumentation error in measuring canopy angles, so unfortunately we have to discard their stem data. We need to find the Epicollect IDs for the affected trees and remove them from the stem data

# Find the affected transect points and trees
group_8_transect_points <- transect_points$ec5_uuid[transect_points$Group == 8]
group_8_trees <- tree_quarters$ec5_uuid[tree_quarters$ec5_parent_uuid %in% group_8_transect_points]
# Remove those rows. The syntax here is ugly: ! mean 'not', so this selects rows
# that have a parent ID that is not in the affected list.
stem_data <- subset(stem_data, subset=! ec5_parent_uuid %in% group_8_trees)

Calculations

Step 1: Timber density

Although the woodlands are dominated by two or three species, there are quite a lot of different tree species that were sampled during the fieldwork:

# set the plot window up to have a big margin for species names
par(mar=c(3, 16, 1, 1))
# count the stems in the different tree species
species_counts <- table(tree_level_data$Tree_Sp)
species_counts <- sort(species_counts, decreasing=TRUE)
# create a barplot to show frequencies
barplot(species_counts, horiz=TRUE, las=1)

We can use values from the literature to see if the different species differ greatly in timber density and whether we should worry about species identify in calculating biomass.The table below gives average dry timber densities taken from https://www.wood-database.com for the more common species found at Silwood. The value for Corylus avellana is taken from 10.5552/drind.2016.1529.

Species Density (dry kg/m3)
Acer pseudoplatanus 615
Alnus glutinosa 495
Betula pendula 640
Corylus avellana 670
Fagus sylvatica 710
Quercus robur 675
Quercus petraea 710
Sorbus aucuparia 770

There are big differences between species, so we probably should handle species separately, rather than just using a single representative density. However, it probably isn’t worth trying to track down densities for every last species, so we’ll lump the rare species and retain the most common ones.

The code below sets out the common species we are going to use, along with their timber densities and then creates a new variable to record the identity of common species and lump the rarer species as ‘Other’.

# Create a list of wood densities that we are going to use
timber_density <- c('Acer pseudoplatanus' = 615,
                    'Alnus sp' = 495,
                    'Betula pendula' = 640,
                    'Corylus avellana' = 670,
                    'Fagus sylvatica' = 710,
                    'Quercus robur' = 675,
                    'Quercus petraea' = 710,
                    'Sorbus aucuparia' = 770)

# Add a mean value to use for Other species
timber_density['Other'] <- mean(timber_density)

# Lump the rare species together, retaining the species identify of common trees.
tree_level_data$AGB_Species <- with(tree_level_data,
                                    ifelse(Tree_Sp %in% names(timber_density), Tree_Sp, 'Other'))

Step 2: Stem volumes

From the handout, we have the stem angles and distances to calculate stem height (\(h\)) and then the stem circumference (\(c\)) and hence basal area. We then need a model of how to convert that into stem volume. The forest modelling literature includes a huge range of different geometry models for tree stems, but here we will just stick with a straightforward cylindrical model.

Height calculation

There was some confusion in the recording for fallen stems, with some having angles and only a few having measured lengths. Here, I’ll assume that fallen stem angles are horizontal angles measured at right angles to the fallen stem, so treating them just as if they were vertical. If a fallen stem length was directly measured, we’ll use that instead.

# First, calculate height for all stems
stem_data$height <- with(stem_data, tan(Angle_to_base / 180 * pi) * Stem_distance + 
                                    tan(Angle_to_canopy / 180 * pi) * Stem_distance)

# Substitute directly measured heights of fallen stems
stem_data$height <- with(stem_data, ifelse(is.na(Fallen_stem_length), height, Fallen_stem_length))

# Have a quick look at the distribution of heights
par(mfrow=c(1,2), mar=c(3, 3, 1, 1))
hist(stem_data$height)
hist(log10(stem_data$height))

We’ve got some issues with tree height - anything over about 30 metres is clearly wrong. However, we are going to use the quarter point trees as a sample of each tree species in each woodland. If we use the median height, then the outliers might not matter too much.

We can check whether the median heights of stems by woodland and species are reasonable.

stem_data <- merge(stem_data, tree_level_data, by.x='ec5_parent_uuid', by.y='ec5_uuid')

median_height <- aggregate(height ~ Woodland + AGB_Species, data=stem_data, FUN=median)
xtabs(height ~ Woodland + AGB_Species, data=median_height)
##                   AGB_Species
## Woodland           Acer pseudoplatanus  Alnus sp Betula pendula
##   Elms Slope                 10.334799  0.000000      14.321491
##   Gunnes's Thicket           10.383629 22.389361      23.928818
##   Mann's Copse                9.699709  0.000000      11.026335
##   Merten's Acres             13.913070  0.000000      19.803747
##   Nash's Copse               13.243861 22.733468      18.195395
##   Rookery Slope              11.051042  0.000000      17.219681
##                   AGB_Species
## Woodland           Corylus avellana Fagus sylvatica     Other
##   Elms Slope               0.000000       56.348085 13.186048
##   Gunnes's Thicket         0.000000        0.000000  7.074134
##   Mann's Copse             8.763386        0.000000  7.240859
##   Merten's Acres           0.000000       17.669716 17.239327
##   Nash's Copse             9.681685       37.062412 18.365703
##   Rookery Slope            6.681098        0.000000 22.938439
##                   AGB_Species
## Woodland           Quercus petraea Quercus robur Sorbus aucuparia
##   Elms Slope             15.757572     19.188634         0.000000
##   Gunnes's Thicket        7.767325     23.743413         0.000000
##   Mann's Copse            0.000000      8.619642        11.939148
##   Merten's Acres          0.000000     19.363413         0.000000
##   Nash's Copse            0.000000     19.460950        10.879381
##   Rookery Slope           0.000000     21.162948         0.000000

Not really - some of those are far too high. The data for Alnus sp. and Fagus sylvatica look particularly problematic, for some reason.

Volume calculation

We can now calculate the stem volume, remembering to convert the circumference measurements from cm to m.

stem_data$basal_area <- (stem_data$Stem_girth / 100) ^ 2 / (4 * pi)
stem_data$stem_volume <- with(stem_data, basal_area * height)

Combining stems

We now to combine the volumes of multi-stem trees to give a single volume for each tree. Most trees only have a single stem but you did measure some multi-stem trees:

par(mar=c(3, 3, 1, 1))
# Count the number of occurences of each tree id
stem_counts <- table(stem_data$ec5_parent_uuid)
hist(stem_counts, breaks=0:8)

We will simply add the volumes of each stem together, as if the stems are a bunch of straws, and then add this data on to the tree level data.

# Aggregate the volume by adding together values grouped by tree id
tree_volume <- aggregate(stem_volume ~ ec5_parent_uuid, data=stem_data, FUN=sum)
# Add the combined stem volumes into the tree level data
tree_level_data <- merge(tree_level_data, tree_volume, by.x='ec5_uuid', by.y='ec5_parent_uuid')

Step 3: Tree biomass

We now have the volume of each tree and the density of the tree species, so we can combine those to find the biomass of each tree. We first need to get timber density for each sampled species and can then simply multiply volume and density to get total biomass

# Use the species name to select the correct timber densities from our list of values
tree_level_data$timber_density <- timber_density[tree_level_data$AGB_Species]
# Find the biomass in tonnes
tree_level_data$biomass <- with(tree_level_data, timber_density * stem_volume) / 1000

The lattice package in R is useful for quickly visualizing groups of data, so we can check the calculated biomass of the different species:

library(lattice)
histogram(~ log10(biomass) |AGB_Species, data=tree_level_data)

There are clearly some problems in those biomass values: most trees are around 1 tonne (\(log_{10}(1) = 0\)) but a few of the trees seem to be over 100 tonnes (\(log_{10}(100) = 2\)). We’ll again look at the median biomass by species and woodland to see if the median values look reasonable:

# Calculate the biomass per species per woodland and print out as a table.
median_biomass <- aggregate(biomass ~ Woodland + AGB_Species, 
                            data=tree_level_data, FUN=median)
xtabs(biomass ~ Woodland + AGB_Species, data=median_biomass)
##                   AGB_Species
## Woodland           Acer pseudoplatanus    Alnus sp Betula pendula
##   Elms Slope                0.08341586  0.00000000     0.28080322
##   Gunnes's Thicket          0.16081892  0.55357884     0.76850871
##   Mann's Copse              0.15387934  0.00000000     0.34313115
##   Merten's Acres            1.17622740  0.00000000     0.58454203
##   Nash's Copse              0.21170676  0.34914546     0.60475871
##   Rookery Slope             0.11778111  0.00000000     0.12727814
##                   AGB_Species
## Woodland           Corylus avellana Fagus sylvatica       Other
##   Elms Slope             0.00000000     36.80319112  0.07677677
##   Gunnes's Thicket       0.00000000      0.00000000  0.20107656
##   Mann's Copse           0.21227777      0.00000000  0.16017595
##   Merten's Acres         0.00000000      0.83622757  0.43690997
##   Nash's Copse           0.12101871      8.96329573  0.94628745
##   Rookery Slope          0.61553736      0.00000000  0.95826813
##                   AGB_Species
## Woodland           Quercus petraea Quercus robur Sorbus aucuparia
##   Elms Slope            0.96596290    1.65755008       0.00000000
##   Gunnes's Thicket      0.19116476    2.64526620       0.00000000
##   Mann's Copse          0.00000000    0.25354004       0.29423319
##   Merten's Acres        0.00000000    1.54252879       0.00000000
##   Nash's Copse          0.00000000    1.31183492       0.13499263
##   Rookery Slope         0.00000000    3.33139219       0.00000000

As a useful measuring point, a 222 year old woodland oak had a dry mass of 7.86 tonnes. So, sadly, although most values look reasonable, those extra tall Fagus sylvatica are causing real issues.

Component biomass

We have only estimated the biomass of the stem. We’ve included bark in that volume, although it has a lower density, but we’ve ignored the branches and foliage and of course the below ground biomass in the roots. Calculating total biomass is difficult - there are lots of approaches with species specific estimation. The figure below, taken from here, suggests that for hardwood trees of the size we are studying, the stem may only be about 60% - 80% of the biomass.

Given that we are not using a particularly high precision method, we won’t try and do anything more complicated here and will stick with just using stem biomass.

Step 4: Spatial density

We now calculate the spatial density of stems in each of the woodlands. From the handout, the density in stems per hectare is:

\[ \lambda = \frac{1}{\bar{r}_s^2} \times 10000,\]

where \(\bar{r}\) is the mean distance from the sample point . We’ll first look at the distribution of distances to check they look sensible:

# look at some overall measures of central tendency
overall_median_distance <- median(tree_level_data$Distance_SamplePoint)
overall_mean_distance <- mean(tree_level_data$Distance_SamplePoint)
print(overall_median_distance)
## [1] 4.86
print(overall_mean_distance)
## [1] 5.287814
par(mfrow=c(1,2), mar=c(3, 3, 1, 1))
hist(tree_level_data$Distance_SamplePoint, main='Distance to stem (m)')
abline(v=c(overall_median_distance, overall_mean_distance), col=c('red', 'blue'))
hist(log10(tree_level_data$Distance_SamplePoint), main=expression(log[10]~Distance~to~stem (m)))

We can calculate the average stem density across all woodlands in stems per hectare:

(1 / (overall_mean_distance^2)) * 10000
## [1] 357.6413

That seems too large - we’d expect somewhere between about 100 and 200 stems per hectare, but there isn’t an obvious source of error. We can also calculate the stem densities per woodland from the average distances per woodland.

# Get the mean distance by woodland 
spatial_density <- aggregate(Distance_SamplePoint ~ Woodland, data=tree_level_data, FUN=mean)
# and hence the stem densities per hectare by woodland
spatial_density$lambda <- (1 / spatial_density$Distance_SamplePoint  ^ 2) * 10000

print(spatial_density)
##           Woodland Distance_SamplePoint   lambda
## 1       Elms Slope             4.980000 403.2193
## 2 Gunnes's Thicket             7.010417 203.4756
## 3     Mann's Copse             5.587500 320.3059
## 4   Merten's Acres             4.913793 414.1582
## 5     Nash's Copse             4.833333 428.0618
## 6    Rookery Slope             3.409756 860.1088

We can then get the proportions of each species in each woodland:

composition <- prop.table(xtabs( ~ AGB_Species+ Woodland , data=tree_level_data), margin=2)
print(composition, digits=3)
##                      Woodland
## AGB_Species           Elms Slope Gunnes's Thicket Mann's Copse
##   Acer pseudoplatanus     0.2200           0.0729       0.2375
##   Alnus sp                0.0000           0.0625       0.0125
##   Betula pendula          0.2900           0.1771       0.2625
##   Corylus avellana        0.0000           0.0000       0.1000
##   Fagus sylvatica         0.0100           0.0000       0.0000
##   Other                   0.1000           0.0417       0.1125
##   Quercus petraea         0.1300           0.0104       0.0000
##   Quercus robur           0.2500           0.6354       0.1625
##   Sorbus aucuparia        0.0000           0.0000       0.1125
##                      Woodland
## AGB_Species           Merten's Acres Nash's Copse Rookery Slope
##   Acer pseudoplatanus         0.0948       0.1190        0.2927
##   Alnus sp                    0.0000       0.0595        0.0000
##   Betula pendula              0.2328       0.5119        0.0488
##   Corylus avellana            0.0000       0.1071        0.0488
##   Fagus sylvatica             0.0172       0.1071        0.0000
##   Other                       0.1121       0.0357        0.1951
##   Quercus petraea             0.0000       0.0000        0.0000
##   Quercus robur               0.5431       0.0238        0.4146
##   Sorbus aucuparia            0.0000       0.0357        0.0000
# Get some colours
library(RColorBrewer)
tree_cols <- brewer.pal(9, 'Set3')

par(mar=c(3,3,1,12), cex=0.8, mgp=c(2,1,0))
barplot(composition, legend.text=rownames(composition), 
         ylab='Proportion of community', col=tree_cols,
         args.legend=list(x=7.5, y =1, xjust=0, bty='n'))

Now we can combine the stem density in each woodland with the species proportions to get the stem density per species:

composition <- as.data.frame(composition)
spatial_density <- merge(spatial_density, composition)
spatial_density$tree_stem_density <- with(spatial_density, Freq * lambda)

Step 5: Biomass estimation per hectare

We now have one data set that has the spatial density of each species in each woodland and another that has the median biomass by woodland for the same species. We can simply merge them together and multiply the number of stems per hectare by median stem biomass

# merge spatial_density with median_biomass
spatial_density <- merge(spatial_density, median_biomass, by=c('Woodland', 'AGB_Species'))
# calculate total biomass per species
spatial_density$total_biomass <- with(spatial_density,  tree_stem_density * biomass)

We can now inspect a table of the species biomass by woodland and look at a stacked barplot to compare woodlands.

biomass <- xtabs(total_biomass ~ AGB_Species + Woodland , data=spatial_density)
print(biomass)
##                      Woodland
## AGB_Species             Elms Slope Gunnes's Thicket Mann's Copse
##   Acer pseudoplatanus    7.3996747        2.3860322   11.7060093
##   Alnus sp               0.0000000        7.0399867    0.0000000
##   Betula pendula        32.8353310       27.6910114   28.8505689
##   Corylus avellana       0.0000000        0.0000000    6.7993821
##   Fagus sylvatica      148.3975707        0.0000000    0.0000000
##   Other                  3.0957874        1.7047572    5.7718462
##   Quercus petraea       50.6343352        0.4051809    0.0000000
##   Quercus robur        167.0890473      342.0111949   13.1966849
##   Sorbus aucuparia       0.0000000        0.0000000   10.6025204
##                      Woodland
## AGB_Species           Merten's Acres Nash's Copse Rookery Slope
##   Acer pseudoplatanus     46.1947110   10.7885217    29.6501163
##   Alnus sp                 0.0000000    8.8961812     0.0000000
##   Betula pendula          56.3492038  132.5188943     5.3401484
##   Corylus avellana         0.0000000    5.5503739    25.8258083
##   Fagus sylvatica          5.9712156  411.0905125     0.0000000
##   Other                   20.2788624   14.4667692   160.8224023
##   Quercus petraea          0.0000000    0.0000000     0.0000000
##   Quercus robur          346.9621541   13.3701538  1188.0759332
##   Sorbus aucuparia         0.0000000    2.0637569     0.0000000
total_biomass <- colSums(biomass)

par(mar=c(3,3,1,12), cex=0.8, mgp=c(2,1,0))
centres <- barplot(biomass, legend.text=rownames(biomass),
                   ylab='Total biomass (t/ha)', ylim=c(0, 1500), col=tree_cols,
                   args.legend=list(x=7.5, y =1400, xjust=0, bty='n'))
text(centres, total_biomass, labels=round(total_biomass, 1), adj=c(0.5, -1.2))

Those values are mostly far too high. We would expect around 100 tonnes per hectare of above ground biomass: roughly 100 mature stems per hectare with a median stem biomass of about 1 tonne. We clearly have problems both with the estimation of stem biomass - and that is particularly problematic for rarer species where we have small samples to estimate a representative stem weight - and with the estimation of spatial stem densitities.