Forests support biodiversity through the supply of food (fruits, seeds, nuts, leaves, cambium, wood, detritus) and a 3-D habitat structure (ground cover, microhabitats, protection from predators on the ground and in the air). This diversity of resources (food and habitat) is probably critical to the diversity of all other organisms.
In addition to biodiversity, forests sequester CO2 from the atmosphere in the form of living wood and buried soil organic matter. They regulate temperature by intercepting incoming solar radiation and through through evaporative cooling during photosynthesis.
The big questions about biodiversity include the following:
At Duke Forest we’ll gather data that address these questions. We’ll learn about distribution and abundance of tree species, the food resources they produce, and the vertebrate consumers.
To bring: - Clip board (or Excel sheets on ipad) - Cruz-all - Diameter tapes - Binoculars - Smart phone with iNaturalist
Activities will include the following:
Camera traps: collect those already there, set new traps
Forest inventory: variable radius and permanent plots
Variables:
In the vignette that follows we’ll explore existing data from the site we’ll visit in the field.
Here I load libraries, source code, and specify data:
source( 'clarkFunctions2024.R' )
library( dplyr )
library( mastif )
library( stringi )
library( maps )
library( car )
dataFile <- 'mastifDataNC.rdata'
plot <- 'DUKEBW'
Next, I load a file that contains forest data for North Carolina–I’ve
clipped from our global data. In this block of code I load
the file plot the locations from the data.frame plotData. I
use the map function to plot the locations:
load( dataFile, verbose = T )
xlim <- range( plotData$lon ) + .5*c(-1, 1 )
ylim <- range( plotData$lat ) + .5*c(-1, 1 )
map( 'state', xlim = xlim, ylim = ylim ) # draw map
points( plotData$lon, plotData$lat, cex = .2, col = 2 ) # add data points
## Loading objects:
## plotData
## taxa
## inputs
Here are the plots with long-term data (data includes
ST for seed traps):
plotData$plot[ grep( 'ST', plotData$data ) ]
## [1] "CALLHDWD" "CALLPIPA" "CWT118" "CWT218" "CWT318" "CWT427"
## [7] "CWT527" "CWTLG" "CWTUG" "DUKEBW" "DUKEEE" "DUKEEW"
## [13] "DUKEFACE1" "DUKEFACE2" "DUKEFACE3" "DUKEFACE4" "DUKEFACE5" "DUKEFACE6"
## [19] "DUKEHW" "GRSMCD" "GRSMND" "GRSMPK" "MARSF" "MARSP"
The field-trip plot is in this list, DUKEBW.
The problem with this map is that all of the observations are piled
on top of each other. Some of these points correspond to 1 tree-year.
Others correspond to 10,000. The map does not show how sample effort
varies. In this case, sample effort means the number of
tree-years observed within each 0.1 x 0.1 degree of lon/lat. Here I use
the function obsGrid, which is sourced from
clarkFunctions2024.R, to shade locations by the number of
tree-years they represent. I use the function symbols to
draw squares of the size and color needed to show effort:
effort <- obsGrid( inputs$treeData, plotData ) # aggregate observations to 0.1 degree
map( 'state', xlim = xlim, ylim = ylim )
symbols( effort$lon, effort$lat,
squares = rep(.11, nrow(effort) ),
fg = effort$cols, bg = effort$cols, inches = F, add = T )
Clearly, the effort is uneven. This is typical of environmental data. This distribution affects what we can estimate.
The list effort is a data.frame. A few
lines show the structure:
| lon | lat | nobs | cols |
|---|---|---|---|
| -80.5 | 33.5 | 32 | #FC9F6A |
| -80.4 | 33.5 | 1 | #FDBB84 |
| -80.8 | 33.8 | 63 | #FC9661 |
| -78.4 | 34.0 | 3 | #FCB17B |
| -78.3 | 34.0 | 1 | #FDBB84 |
The column nobs is the number of observations in the
grid cell. The column cols is the color assigned to that
grid cell. The hex format has # followed by 2 values each
for red, green, and blue.
The list inputs holds several objects:
summary( inputs )
## Length Class Mode
## priorTable 9 data.frame list
## randomEffect 2 -none- list
## seedData 247 data.frame list
## seedNames 2 -none- character
## seedTraits 334 -none- numeric
## specNames 111 -none- character
## treeData 28 data.frame list
## xytrap 4 data.frame list
## xytree 4 data.frame list
## yearEffect 2 -none- list
Several of these are data.frames. The
treeData starts like this, with one row per tree-year
| plot | tree | species | year | diam | shade | cropCount | cropFraction |
|---|---|---|---|---|---|---|---|
| GRSMCD | 1.01 | abiesFraseri | 2011 | 15.9 | 3 | NA | NA |
| GRSMCD | 1.01 | abiesFraseri | 2012 | 16.4 | 3 | NA | NA |
| GRSMCD | 1.01 | abiesFraseri | 2013 | 16.8 | 3 | NA | NA |
| GRSMCD | 1.01 | abiesFraseri | 2014 | 17.3 | 3 | NA | NA |
| GRSMCD | 10.01 | abiesFraseri | 2011 | 3.4 | 3 | NA | NA |
This is a large data.frame with
nrow(inputs$treeData) = 796728 tree-years. The
plot column is used to align this tree-year data with plot
data in plotData:
| plot | lon | lat | UTMx | UTMy | UTMzone | data | ecoRegWWF |
|---|---|---|---|---|---|---|---|
| BCEF30 | -82.64043 | 35.46287 | 351149.3 | 3925612 | 17 | CC | Appalachian-Blue Ridge forests |
| BCEF31 | -82.62934 | 35.50521 | 352232.9 | 3930292 | 17 | CC | Appalachian-Blue Ridge forests |
| BCEFBB | -82.64627 | 35.48923 | 350650.3 | 3928333 | 17 | CC | Appalachian-Blue Ridge forests |
| BCEFBP | -82.59549 | 35.50025 | 355277.0 | 3929481 | 17 | CC | Appalachian-Blue Ridge forests |
| BCEFBT | -82.64707 | 35.47600 | 350553.2 | 3926867 | 17 | CC | Appalachian-Blue Ridge forests |
The UTM columns translate lon/lat to meters. The
data column indicates if the site has long-term inventory
data with seed traps (ST), crop counts (CC),
or both (ST_CC). ecoRegWWF is the WWF
ecoRegion.
For example, if I want to extract plot information for the first five
tree-years in inputs$treeData I can use the
match function:
mm <- match( inputs$treeData$plot, plotData$plot )
plotData[ mm[1:5], ]
These lines are identical, because they come from the same plot in
inputs$treeData.
Here are some columns of inputs$treeData:
| object | mode |
description |
|---|---|---|
| plot | character |
plot name, e.g., DUKEBW |
| tree | character |
unique within a plot |
| species | code8 format | camelCase, 8 genus and 8 species characters |
| year | numeric |
observation year |
| diam | numeric |
cm |
| shade | numeric |
1 (full sun) to 5 (full shade) |
| cropCount | numeric |
no. fruits counted |
| cropFraction | numeric |
fraction of crop represented by cropCount |
Many of the tree-years do not include cropCount (or
other variables). Missing observations is a common feature of long-term
data sets. They are represented by NA.
The species that occur in this data set are here:
inputs$specNames
## [1] "abiesFraseri" "acerBarbatum" "acerFloridan"
## [4] "acerNegundo" "acerPensylva" "acerRubrum"
## [7] "acerSaccharu" "acerSpicatum" "aesculusFlava"
## [10] "aesculusPavia" "aesculusSylvatic" "ailanthuAltissim"
## [13] "amelanchArborea" "amelanchLaevis" "asiminaTriloba"
## [16] "betulaAlleghan" "betulaLenta" "carpinusCarolini"
## [19] "caryaAlba" "caryaIllinoin" "caryaOvalis-g"
## [22] "caryaOvata" "caryaPallida" "castaneaDentata-"
## [25] "castaneaPumila" "celtisLaevigat" "cercisCanadens"
## [28] "chamaecyThyoides" "chionantVirginic" "cornusFlorida"
## [31] "corylusAmerican" "diospyroVirginia" "fagusGrandifo"
## [34] "frangulaCarolini" "fraxinusAmerican" "fraxinusPennsylv"
## [37] "gleditsiTriacant" "gordoniaLasianth" "halesiaCarolina"
## [40] "ilexDecidua" "ilexMontana" "ilexOpaca"
## [43] "ilexVerticil" "ilexVomitori" "juglansNigra"
## [46] "juniperuVirginia" "linderaBenzoin" "liquidamStyracif"
## [49] "liriodenTulipife" "magnoliaAcuminat" "magnoliaFraseri"
## [52] "magnoliaGrandifl" "magnoliaMacrophy" "magnoliaTripetal"
## [55] "magnoliaVirginia" "morusRubra" "nyssaBiflora"
## [58] "nyssaSylvatic" "ostryaVirginia" "oxydendrArboreum"
## [61] "paulowniTomentos" "perseaBorbonia" "perseaPalustri"
## [64] "piceaRubens" "pinusEchinata" "pinusElliotti"
## [67] "pinusPalustri" "pinusPungens" "pinusRigida"
## [70] "pinusSerotina" "pinusStrobus" "pinusTaeda"
## [73] "pinusVirginia" "platanusOccident" "prunusCarolini"
## [76] "prunusPensylva" "prunusSerotina" "quercusAlba"
## [79] "quercusBicolor" "quercusCoccinea" "quercusFalcata"
## [82] "quercusHemispha" "quercusIncana" "quercusLaevis"
## [85] "quercusLaurifol" "quercusLyrata" "quercusMacrocar"
## [88] "quercusMargaret" "quercusMariland" "quercusMichauxi"
## [91] "quercusMontana" "quercusNigra" "quercusPhellos"
## [94] "quercusRubra" "quercusStellata" "quercusVelutina"
## [97] "quercusVirginia" "robiniaPseudoac" "sassafraAlbidum"
## [100] "sorbusAmerican" "taxodiumAscenden" "taxodiumDistichu"
## [103] "tiliaAmerican" "tiliaHeteroph" "tsugaCanadens"
## [106] "ulmusAlata" "ulmusAmerican" "ulmusRubra"
## [109] "viburnumPrunifol" "viburnumRafinesq" "viburnumRufidulu"
These names could be unfamilar. Some addition insight is gained from
the data.frame taxa. Here are the top rows from
taxa:
| genus | family | leaves | fruit | pollination | dispersal | gmPerSeed | seedsPerFruit | |
|---|---|---|---|---|---|---|---|---|
| abiesFraseri | Abies | Pinaceae | needle_evergreen | samara | wind | anemochory | 0.0100 | 180.5 |
| acerBarbatum | Acer | Sapindaceae | broad_deciduous | samara | animal | anemochory | 0.0300 | 1.0 |
| acerFloridan | Acer | Sapindaceae | broad_deciduous | samara | animal | anemochory | 0.0683 | 1.0 |
| acerNegundo | Acer | Sapindaceae | broad_deciduous | samara | animal | anemochory | 0.0450 | 1.0 |
| acerPensylva | Acer | Sapindaceae | broad_deciduous | samara | animal | anemochory | 0.0385 | 1.0 |
| acerRubrum | Acer | Sapindaceae | broad_deciduous | samara | animal | anemochory | 0.0210 | 1.0 |
| acerSaccharu | Acer | Sapindaceae | broad_deciduous | samara | animal | anemochory | 0.0736 | 1.0 |
| acerSpicatum | Acer | Sapindaceae | broad_deciduous | samara | animal | anemochory | 0.0503 | 1.0 |
| aesculusFlava | Aesculus | Sapindaceae | broad_deciduous | capsule | animal | barochory | 4.5000 | 1.0 |
| aesculusPavia | Aesculus | Sapindaceae | broad_deciduous | capsule | animal | barochory | 6.5000 | 1.0 |
Many of these columns are self-explanatory. However, don’t worry about your knowledge of tree species.
Again, we will visit the long-term plot named DUKEBW
(Duke Forest, Blackwood). I use the function extractPlot to
examine long-term data from this site:
load( dataFile, verbose = T )
## Loading objects:
## plotData
## taxa
## inputs
tmp <- extractPlot( inputs, plot, taxa, plotData ) # plot specified at top of this doc
dataList <- tmp$inputs
plotData <- tmp$plotData
taxa <- tmp$taxa
The treeData starts like this, with one row per
tree-year
| plot | tree | species | year | diam | shade | cropCount | cropFraction | |
|---|---|---|---|---|---|---|---|---|
| 59157 | DUKEBW | 10000 | acerRubrum | 2000 | 11.1 | 5 | NA | NA |
| 60100 | DUKEBW | 10000 | acerRubrum | 2001 | 11.1 | 5 | NA | NA |
| 61122 | DUKEBW | 10000 | acerRubrum | 2002 | 11.1 | 5 | NA | NA |
| 62107 | DUKEBW | 10000 | acerRubrum | 2003 | 11.1 | 5 | NA | NA |
| 6396 | DUKEBW | 10000 | acerRubrum | 2004 | 11.1 | 5 | NA | NA |
The species on this plot and their traits are in the new version of
taxa. Here is a table of tree-years by year:
table( dataList$treeData$year )
##
## 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
## 9841 9841 9928 9940 10122 10117 10256 10253 9690 10119 11494 11425 10661
## 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023
## 10660 11578 11558 10085 10041 10470 10470 9227 9930 9905 18
The plots were not sampled every year. I’ve filled in the missing years by interpolating diameters.
To see maps from two different years I can use the
function mastMap:
mapList <- dataList
mapList$mapPlot <- plot
mapList$mapYears <- c(2000, 2020)
mapList$treeScale <- .3
mastMap( mapList )
## Loading objects:
## dem
## Loading objects:
## dem
In these maps trees of each species is given a different color. I have not included a color legend because it would be too long. The colors still give a sense of the diversity. There is a road separating the SW and NE sections of the plot. There is power line cutting through the southern end of the plot. If you look closely you can see that some of the large trees present in 2000 are no longer there.
To get a better idea of the diversity, here’s one genus:
genus <- 'quercus'
mapList$specNames <- dataList$specNames[ startsWith( dataList$specNames , genus ) ]
mapList$seedNames <- dataList$seedNames[ startsWith( dataList$seedNames , genus ) ]
mapList$mapYears <- 2020
mapList$treeScale <- .4
mapList$LEGEND <- TRUE
mastMap( mapList )
## Loading objects:
## dem
The squares are seed traps, scaled to the number of seeds collected at that location.
Here is another genus:
genus <- 'pinus'
mapList$specNames <- dataList$specNames[ startsWith( dataList$specNames , genus ) ]
mapList$seedNames <- dataList$seedNames[ startsWith( dataList$seedNames , genus ) ]
mastMap( mapList )
Another example:
genus <- 'carya'
mapList$specNames <- dataList$specNames[ startsWith( dataList$specNames , genus ) ]
mapList$seedNames <- dataList$seedNames[ startsWith( dataList$seedNames , genus ) ]
mastMap( mapList )
Here are some species from different genera:
specs <- c('juglansNigra', 'nyssaSylvatic' )
mapList$specNames <- specs
mapList$seedNames <- specs
mastMap( mapList )
There are some contrasts in distribution on this map. They also appear to differ in size distribution. Here is a histogram of diameters for the pines:
treeData <- dataList$treeData # less typing!
pines <- rownames( taxa )[ taxa$genus == 'Pinus' ]
xlim <- c(0, 80)
wp <- which( treeData$species %in% pines )
hist( treeData$diam[wp], xlim = xlim, nclass = 20, main = 'Pines' )
Here are the oaks:
oaks <- rownames( taxa )[ taxa$genus == 'Quercus' ]
wq <- which( treeData$species %in% oaks )
hist( treeData$diam[wq], xlim = xlim, nclass = 20, main = 'Oaks' )
Here is sweet gum:
par( mfrow = c(1,3), bty = 'n', omi = c(.5,.5,.1,.1) )
hist( treeData$diam[wp], xlim = xlim, nclass = 20, main = 'Pines', xlab = '', ylab = '' )
hist( treeData$diam[wq], xlim = xlim, nclass = 20, main = 'Oaks', xlab = '', ylab = '' )
ws <- which( treeData$species == 'liquidamStyracif' )
hist( treeData$diam[ws], xlim = xlim, nclass = 20, main = 'Sweet gum', xlab = '', ylab = '' )
mtext( 'Diameter (cm)', 1, outer = T )
mtext( 'Tree years', 2, outer = T )
Which species dominant the stand? The
data.frame treeData has a row for each tree-year, but I
want one row per tree. So I find the maximum diameter for each tree,
which should be the most recent:
maxDiam <- tapply( treeData$diam, treeData$tree, max, na.rm = T )
mm <- match( names(maxDiam), treeData$tree )
trees <- data.frame( maxDiam, treeData[mm,c('species','shade')] )
The rownames of trees is the tree
number.
Here is species dominance by numbers:
stems <- table( trees$species )
stems <- stems[ order(stems, decreasing = T ) ]
plot( stems, type = 's', log = 'xy', ylim = c(1, max(stems) ), xlab = 'Rank', ylab = 'Number' )
#label 12 most abundant
i <- 1:12
text( i, .9*stems[i], names(stems)[i], srt = 90, pos = 2 )
Here is species dominance by basal area:
trees$BA <- pi*(trees$maxDiam/2)^2
tba <- tapply(trees$BA, trees$species, sum, na.rm = T )
tba <- tba[ order(tba, decreasing = T ) ]
plot( tba, type = 's', log = 'xy', ylim = c(1, max(tba) ), xlab = 'Rank', ylab = 'Basal Area' )
text( i, .9*tba[i], names(tba)[i], srt = 90, pos = 2 )
Exercise 1: Which could be the most important tree species in this stand? Based on what criteria?
Exercise 2: Repeat this exercise with another nearby
plot, DUKEEW. Does this added information change the
previous answers?
After the field trip we’ll estimate seed production using a Bayesian
hierarchical model called mastif (Clark
et al. 2019, Qiu et al. 2023). We want to know which species have
the largest reproductive effort and how it varies over time. Model
fitting is done one genus at a time. This is because many of the seeds
recovered from seed traps can be identified to genus level but not to
species level. The estimation of seed produced by species is detailed in
Clark
et al. (2019).
We can see some fruits, nuts, and seeds in trees, but not most of them. We can observe them on the ground, but how do we determine which seeds came from which tree?
mastif uses seed traps and crop counts on trees to
estimate seed productivity and dispersion. Attributes of individual
trees and their local environments could explain their differences in
fecundity. Inference requires information on locations of trees and seed
traps and predictors ( covariates and factors ) that could explain
source strength.
The mastif model admits two data types, crop counts (above)
and seed traps (below). The model for maturation and seed production of
each tree-year links to observations through a data model. In the case
of seed traps, this data model describes seed transport.
Crop counts enter through the long-term inventory
data (treeData) or through the mastif
project on iNaturalist. To add observations to mastif you
can download the iNaturalist
app on a smart phone. In addition to a smart phone, you need a diameter
tape. A simple summary is here.
Seed traps accumulate seeds that can be counted used
to estimate seed production through a dispersal kernel. They enter
through inventory plots in seedData with locations in
xytrap.
Many trees are not reproductively mature, and reproductive status is
often unknown. For individuals of unknown maturation
status (reproductively mature or not), the maturation status
must be estimated together with the conditional
fecundity (seed production given the mature state). Currently,
maturation status depends only on tree diameter (the variable
diam).
Predictors of fecundity include individual traits, site characteristics, climate, synchronicity with other trees, and lag effects. For studies that involve multiple plots and years, there can be effects of climate between sites and through time. There may be synchronicity between individuals both within and between plots.
The model incorporates two types of redistribution.
The first type is a redistribution from discrete species space to a
different seed-type space. Not all seed types can be definitely assigned
to species. The contribution of trees of each species to multiple seed
types must be estimated. As an example, a plot may include trees of
these species in the same genus: pinusTaeda,
pinusEchinata. The seeds recovered in traps from the same
genus could include any combination of the folling types:
pinusTaeda, pinusEchinata,
pinusUNKN. The last seed type, identified only to the genus
Pinus, could dominate seed counts. The seed production by
species must estimate how they are distributed across the observed seed
types. This redistribution is estimated as a
specNames-to-seedNames matrix.
The second redistribution comes from source trees to seed traps, or seed dispersal. Within a plot-year there is dependence between all seed traps induced by a redistribution kernel, which describes how seeds produced by trees are dispersed in space. Because ‘seed shadows’ of trees overlap, seeds are assigned to trees only in a probabilistic sense. The capacity to obtain useful estimates depends on the number of trees, the number of seed traps, the dispersal distances of seeds, the taxonomic specificity of seed identification, and the number of species contributing to a given seed type.
In addition to predictors, the process model for fecundity can involve random tree effects, random group effects ( e.g., species-location ), year effects, and autoregressive lags limited only by the duration of data. Dispersal estimates can include species and site differences (Clark et al. 2013).
mastif packageA full description of mastif is here.
The figure below summarizes the main elements. The
function mastSim is used to simulate a data set. This
allows us to check on data prediction and parameter recovery.
Inputs and outputs (grey) for three main functions (yellow) in the
mastif package.
The function mastif fits the model to data held in four
data.frames called treeData (tree-year by
variables), seedData (seedTrap-year by seed counts),
xytree (tree locations), and xytrap (seed-trap
locations). The function mastif generates output as a
list.
The function mastPlot generates plots of
diagnostics.
The model requires a minimum of six inputs directly as arguments to
the function mastif. There are two formulas,
two character vectors, and four
data.frames:
Table 2. Required in the list inputs to
the function mastif
| object | mode |
components |
|---|---|---|
formulaFec |
formula |
fecundity model |
formulaRep |
formula |
maturation model (currently fixed at
~ diam |
specNames |
character vector |
names in column species of
treeData to include in model |
seedNames |
character vector |
names of seed types to include in model: match column
names in seedData. Seeds identified only to genus are
genusUNKN |
treeData |
data.frame |
tree-year by variables, includes columns
plot, tree, year,
diam, repr |
seedData |
data.frame |
trap-year by seed types, includes columns
plot, trap, year, one for each
seedNames |
xytree |
data.frame |
tree by location, includes columns plot,
tree, x, y |
xytrap |
data.frame |
trap by location, includes columns plot,
trap, x, y |
I introduce additional inputs in the sections that follow, but start with this basic model.
Here is code to extract species from one genus
load( dataFile, verbose = F )
genus <- 'fraxinus'
tmp <- extractGenus( inputs, genus = genus, taxa, plotData )
dataList <- tmp$inputs
plotData <- tmp$plotData
taxa <- tmp$taxa
Here is the trait table that has been trimmed to members of this genus on the plot:
| genus | family | leaves | fruit | pollination | dispersal | gmPerSeed | seedsPerFruit | |
|---|---|---|---|---|---|---|---|---|
| fraxinusAmerican | fraxinus | Oleaceae | broad_deciduous | samara | wind | anemochory | 0.0454 | 1 |
| fraxinusPennsylv | fraxinus | Oleaceae | broad_deciduous | samara | wind | anemochory | 0.0317 | 1 |
treeDataI want to estimate the effects of environment on seed production, so
I need to add those predictors. The treeData data.frame
already contains some tree-year variables, including diam
and shade. Here I add plot variables to
treeData from the data.frame plotData using
the match function:
plotVars <- c( 'ecoRegWWF', 'slope', 'aspect', 'cec30', 'n30', 'ph', 'tpi' )
mm <- match( dataList$treeData$plot, plotData$plot )
dataList$treeData <- cbind( dataList$treeData, plotData[mm, plotVars ] )
Here is the top of the data.frame treeData in the
list dataList:
| plot | tree | species | diam | shade | tempAnom | defAnom | lateFreezeAnom | tempSite | defSite | slope | aspect | cec30 | n30 | ph | tpi |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | -1.006 | 292 | -3004.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | -0.748 | -447 | -1193.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | -0.548 | 81 | 43530.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | -1.148 | -28 | -169.8 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | -0.981 | -152 | -15720.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | 0.619 | 110 | -1012.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | 0.194 | 238 | -15040.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | -0.431 | 311 | 20970.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | -0.198 | 236 | -8444.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
| CWT427 | 439 | fraxinusAmerican | 44.7 | 5 | 0.069 | 210 | 5462.0 | 11.48 | -130 | 18.88 | 26.76 | 332.15 | 5.75 | 5.1 | -51 |
You’ll see that some predictors are plot (e.g.,
tempSite, defSite, cec),
plot-year (e.g., tempAnom, defAnom), tree
(e.g., species) or tree-year (e.g., diam). If
I were using only one plot for this analysis, I could not include any
plot-level predictors; they would not vary across trees on the same
plot.
There can be year effects to allow for “masting”, the tendency for
individuals of the same species to produce large seed crops
quasi-synchronously. There can also be detectable effects of climate
variation, which enters the model as annual anomalies. In this example,
I include defAnom, which is the annual anomaly in moisture
deficit. Positive values for defAnom are wet years, and
vice versa. In this case the yearEffect represents
variation that is shared across individuals that is not explained by
defAnom. The year effects tend to be “regional”, but not
global. For example, the mast years for a species in New England
generally do not match those in North Carolina. Here I define
yearEffect by ecoRegWWF and
species (quercusAlba can differ from
quercusRubra).
dataList$yearEffect <- list( groups = c( 'species', 'ecoRegWWF' ) )
Individuals can vary widely in seed production, so we use random tree
effects, or treeID. These are specified as a random
intercept:
dataList$randomEffect <- list( randGroups = 'treeID', formulaRand = as.formula( ~ 1 ) )
I try a combination of tree-year and plot-year variables for
conditional fecundity. Here are some formulas and a specification to
generate seed-rain prediction for the plot = DUKEBW for the
mapYears = c(2018:2021):
# assign formulas
dataList$formulaFec <- formulaFec <- as.formula( ~ diam + shade + defAnom + tpi )
dataList$formulaRep <- as.formula( ~ diam )
# plot-years to predict seed rain
mapYears <- c(2000:2021)
dataList$predList <- list( mapMeters = 10, plots = plot, years = mapYears )
Model fitting requires specification of the number of MCMC iterations
ng and the burnin. Here is an analysis using
the dataList with a small number of iterations. Still, it
will take some time due to the large numbers of tree years and seed
traps.
output <- mastif( inputs = dataList, ng = 2000, burnin = 1000 )
mastPlot( output )
Here are some plots from mastPlot:
The fitted model predicts the data from latent state estimates (c) and from predictors (a).
Recall that fecundity is not observed, but rather estimated. Therefore I cannot compare predicted fecundity with observed fecundity. However, I can compare the seed counts (per m2) with seed counts predicted from the estimated fecundity, or fecundity estimate –> seed traps. This is part (c) of the figure above. If the estimates of fecundity and seed dispersal are confident, then this plot would tend to the 1:1 line.
I can also compare seed counts to the prediction from the full fitted model, or predictors –> fecundity –> seed traps (part a in the figure). If the predictors explain much of variation in fecundity, and dispersal is confidently estimated by the model, then this plot will tend to the 1:1 line. The differences between these plots can indicate that a breakdown in fit comes at the stage of estimating fecundity from the predictors–the predictors may not explain much variation in fecundity, even if there are confident estimates of fecundity.
Estimated year effects for two species in two ecoregions.
In the plot above, the year effects are grouped by species and ecoregion. There are two species here, and the coverage differs by year and ecoregion. This is not a strong masting genus, and the year effects are noisy, not much different from zero.
Effect of predictors on fecundity, including shade class, annual deficit anomaly (negative is wet), and the topographic position index (negative is low).
Shade generally has a negative effect on fecundity. We’ve known this to be true for growth, but is effects on fecundity of trees in the wild was not known.
The negative effects of defAnom means that both species
have reduced seed production in years with high moisture deficit. In
other words, moist years are good.
The negative effects of tpi, or topographic position
index, on fraxinusAmerican means that seed production is
higher in the moist bottomlands. The second species,
fraxinusPennsylv has a smaller sample size in the data set;
it does not show a tpi effect.
Predicted maturation probability and fecundity by diameter for two species.
The model predicts maturation and fecundity for each species. It depends on the distribution of data, shown as histograms in the right panels.
Here is the prediction surface for seed supply to the forest floor by all Fraxinus trees on the plot:
output$mapPlot <- plot
output$mapYears <- mapYears
output$trapScale <- .8
output$PREDICT <- T
mastMap( output )
Predicted seed rain across the DUKEBW plot differs by year. Circles show
tree diameter. Boxes are scaled to seed counts in traps. The
+ symbol indicates a seed trap that collected no seed in
that year.
Note how the seed supply to the forest floor changes from one year to the next. If this were a strong masting species, the yearly contrasts would be much larger. Even so, there are areas receiving abundant seed and others not at all.
Seed production of the genus Quercus can vary widely from year to year. Many consumers benefit from high mast years. Because there are many Quercus species, and they may not be synchronized across species, it’s possible that there may be some compensation, in the sense that a poor year for one species may coincide with good years in others. Consumers capable of consuming the fruits of more than one tree species may respond to the sum.
I fitted the model to the genus Quercus and summarize the annual variation by species and ecoregion. Values have a predictive coefficient of variation greater than 1 are omitted, so there are some straight lines connecting years across these poorly estimated years. I represent the variation within a species-ecoregion in two ways.
The estimated year effects (+/- 1 SE) add to the effects of other annual effects in the model, such as climate anomalies.
The estimated year effects in the model amplify the effects of other predictors that contribute inter-annual variation. For this reason, the year effects can be viewed as variation shared between individuals beyond the effects other predictors in the model. Wide standard errors for the most recent years are due to fact that many samples are not yet analyzed.
The predicted inter-annual variation in seed production per cm2 of the tree’s basal area; it attempts to adjust for size differences.
The interannual variation per basal area shows wide variation across trees, because synchronicity in most years is weak. Unlike the year effect estimates, these estimates incorporate all variation, including unexplained variation.
Predicted seed rain of Quercus across the DUKEBW plot. Circles are trees scaled to diameter. Squares are seed traps scale to seed recovery. Prediction surfaces follow the scale at bottom.
The variation is spatio-temporal. For example, at the
DUKEBW plot shows the tendency for most seed to accumulate
in the southwestern part of the plot, which is high elevation and
dominated by oaks. However, the maps can shift between years based on
the interannual variation within and between species.
Clark, JS, C Nunes, and B Tomasek. 2019. Masting as an unreliable resource: spatio-temporal host diversity merged with consumer movement, storage, and diet. Ecological Monographs, 89:e01381.
Journe, V., et al., 2022. Globally, tree fecundity exceeds productivity gradients. Ecology Letters, DOI: 10.1111/ele.14012.
Qiu T., et al., 2022. Limits to reproduction and seed size-number tradeoffs that shape forest dominance and future recovery. Nature Communications, 13:2381.