Forests support biodiversity through the supply of food (fruits, seeds, nuts, leaves, cambium, wood, detritus) and a 3-D habitat structure (ground cover, microhabitats, protection from predators on the ground and in the air). This diversity of resources (food and habitat) is probably critical to the diversity of all other organisms.

In addition to biodiversity, forests sequester CO2 from the atmosphere in the form of living wood and buried soil organic matter. They regulate temperature by intercepting incoming solar radiation and through through evaporative cooling during photosynthesis.

The big questions about biodiversity include the following:

How many species are there?
What is the diversity in terms of relative dominance (a few dominant, or equitable)?
Do they partition the environment in order to coexist? Which variables can we measure that might show this?
How is diversity changing? E.g., size/age structure?
How fast is C accumulating?
How much wildlife food is available? How does it vary between habitats?

At Duke Forest we’ll gather data that address these questions. We’ll learn about distribution and abundance of tree species, the food resources they produce, and the vertebrate consumers.

Some Field trip logistics

To bring: - Clip board (or Excel sheets on ipad) - Cruz-all - Diameter tapes - Binoculars - Smart phone with iNaturalist

Activities will include the following:

Camera traps: collect those already there, set new traps
Forest inventory: variable radius and permanent plots
Variables:
- Species: yes, we can identify them in winter
- DBH: diameter at breast height summarizes size differences and size structure
- Crown class: how much shading does a tree experience from neighbors?
- Crop count: the amount of seeds, fruits, and nuts determines forest reproduction and food for consumers.

In the vignette that follows we’ll explore existing data from the site we’ll visit in the field.

Some inventory data

Here I load libraries, source code, and specify data:

source( 'clarkFunctions2024.R' )
library( dplyr )
library( mastif )
library( stringi )
library( maps )
library( car )
dataFile <- 'mastifDataNC.rdata'
plot     <- 'DUKEBW'

Next, I load a file that contains forest data for North Carolina–I’ve clipped from our global data. In this block of code I load the file plot the locations from the data.frame plotData. I use the map function to plot the locations:

load( dataFile, verbose = T )
xlim   <- range( plotData$lon ) + .5*c(-1, 1 )
ylim   <- range( plotData$lat ) + .5*c(-1, 1 )
map( 'state', xlim = xlim, ylim = ylim )                 # draw map
points( plotData$lon, plotData$lat, cex = .2, col = 2 )  # add data points

## Loading objects:
##   plotData
##   taxa
##   inputs

Here are the plots with long-term data (data includes ST for seed traps):

plotData$plot[ grep( 'ST', plotData$data ) ]

##  [1] "CALLHDWD"  "CALLPIPA"  "CWT118"    "CWT218"    "CWT318"    "CWT427"   
##  [7] "CWT527"    "CWTLG"     "CWTUG"     "DUKEBW"    "DUKEEE"    "DUKEEW"   
## [13] "DUKEFACE1" "DUKEFACE2" "DUKEFACE3" "DUKEFACE4" "DUKEFACE5" "DUKEFACE6"
## [19] "DUKEHW"    "GRSMCD"    "GRSMND"    "GRSMPK"    "MARSF"     "MARSP"

The field-trip plot is in this list, DUKEBW.

Sample effort

The problem with this map is that all of the observations are piled on top of each other. Some of these points correspond to 1 tree-year. Others correspond to 10,000. The map does not show how sample effort varies. In this case, sample effort means the number of tree-years observed within each 0.1 x 0.1 degree of lon/lat. Here I use the function obsGrid, which is sourced from clarkFunctions2024.R, to shade locations by the number of tree-years they represent. I use the function symbols to draw squares of the size and color needed to show effort:

effort <- obsGrid( inputs$treeData, plotData ) # aggregate observations to 0.1 degree
map( 'state', xlim = xlim, ylim = ylim )                 
symbols( effort$lon, effort$lat, 
         squares = rep(.11, nrow(effort) ), 
         fg = effort$cols, bg = effort$cols, inches = F, add = T )

Clearly, the effort is uneven. This is typical of environmental data. This distribution affects what we can estimate.

The list effort is a data.frame. A few lines show the structure:

lon	lat	nobs	cols
-80.5	33.5	32	#FC9F6A
-80.4	33.5	1	#FDBB84
-80.8	33.8	63	#FC9661
-78.4	34.0	3	#FCB17B
-78.3	34.0	1	#FDBB84

The column nobs is the number of observations in the grid cell. The column cols is the color assigned to that grid cell. The hex format has # followed by 2 values each for red, green, and blue.

Taxa

The list inputs holds several objects:

summary( inputs )

##              Length Class      Mode     
## priorTable     9    data.frame list     
## randomEffect   2    -none-     list     
## seedData     247    data.frame list     
## seedNames      2    -none-     character
## seedTraits   334    -none-     numeric  
## specNames    111    -none-     character
## treeData      28    data.frame list     
## xytrap         4    data.frame list     
## xytree         4    data.frame list     
## yearEffect     2    -none-     list

Several of these are data.frames. The treeData starts like this, with one row per tree-year

plot	tree	species	year	diam	shade	cropCount	cropFraction
GRSMCD	1.01	abiesFraseri	2011	15.9	3	NA	NA
GRSMCD	1.01	abiesFraseri	2012	16.4	3	NA	NA
GRSMCD	1.01	abiesFraseri	2013	16.8	3	NA	NA
GRSMCD	1.01	abiesFraseri	2014	17.3	3	NA	NA
GRSMCD	10.01	abiesFraseri	2011	3.4	3	NA	NA

This is a large data.frame with nrow(inputs$treeData) = 796728 tree-years. The plot column is used to align this tree-year data with plot data in plotData:

plot	lon	lat	UTMx	UTMy	UTMzone	data	ecoRegWWF
BCEF30	-82.64043	35.46287	351149.3	3925612	17	CC	Appalachian-Blue Ridge forests
BCEF31	-82.62934	35.50521	352232.9	3930292	17	CC	Appalachian-Blue Ridge forests
BCEFBB	-82.64627	35.48923	350650.3	3928333	17	CC	Appalachian-Blue Ridge forests
BCEFBP	-82.59549	35.50025	355277.0	3929481	17	CC	Appalachian-Blue Ridge forests
BCEFBT	-82.64707	35.47600	350553.2	3926867	17	CC	Appalachian-Blue Ridge forests

The UTM columns translate lon/lat to meters. The data column indicates if the site has long-term inventory data with seed traps (ST), crop counts (CC), or both (ST_CC). ecoRegWWF is the WWF ecoRegion.

For example, if I want to extract plot information for the first five tree-years in inputs$treeData I can use the match function:

mm <- match( inputs$treeData$plot, plotData$plot )
plotData[ mm[1:5], ]

These lines are identical, because they come from the same plot in inputs$treeData.

Here are some columns of inputs$treeData:

object	`mode`	description
plot	`character`	plot name, e.g., `DUKEBW`
tree	`character`	unique within a plot
species	code8 format	camelCase, 8 genus and 8 species characters
year	`numeric`	observation year
diam	`numeric`	cm
shade	`numeric`	1 (full sun) to 5 (full shade)
cropCount	`numeric`	no. fruits counted
cropFraction	`numeric`	fraction of crop represented by cropCount

Many of the tree-years do not include cropCount (or other variables). Missing observations is a common feature of long-term data sets. They are represented by NA.

The species that occur in this data set are here:

inputs$specNames

##   [1] "abiesFraseri"     "acerBarbatum"     "acerFloridan"    
##   [4] "acerNegundo"      "acerPensylva"     "acerRubrum"      
##   [7] "acerSaccharu"     "acerSpicatum"     "aesculusFlava"   
##  [10] "aesculusPavia"    "aesculusSylvatic" "ailanthuAltissim"
##  [13] "amelanchArborea"  "amelanchLaevis"   "asiminaTriloba"  
##  [16] "betulaAlleghan"   "betulaLenta"      "carpinusCarolini"
##  [19] "caryaAlba"        "caryaIllinoin"    "caryaOvalis-g"   
##  [22] "caryaOvata"       "caryaPallida"     "castaneaDentata-"
##  [25] "castaneaPumila"   "celtisLaevigat"   "cercisCanadens"  
##  [28] "chamaecyThyoides" "chionantVirginic" "cornusFlorida"   
##  [31] "corylusAmerican"  "diospyroVirginia" "fagusGrandifo"   
##  [34] "frangulaCarolini" "fraxinusAmerican" "fraxinusPennsylv"
##  [37] "gleditsiTriacant" "gordoniaLasianth" "halesiaCarolina" 
##  [40] "ilexDecidua"      "ilexMontana"      "ilexOpaca"       
##  [43] "ilexVerticil"     "ilexVomitori"     "juglansNigra"    
##  [46] "juniperuVirginia" "linderaBenzoin"   "liquidamStyracif"
##  [49] "liriodenTulipife" "magnoliaAcuminat" "magnoliaFraseri" 
##  [52] "magnoliaGrandifl" "magnoliaMacrophy" "magnoliaTripetal"
##  [55] "magnoliaVirginia" "morusRubra"       "nyssaBiflora"    
##  [58] "nyssaSylvatic"    "ostryaVirginia"   "oxydendrArboreum"
##  [61] "paulowniTomentos" "perseaBorbonia"   "perseaPalustri"  
##  [64] "piceaRubens"      "pinusEchinata"    "pinusElliotti"   
##  [67] "pinusPalustri"    "pinusPungens"     "pinusRigida"     
##  [70] "pinusSerotina"    "pinusStrobus"     "pinusTaeda"      
##  [73] "pinusVirginia"    "platanusOccident" "prunusCarolini"  
##  [76] "prunusPensylva"   "prunusSerotina"   "quercusAlba"     
##  [79] "quercusBicolor"   "quercusCoccinea"  "quercusFalcata"  
##  [82] "quercusHemispha"  "quercusIncana"    "quercusLaevis"   
##  [85] "quercusLaurifol"  "quercusLyrata"    "quercusMacrocar" 
##  [88] "quercusMargaret"  "quercusMariland"  "quercusMichauxi" 
##  [91] "quercusMontana"   "quercusNigra"     "quercusPhellos"  
##  [94] "quercusRubra"     "quercusStellata"  "quercusVelutina" 
##  [97] "quercusVirginia"  "robiniaPseudoac"  "sassafraAlbidum" 
## [100] "sorbusAmerican"   "taxodiumAscenden" "taxodiumDistichu"
## [103] "tiliaAmerican"    "tiliaHeteroph"    "tsugaCanadens"   
## [106] "ulmusAlata"       "ulmusAmerican"    "ulmusRubra"      
## [109] "viburnumPrunifol" "viburnumRafinesq" "viburnumRufidulu"

These names could be unfamilar. Some addition insight is gained from the data.frame taxa. Here are the top rows from taxa:

	genus	family	leaves	fruit	pollination	dispersal	gmPerSeed	seedsPerFruit
abiesFraseri	Abies	Pinaceae	needle_evergreen	samara	wind	anemochory	0.0100	180.5
acerBarbatum	Acer	Sapindaceae	broad_deciduous	samara	animal	anemochory	0.0300	1.0
acerFloridan	Acer	Sapindaceae	broad_deciduous	samara	animal	anemochory	0.0683	1.0
acerNegundo	Acer	Sapindaceae	broad_deciduous	samara	animal	anemochory	0.0450	1.0
acerPensylva	Acer	Sapindaceae	broad_deciduous	samara	animal	anemochory	0.0385	1.0
acerRubrum	Acer	Sapindaceae	broad_deciduous	samara	animal	anemochory	0.0210	1.0
acerSaccharu	Acer	Sapindaceae	broad_deciduous	samara	animal	anemochory	0.0736	1.0
acerSpicatum	Acer	Sapindaceae	broad_deciduous	samara	animal	anemochory	0.0503	1.0
aesculusFlava	Aesculus	Sapindaceae	broad_deciduous	capsule	animal	barochory	4.5000	1.0
aesculusPavia	Aesculus	Sapindaceae	broad_deciduous	capsule	animal	barochory	6.5000	1.0

Many of these columns are self-explanatory. However, don’t worry about your knowledge of tree species.

Examine a long-term plot

Again, we will visit the long-term plot named DUKEBW (Duke Forest, Blackwood). I use the function extractPlot to examine long-term data from this site:

load( dataFile, verbose = T )

## Loading objects:
##   plotData
##   taxa
##   inputs

tmp  <- extractPlot( inputs, plot, taxa, plotData ) # plot specified at top of this doc
dataList <- tmp$inputs
plotData <- tmp$plotData
taxa     <- tmp$taxa

The treeData starts like this, with one row per tree-year

	plot	tree	species	year	diam	shade	cropCount	cropFraction
59157	DUKEBW	10000	acerRubrum	2000	11.1	5	NA	NA
60100	DUKEBW	10000	acerRubrum	2001	11.1	5	NA	NA
61122	DUKEBW	10000	acerRubrum	2002	11.1	5	NA	NA
62107	DUKEBW	10000	acerRubrum	2003	11.1	5	NA	NA
6396	DUKEBW	10000	acerRubrum	2004	11.1	5	NA	NA

The species on this plot and their traits are in the new version of taxa. Here is a table of tree-years by year:

table( dataList$treeData$year )

## 
##  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012 
##  9841  9841  9928  9940 10122 10117 10256 10253  9690 10119 11494 11425 10661 
##  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023 
## 10660 11578 11558 10085 10041 10470 10470  9227  9930  9905    18

The plots were not sampled every year. I’ve filled in the missing years by interpolating diameters.

Spatial distribution

To see maps from two different years I can use the function mastMap:

mapList <- dataList
mapList$mapPlot <- plot
mapList$mapYears <- c(2000, 2020)
mapList$treeScale <- .3
mastMap( mapList )

## Loading objects:
##   dem

## Loading objects:
##   dem

In these maps trees of each species is given a different color. I have not included a color legend because it would be too long. The colors still give a sense of the diversity. There is a road separating the SW and NE sections of the plot. There is power line cutting through the southern end of the plot. If you look closely you can see that some of the large trees present in 2000 are no longer there.

To get a better idea of the diversity, here’s one genus:

genus <- 'quercus'
mapList$specNames <- dataList$specNames[ startsWith( dataList$specNames , genus ) ]
mapList$seedNames <- dataList$seedNames[ startsWith( dataList$seedNames , genus ) ]
mapList$mapYears <- 2020
mapList$treeScale <- .4
mapList$LEGEND <- TRUE
mastMap( mapList )

## Loading objects:
##   dem

The squares are seed traps, scaled to the number of seeds collected at that location.

Here is another genus:

genus <- 'pinus'
mapList$specNames <- dataList$specNames[ startsWith( dataList$specNames , genus ) ]
mapList$seedNames <- dataList$seedNames[ startsWith( dataList$seedNames , genus ) ]
mastMap( mapList )

Another example:

genus <- 'carya'
mapList$specNames <- dataList$specNames[ startsWith( dataList$specNames , genus ) ]
mapList$seedNames <- dataList$seedNames[ startsWith( dataList$seedNames , genus ) ]
mastMap( mapList )

Here are some species from different genera:

specs <- c('juglansNigra', 'nyssaSylvatic' )
mapList$specNames <- specs
mapList$seedNames <- specs
mastMap( mapList )

Size distribution

There are some contrasts in distribution on this map. They also appear to differ in size distribution. Here is a histogram of diameters for the pines:

treeData <- dataList$treeData  # less typing!
pines <- rownames( taxa )[ taxa$genus == 'Pinus' ]
xlim <- c(0, 80)
wp   <- which( treeData$species %in% pines )
hist( treeData$diam[wp], xlim = xlim, nclass = 20, main = 'Pines' )

Here are the oaks:

oaks <- rownames( taxa )[ taxa$genus == 'Quercus' ]
wq   <- which( treeData$species %in% oaks )
hist( treeData$diam[wq], xlim = xlim, nclass = 20, main = 'Oaks' )

Here is sweet gum:

par( mfrow = c(1,3), bty = 'n', omi = c(.5,.5,.1,.1) )
hist( treeData$diam[wp], xlim = xlim, nclass = 20, main = 'Pines', xlab = '', ylab = '' )
hist( treeData$diam[wq], xlim = xlim, nclass = 20, main = 'Oaks', xlab = '', ylab = '' )
ws   <- which( treeData$species == 'liquidamStyracif' )
hist( treeData$diam[ws], xlim = xlim, nclass = 20, main = 'Sweet gum', xlab = '', ylab = '' )
mtext( 'Diameter (cm)', 1, outer = T )
mtext( 'Tree years', 2, outer = T )

Dominance

Which species dominant the stand? The data.frame treeData has a row for each tree-year, but I want one row per tree. So I find the maximum diameter for each tree, which should be the most recent:

maxDiam <- tapply( treeData$diam, treeData$tree, max, na.rm = T )
mm      <- match( names(maxDiam), treeData$tree )
trees   <- data.frame( maxDiam, treeData[mm,c('species','shade')] )

The rownames of trees is the tree number.

Here is species dominance by numbers:

stems <- table( trees$species )
stems <- stems[ order(stems, decreasing = T ) ]
plot( stems, type = 's', log = 'xy', ylim = c(1, max(stems) ), xlab = 'Rank', ylab = 'Number' )

#label 12 most abundant
i <- 1:12
text( i, .9*stems[i], names(stems)[i], srt = 90, pos = 2 )

Here is species dominance by basal area:

trees$BA <- pi*(trees$maxDiam/2)^2
tba      <- tapply(trees$BA, trees$species, sum, na.rm = T )
tba      <- tba[ order(tba, decreasing = T ) ]
plot( tba, type = 's', log = 'xy', ylim = c(1, max(tba) ), xlab = 'Rank', ylab = 'Basal Area' )
text( i, .9*tba[i], names(tba)[i], srt = 90, pos = 2 )

Exercise 1: Which could be the most important tree species in this stand? Based on what criteria?

How do species “partition” the environment in space?
What menvironmental variables might explain spatial variation in composition?

Exercise 2: Repeat this exercise with another nearby plot, DUKEEW. Does this added information change the previous answers?

Fecundity estimates

After the field trip we’ll estimate seed production using a Bayesian hierarchical model called mastif (Clark et al. 2019, Qiu et al. 2023). We want to know which species have the largest reproductive effort and how it varies over time. Model fitting is done one genus at a time. This is because many of the seeds recovered from seed traps can be identified to genus level but not to species level. The estimation of seed produced by species is detailed in Clark et al. (2019).

We can see some fruits, nuts, and seeds in trees, but not most of them. We can observe them on the ground, but how do we determine which seeds came from which tree?

mastif uses seed traps and crop counts on trees to estimate seed productivity and dispersion. Attributes of individual trees and their local environments could explain their differences in fecundity. Inference requires information on locations of trees and seed traps and predictors ( covariates and factors ) that could explain source strength.

The mastif model admits two data types, crop counts (above) and seed traps (below). The model for maturation and seed production of each tree-year links to observations through a data model. In the case of seed traps, this data model describes seed transport.

Crop counts enter through the long-term inventory data (treeData) or through the mastif project on iNaturalist. To add observations to mastif you can download the iNaturalist app on a smart phone. In addition to a smart phone, you need a diameter tape. A simple summary is here.

Seed traps accumulate seeds that can be counted used to estimate seed production through a dispersal kernel. They enter through inventory plots in seedData with locations in xytrap.

Many trees are not reproductively mature, and reproductive status is often unknown. For individuals of unknown maturation status (reproductively mature or not), the maturation status must be estimated together with the conditional fecundity (seed production given the mature state). Currently, maturation status depends only on tree diameter (the variable diam).

Predictors of fecundity include individual traits, site characteristics, climate, synchronicity with other trees, and lag effects. For studies that involve multiple plots and years, there can be effects of climate between sites and through time. There may be synchronicity between individuals both within and between plots.

The model incorporates two types of redistribution. The first type is a redistribution from discrete species space to a different seed-type space. Not all seed types can be definitely assigned to species. The contribution of trees of each species to multiple seed types must be estimated. As an example, a plot may include trees of these species in the same genus: pinusTaeda, pinusEchinata. The seeds recovered in traps from the same genus could include any combination of the folling types: pinusTaeda, pinusEchinata, pinusUNKN. The last seed type, identified only to the genus Pinus, could dominate seed counts. The seed production by species must estimate how they are distributed across the observed seed types. This redistribution is estimated as a specNames-to-seedNames matrix.

The second redistribution comes from source trees to seed traps, or seed dispersal. Within a plot-year there is dependence between all seed traps induced by a redistribution kernel, which describes how seeds produced by trees are dispersed in space. Because ‘seed shadows’ of trees overlap, seeds are assigned to trees only in a probabilistic sense. The capacity to obtain useful estimates depends on the number of trees, the number of seed traps, the dispersal distances of seeds, the taxonomic specificity of seed identification, and the number of species contributing to a given seed type.

In addition to predictors, the process model for fecundity can involve random tree effects, random group effects ( e.g., species-location ), year effects, and autoregressive lags limited only by the duration of data. Dispersal estimates can include species and site differences (Clark et al. 2013).

`mastif` package

A full description of mastif is here. The figure below summarizes the main elements. The function mastSim is used to simulate a data set. This allows us to check on data prediction and parameter recovery.

Inputs and outputs (grey) for three main functions (yellow) in the mastif package.

The function mastif fits the model to data held in four data.frames called treeData (tree-year by variables), seedData (seedTrap-year by seed counts), xytree (tree locations), and xytrap (seed-trap locations). The function mastif generates output as a list.

The function mastPlot generates plots of diagnostics.

The model requires a minimum of six inputs directly as arguments to the function mastif. There are two formulas, two character vectors, and four data.frames:

Table 2. Required in the list inputs to the function mastif

object	`mode`	components
`formulaFec`	`formula`	fecundity model
`formulaRep`	`formula`	maturation model (currently fixed at `~ diam`
`specNames`	`character vector`	names in column `species` of `treeData` to include in model
`seedNames`	`character vector`	names of seed types to include in model: match column names in `seedData`. Seeds identified only to genus are `genusUNKN`
`treeData`	`data.frame`	tree-year by variables, includes columns `plot`, `tree`, `year`, `diam`, `repr`
`seedData`	`data.frame`	trap-year by seed types, includes columns `plot`, `trap`, `year`, one for each `seedNames`
`xytree`	`data.frame`	tree by location, includes columns `plot`, `tree`, `x`, `y`
`xytrap`	`data.frame`	trap by location, includes columns `plot`, `trap`, `x`, `y`

I introduce additional inputs in the sections that follow, but start with this basic model.

Here is code to extract species from one genus

load( dataFile, verbose = F )
genus <- 'fraxinus'
tmp   <- extractGenus( inputs, genus = genus, taxa, plotData )
dataList <- tmp$inputs
plotData <- tmp$plotData
taxa     <- tmp$taxa

Here is the trait table that has been trimmed to members of this genus on the plot:

	genus	family	leaves	fruit	pollination	dispersal	gmPerSeed	seedsPerFruit
fraxinusAmerican	fraxinus	Oleaceae	broad_deciduous	samara	wind	anemochory	0.0454	1
fraxinusPennsylv	fraxinus	Oleaceae	broad_deciduous	samara	wind	anemochory	0.0317	1

Add predictors to `treeData`

I want to estimate the effects of environment on seed production, so I need to add those predictors. The treeData data.frame already contains some tree-year variables, including diam and shade. Here I add plot variables to treeData from the data.frame plotData using the match function:

plotVars <- c( 'ecoRegWWF', 'slope', 'aspect', 'cec30', 'n30', 'ph', 'tpi' )
mm       <- match( dataList$treeData$plot, plotData$plot )
dataList$treeData <- cbind( dataList$treeData, plotData[mm, plotVars ] )

Here is the top of the data.frame treeData in the list dataList:

plot	tree	species	diam	shade	tempAnom	defAnom	lateFreezeAnom	tempSite	defSite	slope	aspect	cec30	n30	ph	tpi
CWT427	439	fraxinusAmerican	44.7	5	-1.006	292	-3004.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	-0.748	-447	-1193.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	-0.548	81	43530.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	-1.148	-28	-169.8	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	-0.981	-152	-15720.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	0.619	110	-1012.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	0.194	238	-15040.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	-0.431	311	20970.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	-0.198	236	-8444.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51
CWT427	439	fraxinusAmerican	44.7	5	0.069	210	5462.0	11.48	-130	18.88	26.76	332.15	5.75	5.1	-51

You’ll see that some predictors are plot (e.g., tempSite, defSite, cec), plot-year (e.g., tempAnom, defAnom), tree (e.g., species) or tree-year (e.g., diam). If I were using only one plot for this analysis, I could not include any plot-level predictors; they would not vary across trees on the same plot.

Year effects

There can be year effects to allow for “masting”, the tendency for individuals of the same species to produce large seed crops quasi-synchronously. There can also be detectable effects of climate variation, which enters the model as annual anomalies. In this example, I include defAnom, which is the annual anomaly in moisture deficit. Positive values for defAnom are wet years, and vice versa. In this case the yearEffect represents variation that is shared across individuals that is not explained by defAnom. The year effects tend to be “regional”, but not global. For example, the mast years for a species in New England generally do not match those in North Carolina. Here I define yearEffect by ecoRegWWF and species (quercusAlba can differ from quercusRubra).

dataList$yearEffect <- list( groups = c( 'species', 'ecoRegWWF' ) )

Random individual effects

Individuals can vary widely in seed production, so we use random tree effects, or treeID. These are specified as a random intercept:

dataList$randomEffect <- list( randGroups = 'treeID', formulaRand = as.formula( ~ 1 ) )

I try a combination of tree-year and plot-year variables for conditional fecundity. Here are some formulas and a specification to generate seed-rain prediction for the plot = DUKEBW for the mapYears = c(2018:2021):

# assign formulas
dataList$formulaFec <- formulaFec <- as.formula( ~ diam + shade + defAnom + tpi )
dataList$formulaRep <- as.formula( ~ diam )

# plot-years to predict seed rain
mapYears <- c(2000:2021)
dataList$predList <- list( mapMeters = 10, plots = plot, years = mapYears )

Model fitting requires specification of the number of MCMC iterations ng and the burnin. Here is an analysis using the dataList with a small number of iterations. Still, it will take some time due to the large numbers of tree years and seed traps.

output   <- mastif( inputs = dataList, ng = 2000, burnin = 1000 )
mastPlot( output )

Here are some plots from mastPlot:

The fitted model predicts the data from latent state estimates (c) and from predictors (a).

Recall that fecundity is not observed, but rather estimated. Therefore I cannot compare predicted fecundity with observed fecundity. However, I can compare the seed counts (per m2) with seed counts predicted from the estimated fecundity, or fecundity estimate –> seed traps. This is part (c) of the figure above. If the estimates of fecundity and seed dispersal are confident, then this plot would tend to the 1:1 line.

I can also compare seed counts to the prediction from the full fitted model, or predictors –> fecundity –> seed traps (part a in the figure). If the predictors explain much of variation in fecundity, and dispersal is confidently estimated by the model, then this plot will tend to the 1:1 line. The differences between these plots can indicate that a breakdown in fit comes at the stage of estimating fecundity from the predictors–the predictors may not explain much variation in fecundity, even if there are confident estimates of fecundity.

Estimated year effects for two species in two ecoregions.

In the plot above, the year effects are grouped by species and ecoregion. There are two species here, and the coverage differs by year and ecoregion. This is not a strong masting genus, and the year effects are noisy, not much different from zero.

Effect of predictors on fecundity, including shade class, annual deficit anomaly (negative is wet), and the topographic position index (negative is low).

Shade generally has a negative effect on fecundity. We’ve known this to be true for growth, but is effects on fecundity of trees in the wild was not known.

The negative effects of defAnom means that both species have reduced seed production in years with high moisture deficit. In other words, moist years are good.

The negative effects of tpi, or topographic position index, on fraxinusAmerican means that seed production is higher in the moist bottomlands. The second species, fraxinusPennsylv has a smaller sample size in the data set; it does not show a tpi effect.

Predicted maturation probability and fecundity by diameter for two species.

The model predicts maturation and fecundity for each species. It depends on the distribution of data, shown as histograms in the right panels.

Here is the prediction surface for seed supply to the forest floor by all Fraxinus trees on the plot:

output$mapPlot   <- plot
output$mapYears  <- mapYears
output$trapScale <- .8 
output$PREDICT <- T
mastMap( output )

Predicted seed rain across the DUKEBW plot differs by year. Circles show tree diameter. Boxes are scaled to seed counts in traps. The + symbol indicates a seed trap that collected no seed in that year.

Note how the seed supply to the forest floor changes from one year to the next. If this were a strong masting species, the yearly contrasts would be much larger. Even so, there are areas receiving abundant seed and others not at all.

A volatile species

Seed production of the genus Quercus can vary widely from year to year. Many consumers benefit from high mast years. Because there are many Quercus species, and they may not be synchronized across species, it’s possible that there may be some compensation, in the sense that a poor year for one species may coincide with good years in others. Consumers capable of consuming the fruits of more than one tree species may respond to the sum.

I fitted the model to the genus Quercus and summarize the annual variation by species and ecoregion. Values have a predictive coefficient of variation greater than 1 are omitted, so there are some straight lines connecting years across these poorly estimated years. I represent the variation within a species-ecoregion in two ways.

The estimated year effects (+/- 1 SE) add to the effects of other annual effects in the model, such as climate anomalies.

The estimated year effects in the model amplify the effects of other predictors that contribute inter-annual variation. For this reason, the year effects can be viewed as variation shared between individuals beyond the effects other predictors in the model. Wide standard errors for the most recent years are due to fact that many samples are not yet analyzed.

The predicted inter-annual variation in seed production per cm2 of the tree’s basal area; it attempts to adjust for size differences.

The interannual variation per basal area shows wide variation across trees, because synchronicity in most years is weak. Unlike the year effect estimates, these estimates incorporate all variation, including unexplained variation.

Predicted seed rain of Quercus across the DUKEBW plot. Circles are trees scaled to diameter. Squares are seed traps scale to seed recovery. Prediction surfaces follow the scale at bottom.

The variation is spatio-temporal. For example, at the DUKEBW plot shows the tendency for most seed to accumulate in the southwestern part of the plot, which is high elevation and dominated by oaks. However, the maps can shift between years based on the interannual variation within and between species.

Citations

Clark, JS, C Nunes, and B Tomasek. 2019. Masting as an unreliable resource: spatio-temporal host diversity merged with consumer movement, storage, and diet. Ecological Monographs, 89:e01381.

Journe, V., et al., 2022. Globally, tree fecundity exceeds productivity gradients. Ecology Letters, DOI: 10.1111/ele.14012.

Qiu T., et al., 2022. Limits to reproduction and seed size-number tradeoffs that shape forest dominance and future recovery. Nature Communications, 13:2381.

Duke Forest Field Trip
Data and logistics
Duke University

Jim Clark

2024-01-22

Some Field trip logistics

Some inventory data

Sample effort

Taxa

Examine a long-term plot

Spatial distribution

Size distribution

Dominance

Fecundity estimates

`mastif` package

Add predictors to `treeData`

Year effects

Random individual effects

A volatile species

Citations

Duke Forest Field Trip Data and logistics Duke University

Jim Clark

2024-01-22

Some Field trip logistics

Some inventory data

Sample effort

Taxa

Examine a long-term plot

Spatial distribution

Size distribution

Dominance

Fecundity estimates

mastif package

Add predictors to treeData

Year effects

Random individual effects

A volatile species

Citations

Duke Forest Field Trip
Data and logistics
Duke University

`mastif` package

Add predictors to `treeData`