Friday, September 25, 2020

Outline

  1. Dendroband data as time series
  2. Naive forecasting methods and issues
  3. Proposed modeling approach
  4. TODO’s

Dendroband data as time series

Dendroband data

Pictured: One particular litu (082422)

Dendroband data

Same data as in Cam’s July 31 preso:

One particular litu

Raw dendroband measurements. Call these \(D_t^{obs}\)

One particular litu

Take diffs to compute growth. What do you observe?

One particular litu

Observations about data:

  • Exhibits both seasonal pattern and trend
  • Very noisy
  • Issue: Negative growth?
  • Issue: Irregular intervals of measurement

One particular litu

Seasonal pattern (blue) + increasing trend (red)

Naive forecasting methods and issues

Naive forecasts

Say we have observed measurements \(y_1, \ldots, y_t\). Here are some naive forecasts of \(y_{t+1}\):

  1. Naive: \(y_{t+1}\) = last value \(y_{t}\)
  2. Mean: \(y_{t+1}\) = \(\overline{y}\)
  3. Seasonal naive: \(y_{t+1}\) = \(y_{\text{one seasonal period earlier}}\)

3. Seasonal naive forecast

Assuming seasonal period of 1 year we have (1) forecast and (2) prediction intervals of uncertainty

3. Seasonal naive forecast

(Zoomed-in to 2018-2020) Recall however irregular intervals of measurements

3. Seasonal naive forecast

I cheated by back-filling in missing values:

Proposed modeling approach

State-space models

A general class of models used to estimate latent i.e. unobservable variables

  • Spatial data: Markov random fields
  • Time series: Hidden Markov Model
    • Hidden: Latent variables
    • Markov (property): Current observations depend on previous ones

HMM example (in equations)

\[ \begin{eqnarray*} \text{Data model: } D_t^{obs} &\sim& \text{Normal}(D_t, \tau_{obs})\\ \text{Process model: }D_{t+1} &\sim& \text{Normal}(\beta_0 + \beta_1D_{t}, \tau_{pro})\\ &=& \beta_0 + \beta_1D_{t} + \text{Normal}(0, \tau_{pro})\\ \end{eqnarray*} \] where

  1. \(D_t^{obs}\) are observed diameters
  2. \(D_t, D_{t+1}\) are unobservable latent “true” diameters
  3. \(\tau_{obs}\) dictates variation of measurement error from dendroband + caliper
  4. \(\tau_{pro}\) dictates variation of error not captured by model i.e. residual

HMM example (in pictures)

HMM example (in pictures)

HMM example (in pictures)

HMM example (in pictures)

Example (in pictures)

Why Hidden Markov Models?

  1. Separate observation errors (don’t propagate) from process errors (do propagate)
  2. Missing data
  3. Prediction/forecasting: Future observations as missing data
  4. Data fusion (at the end)

Why Hidden Markov Models?

Missing data: Two gaps of missing data in red were imputed.

Why Hidden Markov Models?

Predictions of future observations in red:

TODO’s

My plan

  1. Minimally viable model
  2. Next iteration

Minimally viable model

Based on Clark (2007)

  1. Lognormal model for growth because of noisy & negative observed values
  2. Hierarchical process model for \(D_t\) and \(\tau_{pro}\):
    1. Random effects (things to account for): tag, site, species
    2. Fixed effects (things we’re interested in): shared effect of year (climate)
  3. More emphasis on explicit forecasts/predictions

Next iteration

Next iteration: Bayesian methods allow for data fusion disparate data sources. Ex: different data that have

  1. Different observed data types
  2. Different time scales
  3. Different error structures

Data fusion

  1. Diameter censuses
    1. Diameters (same as dendrobands)
    2. Every 5 years
    3. Errors from tape/calipers
  2. Ring width from tree coring
    1. Increments, not diameters
    2. Time scale?
    3. Errors from dendrochronology

Data Fusion (Clark 2007)

Data Fusion (Clark 2007)

Data Fusion (Clark 2007)

Data Fusion (Clark 2007)

Data Fusion (Clark 2007)

Data Fusion (Clark 2007)

Yays for Bayes