2016-07-23

Use try() notes in Hadley’s book function to catch mle2() errors. It returns an object of class() try-error and continues the loop.

Species counts of small trees:

species n
Black Cherry 4711
Red Maple 3947
Witch Hazel 2240
Service Berry 1266
American Elm 405
Flowering Dogwood 293
Sassafras 239
Pignut Hickory 219
Hophornbeam 171
Autumn Olive 116
White Oak 106
Shagbark Hickory 45
American Beech 44
Black Oak 40
American Basswood 34
Black/Red Oak hybrid 26
Choke Cherry 26
Black/Northern Pin hybrid 19
Red Oak 16
White Ash 9
Bitternut Hickory 8
Big Tooth Aspen 5
Sugar Maple 5
Musclewood 3
Black Walnut 1
Honeysuckle 1
Yellow Birch 1

2016-06-14

Focusing on smaller American Elms (\(dbh_{08}<10\)cm)

Maximum Likelihood Estimation Routines.

Disparity in MLE results. As per “On Best Practice Optimization Methods in R” Nash (2014), switched to using mle2 from bbmle package over just optim(). Much better interface and indicators when algorithm didn’t converge.

We should still be cautious to use multiple start values. Also, Nelder-Mead methods are not derivative based. Consider sticking to BFGS.

\(\lambda\) values

When considering only small trees and thus setting \(\gamma=0\) and thus \(\text{dbh}_{\text{focal}}^{\gamma}=1\) in:

\[ \begin{align*} \text{ActualGrowth} &= \text{Normal}\left(\mu = \text{ExpGrowth}, \sigma^2\right)\\ \mu = \text{ExpGrowth} &= \text{MaxGrowth}\times\text{CrowdEffect}\\ \text{CrowdEffect} &= \exp\left( -\text{crowd}_1 \left( \frac{\text{NCI}}{\text{NCI}_{\max}} \right)^{\text{crowd}_2}\right)\\ \text{NCI} &= \text{dbh}_{\text{focal}}^{\gamma} \sum_{i=1}^{s}\lambda_{i}\sum_{j=1}^{n_i}\frac{\text{dbh}_{ij}^{\alpha}}{\text{dist}_{ij}^{\beta}} \end{align*} \]

b/c we are normalizing the \(\text{NCI}\) by \(\text{NCI}_{\max}\), there are identifiability issues with the \(\lambda_i\)’s; they are equal up to a multiplicative constant. Compare the two sets of estimates based on different values below. The \(\lambda_i\)’s are equal up to constant \(\sim 2.23\).

\(\lambda_{\text{dogwood}}\) \(\lambda_{\text{oaks}}\) \(\lambda_{\text{shrub}}\) \(\lambda_{\text{juglandaceae}}\) max_growth crowd1 crowd2 sigma
Start Value 4.000 4.000 4.000 4.000 0.250 1.000 1.000 0.200
MLE 4.979 9.691 4.286 0.664 3.903 4.458 0.097 0.172
\(\lambda_{\text{dogwood}}\) \(\lambda_{\text{oaks}}\) \(\lambda_{\text{shrub}}\) \(\lambda_{\text{juglandaceae}}\) max_growth crowd1 crowd2 sigma
Start Value 2.000 3.000 1.000 1.000 0.250 1.000 1.000 0.200
MLE 2.287 4.345 1.933 0.313 3.069 4.238 0.105 0.172

More Parameters

The latter model had Log-likehood statistic = 138.97. We go ahead and fit with more parameters: 11 mins, Log-likelihood statistic = 160.81.

\(\lambda_{\text{dogwood}}\) \(\lambda_{\text{oaks}}\) \(\lambda_{\text{shrub}}\) \(\lambda_{\text{juglandaceae}}\) max_growth crowd1 crowd2 sigma alpha beta gamma
Start Value 2.000 3.000 1.000 1.000 0.250 1.000 1.000 0.200 2.000 2.000 0.000
MLE 0.027 31.028 9.656 0.024 5.426 4.302 0.012 0.163 3.412 5.923 -21.625

Action Items

  • Investigate constrained optimization for MLE’s. Ex: \(\lambda>0\)
  • Use sp::point.in.polygon to control for edge effects. 5m boundary suffices
  • Add code to repository and share with Dave
  • Start thinking incorporating all species

2016-05-18

We note differences in the MLE’s depending on the algorithm we used. Some of the species returned errors for BFGS, so I present comparisons on the 6 species that didn’t return errors.


This is a bit hard to digest, so let’s compare by plot. First let’s compare values relating to the quality of the fit and the residual noise \(\sigma\) (the solid line is \(y=x\)). We see that for both algorithms, the values are very similar. My interpretation of this is that there is similar predictive signal for those species.

However, when comparing estimates that relate to the modeling of expected growth, we see large differences, despite both sets of values leading to similar prediction scores above. My guess is that some of the parameters are conflated with each other (kind of like collinearity in regression), thereby making it difficult to tease out the effect of one vs the other.

2016-05-17

We define a measure of fit, which is based on \(R^2\) for regression.

\[ \begin{eqnarray*} \text{ActualGrowth}_i &\sim& \text{Normal}\left(\text{ExpectedGrowth}_i, \sigma^2\right)\\ \epsilon_i &=& \text{ActualGrowth}_i - \text{ExpectedGrowth}_i\\ R^2 &=& 1 - \frac{\text{Var}\left(\epsilon\right)}{\text{Var}\left(\text{ActualGrowth}\right)} \end{eqnarray*} \]

Also

Results

Observations

  • We get horrible fits for oaks
  • None of the lambdas vary much from the start values for the optimizer of 1, yet we get decent \(R^2\)’s
  • The Serviceberry and Sassafrass max growth values don’t make much sense.