2016-07-23

Use try() notes in Hadley’s book function to catch mle2() errors. It returns an object of class() try-error and continues the loop.

Species counts of small trees:

species	n
Black Cherry	4711
Red Maple	3947
Witch Hazel	2240
Service Berry	1266
American Elm	405
Flowering Dogwood	293
Sassafras	239
Pignut Hickory	219
Hophornbeam	171
Autumn Olive	116
White Oak	106
Shagbark Hickory	45
American Beech	44
Black Oak	40
American Basswood	34
Black/Red Oak hybrid	26
Choke Cherry	26
Black/Northern Pin hybrid	19
Red Oak	16
White Ash	9
Bitternut Hickory	8
Big Tooth Aspen	5
Sugar Maple	5
Musclewood	3
Black Walnut	1
Honeysuckle	1
Yellow Birch	1

2016-06-14

Focusing on smaller American Elms (\(dbh_{08}<10\)cm)

Maximum Likelihood Estimation Routines.

Disparity in MLE results. As per “On Best Practice Optimization Methods in R” Nash (2014), switched to using mle2 from bbmle package over just optim(). Much better interface and indicators when algorithm didn’t converge.

We should still be cautious to use multiple start values. Also, Nelder-Mead methods are not derivative based. Consider sticking to BFGS.

\(\lambda\) values

When considering only small trees and thus setting \(\gamma=0\) and thus \(\text{dbh}_{\text{focal}}^{\gamma}=1\) in:

\[ \begin{align*} \text{ActualGrowth} &= \text{Normal}\left(\mu = \text{ExpGrowth}, \sigma^2\right)\\ \mu = \text{ExpGrowth} &= \text{MaxGrowth}\times\text{CrowdEffect}\\ \text{CrowdEffect} &= \exp\left( -\text{crowd}_1 \left( \frac{\text{NCI}}{\text{NCI}_{\max}} \right)^{\text{crowd}_2}\right)\\ \text{NCI} &= \text{dbh}_{\text{focal}}^{\gamma} \sum_{i=1}^{s}\lambda_{i}\sum_{j=1}^{n_i}\frac{\text{dbh}_{ij}^{\alpha}}{\text{dist}_{ij}^{\beta}} \end{align*} \]

b/c we are normalizing the \(\text{NCI}\) by \(\text{NCI}_{\max}\), there are identifiability issues with the \(\lambda_i\)’s; they are equal up to a multiplicative constant. Compare the two sets of estimates based on different values below. The \(\lambda_i\)’s are equal up to constant \(\sim 2.23\).

	\(\lambda_{\text{dogwood}}\)	\(\lambda_{\text{oaks}}\)	\(\lambda_{\text{shrub}}\)	\(\lambda_{\text{juglandaceae}}\)	max_growth	crowd1	crowd2	sigma
Start Value	4.000	4.000	4.000	4.000	0.250	1.000	1.000	0.200
MLE	4.979	9.691	4.286	0.664	3.903	4.458	0.097	0.172

	\(\lambda_{\text{dogwood}}\)	\(\lambda_{\text{oaks}}\)	\(\lambda_{\text{shrub}}\)	\(\lambda_{\text{juglandaceae}}\)	max_growth	crowd1	crowd2	sigma
Start Value	2.000	3.000	1.000	1.000	0.250	1.000	1.000	0.200
MLE	2.287	4.345	1.933	0.313	3.069	4.238	0.105	0.172

More Parameters

The latter model had Log-likehood statistic = 138.97. We go ahead and fit with more parameters: 11 mins, Log-likelihood statistic = 160.81.

	\(\lambda_{\text{dogwood}}\)	\(\lambda_{\text{oaks}}\)	\(\lambda_{\text{shrub}}\)	\(\lambda_{\text{juglandaceae}}\)	max_growth	crowd1	crowd2	sigma	alpha	beta	gamma
Start Value	2.000	3.000	1.000	1.000	0.250	1.000	1.000	0.200	2.000	2.000	0.000
MLE	0.027	31.028	9.656	0.024	5.426	4.302	0.012	0.163	3.412	5.923	-21.625

Action Items

Investigate constrained optimization for MLE’s. Ex: \(\lambda>0\)
Use sp::point.in.polygon to control for edge effects. 5m boundary suffices
Add code to repository and share with Dave
Start thinking incorporating all species

2016-05-18

We note differences in the MLE’s depending on the algorithm we used. Some of the species returned errors for BFGS, so I present comparisons on the 6 species that didn’t return errors.

This is a bit hard to digest, so let’s compare by plot. First let’s compare values relating to the quality of the fit and the residual noise \(\sigma\) (the solid line is \(y=x\)). We see that for both algorithms, the values are very similar. My interpretation of this is that there is similar predictive signal for those species.

However, when comparing estimates that relate to the modeling of expected growth, we see large differences, despite both sets of values leading to similar prediction scores above. My guess is that some of the parameters are conflated with each other (kind of like collinearity in regression), thereby making it difficult to tease out the effect of one vs the other.

2016-05-17

We define a measure of fit, which is based on \(R^2\) for regression.

\[ \begin{eqnarray*} \text{ActualGrowth}_i &\sim& \text{Normal}\left(\text{ExpectedGrowth}_i, \sigma^2\right)\\ \epsilon_i &=& \text{ActualGrowth}_i - \text{ExpectedGrowth}_i\\ R^2 &=& 1 - \frac{\text{Var}\left(\epsilon\right)}{\text{Var}\left(\text{ActualGrowth}\right)} \end{eqnarray*} \]

Also

For each focal species tree, we only consider competitor trees within 100 units (meters?) from it. Using 100 we get very similar results in the estimates, but for a huge savings in run time.
These results use method="Nelder-Mead" and not method="BFGS" in the optim() command. Interestingly, we are getting
- very different MLE’s
- but similar \(R^2\)
The method="BFGS" are pending because they’re throwing a lot of errors.

Results

Observations

We get horrible fits for oaks
None of the lambdas vary much from the start values for the optimizer of 1, yet we get decent \(R^2\)’s
The Serviceberry and Sassafrass max growth values don’t make much sense.

Tree Competition MLE’s

Albert Y. Kim

2016-07-23

2016-07-23

2016-06-14

Maximum Likelihood Estimation Routines.

\(\lambda\) values

More Parameters

Action Items

2016-05-18

2016-05-17

Results