FORS 8050 – Forest Stand Dynamics

Homework #3: Site Index Homework As a forester you need to construct some anamorphic site index curves for your area and species of interest. Using data that was collected over a broad range of ages and sites for loblolly pine plantations across the southeast, construct several sets of site index curves using the different models. The data are in plot summary format (height-age pairs) from PMRC research trials. The data are from PMRC “base” or control treatments from experiments, hence they have not had any herb, fert, thinning or other management beyond site prep and planting.

There are a total of 117 plots with age in years and average height of the dominant and codominant trees in feet. Each plot has multiple measurements over time, ranging from ages 6 to 32 years; heights ranging from about 10 to over 100 feet. At each age, the height presented was determined by averaging the height of the dominant and co-dominant trees in the research plot. The minimum number of trees used to calculate a height average was 5 trees and the max was 62 trees, with an average of 38 trees per measurement age. Only 3 plots, all at older ages, had less than 10 trees used to get the average height of the dominant and co-dominant trees.

  1. Descriptive statistics on the dataset.
  1. Include a graph of the height growth trajectories (connected) of all the plots on one graph (a bit messy of a figure, but illustrative of the general growth trends!)
library(tidyverse)
library(tidyverse)
library(readxl)
library(readr)

SI_data = read_xls("C:/Users/du11505/OneDrive - University of Georgia/Desktop/Forest Stand Dynamics/HW3-Site Index/Hwk 3 - SI - Fall 2025.xls")
summary(SI_data)
##       PLOT             AGE              HD         
##  Min.   :  1.00   Min.   : 6.00   Min.   :  9.655  
##  1st Qu.: 36.00   1st Qu.: 9.00   1st Qu.: 28.833  
##  Median : 65.00   Median :15.00   Median : 45.145  
##  Mean   : 63.11   Mean   :15.06   Mean   : 45.718  
##  3rd Qu.: 91.00   3rd Qu.:21.00   3rd Qu.: 61.514  
##  Max.   :117.00   Max.   :32.00   Max.   :104.182
sum(is.na(SI_data))
## [1] 0
library(ggplot2)
library(dplyr)

# Asegurar tipos
df <- SI_data %>%
  mutate(
    PLOT = factor(PLOT)
  )

ggplot(df, aes(x = AGE, y = HD, group = PLOT)) +
  geom_line(alpha = 0.4, linewidth = 0.6, color = "#1f78b4") +
  labs(
    x = "X (AGE)",
    y = "Y (HD)",
    title = "HD V/S AGE CURVES (117 plots)",
    subtitle = "every line is a plot "
  ) +
  theme_bw(base_size = 12)

b. Other descriptive stats as you see fit.

  1. We discussed possible age-site bias in class extensively. To test for this, break the data up into age-classes (you decide) and then calculate the CV% for the heights in each age class. Determine if the CV% is related to age and if it is appropriate to assume proportionality of heights at all ages (and hence use anamorphic SI curves).
# Funciones de CV
cv_std <- function(x) 100 * sd(x) / mean(x)

# Class: [5,10), [10,15), [15,20), [20,25), [25,30), [30,35)
breaks_5y <- seq(5, 35, by = 5)

df_classes <- df %>%
  mutate(
    age_class = cut(AGE, breaks = breaks_5y, right = FALSE),  # incluye el borde izquierdo
    age_mid   = (as.numeric(sub("\\[(.*),(.*)\\)", "\\1", age_class)) +
                 as.numeric(sub("\\[(.*),(.*)\\)", "\\2", age_class)))/2
  ) %>%
  group_by(age_class) %>%
  summarise(
    n      = sum(HD),
    age_mid = mean(AGE),  # centro promedio real de cada clase
    mean_h = mean(HD),
    sd_h   = sd(HD),
    cv_pct = cv_std(HD),
    .groups = "drop"
  )

df_classes
## # A tibble: 6 × 6
##   age_class      n age_mid mean_h  sd_h cv_pct
##   <fct>      <dbl>   <dbl>  <dbl> <dbl>  <dbl>
## 1 [5,10)     5181.     7.5   22.1  6.72  30.3 
## 2 [10,15)    4546.    12     38.9  5.19  13.3 
## 3 [15,20)   10986.    16.4   51.6  7.27  14.1 
## 4 [20,25)    5575.    21     64.8  5.56   8.57
## 5 [25,30)    4932.    25     72.5  6.88   9.49
## 6 [30,35)    3342.    32     87.9  9.45  10.7

We can clarly see that there is differences in t5he CV for the different age clases, ande in the first class (age between 5 and 10 years) the CV is the highest

  1. Basic SI curve

\[ ln(H) = \beta_0 + \beta_1(1/A) + \epsilon \] where H = average total height of the dominant and codominant trees, ft

A = Age, yrs

b0 , b1 , = coefficients to be estimated

e = error term

df <- df %>%
  mutate(
    inv_A = 1/AGE
    )

df <- df %>%
  mutate(
  ln_HD = log(HD)
    )




 model = lm(ln_HD ~ inv_A, data=df)
summary(model)
## 
## Call:
## lm(formula = ln_HD ~ inv_A, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.48663 -0.08908  0.00160  0.10419  0.44822 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   4.69262    0.01236  379.60   <2e-16 ***
## inv_A       -11.84530    0.13175  -89.91   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1532 on 754 degrees of freedom
## Multiple R-squared:  0.9147, Adjusted R-squared:  0.9146 
## F-statistic:  8084 on 1 and 754 DF,  p-value: < 2.2e-16

then the equation for this fit is:

\[ ln(H) = 4.69262 -11.84530 (1/A) \]

  1. Produce the following two graphs from the data:
  1. Graph of Height by Age (not connected by plot), with Height on the Y-axis and Age on the X-axis
ggplot(df, aes(x = AGE, y = HD)) +
  geom_line(alpha = 0.4, linewidth = 0.6, color = "#1f78b4") +
  labs(
    x = "X (Age)",
    y = "Y (HD)",
    title = "HD V/S Age")+
  theme_bw(base_size = 12)

b. Note any trends you see in the data

  1. Graph of ln(H) by (1/A), with ln(H) on the Y-axis and (1/A) on the X-axis, include a linear trendline on this graph
ggplot(df, aes(x = 1/AGE, y = log(HD))) +
  geom_line(alpha = 0.4, linewidth = 0.6, color = "#1f78b4") +
  labs(
    x = "X (1/Age)",
    y = "Y (ln(HD)",
    title = "ln(HD) V/S 1/Age") +
  theme_bw(base_size = 12)

d. Note any trends you see in the data

  1. Produce a guide-curve from the prediction equation for H. Use ages ranging from 3 to 60 years (in 1-yr intervals).

D

  1. Graph the guide curve (connect the predicted values).
  2. Include a table of the Ages and predicted Height values from the equation
model2 <- nls(HD ~ beta0 * (1 - exp(beta1 * AGE))^beta2,
              data = df,
              start = list(beta0 = 100, beta1 = -0.1, beta2 = 2))
summary(model2)
## 
## Formula: HD ~ beta0 * (1 - exp(beta1 * AGE))^beta2
## 
## Parameters:
##         Estimate Std. Error t value Pr(>|t|)    
## beta0 122.244512   5.866397   20.84   <2e-16 ***
## beta1  -0.048547   0.004495  -10.80   <2e-16 ***
## beta2   1.431268   0.062737   22.81   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.407 on 753 degrees of freedom
## 
## Number of iterations to convergence: 4 
## Achieved convergence tolerance: 4.412e-06