Homework #3: Site Index Homework As a forester you need to construct some anamorphic site index curves for your area and species of interest. Using data that was collected over a broad range of ages and sites for loblolly pine plantations across the southeast, construct several sets of site index curves using the different models. The data are in plot summary format (height-age pairs) from PMRC research trials. The data are from PMRC “base” or control treatments from experiments, hence they have not had any herb, fert, thinning or other management beyond site prep and planting.
There are a total of 117 plots with age in years and average height of the dominant and codominant trees in feet. Each plot has multiple measurements over time, ranging from ages 6 to 32 years; heights ranging from about 10 to over 100 feet. At each age, the height presented was determined by averaging the height of the dominant and co-dominant trees in the research plot. The minimum number of trees used to calculate a height average was 5 trees and the max was 62 trees, with an average of 38 trees per measurement age. Only 3 plots, all at older ages, had less than 10 trees used to get the average height of the dominant and co-dominant trees.
library(tidyverse)
library(tidyverse)
library(readxl)
library(readr)
SI_data = read_xls("C:/Users/du11505/OneDrive - University of Georgia/Desktop/Forest Stand Dynamics/HW3-Site Index/Hwk 3 - SI - Fall 2025.xls")
summary(SI_data)
## PLOT AGE HD
## Min. : 1.00 Min. : 6.00 Min. : 9.655
## 1st Qu.: 36.00 1st Qu.: 9.00 1st Qu.: 28.833
## Median : 65.00 Median :15.00 Median : 45.145
## Mean : 63.11 Mean :15.06 Mean : 45.718
## 3rd Qu.: 91.00 3rd Qu.:21.00 3rd Qu.: 61.514
## Max. :117.00 Max. :32.00 Max. :104.182
sum(is.na(SI_data))
## [1] 0
library(ggplot2)
library(dplyr)
# Asegurar tipos
df <- SI_data %>%
mutate(
PLOT = factor(PLOT)
)
ggplot(df, aes(x = AGE, y = HD, group = PLOT)) +
geom_line(alpha = 0.4, linewidth = 0.6, color = "#1f78b4") +
labs(
x = "X (AGE)",
y = "Y (HD)",
title = "HD V/S AGE CURVES (117 plots)",
subtitle = "every line is a plot "
) +
theme_bw(base_size = 12)
b. Other descriptive stats as you see fit.
# Funciones de CV
cv_std <- function(x) 100 * sd(x) / mean(x)
# Class: [5,10), [10,15), [15,20), [20,25), [25,30), [30,35)
breaks_5y <- seq(5, 35, by = 5)
df_classes <- df %>%
mutate(
age_class = cut(AGE, breaks = breaks_5y, right = FALSE), # incluye el borde izquierdo
age_mid = (as.numeric(sub("\\[(.*),(.*)\\)", "\\1", age_class)) +
as.numeric(sub("\\[(.*),(.*)\\)", "\\2", age_class)))/2
) %>%
group_by(age_class) %>%
summarise(
n = sum(HD),
age_mid = mean(AGE), # centro promedio real de cada clase
mean_h = mean(HD),
sd_h = sd(HD),
cv_pct = cv_std(HD),
.groups = "drop"
)
df_classes
## # A tibble: 6 × 6
## age_class n age_mid mean_h sd_h cv_pct
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 [5,10) 5181. 7.5 22.1 6.72 30.3
## 2 [10,15) 4546. 12 38.9 5.19 13.3
## 3 [15,20) 10986. 16.4 51.6 7.27 14.1
## 4 [20,25) 5575. 21 64.8 5.56 8.57
## 5 [25,30) 4932. 25 72.5 6.88 9.49
## 6 [30,35) 3342. 32 87.9 9.45 10.7
We can clarly see that there is differences in t5he CV for the different age clases, ande in the first class (age between 5 and 10 years) the CV is the highest
\[ ln(H) = \beta_0 + \beta_1(1/A) + \epsilon \] where H = average total height of the dominant and codominant trees, ft
A = Age, yrs
b0 , b1 , = coefficients to be estimated
e = error term
df <- df %>%
mutate(
inv_A = 1/AGE
)
df <- df %>%
mutate(
ln_HD = log(HD)
)
model = lm(ln_HD ~ inv_A, data=df)
summary(model)
##
## Call:
## lm(formula = ln_HD ~ inv_A, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.48663 -0.08908 0.00160 0.10419 0.44822
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.69262 0.01236 379.60 <2e-16 ***
## inv_A -11.84530 0.13175 -89.91 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1532 on 754 degrees of freedom
## Multiple R-squared: 0.9147, Adjusted R-squared: 0.9146
## F-statistic: 8084 on 1 and 754 DF, p-value: < 2.2e-16
then the equation for this fit is:
\[ ln(H) = 4.69262 -11.84530 (1/A) \]
ggplot(df, aes(x = AGE, y = HD)) +
geom_line(alpha = 0.4, linewidth = 0.6, color = "#1f78b4") +
labs(
x = "X (Age)",
y = "Y (HD)",
title = "HD V/S Age")+
theme_bw(base_size = 12)
b. Note any trends you see in the data
ggplot(df, aes(x = 1/AGE, y = log(HD))) +
geom_line(alpha = 0.4, linewidth = 0.6, color = "#1f78b4") +
labs(
x = "X (1/Age)",
y = "Y (ln(HD)",
title = "ln(HD) V/S 1/Age") +
theme_bw(base_size = 12)
d. Note any trends you see in the data
D
model2 <- nls(HD ~ beta0 * (1 - exp(beta1 * AGE))^beta2,
data = df,
start = list(beta0 = 100, beta1 = -0.1, beta2 = 2))
summary(model2)
##
## Formula: HD ~ beta0 * (1 - exp(beta1 * AGE))^beta2
##
## Parameters:
## Estimate Std. Error t value Pr(>|t|)
## beta0 122.244512 5.866397 20.84 <2e-16 ***
## beta1 -0.048547 0.004495 -10.80 <2e-16 ***
## beta2 1.431268 0.062737 22.81 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.407 on 753 degrees of freedom
##
## Number of iterations to convergence: 4
## Achieved convergence tolerance: 4.412e-06