This report focuses on red maple and sugar maple trees in the Harvard Forest. The research that the data comes from a study that ran from 2011-2021 and focused mainly on reproduction and sap flow. The Harvard Forest spans 4,000 acres across Petersham, Massachusetts. This report seeks to determine whether the non-masting red maple species exhibits muted dynamics compared to the masting sugar maple species. Muted dynamics are when a tree has weaker signs of vitality than usual. This report will examine several physical characteristics to determine the answer. As they only added red maples to their research in 2015 I am of course going to be examining only the data with red maples included in the set.
I need to credit the researchers for obtaining this data:
Rapp, J., E. Crone, and K. Stinson. 2023. Maple Reproduction and Sap Flow at Harvard Forest since 2011 ver 6. Environmental Data Initiative. https://doi.org/10.6073/pasta/7c2ddd7b75680980d84478011c5fbba9 (Accessed 2024-12-10)
One of the most important characteristics to look at to determine
whether a tree exhibits muted dynamics the amount of sap produced by
weight. The important variables in this data set are
species, which of the two species the tree is,
sap_weight; the amount of sap produced in kilograms, and
sugar, the sugar concentration in Brixx. I am going to
compare the levels of sap produced by the different species of trees and
determine whether is species is a significant predictor of
sap_weight. The following data table is what is being used
to generate the visualizations and models.
ggplot(sap_clean, aes(x = species, y = sap_weight)) +
geom_boxplot() +
labs(title = "Distribution of Sap Weight by Species",
x = "Species",
y = "Sap Weight")
As we can see from this box plot, sugar maples seem to produce more sap than red maples. This is because the median for sugar maples is higher than for red maples. I would also like to take into consideration the sugar concentration to see whether sugar maples actually produce more sugar or if they just produce more fluid.
ggplot(sap_clean, aes(species, sugar)) +
geom_boxplot() +
labs("Sugar Concentration by Species",
x = "Species",
y = "Sugar Concentration (Brixx)")
We can see from this box plot that sugar concentration also has a higher median value in sugar maples than red maples. This of course is not the full story. I need to look at what the statistics can tell me about the significance of the relationships of interest.
sap_model <- lm(sap_weight ~ species + sugar, data = sap_clean)
summary(sap_model)
##
## Call:
## lm(formula = sap_weight ~ species + sugar, data = sap_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.0556 -2.2487 -0.5885 1.6145 19.1936
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.12360 0.15580 20.049 < 2e-16 ***
## speciesSugar Maple 2.42958 0.12368 19.644 < 2e-16 ***
## sugar -0.44173 0.05947 -7.428 1.23e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.018 on 7572 degrees of freedom
## (816 observations deleted due to missingness)
## Multiple R-squared: 0.04859, Adjusted R-squared: 0.04834
## F-statistic: 193.4 on 2 and 7572 DF, p-value: < 2.2e-16
The p-value 2 x 10-16 is well below the 0.05 threshold for
statistical significance. Based on this p-value there is only a 2 x
10-14 % chance that species is not a predictor
of sap_weight. We can confidently say that Sugar Maples
produce more sap than Red Maples. We can also say that based on the
p-value of 1.23 x 10-13 that sugar maples produce sap with a
higher concentration of sugar than red maples. This is only two aspects
of muted dynamics however. As we see from the R2 value, this
only can predict about 4.834% of the variation in the data. I need to
look at more factors to answer the question.
Another characteristic of trees with muted dynamics is that they
don’t grow as large. We can determine if red maples have muted dynamics
relative to the sugar maple by looking at whether species is a good
predictor for diameter. Important variables here are
diameter, the diameter of the tree in centimeters at 1.4
meters above the ground and species, the species of the
tree. The data table for this section is:
First, I would like to look at the distribution for the two variables in question.
ggplot(maple_clean, aes(species, diameter)) +
geom_boxplot() +
labs(title = "Diameter by Species",
x = "Species",
y = "Diameter (cm)")
Here we can see clearly that sugar maples have a larger median diameter by almost 20 cm. However, this is not conclusive. I will analyze the statistics and to determine the significance of this relationship.
diameter_model <- lm(diameter ~ species, data = maple_clean)
summary(diameter_model)
##
## Call:
## lm(formula = diameter ~ species, data = maple_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.403 -8.648 -2.103 10.697 21.017
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 46.583 2.405 19.373 < 2e-16 ***
## speciesSugar Maple 20.120 2.673 7.528 9.19e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.78 on 124 degrees of freedom
## Multiple R-squared: 0.3137, Adjusted R-squared: 0.3081
## F-statistic: 56.67 on 1 and 124 DF, p-value: 9.191e-12
The p-value of 9.19 x 10-12 shows that there is a
statistically significant relationship between species I can reject the
null hypothesis that species is not a significant predictor
of diameter. This means that I was correct in assuming that
diameter can be predicted by species and that sugar maples have a
statistically significantly larger diameter on average. The
R2 value may leave a bit to be desired, but
species is definitely a good predictor. Based on the
intercept, 20.120, sugar maples are predicted to have a diameter of
20.120 cm larger than a red-maple.
Based off the data that is available for both red maples and sugar
maples in the Harvard Forest, I was able to conclude that yes, the
non-masting red maple exhibits muted dynamics compared to the masting
sugar maple. Based on the statistics for the relationships between
species and both the diameter and the
sap_weight I was able to arrive at this conclusion based on
the p-values.