This report seeks to answer the following question:
Does the non-masting red maple species (ACRU) exhibit muted dynamics compared to the masting sugar maple species (ACSA)? (If you are unfamiliar with these terms, masting trees are trees that either produce a lots of seeds, or very few seeds, while non-masting trees tend to produce the same amount of seeds every year. Also, muted dynamics means the trees don’t have lots of ups and downs, instead they produce a steady amount every year.)
The data that we will be using comes from the following source:
Rapp, J., E. Crone, and K. Stinson. 2023. Maple Reproduction and Sap Flow at Harvard Forest since 2011 ver 6. Environmental Data Initiative. https://doi.org/10.6073/pasta/7c2ddd7b75680980d84478011c5fbba9 (Accessed 2025-12-09).
It contains 14 different CSV files that address the mechanisms of
mast seeding in sugar maples, recently they added the seed monitoring of
the red maple trees to explore the hypothesis that the non-masting
species would have muted dynamics compared to its masting counterpart
(this is the hypothesis we will be testing). There are numerous
variables that are contained in the 14 csv files; the relevant ones in
this report are year (The year that the measurements were
found), tree(the individual tree being measured),
species(the species, red maple or sugar
maple),dbh(diameter),sugar(the sugar
count),sap.wt(the sap weight of the tree). We will only be
using 2 of the 14 available csv files to do our analysis, you can view
both below:
This data set will be used to compare the diameters of the two species, as this can be a key indicator that muted dynamics are present in the Red Maple species.
This data set will be used to compare the sap weights and sugar counts between the two species.
The rest of the csv files appeared to only have data available on the sugar maple species, thus making them useless for comparison purposes.
Throughout, we will need the functionality of the tidyverse package, mainly to create visualizations.
library(tidyverse)
Before I get into the analysis of my data though, I must clean up the two csv files in order to perform a proper comparison:
maple_tap<-hf285_01_maple_tap%>%
separate(date, into = c("Year", "Month_Day"), sep = -6)%>%
summarize(Year, tree, species, dbh)%>%
drop_na(dbh)
datatable(maple_tap)
maple_sap<-hf285_02_maple_sap%>%
drop_na(sap.wt)%>%
drop_na(sugar)%>%
separate(date, into = c("Year", "Month_Day"), sep = -6)%>%
summarize(Year, tree, species, sugar, sap.wt)
datatable(maple_sap)
Now both of my data sets contain just the variables I am going to use, and no longer contain NA values for the columns I’m going to use.
Now, I can start to get into answering the main research question, which is whether the Red Maple sees muted dynamics compared to the Sugar Maple. The first comparison I will make is the diameter of the species. I will expect the Red Maple’s diameter to grow slower over time than the sugar maple, because of the muted dynamics. In order to test this theory, I am going to make a box plot to see the growth over time.
library(ggplot2)
ggplot(maple_tap, mapping = aes(Year, dbh)) +
geom_boxplot(aes(color = species)) +
labs(title = "Diameter by Species",
x = "year",
y = "diameter")
It appears that my hypothesis was correct, as there was a larger growth over time in the diameter of the sugar maple, than the diameter of the red maple. We can also see that the diameter of the sugar maple is much larger in general. My guess is both of the trees started off as a similar size, but the faster growth rate of sugar maple is what makes the sugar maple so much larger than the red maple now.
Another thing that I can do in relation to the diameter, is run a regression in order to see if there is a relationship between the two variables. I can look at the p-value and R^2 values to determine whether there is a linear relationship present between the diameter and species.
dbh_model<-lm(dbh~species, data =maple_tap)
summary(dbh_model)
##
## Call:
## lm(formula = dbh ~ species, data = maple_tap)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.403 -8.648 -2.103 10.697 21.017
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 46.583 2.405 19.373 < 2e-16 ***
## speciesACSA 20.120 2.673 7.528 9.19e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.78 on 124 degrees of freedom
## Multiple R-squared: 0.3137, Adjusted R-squared: 0.3081
## F-statistic: 56.67 on 1 and 124 DF, p-value: 9.191e-12
After running the model, I can see that the p-value is much less than .05, therefore I can reject the null and say that there is a relationship between species and diameter. Also, the R^2 value is .3137 which isn’t terrible, but certainly could be better. This R^2 value indicates that we may want to look for some more explanatory variables in order to try to better explain the variation in the diameter. All in all though, it is clear that there is a relationship between the variables, and the diameter of the sugar maple is clearly much larger than the diameter of the red maple.
To look at the difference in sap weight and sugar count between the two species, I’ll make box plots for both, and see if there appears to be a significant difference.
library(ggplot2)
ggplot(maple_sap, mapping = aes(species, sugar)) +
geom_boxplot() +
labs(title = "Sugar by Species",
x = "species",
y = "sugar")
library(ggplot2)
ggplot(maple_sap, mapping = aes(species, sap.wt)) +
geom_boxplot() +
labs(title = "Sap Weight by Species",
x = "species",
y = "sap.wt")
The sugar and weight of the sugar maple appear to be much higher than the red maple. This makes sense, because the species with muted dynamics is expected to have a lower sugar count and weight. However, I still need to test whether these results are statistically significant, which I can do by running another regression. I will run one for both the sugar, and the weight.
sugar_weight_model<-lm(sugar ~species, data =maple_sap)
summary(sugar_weight_model)
##
## Call:
## lm(formula = sugar ~ species, data = maple_sap)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7366 -0.4366 -0.0366 0.3634 4.7634
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.84293 0.02140 86.13 <2e-16 ***
## speciesACSA 0.69367 0.02253 30.79 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5832 on 7573 degrees of freedom
## Multiple R-squared: 0.1113, Adjusted R-squared: 0.1111
## F-statistic: 948 on 1 and 7573 DF, p-value: < 2.2e-16
sugar_weight_model<-lm(sap.wt ~species, data =maple_sap)
summary(sugar_weight_model)
##
## Call:
## lm(formula = sap.wt ~ species, data = maple_sap)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.4227 -2.2895 -0.5727 1.6073 19.6073
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.3095 0.1111 20.78 <2e-16 ***
## speciesACSA 2.1232 0.1170 18.14 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.029 on 7573 degrees of freedom
## Multiple R-squared: 0.04166, Adjusted R-squared: 0.04154
## F-statistic: 329.2 on 1 and 7573 DF, p-value: < 2.2e-16
The p-value is much less than .05 for both, which once again indicates that there is a statistically significant relationship between our response and explanatory variables. However, the R^2 values are low for both which indicates that these may not be the best models. The last thing to mention is the sugar maple is larger than the intercept in both regressions which indicates the sugar maple has a higher expected value for both the sugar and weight.
In conclusion, we saw statistically significant evidence that the sugar and weight of the red maple is lower than the sugar maple. Also, there was evidence that the red maple’s diameter grows slower over time. With all of this being said, it is clear that the red maple does see muted dynamics compared to the sugar maple.