This report seeks to answer the following question:
Does the non-masting Red Maple species exhibit muted dynamics compared to the masting Sugar Maple species?
We will be using a data set called Maple_Sapand
Maple_Tap obtained from https://portal.edirepository.org/nis/mapbrowse?scope=knb-lter-hfr&identifier=285&revision=6.
This includes a variation of data about Red Maple trees and Sugar Maple
trees. The Maple_Tap data set has seven variables and 328
entries. Of these the relevant ones include year (year of
collection), species ( red maple or sugar maple), and
dbh (diameter of tree at 1.4 meters above ground). The
Maple_Sap data set has eight variables and 9,022 entries.
Of these the relevant ones include
year,species, and sap.wt(weight
of sap collected in kilograms).
Below are the full data sets before any necessary cleaning:
Throughout, we will need the functionality of the tidyverse package.
library(tidyverse)
Firstly, we are going to clean the data as necessary.
Maple_Sap <- Maple_Sap %>%
separate(date, into = c("year", "month_day"), sep = -6) %>%
mutate(species = case_when(
species == "ACSA" ~ "Sugar Maple",
species == "ACRU" ~ "Red Maple")) %>%
summarize(year,tree,species,sugar,sap.wt)
Maple_Tap <- Maple_Tap %>%
separate(date, into = c("year", "month_day"), sep = -6) %>%
mutate(year=as.double(year),species = case_when(
species == "ACSA" ~ "Sugar Maple",
species == "ACRU" ~ "Red Maple")) %>%
summarize(year,species,tree,dbh)
By comparing the Sap Weights collected from the Red Maple tree and Sugar Maple tree we can visualize the difference between their outputs. This will highlight how specific data can support or reject the hypothesis that non-masting red maples exhibit muted dynamics comparatively.
By creating a box-plot we can visualize the amount of sap produced by weight and species.
ggplot(data = Maple_Sap, mapping = aes(x = year,y = sap.wt))+
geom_boxplot(aes(color = species))+
labs(title = "Sap Weight Over Time",
x = "Year",
y= "Sap Weight",
color = "Species")
As seen in the box plot there was no data from red maples outside of the years 2015 - 2018. Throughout the years that data was collected. the Red Maple does not produce as much sap as that of the Sugar Maple.
This data may seem to support the hypothesis, however we most create a linear regression model to understand the statistical significance of the data, before we are able to draw proper conclusions.
Below is the created model that looks at species as a predictor for Sap Weight.
sap_model <- lm(sap.wt ~ species, data = Maple_Sap)
summary(sap_model)
##
## Call:
## lm(formula = sap.wt ~ species, data = Maple_Sap)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1475 -2.2575 -0.5975 1.6225 19.8825
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.89603 0.09755 19.44 <2e-16 ***
## speciesSugar Maple 2.26149 0.10379 21.79 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.054 on 8389 degrees of freedom
## (631 observations deleted due to missingness)
## Multiple R-squared: 0.05356, Adjusted R-squared: 0.05345
## F-statistic: 474.7 on 1 and 8389 DF, p-value: < 2.2e-16
From the linear regression model we gain a lot of understanding about this data. The intercept is 1.896 with a p-value of <2e-16 therefore it is statistically significant. Meaning on average the Red Maple on average produced 1.896 kilograms of sap. The intercept for the Sugar Maple showcases that on average the Sugar Maple produces 2.261 units grater than the Red Maple in sap weight. Which is also statistically significant as showcased by the small p-value.
The residual standard error of this model is 3.054 on 8389 degrees of freedom, meaning the model is off on the estimate of sap weight by about 3.05 in either direction on average. The adjusted R^2 is 0.05345, which means only 5.34% of the variables variance is explained by the model which is not a great, meaning the other roughly 94% of the variance is explained by outside factors.
Additionally the p-value is < 2.2e-16 much lower than the 0.05% cutoff, which indicates this model overall is statistically significant. This highlights that the visual aspect of the model and the overall statistics that there is some statistical significance, yet this may not be conclusive enough due to the Adjusted R^2.
The overall sap production for the species is one way that we can see if the non-masting tree has muted dynamics, another way is by looking at the growth in trunk size over the years of the data. By looking at the data we can continue to investigate the hypothesis at hand and what may assist this.
First we are going to group the data from the Maple Tap data into species and year. Then we filter to make the graph easier to read and see slight differences. Lastly, the data is summarized by species and mean diameter of trunk size.
Maple_Tap_Average <- Maple_Tap %>%
group_by(species,year) %>%
filter(year <= 2018) %>%
summarize(mean(dbh))
Below is the graph that visualizes the trunk diameter over the years that were examined in the research. It is a line graph filter to increase readability of the average trunk diameter for the each species that are differentiated by color.
ggplot(data = Maple_Tap_Average, mapping = aes(x = year,y = `mean(dbh)`,color=species))+
geom_line()+
geom_point()+
scale_x_continuous(breaks = seq(min(Maple_Tap_Average$year), 2018, by = 1))+
labs(title = "DBH Comparison by Species",
x = "Year",
y= "DBH (cm)",
color = "Species")
As we can see in this graph the Red Maple has a slightly steeper increase than that of the Sugar Maple, this is very slight but can help disprove the original hypothesis. But the overall tree trunk diameter for the Sugar Maple is nearly 20 cm larger than that of the Red Maple, commonly the larger the trunk the more sap production. This can correlate to the sap weight data from above as well. However we can not argue that this is true until we create a linear regression model that looks at the summary statistics.
We can create a model that looks at the species with dbh
as the response variable, we are going to use the original
Maple_Tap data set to look at the entries as a whole rather
than the averages. Below is that model:
model_dbh <- lm(dbh ~ species, data = Maple_Tap)
summary(model_dbh)
##
## Call:
## lm(formula = dbh ~ species, data = Maple_Tap)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.403 -8.648 -2.103 10.697 21.017
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 46.583 2.405 19.373 < 2e-16 ***
## speciesSugar Maple 20.120 2.673 7.528 9.19e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.78 on 124 degrees of freedom
## (256 observations deleted due to missingness)
## Multiple R-squared: 0.3137, Adjusted R-squared: 0.3081
## F-statistic: 56.67 on 1 and 124 DF, p-value: 9.191e-12
The statistics showcase that the intercept or the average trunk diameter for the Red Maple is 46.583 with < 2e-16 as the p-value. This meets the 0.05 cutoff meaning it is statistically significant. This data also showcases that on average the Sugar Maple trunk diameter is 20.120 cm larger than that of the Red Maple, this has a p-value of 9.19e-12.
The RSE for this linear regression model is 11.78 on 124 degrees of freedom, which given this context is not terrible. The adjusted R^2 of this data is dramatically better than that of the sap weight at 0.3081 , meaning that 30.81% of the variance in variables is explained by this model. However this is still not considered a great R^2 value.
Lastly, this model as a whole has a p-value of 9.191e-12 which is below the cutoff, yet the low R^2 can complicate the conclusions drawn when taking the whole model into account.
As we prepare to draw conclusion, we also looked at other data in this resource that could be helpful in clarifying the conclusions drawn in regards to the hypothesis. One that we took the time to research was the flowering data, after joining the data based off of trees it was found that the flowering data only included that of the sugar maple therefore it does not assist in identifying muted dynamics of the Red Maple.
With all of this in mind we can answer the question at hand - Does the non-masting red maple species exhibit muted dynamics compared to the masting sugar maple species? Technically yes, however the models are not extremely reliable based upon the low R^2. The Sap Weight model is nearly insignificant due to the 0.05345, the Tap model was slightly better at 0.3081. These low values showcase the unreliability of the data we are provided in this research.
If we were able to gather more relevant data that also included other confounding variables that would increase the reliability of the model. In short we can technically argue that there are muted dynamics for the Red Maple but we would not make this claim with much confidence.
Rapp, J., E. Crone, and K. Stinson. 2023. Maple Reproduction and Sap Flow at Harvard Forest since 2011 ver 6. Environmental Data Initiative. https://doi.org/10.6073/pasta/7c2ddd7b75680980d84478011c5fbba9 (Accessed 2025-12-02).