Introduction

This report focuses on a group of red maple and sugar maple trees in the Harvard Forest, which is located in central Massachusetts. The research was previously done by a group of researchers that sought to measure the maple reproduction and sap flow since 2011 among this group of maples. The data they collected is meant to describe the trees’ seed production and other dynamics. In our report, we will cut it down to red maples and answer the question of if the non-masting red maple species exhibit muted dynamics compared to sugar maples, who are masting. To have muted dynamics means to respond different than typical to environmental changes in ecology. Therefore, in this report we seek to answer the question of if non-masting red maple species respond different than typical to environmental changes compared to the sugar maples, who are non-masting (seed production), and that they have less dynamics (respond less). For this, we will use the library “tidyverse” for many things, including plotting. For our visualizations, we will use “DT” library to be able to use datatables.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.2
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(DT)

Thanks to the researchers who put together the dataset that will be used, their names are in the citation below:

Rapp, J., E. Crone, and K. Stinson. 2023. Maple Reproduction and Sap Flow at Harvard Forest since 2011 ver 6. Environmental Data Initiative. https://doi.org/10.6073/pasta/7c2ddd7b75680980d84478011c5fbba9 (Accessed 2025-12-03).

Diameter of Trees over Time

The dataset imported below shows diameter of the different trees pre-cleaning. Diameter of the tree happens to be an indicator of a tree’s health in a forest. So we can use this to see if there is a significant impact on diameter given the species, sugar maple and red maple. Below also is a datatable of the data imported pre-cleaning.

library(readr)
maple_tap <- read_csv("hf285-01-maple-tap.csv")
## Rows: 382 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (3): tree, tap, species
## dbl  (3): dbh, tap.bearing, tap.height
## date (1): date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
datatable(maple_tap)

As previously stated, this data needs some cleaning, which will be the code chunk below:

maple_tap$species[maple_tap$species == "ACRU"] <- "Red Maple"
maple_tap$species[maple_tap$species == "ACSA"] <- "Sugar Maple"
maple_tap_clean <- maple_tap %>%
  separate(date, into = c("year", "month_day"), sep = -6)
maple_tap_clean <- maple_tap_clean %>%
  rename("Diameter" = "dbh")
maple_tap_clean <- maple_tap_clean %>%
  summarize(year, tree, species, Diameter)
maple_tap_clean <- maple_tap_clean %>%
  na.omit("Diameter")

This cleaning brings us the below datatable, which will be our basis for our analysis based on diameter of the trees.

datatable(maple_tap_clean)

A key indicator of the health of a tree as it relates to dynamics is its diameter growing over time. We can visualize the change over time with a boxplot graph divided by species in color.

ggplot(maple_tap_clean, mapping = aes(year, Diameter)) + geom_boxplot(aes(color = species))  + labs(title = "Diameter Over Time by Species", x = "Year", y = "Trunk Diameter (cm)")

The boxplot shown above shows a slight difference in diameter growth over time between red maple and sugar maples, with sugar maples having a larger growth in diameter. Overall, we see a larger median diameter among sugar maples than red maples. We can see if this difference is significant by using a linear model.

diametermodel <- lm(Diameter ~ species, data = maple_tap_clean)
summary(diametermodel)
## 
## Call:
## lm(formula = Diameter ~ species, data = maple_tap_clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.403  -8.648  -2.103  10.697  21.017 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          46.583      2.405  19.373  < 2e-16 ***
## speciesSugar Maple   20.120      2.673   7.528 9.19e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.78 on 124 degrees of freedom
## Multiple R-squared:  0.3137, Adjusted R-squared:  0.3081 
## F-statistic: 56.67 on 1 and 124 DF,  p-value: 9.191e-12

The intercept, or the value of the red maple, is 46.583 cm. The estimate for sugar maple is 20.120 cm more if a tree is a sugar maple. Looking at the other summary statistics of our model, we see a p-value of 9.191*10^-12, which is significant. This means that there is a statistically significant relationship between species and diameter, and that we can predict the species by a tree’s diameter. This also means that according to diameter’s relationship with species, the non-masting red maples exhibit muted dynamics compared to the sugar maple, who are masting. However, we must also look at sap weight.

Sap Weight

The dataset imported below shows the sugar levels and sap weight of different trees, associated with a tree number. The below datatable is also our sap dataset pre-cleaning.

library(readr)
maple_sap <- read_csv("hf285-02-maple-sap.csv")
## Rows: 9022 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (3): tree, tap, species
## dbl  (2): sugar, sap.wt
## dttm (1): datetime
## date (1): date
## time (1): time
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
datatable(maple_sap)
maple_sap$species[maple_sap$species == "ACRU"] <- "Red Maple"
maple_sap$species[maple_sap$species == "ACSA"] <- "Sugar Maple"
maple_sap2 <- maple_sap %>%
  drop_na(sap.wt)
maple_sap2 <- maple_sap2 %>%
  select(-datetime)
maple_sap2 <- maple_sap2 %>%
  select(date, time, tap, tree, sugar, species, sap.wt)
maple_sap2 <- maple_sap2 %>%
 separate(date, into = c("year", "month_day"), sep = -6) %>%
  summarize(year, tree, species, sugar, sap.wt)
 
view(maple_sap2)

Post-cleaning, this is the data that we will be using for our analysis of red maples vs sugar maples, as shown in a data table below. Since sap weight is an important indicator here, na’s were omitted as well, and we reordered the dataset to be in a more appealing order, while also renaming the scientific lettering for Red Maples and Sugar Maples to be how we know them.

datatable(maple_sap2)

Another indicator of if a tree is exhibiting muted dynamics compared to another species is analyzing its sap weight. To truly understand how they are changing over time as a result of environmental changes, we have to use a boxplot over time, using years.

ggplot(maple_sap2, mapping = aes(year, sap.wt)) + geom_boxplot(aes(color = species)) + labs(title = "Sap Weight In Trees Over Time", x = "Year", y = "Sap Weight Measured")

In this boxplot, we see that the sugar maple produces more median sap weight than the red maple. However, we will have to see if that is significant, using a linear regression model to see if we can predict sap weight based on species.

sap_wt_model <- lm(sap.wt ~ species, data = maple_sap2)
summary(sap_wt_model)
## 
## Call:
## lm(formula = sap.wt ~ species, data = maple_sap2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1475 -2.2575 -0.5975  1.6225 19.8825 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         1.89603    0.09755   19.44   <2e-16 ***
## speciesSugar Maple  2.26149    0.10379   21.79   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.054 on 8389 degrees of freedom
## Multiple R-squared:  0.05356,    Adjusted R-squared:  0.05345 
## F-statistic: 474.7 on 1 and 8389 DF,  p-value: < 2.2e-16

As we see in the summary, this is a significant relationship between species and sap weight, with a p-value of less than 2.2*10^-16. This means that we can confidently say that sugar maples produce more sap in weight than Red Maples. Our R-squared value is 0.05345, though, which means that only 5.34% of the variable’s variance is explained by our model. While other factors may be looked into in the future, we see a significant relationship between species and sap weight.

Conclusion

According to the data collected and our models, we can conclude that masting red maples do exhibit muted dynamics compared to sugar maples, who are non-masting. Both diameter and sap weight having significant relationships with species helped us come to this conclusion.