Background

In 2014, in my last year of my undergraduate Meteorology degree at Plymouth State University. I completed a Senior Research Project titled “Effects of Green-up on Spring and Summer Maximum Temperatures in Northern New Hampshire from 1989 to 2012”; I later presented this research at the 39th Northeast Storm Conference in Rutland, Vermont, USA. My original research project was done using Microsoft Excel, but used the same data sets. The goal of this project specifically was to redo the project, with an additional eight years of data, in R.

Introduction

The Hubbard Brook Experimental Forest (HBEF) is a 7,800-acre valley located within the White Mountain National Forest in New Hampshire, USA. Research started by the Northern Research Station office of the US Forest Service because in 1956, the Hubbard Brook Ecosystem Study (HBES) began in 1953, and HBEF was designated a National Science Foundation Long-Term Ecological Research Forest in 1988. (Hubbard Brook Ecosystem Study, 2022; USDA Forest Service Northern Research Station, 2022)

Image 1: Site map of the Hubbard Brook Experimental Forest in the White Mountain National Forest, New Hampshire. (Campbell, et. al., 2007)

In the sizable HBES Data Catalog, there are over 60 years of daily temperature records, and yearly documentation of tree growth in the spring. When the leaves of deciduous trees have fully bloomed is called “Green-Up”; in the mountains of northern New England, this is late May through early June. With Green-Up comes a tempering of temperature extremes, as the data below shows; there are more days where the maximum temperature is significantly greater than normal before Green-Up than after. (Hubbard Brook Ecosystem Study, n.d.)

Data

Phenology Data

Import

phen1 <- read.csv('https://pasta.lternet.edu/package/data/eml/knb-lter-hbr/51/12/9f623c83fb1da7595c6d2d498bde15df')

#phen1 <- read.csv('C:/Users/ahammond1/OneDrive - University of Massachusetts/DataSci/HBEF_Phenology_longform.csv')

** Spring Phenology Codes: ** [0] No change from winter conditions, unexpanded buds only [1] Bud swelling noticeable [2] Small leaves or flowers visible, initial stages of leaf expansion, leaves about 1 cm long [3] Leaves 1/2 of final length, leaves obscure half of sky as seen through crowns [3.5] Leaves 3/4 expanded, sky mostly obscured through crown, crowns not yet in summer condition [4] Canopy appears in summer condition leaves fully expanded little sky visible through crowns

(USDA Forest Service Northern Research Station, 2021b)

Cleaning

New Column: YEAR

phen1 <- phen1 %>%                      # YYYY-MM-DD to YYYY
   mutate(phen1, YEAR = substr(DATE, 0, 4))

phen1$YEAR <- as.integer(phen1$YEAR)    # Define as integer
phen1$DATE <- as.Date(phen1$DATE)       # Define & format as a Date
phen1$DOY <- as.integer(phen1$DAY)      # Redefining "DAY" to DOY (Day Of Year)

phen1$dayCode = paste(phen1$YEAR, phen1$DOY, sep = ",")   # Merge Year and Day into one item, easier for coding.

phen1 <- phen1 %>%
  arrange(YEAR, DOY)

Filter: Spring Records Only

phenSpring <- filter(phen1, `SEASON` == "SPRING") %>%           # Removes FALL Values
  select(`YEAR`, `DOY`, `dayCode`, `Phenology_Stage`, `DATE`)   # remove unnecessary columns

remove(phen1)

Calculate: Mean Phenology Stage

SpringCal <- phenSpring %>%       #Min, Mean, Max Stage by Year and Day, non-weighted, does not account for Site/Species
  group_by(dayCode) %>%
  summarise(minStage = min(Phenology_Stage), 
            averageStage = mean(Phenology_Stage), 
            maxStage = max(Phenology_Stage)) 

SpringCal <- SpringCal %>%    #Split DayCode into Year and DOY
  mutate(YEAR = substr(dayCode, 0, 4), 
         DOY = substr(dayCode,6,8) )

SpringCal$YEAR <- as.integer(SpringCal$YEAR) #Define as INT
SpringCal$DOY <- as.integer(SpringCal$DOY) #Define as INT

SpringCal <- SpringCal %>%
  arrange(YEAR, DOY)

Define and Filter for Green-Up

Green-Up defined as Phenology_Stage >= 3.75

GreenUp <- SpringCal %>%
  filter(`averageStage` >= 3.75) %>%
  group_by(YEAR)

GreenUp = GreenUp[!duplicated(GreenUp$YEAR), ] # Remove duplicated >> keep first instance of Stage >= 3.75

GreenUp$date <- as.Date(GreenUp$DOY, origin = "2019-01-01")  # Non-leap Year
GreenUp$date <- format(GreenUp$date, format = "%b-%d")       # Monday-Day format

GreenUp = select(GreenUp, `YEAR`, `date`, `DOY`)

remove(phenSpring, SpringCal)

Temperature Data

Import

dailyTemp <- read_csv('https://pasta.lternet.edu/package/data/eml/knb-lter-hbr/59/10/9723086870f14b48409869f6c06d6aa8')

#dailyTemp <- read_csv('C:/Users/ahammond1/OneDrive - University of Massachusetts/DataSci/HBEF_air_temp_daily_1957-2021.csv')

(USDA Forest Service Northern Research Station, 2021a)

Cleaning

dailyTemp <- dailyTemp %>% # Use only HQ station for temp
  filter(STA == "HQ")      # HQ starts in 1956, other stations/locations start in later years

dailyTemp$date <- as.Date(dailyTemp$date, "%Y-%m-%d")

Normals

Using Daily Temperature Records from 1959 to 2021

normals <- dailyTemp %>%
  group_by(month = month(date), day = (day(date))) %>%
  summarise(count1 = n(),                         # How many years averaged together
            Tmin = mean(MIN), sdMin = sd(MIN),    # sd = standard deviation
            Tave = mean(AVE), sdAve = sd(AVE),
            Tmax = mean(MAX), sdMax = sd(MAX),
            sigma = sdMax,
            oneSigma = (Tmax + sigma),            # Events >= 1 standard deviation greater than normal
            twoSigma = (Tmax + (2*sigma)))         # Events >= 2 standard deviations greater than normal) 

normals$month <- as.integer(normals$month)
normals$dateX = paste(normals$month, normals$day, sep = "-")
normals$dateX <- as.Date(normals$dateX, format = "%m-%d")
normals$DOY <- format(normals$dateX, format = "%j")
normals$DOY <- as.integer(normals$DOY)

Filter: Spring Normals

Spring defined as January 1 through July 31

normalSPRING <- normals %>%
  filter(month >= 1 & month <= 7) %>%
  select(month, day, dateX, DOY, 
         Tmax, oneSigma, twoSigma)

Filter: Maximum Daily Temperatures

tempSpring <- dailyTemp %>%
   mutate(YEAR = substr(`date`, 0, 4))

tempSpring <- tempSpring %>%
   mutate(MONTH = substr(`date`, 6, 7))

tempSpring$YEAR <- as.integer(tempSpring$YEAR)
tempSpring$MONTH <- as.integer(tempSpring$MONTH)

tempSpring <- tempSpring %>%
  filter(STA == "HQ" & 
         YEAR >= 1989 & YEAR <=2020 &
         MONTH >= 1 & MONTH <= 7)

tempSpring$date <- as.Date(tempSpring$date, format = "%m-%d")
tempSpring$DOY <- format(tempSpring$date, format = "%j")
tempSpring$DOY <- as.integer(tempSpring$DOY)

#remove(dailyTemp)

Megatable Formated for tidyr

Merge Green-Up and Daily Normals

tempSpringNORMALS <- tempSpring %>%
  inner_join(
    x = tempSpring, y = normalSPRING, 
    by = 'DOY')

tempSpNorGreen <- tempSpringNORMALS %>%
  inner_join(
    x = tempSpringNORMALS, y = GreenUp,
    by = 'YEAR')

remove(tempSpringNORMALS)

Determine Maximum Temperature Events

Days where the Recorded Daily Maximum Temperature is Greater Than or Equal To the Normal Maximum Temperature

Then Counted by if the Event is Greater Than or Equal To 1 Standard Deviation or Two Standard Deviations above Normal

eventsB <- tempSpNorGreen %>%
  group_by(YEAR) %>%
  filter(DOY.x < DOY.y & MAX >= Tmax) %>%
  summarise(bGreenCt = n())

eventsB1 <- tempSpNorGreen %>%
  group_by(YEAR) %>%
  filter(DOY.x < DOY.y & MAX >= oneSigma) %>%
  summarise(bGreen1 = n())

eventsB <- eventsB %>%
  left_join(
    x = eventsB,
    y = eventsB1,
    by = 'YEAR')

eventsB1 <- tempSpNorGreen %>%
  group_by(YEAR) %>%
  filter(DOY.x < DOY.y & MAX >= twoSigma) %>%
  summarise(bGreen2 = n())

eventsB <- eventsB %>%
  left_join(
    x = eventsB,
    y = eventsB1,
    by = 'YEAR')

remove(eventsB1)

eventsA <- tempSpNorGreen %>%
  group_by(YEAR) %>%
  filter(DOY.x >= DOY.y & MAX >= Tmax) %>%
  summarise(aGreenCt = n())

eventsA1 <- tempSpNorGreen %>%
  group_by(YEAR) %>%
  filter(DOY.x >= DOY.y & MAX >= oneSigma) %>%
  summarise(aGreen1 = n())

eventsA <- eventsA %>%
  left_join(
    x = eventsA,
    y = eventsA1,
    by = 'YEAR')

eventsA1 <- tempSpNorGreen %>%
  group_by(YEAR) %>%
  filter(DOY.x >= DOY.y & MAX >= twoSigma) %>%
  summarise(aGreen2 = n())

eventsA <- eventsA %>%
  left_join(
    x = eventsA,
    y = eventsA1,
    by = 'YEAR')

remove(eventsA1)

events <- eventsB %>%
  left_join(
    x = eventsB,
    y = eventsA,
    by = 'YEAR')

remove(eventsA, eventsB)

events[is.na(events)] <- 0  # If NA, make 0

events <- events %>%        # Rename Columns
  rename(
    "Before Green-Up" = bGreenCt,
    "Before + 1σ" = bGreen1,
    "Before + 2σ" = bGreen2,
    "After Green-Up" = aGreenCt,
    "After + 1σ" = aGreen1,
    "After + 2σ" = aGreen2)

Recode and Convert to Factors

Visualizations

Green-Up Date by Year

Normal Temperatures January 1st through July 31st

Normal Maximum Temperature plus 1 and 2 Standard Deviations

Number of Events Before vs After Green-Up

Reflections and Conclusions

The data sets used above do show a marked difference in extreme temperature events before versus after Green-Up; there were almost double the number of events greater than 2 standard deviations above normal before than after overall. There is some early fluctuations, but a 32-year dataset is great to know some medium term trends. I am glad I picked a project and dataset where I already knew what to expect as an answer; it gave me the flexibility to work with R without working about how the evaluations would turn out. I definitely utilized the books available online for R Markdown and ggplot2, and I got lucky by working with others that knew R and could help me rationalize some things verbally. If I were to continue with this project, I would want to expand into evaluating minimum temperatures, as well as the Fall season when the leaves fall of the trees. It would also be interesting to look at the spatial aspects of the data. Different parts of the valley were recorded for Phenology Stage, and are indicated in Image 1; for this project I just averaged them together, but it would be interesting to see how Green-Up changes on the north side of the basin versus the south.

Table 1: Green-Up Date for 1989 to 2020
Year	Date	Day of Year
1989	May-31	150
1990	Jun-05	155
1991	May-29	148
1992	Jun-10	160
1993	Jun-08	158
1994	Jun-07	157
1995	Jun-06	156
1996	Jun-05	155
1997	Jun-17	167
1998	May-28	147
1999	Jun-02	152
2000	Jun-07	157
2001	May-30	149
2002	Jun-11	161
2003	Jun-17	167
2004	Jun-03	153
2005	Jun-07	157
2006	Jun-06	156
2007	May-30	149
2008	Jun-04	154
2009	Jun-09	159
2010	May-25	144
2011	Jun-01	151
2012	May-31	150
2013	May-29	148
2014	Jun-03	153
2015	May-27	146
2016	Jun-03	153
2017	May-31	150
2018	May-30	149
2019	Jun-11	161
2020	Jun-03	153

Campbell, John L., Charles T. Driscoll, Christopher Eagar, Gene E. Likens, Thomas G. Siccama, Chris E. Johnson, Timothy J. Fahey, et al. 2007. “Long-Term Trends from Ecosystem Research at the Hubbard Brook Experimental Forest.” Gen. Tech. Rep. NRS-17. Newtown Square, PA: U.S. Department of Agriculture, Forest Service, Northern Research Station. 41 p. 17. https://doi.org/10.2737/NRS-GTR-17.

Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. https://www.jstatsoft.org/v40/i03/.

“Hubbard Brook Ecosystem Study.” n.d. Hubbard Brook Ecosystem Study. Accessed July 15, 2022. https://hubbardbrook.org/.

Mollie. 2013. “Date Formats in R R-Bloggers.” https://www.r-bloggers.com/2013/08/date-formats-in-r/.

R Core Team. 2022. “R: A Language and Environment for Statistical Computing.” R Foundation for Statistical Computing. https://www.R-project.org/.

RStudio Team. 2022. “RStudio: Integrated Development Environment for R.” RStudio, PBC. http://www.rstudio.com/.

USDA Forest Service Northern Research Station. 2021a. “Hubbard Brook Experimental Forest: Daily Temperature Record, 1955 - Present.” Environmental Data Initiative. https://doi.org/10.6073/PASTA/3AFAB60D54D5F2FCB1112E71F4BE2106.

———. 2021b. “Hubbard Brook Experimental Forest: Routine Seasonal Phenology Measurements, 1989 - Present.” Environmental Data Initiative. https://doi.org/10.6073/PASTA/F2C18A955C24EADAEC1FA0D915A7B527.

———. n.d. “Hubbard Brook Experimental Forest.” Hubbard Brook Experimental Forest. Accessed August 1, 2022. https://www.nrs.fs.fed.us/ef/locations/nh/hubbard-brook/.

Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.

Wickham, Hadley, and Maximilian Girlich. 2022. Tidyr: Tidy Messy Data. https://CRAN.R-project.org/package=tidyr.

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. First edition. Sebastopol, CA: O’Reilly.

Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. 1st ed. Chapman; Hall/CRC. https://doi.org/10.1201/9781003097471.

Effects of Green-Up on Spring Maximum Temperatures

Background

Introduction

Data

Phenology Data

Import

Cleaning

New Column: YEAR

Filter: Spring Records Only

Calculate: Mean Phenology Stage

Define and Filter for Green-Up

Temperature Data

Import

Cleaning

Normals

Filter: Spring Normals

Filter: Maximum Daily Temperatures

Megatable Formated for tidyr

Determine Maximum Temperature Events

Recode and Convert to Factors

Visualizations

Green-Up Date by Year

Normal Temperatures January 1st through July 31st

Normal Maximum Temperature plus 1 and 2 Standard Deviations

Number of Events Before vs After Green-Up

Reflections and Conclusions

References