Visualizing raw psy timeseries

Background

Fourteen stem psychrometers were installed on seven individual Juniperus osteosperma at an Ameriflux eddy covariance site in southeastern Utah (US-CdM) between 2021-05-21 and 2021-11-06. Here, we intend to combine and plot the stem water potential with soil water content and VPD in order to identify flagged values.

Stem psychrometer data inputs

Individual .csv files were combined by S. Kannenberg and labeled with the associated tree and logger ID. These raw data were read in and the timestamps were rounded to the nearest half hour.

psy <- read.csv("../data_raw/Merged-psychrometer-data.csv",  encoding="UTF-8") %>%
  rename(chamber_temp = 5,
         dT = 6,
         wet_bulb = 7,
         psy = 8,
         correction_dT = 12,
         correction = 13,
         batt_volt = 14,
         int_batt_temp = 15,
         ext_power = 16,
         ext_power_volt = 17,
         ext_power_current = 18,
         comment = 19) %>%
  mutate(Date = as.POSIXct(Date, format = "%m/%d/%Y", tz = "America/Denver"),
         datetime = as.POSIXct(paste0(Date, Time), tz = "America/Denver"),
         dt = round.POSIXct(datetime, units = "30 mins"))

Occasionally, the psychrometer dataloggers record two measurements for the same half-hour. We identify duplicates and take the average.

psy %>%
  group_by(Tree, Logger) %>%
  summarize(n = n(),
            n.dt = length(unique(dt)),
            ndiff = n - n.dt)

## # A tibble: 14 x 5
## # Groups:   Tree [7]
##     Tree Logger     n  n.dt ndiff
##    <int>  <int> <int> <int> <int>
##  1     1      1  8110  8108     2
##  2     1      2  6931  6931     0
##  3     2      1  8115  8114     1
##  4     2      2  8113  8113     0
##  5     3      1  8108  8105     3
##  6     3      2  8107  8106     1
##  7     4      1  5864  5861     3
##  8     4      2  8050  8049     1
##  9     5      1  3101  3101     0
## 10     5      2  8077  8077     0
## 11     6      1  3979  3978     1
## 12     6      2  5865  5865     0
## 13     7      1  8072  8072     0
## 14     7      2  8030  8028     2

Roughly half of the loggers had at least one duplication, but none had more than three. Next, we take the mean values of the duplicates.

psy2 <- psy %>%
  group_by(Tree, Logger, dt) %>%
  summarize(chamber_temp = mean(chamber_temp),
            dT = mean(dT),
            psy = mean(psy),
            Intercept = mean(Intercept),
            Slope = mean(Slope),
            EDBO = mean(EDBO),
            correction_dT = mean(correction_dT),
            correction = mean(correction))

Finally, let’s check the range of psy measurements for each tree and logger.

psy2 %>%
  group_by(Tree, Logger) %>%
  summarize(n = sum(!is.na(psy)),
            n_na = sum(is.na(psy)),
            n_12.5 = length(which(psy < -12.5)),
            n_0 = length(which(psy >= 0)),
            min_psy = min(psy, na.rm = TRUE),
            max_psy = max(psy, na.rm = TRUE))

## # A tibble: 14 x 8
## # Groups:   Tree [7]
##     Tree Logger     n  n_na n_12.5   n_0 min_psy max_psy
##    <int>  <int> <int> <int>  <int> <int>   <dbl>   <dbl>
##  1     1      1  8108     0      0  1751  -10.8     0   
##  2     1      2  6924     7      0  1335   -6.8     0.49
##  3     2      1  8114     0      1   565  -25.2     2.72
##  4     2      2  8113     0      0  1729   -4.44    0   
##  5     3      1  8105     0      0   132  -12.2     0   
##  6     3      2  8106     0      0     1   -8.19    0   
##  7     4      1  5767    94      0     6   -9       0   
##  8     4      2  8049     0      1  2474  -71.6     0   
##  9     5      1  3087    14      0    83   -7.36    0   
## 10     5      2  8060    17      1   146 -102.      0   
## 11     6      1  3494   484      0    57   -5.39    0   
## 12     6      2  5728   137      1   251  -49.9     0   
## 13     7      1  8056    16      0  1729   -8.07    0   
## 14     7      2  7875   153      0   762   -8.83    0

The number of missing values per logger ranges from 0 to 484. Also, we can remove points that are less than -12.5 and greater than or equal to 0.

psy3 <- psy2 %>%
  filter(psy > -12.5 & psy <0)

Met data inputs

Tower and met data were collected starting in 2019. We select the relevant soil and met values for the time period of interest.

col_names <- names(read_csv("../data_raw/Other-tower-data.csv",
                skip = 1, n_max = 0))

met <- read_csv("../data_raw/Other-tower-data.csv",
                skip = 4, 
                col_names = col_names, 
                na = c("-9999")) %>%
  mutate(dt = as.POSIXct(TIMESTAMP, format = "%m/%d/%Y %H:%M", 
                         tz = "America/Denver"),
         VPD_Avg = RHtoVPD(RH_Avg, AirTemp_Avg)) %>%
  select(dt, AirTemp_Avg, RH_Avg, VPD_Avg, 
         Precip_Tot, contains("VWC")) %>%
  filter(dt >= min(psy2$dt) & dt <= max(psy2$dt))

Combine datasets for plotting

Finally, we convert the psy column to a wide format to match with the met data.

psy_wide <- psy3 %>%
  select(Tree, Logger, dt, psy) %>%
  tidyr::pivot_wider(names_from = c(Tree, Logger),
                     values_from = psy) %>%
  left_join(met)

Maintenance days

Monthly site visits were made, wherein the psychrometer sensors were removed, cleaned, and reattached to a fresh branch on the same tree. We expect some odd values during these periods and want to highlight them on the visualization.

maint <- read_csv("../data_raw/Maintenance-times.csv") %>%
  mutate(Start = as.POSIXct(Start, tz = "America/Denver"),
         End= as.POSIXct(End, tz = "America/Denver"))

Dates to skip

Psychrometers can fail to record a realistic \(\Psi\) for a number of reasons, including primarily the loss of contact and good seal between living xylem and the chamber. We have identified periods where either the contact deteriorates over time until the chamber is re-installed (‘Loss of contact’), contact is simply poor and results in spiky readings for a period before resolving naturally (‘Poor contact, very spiky’), or particular spikes are noted, often across loggers and trees and potentially associated with weather phenomenon (‘Unknown spike’). We read in these dates, which can be manually updated during the process of data cleaning.

skip <- read_csv("../data_raw/Psy-skip-dates.csv") %>%
  mutate(Start = as.POSIXct(st, format = "%m/%d/%Y %H:%M:%S", tz = "America/Denver"),
         End= as.POSIXct(en, format = "%m/%d/%Y %H:%M:%S", tz = "America/Denver"))

Creating interactive plots

For each tree, we would like to interactively plot the time series from both loggers and match with a second panel of VPD and soil VWC, possibly at two depths. The dygraph input needs to be an xts object.

Tree 1

Tree 2

Tree 3

Tree 4

Tree 5

Tree 6

Tree 7

Edit timeseries

Cut the maintenance days (all loggers) and skip days (tree- and logger-specific) from the half-hourly timeseries and output as .csv file.

# Function to find overlap in maintenance times
maint.overlap <- function(vals) {
  rowSums(mapply(function(a,b) between(vals, a, b), 
                 maint$Start, maint$End)) > 0
}

psy4 <- psy3 %>%
  filter(!test.overlap(dt))

# Create sequence of half-hours
skip_hourly <- data.frame()
for(i in 1:nrow(skip)) {
  temp <- data.frame(dt = seq(skip$Start[i], skip$End[i], by = 30*60)) %>%
    mutate(Tree = skip$Tree[i],
           Logger = skip$Logger[i])
  skip_hourly <- rbind.data.frame(skip_hourly, temp)
}

psy5 <- anti_join(psy4, skip_hourly)

write_csv(psy5, file = "../data_cleaned/psy_hourly.csv")