ola-vwc-check

Setup

Load 2023 and 2024 TMP Fresh plot VWC data:

library(readr)
library(dplyr)
library(lubridate)

files2023 <- list.files(file.path(L1_DATA, "TMP_2023/"),
                        pattern = "soil-vwc.*2-0.csv$",
                        full.names = TRUE)
files2024 <- list.files(file.path(L1_DATA, "TMP_2024/"), 
                        pattern = "soil-vwc.*2-0.csv$",
                        full.names = TRUE)

dat_list <- lapply(c(files2023, files2024), function(f) {
    # read files and keep only June Fresh data
    read_csv(f, col_types = "ccTccccdcclll") %>%
        filter(month(TIMESTAMP) == 6, Plot == "F")
})
dat <- bind_rows(dat_list)

# Add a few columns to make the data easier to work with 
dat$name <- paste(dat$Sensor_ID, dat$Location)
dat$yday <- yday(dat$TIMESTAMP)
dat$year <- as.factor(year(dat$TIMESTAMP))
dat$hour <- hour(dat$TIMESTAMP)

Plot the raw data for June 2023 and 2024

library(ggplot2)
p <- ggplot(dat, aes(yday + hour/24, Value, color = year)) +
    facet_wrap(~name) +
    coord_cartesian(ylim = c(0, 1)) +
    geom_line(na.rm = TRUE)
print(p)

Compute average hourly data

The sampling interval might be different between years, so compute the average hourly values.

dat %>%
    group_by(Plot, Location, research_name, year, yday, hour, name) %>%
    summarise(Value = mean(Value, na.rm = TRUE), .groups = "drop") ->
    dat_hourly

last_plot() %+% dat_hourly

Focus on “TEROS C3 (FW)”

Olawale’s document spotlights “TEROS C3 (FW) at 5cm,” among others, so take a close look:

dat_hourly %>%
    filter(Location == "C3", Plot == "F", research_name == "soil-vwc-5cm" ) ->
    dat_hourly_FreshC3_5cm
print(unique(dat_hourly_FreshC3_5cm$name))

[1] "T171 C3"

ggplot(dat_hourly_FreshC3_5cm, aes(yday + hour/24, Value, color = year)) +
    geom_point() + 
    facet_wrap(~name)

Note that there are three different sensors in Fresh C3, so it’s not enough to say “TEROS C3” as an identifier.

Numerical summary of observations by sensor

Compute the number of June hourly data points by sensor:

library(tidyr)
dat_hourly %>%
    group_by(Plot, year, name) %>%
    summarise(n = n(), .groups = "drop") %>%
    pivot_wider(names_from = "year", values_from = "n") %>%
    mutate(change_pct = (`2024` - `2023`) / `2023` * 100) %>%
    arrange(desc(change_pct)) %>% 
    knitr::kable()

Plot	name	2023	2024
F	T001 F3	720	720
F	T002 G3	720	720
F	T003 I3	720	720
F	T004 G4	720	720
F	T005 H4	720	720
F	T006 F5	720	720
F	T007 G5	720	720
F	T008 H5	720	720
F	T009 I5	720	720
F	T011 D6	720	720
F	T012 F2	720	720
F	T013 H2	720	720
F	T014 G6	720	720
F	T015 E7	720	720
F	T016 C7	720	720
F	T017 G7	720	720
F	T018 I7	720	720
F	T025 E6	720	720
F	T026 F6	720	720
F	T041 C3	720	720
F	T043 H6	720	720
F	T044 H6	720	720
F	T053 E3	720	720
F	T076 B2	720	720
F	T077 D2	720	720
F	T078 D3	720	720
F	T082 D4	720	720
F	T083 E4	720	720
F	T084 C5	720	720
F	T086 E5	720	720
F	T118 C6	720	720
F	T119 C6	720	720
F	T120 C6	720	720
F	T121 H3	720	720
F	T122 H3	720	720
F	T123 H3	720	720
F	T124 F4	720	720
F	T125 F4	720	720
F	T126 F4	720	720
F	T163 H6	720	720
F	T168 C3	720	720
F	T169 D5	720	720
F	T170 B4	720	720
F	T171 C3	720	720
F	T172 C4	720	720
F	T183 B6	720	720

(720 is 30 days in June times 24 hours per day.)

In summary, I don’t see any data gap.