This notebook performs an end-to-end wastewater-based epidemiology (WBE) analysis of SARS-CoV-2 surveillance data from the Point Loma WWTP sewershed (San Diego, CA), covering the five-month window October 1, 2021 – February 28, 2022. This window spans the tail end of the Delta wave and the full emergence of Omicron — a period analyzed in Karthikeyan et al., 2022 (Nature), which demonstrated that wastewater sequencing detected variant emergence weeks before clinical confirmation.
The analysis follows seven sequential steps mirroring operational biosurveillance workflows: Sample QC → Concentration transformation → Baseline & activity-level classification → Trend detection → Variant deconvolution → Emerging lineage detection → Clinical integration.
# Install any missing packages automatically
pkgs <- c(
"tidyverse", "lubridate", "zoo", "slider", "TTR",
"changepoint", "trend", "broom", "knitr", "kableExtra",
"scales", "patchwork", "ggrepel", "viridis", "RColorBrewer"
)
to_install <- pkgs[!pkgs %in% installed.packages()[, "Package"]]
if (length(to_install)) install.packages(to_install, repos = "https://cloud.r-project.org")
library(tidyverse)
library(lubridate)
library(zoo)
library(slider)
library(TTR) # EMA / SMA helpers
library(changepoint)
library(trend) # Mann-Kendall
library(broom)
library(knitr)
library(kableExtra)
library(scales)
library(patchwork)
library(ggrepel)
library(viridis)
library(RColorBrewer)
library(readr)
theme_set(theme_bw(base_size = 12))Note on data access: The dataset is distributed via the Latch Bio console (https://console.latch.bio/s/20805773099779822). Download the bundle
sars_cov2_wastewater_takehome/and place the three files —qpcr.csv,freyja.tsv, andcases.csv— in your working directory before knitting.
## [1] "C:/Users/Shahab/Desktop/Homework"
setwd("C:/Users/Shahab/Desktop/Homework")
# ── Load raw data ─────────────────────────────────────────────────────────────
qpcr <- read_csv("qpcr.csv", show_col_types = FALSE)
freyja <- read_tsv("freyja.tsv", show_col_types = FALSE)
cases <- read_csv("cases.csv", show_col_types = FALSE)
# ── Harmonise date columns ────────────────────────────────────────────────────
qpcr <- qpcr %>% mutate(date = as_date(date))
freyja <- freyja %>% mutate(date = as_date(date))
cases <- cases %>% mutate(week_ending = as_date(week_ending))
# ── Quick look ────────────────────────────────────────────────────────────────
glimpse(qpcr)## Rows: 60
## Columns: 9
## $ date <date> 2021-10-04, 2021-10-06, 2021-10-10, 2021-10-11, 2…
## $ site <chr> "PointLoma", "PointLoma", "PointLoma", "PointLoma"…
## $ pcr_target_avg_conc <dbl> 248673, 218099, 248399, 409576, 653372, 703496, 52…
## $ pmmov_copies_per_L <dbl> 31884542, 40272628, 45881970, 53495335, 35787131, …
## $ flow_mgd <dbl> 166.08, 154.12, 131.01, 160.94, 146.79, 130.32, 14…
## $ detection_flag <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ lab_method <chr> "ddPCR_N1N2_PEG_concentrated", "ddPCR_N1N2_PEG_con…
## $ replicate_cv <dbl> 0.316, 0.174, 0.311, 0.177, 0.104, 0.191, 0.259, 0…
## $ population_served <dbl> 2300000, 2300000, 2300000, 2300000, 2300000, 23000…
## Rows: 40
## Columns: 59
## $ date <date> 2021-10-04, 2021-10-06, 2021-10-10, 2021-1…
## $ site <chr> "PointLoma", "PointLoma", "PointLoma", "Poi…
## $ BA.1 <dbl> 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0…
## $ BA.1.1.X <dbl> 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0…
## $ BA.2.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BA.2.12.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BA.4.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BA.5.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ B.1.1.529 <dbl> 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0…
## $ AY.113 <dbl> 0.018, 0.003, 0.014, 0.017, 0.094, 0.175, 0…
## $ AY.100 <dbl> 0.080, 0.000, 0.288, 0.053, 0.222, 0.107, 0…
## $ AY.20 <dbl> 0.000, 0.011, 0.077, 0.042, 0.051, 0.061, 0…
## $ AY.25 <dbl> 0.371, 0.437, 0.111, 0.011, 0.211, 0.111, 0…
## $ AY.3 <dbl> 0.019, 0.000, 0.012, 0.000, 0.059, 0.107, 0…
## $ AY.44 <dbl> 0.131, 0.138, 0.333, 0.406, 0.121, 0.107, 0…
## $ AY.119 <dbl> 0.011, 0.000, 0.000, 0.013, 0.013, 0.025, 0…
## $ AY.3.1 <dbl> 0.007, 0.000, 0.000, 0.001, 0.000, 0.050, 0…
## $ AY.103 <dbl> 0.345, 0.410, 0.165, 0.449, 0.167, 0.143, 0…
## $ AY.46.4 <dbl> 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0…
## $ AY.25.1 <dbl> 0.000, 0.000, 0.000, 0.000, 0.018, 0.026, 0…
## $ AY.116 <dbl> 0.000, 0.000, 0.000, 0.000, 0.041, 0.044, 0…
## $ AY.43.4 <dbl> 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0…
## $ `Other Delta sub-lineages` <dbl> 0.017, 0.000, 0.000, 0.009, 0.006, 0.045, 0…
## $ BA.2.75 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BA.4.6 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BQ.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BQ.1.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BF.7.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BN.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XBB.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XBB.1.5.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XBB.1.9.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XBB.1.16.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XBB.2.3.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ EG.5.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BA.2.86.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ HV.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ JN.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ JN.1.7.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ JN.1.4.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ KQ.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ JN.1.11.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ KP.2.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Recombinants <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Other <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XDP.1 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ JN.1.16.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XDP <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ KP.3.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ LB.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ KP.1.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ KP.3.1.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Delta <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XEC.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ LP.8.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ LP.8.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ NB.1.8.1.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ XFG.X <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ residual <dbl> 0.001, 0.001, 0.000, 0.000, 0.000, 0.000, 0…
## Rows: 23
## Columns: 4
## $ week_ending <date> 2021-10-02, 2021-10-09, 2021-10-16, 2021-10-23, 2021-…
## $ county <chr> "San Diego", "San Diego", "San Diego", "San Diego", "S…
## $ weekly_cases <dbl> 836, 3250, 3070, 3451, 3381, 3375, 3241, 3386, 3073, 5…
## $ n_days_reported <dbl> 2, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, …
Raw wastewater measurements are subject to several sources of technical noise: inhibited PCR reactions, failed extractions, atypical flow events (rain dilution, industrial discharge), and below-LOD samples. Before any epidemiological inference we must identify and flag low-quality samples using orthogonal indicators built into the data:
| QC criterion | Variable used | Rationale |
|---|---|---|
| Detection flag | detection_flag |
Lab-reported below-LOD |
| Replicate CV | replicate_cv |
High CV (>25–30 %) indicates technical irreproducibility |
| PMMoV anchor | pmmov_copies_per_L |
Pepper Mild Mottle Virus is a human-fecal indicator; extreme outliers suggest matrix failure |
| Flow rate | flow_mgd |
Extreme flow dilutes viral signal; flag samples outside ±3 SD of site mean |
PMMoV is used as a fecal-strength indicator but not as a normalization divisor in the primary WVAL calculation (following the August 2025 CDC update reverting to non-normalized data). It is retained here for QC and as a sensitivity-check covariate.
# ── Flag rules ────────────────────────────────────────────────────────────────
flow_mean <- mean(qpcr$flow_mgd, na.rm = TRUE)
flow_sd <- sd(qpcr$flow_mgd, na.rm = TRUE)
pmmov_mean <- mean(log10(qpcr$pmmov_copies_per_L + 1), na.rm = TRUE)
pmmov_sd <- sd(log10(qpcr$pmmov_copies_per_L + 1), na.rm = TRUE)
qpcr_qc <- qpcr %>%
mutate(
# Individual QC flags (TRUE = problem)
flag_detection = detection_flag %in% c("non-detect", "below_LOD", FALSE) |
pcr_target_avg_conc <= 0,
flag_cv = !is.na(replicate_cv) & replicate_cv > 30,
flag_flow = !is.na(flow_mgd) &
abs(flow_mgd - flow_mean) > 3 * flow_sd,
flag_pmmov = !is.na(pmmov_copies_per_L) &
abs(log10(pmmov_copies_per_L + 1) - pmmov_mean) > 3 * pmmov_sd,
# Composite
qc_fail = flag_detection | flag_cv | flag_flow | flag_pmmov,
qc_pass = !qc_fail
)
# ── QC summary table ──────────────────────────────────────────────────────────
qpcr_qc %>%
summarise(
n_total = n(),
n_nondetect = sum(flag_detection, na.rm = TRUE),
n_high_cv = sum(flag_cv, na.rm = TRUE),
n_extreme_flow = sum(flag_flow, na.rm = TRUE),
n_extreme_pmmov= sum(flag_pmmov, na.rm = TRUE),
n_fail_any = sum(qc_fail, na.rm = TRUE),
pct_pass = round(100 * mean(qc_pass, na.rm = TRUE), 1)
) %>%
kable(caption = "Table 1. QC flag summary") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)| n_total | n_nondetect | n_high_cv | n_extreme_flow | n_extreme_pmmov | n_fail_any | pct_pass |
|---|---|---|---|---|---|---|
| 60 | 0 | 0 | 1 | 1 | 2 | 96.7 |
p_flow <- ggplot(qpcr_qc, aes(date, flow_mgd, colour = qc_fail)) +
geom_point(size = 1.8, alpha = 0.8) +
scale_colour_manual(values = c("FALSE" = "steelblue", "TRUE" = "firebrick"),
labels = c("Pass", "Fail")) +
labs(title = "Flow rate over time", x = NULL, y = "Flow (MGD)", colour = "QC") +
theme(legend.position = "top")
p_cv <- ggplot(qpcr_qc %>% filter(!is.na(replicate_cv)),
aes(date, replicate_cv, colour = flag_cv)) +
geom_point(size = 1.8, alpha = 0.8) +
geom_hline(yintercept = 30, linetype = "dashed", colour = "firebrick") +
scale_colour_manual(values = c("FALSE" = "steelblue", "TRUE" = "firebrick"),
labels = c("<= 30 %", "> 30 %")) +
labs(title = "Replicate CV over time",
x = NULL, y = "Replicate CV (%)", colour = "CV flag") +
theme(legend.position = "top")
p_pmmov <- ggplot(qpcr_qc, aes(date, log10(pmmov_copies_per_L + 1),
colour = flag_pmmov)) +
geom_point(size = 1.8, alpha = 0.8) +
scale_colour_manual(values = c("FALSE" = "steelblue", "TRUE" = "firebrick"),
labels = c("Normal", "Outlier")) +
labs(title = "PMMoV (log₁₀) over time",
x = NULL, y = "log₁₀ PMMoV (copies/L)", colour = "PMMoV flag") +
theme(legend.position = "top")
p_flow / p_cv / p_pmmov# Working dataset: QC-pass only, positive concentrations
qpcr_clean <- qpcr_qc %>%
filter(qc_pass, pcr_target_avg_conc > 0) %>%
arrange(date)
cat("Samples retained after QC:", nrow(qpcr_clean), "/", nrow(qpcr_qc), "\n")## Samples retained after QC: 58 / 60
After applying multi-criterion QC (non-detect flag, replicate CV > 30 %, ±3 SD flow outlier, ±3 SD PMMoV outlier), which statement best describes the overall data quality and the principal reason samples failed QC at Point Loma during this surveillance window?
A) The majority of samples failed QC, with high replicate CV being the dominant failure mode — consistent with sustained PCR inhibition in this matrix.
B) A small minority of samples failed QC; the dominant failure mode was extreme flow-rate events (rain dilution or industrial surge), not PCR irreproducibility.
C) Failure rates were moderate (~20–30 %); non-detect flags and extreme PMMoV outliers contributed equally, suggesting intermittent extraction failure rather than any single process breakdown.
D) Nearly all samples passed QC; the few failures were driven primarily by below-LOD non-detects at the very beginning of the window before viral concentrations rose to reliably detectable levels.
Raw ddPCR viral RNA concentrations in wastewater span several orders of magnitude (~10²–10⁶ copies/L) and are right-skewed. For downstream regression, baseline estimation, and epidemiological trend detection, a log₁₀ transformation is standard in the WBE literature (Karthikeyan et al. 2022; CDC NWSS). This:
We also compute a 7-day centred rolling median (robust to single outliers) for trend visualisation, consistent with the rolling-average approach used in the Karthikeyan et al. paper.
qpcr_t <- qpcr_clean %>%
mutate(
log10_conc = log10(pcr_target_avg_conc),
# 7-day centred rolling median (robust smoother)
roll7_median = slide_dbl(log10_conc, median, na.rm = TRUE,
.before = 3, .after = 3),
# 7-day rolling mean (for WVAL baseline later)
roll7_mean = slide_dbl(log10_conc, mean, na.rm = TRUE,
.before = 3, .after = 3)
)ggplot(qpcr_t, aes(date)) +
geom_point(aes(y = log10_conc), colour = "grey60", size = 1.4, alpha = 0.7) +
geom_line(aes(y = roll7_median), colour = "steelblue", linewidth = 1.1) +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
labs(
title = "SARS-CoV-2 RNA concentration over time (log₁₀-transformed)",
subtitle = "Points = daily ddPCR; line = 7-day centred rolling median",
x = NULL, y = "log₁₀ copies / L"
) +
theme(axis.text.x = element_text(angle = 30, hjust = 1))# Assess normality before vs after transform
p_raw <- ggplot(qpcr_clean, aes(pcr_target_avg_conc)) +
geom_histogram(bins = 30, fill = "steelblue", colour = "white") +
labs(title = "Raw concentration", x = "copies/L", y = "Count")
p_log <- ggplot(qpcr_t, aes(log10_conc)) +
geom_histogram(bins = 30, fill = "darkorange", colour = "white") +
stat_function(fun = function(x)
dnorm(x, mean(qpcr_t$log10_conc), sd(qpcr_t$log10_conc)) * nrow(qpcr_t) *
diff(range(qpcr_t$log10_conc)) / 30,
colour = "black", linewidth = 1) +
labs(title = "log₁₀ concentration", x = "log₁₀ copies/L", y = "Count")
p_raw + p_logAfter applying log₁₀ transformation to the raw ddPCR concentration data from Point Loma, what does the shape of the log-transformed distribution and the trajectory of the 7-day rolling median together suggest about SARS-CoV-2 circulation dynamics during this five-month window?
A) The log-transformed values are uniformly distributed and the rolling median is flat, indicating stable endemic transmission with no significant waves during October 2021 – February 2022.
B) The distribution is bimodal and the rolling median shows two distinct elevated plateaus of equal magnitude, consistent with two independent epidemic waves of similar peak intensity.
C) The log-transformed values approach normality and the rolling median reveals at least one major ascending phase reaching a clear peak, followed by evidence of a secondary rise — consistent with sequential variant waves (Delta tail → Omicron emergence) rather than stable endemic transmission.
D) The transformation yields a strongly left-skewed distribution, indicating that most samples captured the peak of a single dominant wave with rapid post-peak decline.
The CDC Wastewater Viral Activity Level (WVAL) standardises site-level measurements against a historical baseline so that activity levels can be compared across sites and time. The core calculation is:
\[\text{WVAL} = \frac{C_{\text{recent}} - \mu_{\text{baseline}}}{\sigma_{\text{baseline}}}\]
where \(C_{\text{recent}}\) is a smoothed (7-day median) log₁₀ concentration, and \(\mu\) and \(\sigma\) are the mean and standard deviation computed over a baseline window of low-activity periods.
For a 5-month window, we define the baseline as the first 8 weeks of data (the minimum CDC requires before publishing a WVAL), which in this dataset corresponds to October–November 2021 — a period that preceded the Omicron surge and therefore represents a relatively low baseline consistent with Delta wave dynamics. The WVAL thresholds used (Minimal/Low/Moderate/High/Very High) are then derived from the baseline distribution.
# ── Baseline window: first 8 weeks ───────────────────────────────────────────
baseline_end <- min(qpcr_t$date) + weeks(8)
baseline_stats <- qpcr_t %>%
filter(date <= baseline_end) %>%
summarise(
bl_mean = mean(log10_conc, na.rm = TRUE),
bl_sd = sd(log10_conc, na.rm = TRUE)
)
cat("Baseline mean (log₁₀):", round(baseline_stats$bl_mean, 3),
"\nBaseline SD (log₁₀):", round(baseline_stats$bl_sd, 3), "\n")## Baseline mean (log₁₀): 5.876
## Baseline SD (log₁₀): 0.284
# ── WVAL score ────────────────────────────────────────────────────────────────
qpcr_wval <- qpcr_t %>%
mutate(
wval_score = (roll7_median - baseline_stats$bl_mean) / baseline_stats$bl_sd
)
# ── Classify activity levels (CDC-inspired thresholds in standardised units) ──
# CDC's thresholds map approximately to these z-score ranges (2024 framework):
# Minimal : wval_score < 0
# Low : 0 <= wval_score < 1
# Moderate: 1 <= wval_score < 2
# High : 2 <= wval_score < 3
# Very High: wval_score >= 3
wval_levels <- c("Minimal", "Low", "Moderate", "High", "Very High")
wval_colours <- c("Minimal" = "#2166AC",
"Low" = "#74ADD1",
"Moderate" = "#FEE090",
"High" = "#F46D43",
"Very High" = "#A50026")
qpcr_wval <- qpcr_wval %>%
mutate(
activity_level = case_when(
wval_score < 0 ~ "Minimal",
wval_score < 1 ~ "Low",
wval_score < 2 ~ "Moderate",
wval_score < 3 ~ "High",
TRUE ~ "Very High"
),
activity_level = factor(activity_level, levels = wval_levels)
)# Ribbon background by activity level
# Build date ranges for each level for background shading
activity_runs <- qpcr_wval %>%
select(date, activity_level) %>%
mutate(run_id = cumsum(activity_level != lag(activity_level,
default = first(activity_level))))
activity_ranges <- activity_runs %>%
group_by(run_id, activity_level) %>%
summarise(start = min(date), end = max(date), .groups = "drop")
p_wval <- ggplot(qpcr_wval, aes(date)) +
# Background shading per activity level
geom_rect(data = activity_ranges,
aes(xmin = start - 0.5, xmax = end + 0.5,
ymin = -Inf, ymax = Inf, fill = activity_level),
alpha = 0.25, inherit.aes = FALSE) +
# Smoothed signal
geom_line(aes(y = wval_score), colour = "black", linewidth = 0.9) +
geom_point(aes(y = wval_score, colour = activity_level),
size = 1.5, alpha = 0.7) +
# Threshold lines
geom_hline(yintercept = c(0, 1, 2, 3), linetype = "dashed",
colour = "grey40", linewidth = 0.4) +
annotate("text", x = max(qpcr_wval$date), y = c(-0.2, 0.5, 1.5, 2.5, 3.5),
label = wval_levels, hjust = 1, size = 3, colour = "grey30") +
scale_fill_manual(values = wval_colours, guide = "none") +
scale_colour_manual(values = wval_colours, name = "Activity level") +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
labs(
title = "CDC WVAL — Point Loma WWTP",
subtitle = "Baseline: first 8 weeks (Oct–Nov 2021); z-score relative to baseline",
x = NULL, y = "WVAL score (z-score)"
) +
theme(axis.text.x = element_text(angle = 30, hjust = 1),
legend.position = "top")
p_wvalqpcr_wval %>%
count(activity_level) %>%
mutate(pct = round(100 * n / sum(n), 1)) %>%
kable(caption = "Table 2. Days per CDC activity level") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)| activity_level | n | pct |
|---|---|---|
| Minimal | 9 | 15.5 |
| Low | 11 | 19.0 |
| Moderate | 5 | 8.6 |
| High | 5 | 8.6 |
| Very High | 28 | 48.3 |
After applying CDC WVAL methodology with a baseline window anchored to the first 8 weeks of the surveillance record (Oct–Nov 2021), which statement best characterises the activity-level trajectory at Point Loma WWTP across the full five-month window?
A) Sustained Very Low / Minimal activity throughout — no period exceeded the Low threshold, suggesting SARS-CoV-2 transmission was controlled for the entire window.
B) A single transient spike to High lasting fewer than two weeks, otherwise Low / Moderate — consistent with a brief outbreak that was rapidly contained without a sustained wave.
C) A multi-week ramp from Minimal / Low through Moderate to a Very High peak, with the escalation concentrated in December 2021 – January 2022, followed by a declining trajectory toward High / Moderate by late February — consistent with the Omicron BA.1 wave superimposed on residual Delta signal.
D) Bimodal — two distinct Very High waves separated by a return to Minimal, reflecting two fully independent epidemic introductions each starting from near-zero baseline.
Trend detection serves two goals: (1) confirming the directionality of activity-level transitions detected in Step 3, and (2) providing an analytically robust lead-time estimate relative to clinical surveillance. We use two complementary methods:
| Method | What it tests |
|---|---|
| Mann-Kendall trend test | Non-parametric test for monotonic trend in the full series; does not assume normality or linearity |
| Changepoint detection (PELT algorithm) | Identifies structural breakpoints in the mean and/or variance of the series — marking when the trend direction changed |
The PELT (Pruned Exact Linear Time) algorithm from the
changepoint package detects changepoints in the mean of the
log₁₀ concentration series under a penalty chosen by BIC, which balances
sensitivity to real breaks against false positives.
# ── Mann-Kendall over the full QC-pass series ─────────────────────────────────
mk_result <- mk.test(qpcr_wval$log10_conc)
print(mk_result)##
## Mann-Kendall trend test
##
## data: qpcr_wval$log10_conc
## z = 6.527, n = 58, p-value = 6.709e-11
## alternative hypothesis: true S is not equal to 0
## sample estimates:
## S varS tau
## 974.00000 22222.66667 0.58941
# ── Mann-Kendall in first half (Oct–Dec) vs second half (Jan–Feb) ─────────────
mk_first <- mk.test(filter(qpcr_wval, date < as_date("2022-01-01"))$log10_conc)
mk_second <- mk.test(filter(qpcr_wval, date >= as_date("2022-01-01"))$log10_conc)
tibble(
Period = c("Full window (Oct–Feb)", "Oct–Dec 2021", "Jan–Feb 2022"),
tau = c(mk_result$statistic, mk_first$statistic, mk_second$statistic),
p_value = c(mk_result$p.value, mk_first$p.value, mk_second$p.value),
direction = ifelse(tau > 0, "Increasing", "Decreasing")
) %>%
mutate(across(where(is.numeric), ~round(.x, 4))) %>%
kable(caption = "Table 3. Mann-Kendall trend tests") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)| Period | tau | p_value | direction |
|---|---|---|---|
| Full window (Oct–Feb) | 6.5270 | 0 | Increasing |
| Oct–Dec 2021 | 6.8316 | 0 | Increasing |
| Jan–Feb 2022 | -5.9159 | 0 | Decreasing |
# ── PELT changepoint detection on the smoothed log₁₀ series ──────────────────
cpt_series <- qpcr_wval$roll7_median
cpt_result <- cpt.mean(cpt_series, method = "PELT", penalty = "BIC",
minseglen = 7) # minimum 7 days per segment
# Extract changepoint indices and map to dates
cpt_indices <- cpts(cpt_result)
cpt_dates <- qpcr_wval$date[cpt_indices]
cat("Changepoint dates detected:\n")## Changepoint dates detected:
## [1] "2021-12-13"
# Vertical lines at changepoints + segment means overlaid
cpt_df <- tibble(date = cpt_dates)
ggplot(qpcr_wval, aes(date)) +
geom_point(aes(y = log10_conc), colour = "grey70", size = 1.2, alpha = 0.6) +
geom_line(aes(y = roll7_median), colour = "steelblue", linewidth = 1) +
geom_vline(data = cpt_df, aes(xintercept = date),
linetype = "dashed", colour = "firebrick", linewidth = 0.8) +
geom_label_repel(data = cpt_df,
aes(x = date, y = max(qpcr_wval$log10_conc, na.rm = TRUE) - 0.3,
label = format(date, "%b %d")),
colour = "firebrick", size = 3.2) +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
labs(
title = "Changepoint detection (PELT-BIC) on 7-day rolling median log₁₀ concentration",
subtitle = "Red dashed lines = structural changepoints; blue = smoothed signal",
x = NULL, y = "log₁₀ copies / L"
) +
theme(axis.text.x = element_text(angle = 30, hjust = 1))Applying Mann-Kendall testing and PELT changepoint detection to the log₁₀-transformed concentration series from Point Loma, which description best characterises the directionality and structural transitions of the wastewater signal?
A) A significant positive monotonic trend across the entire five-month window with no structural breakpoints — indicating uninterrupted exponential growth throughout.
B) No statistically significant trend in any sub-period; the detected changepoints reflect only technical noise, not genuine epidemiological transitions.
C) An initial non-significant or weakly negative trend (Oct–Dec) followed by a significant positive trend (Dec–Jan) and a subsequent structural break marking a declining phase (Jan–Feb) — changepoints correspond to the onset and peak of the Omicron BA.1 surge.
D) A significant negative monotonic trend across the full window driven by consistent decline throughout; changepoints are absent because the wave had already peaked before October 2021.
Freyja uses a lineage-weighted least-squares approach to deconvolute
the mixed amplicon sequencing signal in wastewater into relative
abundances of SARS-CoV-2 Pango lineages. The residual
column measures the fraction of reads unexplained by the current
reference set — high residuals warrant caution.
For this analysis we:
Lineage-to-variant mapping (based on Pango nomenclature for this period):
# ── Identify lineage abundance columns (not date/site/residual) ───────────────
meta_cols <- c("date", "site", "residual")
lineage_cols <- setdiff(names(freyja), meta_cols)
# ── Long format ───────────────────────────────────────────────────────────────
freyja_long <- freyja %>%
filter(!is.na(residual), residual <= 0.1) %>% # quality filter
pivot_longer(cols = all_of(lineage_cols),
names_to = "lineage",
values_to = "abundance") %>%
filter(!is.na(abundance), abundance > 0)
# ── Assign variant groups ─────────────────────────────────────────────────────
freyja_long <- freyja_long %>%
mutate(
variant_group = case_when(
str_detect(lineage, "^AY\\.|^B\\.1\\.617\\.2") ~ "Delta",
str_detect(lineage, "^BA\\.1|^B\\.1\\.1\\.529")~ "Omicron BA.1",
str_detect(lineage, "^BA\\.2") ~ "Omicron BA.2",
TRUE ~ "Other"
)
)
# ── Aggregate to variant-group level per day ──────────────────────────────────
freyja_variant <- freyja_long %>%
group_by(date, variant_group) %>%
summarise(abundance = sum(abundance, na.rm = TRUE), .groups = "drop") %>%
# Renormalise per day to sum to 1
group_by(date) %>%
mutate(abundance = abundance / sum(abundance)) %>%
ungroup()variant_colours <- c(
"Delta" = "#E41A1C",
"Omicron BA.1" = "#377EB8",
"Omicron BA.2" = "#4DAF4A",
"Other" = "#999999"
)
ggplot(freyja_variant,
aes(date, abundance, fill = variant_group)) +
geom_area(alpha = 0.85, colour = "white", linewidth = 0.2, position = "stack") +
scale_fill_manual(values = variant_colours, name = "Variant") +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
scale_y_continuous(labels = percent_format()) +
labs(
title = "SARS-CoV-2 variant composition in wastewater (Freyja deconvolution)",
subtitle = "Point Loma WWTP | residual-filtered (≤ 0.10)",
x = NULL, y = "Relative abundance"
) +
theme(axis.text.x = element_text(angle = 30, hjust = 1),
legend.position = "top")# Identify when Omicron BA.1 first exceeded Delta
crossover <- freyja_variant %>%
pivot_wider(names_from = variant_group, values_from = abundance,
values_fill = 0) %>%
filter(`Omicron BA.1` > Delta) %>%
slice_min(date, n = 1)
cat("Omicron BA.1 first exceeded Delta in wastewater on:",
format(crossover$date, "%B %d, %Y"), "\n")## Omicron BA.1 first exceeded Delta in wastewater on: December 12, 2021
Based on Freyja deconvolution of the wastewater metagenome at Point Loma, which statement best describes the variant succession dynamics during October 2021 – February 2022?
A) Delta maintained dominance (> 50 % abundance) through the end of February 2022; Omicron was detected only as sporadic low-level contamination below 10 % at any point.
B) Omicron BA.1 and BA.2 emerged simultaneously in early December and co-dominated the community from that point forward, with Delta absent by mid-December.
C) Delta dominated from October through mid-to-late November; Omicron BA.1 then rose rapidly from a cryptic minority signal to community dominance by December–January, followed by early evidence of BA.2 expansion — detectable in wastewater ahead of clinical confirmation.
D) The deconvolution residuals were uniformly high (> 0.3) throughout, rendering all lineage assignments unreliable; no meaningful variant dynamics can be inferred from this dataset.
A core biosurveillance function beyond tracking known variants is early detection of novel or rising lineages before they reach dominance. We operationalise this as:
freyja_residual <- freyja %>%
filter(!is.na(residual)) %>%
select(date, residual) %>%
arrange(date)
# Mann-Kendall on residual series
mk_residual <- mk.test(freyja_residual$residual)
cat("Mann-Kendall on Freyja residual — tau:", round(mk_residual$statistic, 3),
" p =", round(mk_residual$p.value, 4), "\n")## Mann-Kendall on Freyja residual — tau: 1.239 p = 0.2155
ggplot(freyja_residual, aes(date, residual)) +
geom_point(colour = "darkorange", size = 1.8, alpha = 0.7) +
geom_smooth(method = "loess", span = 0.3, colour = "black", linewidth = 0.9,
se = TRUE, fill = "grey80") +
geom_hline(yintercept = 0.1, linetype = "dashed", colour = "firebrick") +
annotate("text", x = max(freyja_residual$date), y = 0.11,
label = "0.10 QC threshold", hjust = 1, size = 3, colour = "firebrick") +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
labs(
title = "Freyja deconvolution residual over time",
subtitle = "Rising residual can signal emergence of lineages absent from reference set",
x = NULL, y = "Residual (unexplained fraction)"
) +
theme(axis.text.x = element_text(angle = 30, hjust = 1))# First detection (> 1 % abundance) for each variant group
first_detections <- freyja_variant %>%
filter(abundance > 0.01) %>%
group_by(variant_group) %>%
slice_min(date, n = 1) %>%
ungroup() %>%
arrange(date)
first_detections %>%
select(variant_group, first_detection_date = date,
abundance_at_detection = abundance) %>%
mutate(abundance_at_detection = percent(abundance_at_detection, accuracy = 0.1)) %>%
kable(caption = "Table 4. First wastewater detection by variant group (> 1 % threshold)") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)| variant_group | first_detection_date | abundance_at_detection |
|---|---|---|
| Delta | 2021-10-04 | 98.3% |
| Other | 2021-10-04 | 1.7% |
| Omicron BA.1 | 2021-11-28 | 2.6% |
| Omicron BA.2 | 2022-02-22 | 7.5% |
# Identify individual low-abundance lineages with rising trends
# Focus on lineages that first appear at < 5 % and trend upward
emerging <- freyja_long %>%
# Keep lineages with at least 5 time points
group_by(lineage) %>%
filter(n() >= 5, mean(abundance, na.rm = TRUE) < 0.05) %>%
summarise(
first_seen = min(date),
last_seen = max(date),
n_timepoints = n(),
mean_abund = mean(abundance, na.rm = TRUE),
mk_tau = mk.test(abundance)$statistic,
mk_p = mk.test(abundance)$p.value,
.groups = "drop"
) %>%
filter(mk_p < 0.05, mk_tau > 0) %>% # significant positive trend
arrange(mk_p) %>%
slice_head(n = 15)
emerging %>%
mutate(across(where(is.numeric), ~round(.x, 4))) %>%
kable(caption = "Table 5. Low-abundance lineages with significant rising trends (p < 0.05, Mann-Kendall)") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)| lineage | first_seen | last_seen | n_timepoints | mean_abund | mk_tau | mk_p |
|---|---|---|---|---|---|---|
| AY.119 | 2021-10-04 | 2021-12-20 | 11 | 0.0302 | 2.5064 | 0.0122 |
Examining first-detection dates, Freyja residual dynamics, and low-abundance lineage trends, which statement best describes the wastewater surveillance system’s capacity for early variant warning at Point Loma during this window?
A) Wastewater sequencing provided no lead time over clinical surveillance: Omicron BA.1 was detected in wastewater only after it had already reached community dominance in clinical case data.
B) Omicron BA.1 appeared in wastewater at detectable but minority abundance (< 10–20 %) while Delta still dominated — and this signal preceded widespread clinical confirmation — demonstrating that wastewater sequencing can detect variant emergence cryptically, before community-level spread is apparent in case reports.
C) The Freyja residual was the primary early warning signal: it exceeded 0.5 for multiple weeks before any Omicron lineage exceeded 1 % abundance, indicating that the reference set failure (not positive abundance calls) was the actionable sentinel.
D) Low-abundance lineage tracking was uninformative because all individual Pango lineages either rose directly to > 50 % or were absent; no lineage was detectable in the 1–20 % range for more than three days.
Wastewater-based epidemiology is most powerful when integrated with clinical surveillance data because:
We integrate the weekly reported San Diego County case data
(cases.csv) with the 7-day rolling wastewater signal,
aligning on the weekly-ending date, and compute:
# ── Aggregate wastewater to weekly averages (align with case cadence) ─────────
ww_weekly <- qpcr_wval %>%
mutate(week_ending = ceiling_date(date, "week") - days(1)) %>%
group_by(week_ending) %>%
summarise(
ww_log10_mean = mean(log10_conc, na.rm = TRUE),
ww_log10_med = median(log10_conc, na.rm = TRUE),
ww_wval_mean = mean(wval_score, na.rm = TRUE),
n_samples = n(),
.groups = "drop"
)
# ── Join with case data ───────────────────────────────────────────────────────
combined <- cases %>%
inner_join(ww_weekly, by = "week_ending") %>%
filter(!is.na(weekly_cases), weekly_cases > 0) %>%
mutate(log10_cases = log10(weekly_cases))
cat("Weeks with matched WW + clinical data:", nrow(combined), "\n")## Weeks with matched WW + clinical data: 22
# Dual-axis: wastewater (left) + cases (right)
cases_scale <- max(combined$log10_cases, na.rm = TRUE) /
max(combined$ww_log10_mean, na.rm = TRUE)
ggplot(combined, aes(week_ending)) +
geom_col(aes(y = log10_cases / cases_scale), fill = "#FDAE61",
alpha = 0.6, width = 5) +
geom_line(aes(y = ww_log10_mean), colour = "steelblue",
linewidth = 1.2) +
geom_point(aes(y = ww_log10_mean), colour = "steelblue", size = 2) +
scale_y_continuous(
name = "Wastewater log₁₀ concentration (copies/L)",
sec.axis = sec_axis(~ . * cases_scale, name = "log₁₀ weekly cases")
) +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
labs(
title = "Wastewater signal vs clinical COVID-19 cases — San Diego County",
subtitle = "Blue line = WW log₁₀ concentration; Orange bars = log₁₀ weekly reported cases",
x = NULL
) +
theme(axis.text.x = element_text(angle = 30, hjust = 1),
axis.title.y.right = element_text(colour = "#FDAE61"),
axis.title.y.left = element_text(colour = "steelblue"))# ── Cross-correlation function (WW leads cases at positive lags) ───────────────
# Interpolate to fill any date gaps
full_dates <- tibble(week_ending = seq(min(combined$week_ending),
max(combined$week_ending),
by = "week"))
combined_full <- full_dates %>%
left_join(combined, by = "week_ending") %>%
mutate(across(c(ww_log10_mean, log10_cases), ~na.approx(.x, na.rm = FALSE)))
ccf_result <- ccf(
combined_full$ww_log10_mean,
combined_full$log10_cases,
lag.max = 6,
plot = FALSE,
na.action = na.pass
)
ccf_df <- tibble(
lag_weeks = ccf_result$lag[,,1],
lag_days = lag_weeks * 7,
acf = ccf_result$acf[,,1]
)
ggplot(ccf_df, aes(lag_days, acf)) +
geom_col(fill = "steelblue", alpha = 0.7, width = 5) +
geom_hline(yintercept = c(-1, 1) * qnorm(0.975) / sqrt(ccf_result$n.used),
linetype = "dashed", colour = "firebrick") +
geom_vline(xintercept = 0, colour = "grey40") +
labs(
title = "Cross-correlation: wastewater log₁₀ concentration vs log₁₀ weekly cases",
subtitle = "Positive lag (right of zero) = wastewater leads cases",
x = "Lag (days; positive = WW leads cases)", y = "Cross-correlation"
)# Pearson correlations at specified lags (WW leads by 0, 7, 14 days)
cor_lags <- tibble(
lag_days = c(0, 7, 14),
lag_weeks = lag_days / 7
) %>%
mutate(pearson_r = map_dbl(lag_weeks, function(lag_w) {
ww_shifted <- combined_full %>%
mutate(ww_shifted = lag(ww_log10_mean, n = as.integer(lag_w)))
cor(ww_shifted$ww_shifted, ww_shifted$log10_cases,
use = "complete.obs", method = "pearson")
}))
cor_lags %>%
mutate(pearson_r = round(pearson_r, 3)) %>%
kable(caption = "Table 6. Pearson correlation at WW lead lags of 0, 7, 14 days") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)| lag_days | lag_weeks | pearson_r |
|---|---|---|
| 0 | 0 | 0.813 |
| 7 | 1 | 0.657 |
| 14 | 2 | 0.472 |
# Overlay variant turnover with case trajectory
variant_weekly <- freyja_variant %>%
mutate(week_ending = ceiling_date(date, "week") - days(1)) %>%
group_by(week_ending, variant_group) %>%
summarise(abundance = mean(abundance, na.rm = TRUE), .groups = "drop") %>%
group_by(week_ending) %>%
mutate(abundance = abundance / sum(abundance)) %>%
ungroup()
p_vt <- ggplot(variant_weekly, aes(week_ending, abundance, fill = variant_group)) +
geom_area(position = "stack", alpha = 0.8, colour = "white", linewidth = 0.2) +
scale_fill_manual(values = variant_colours, name = NULL) +
scale_y_continuous(labels = percent_format()) +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
labs(title = "Weekly variant composition", x = NULL, y = "Proportion") +
theme(axis.text.x = element_text(angle = 30, hjust = 1), legend.position = "top")
p_cases <- ggplot(combined, aes(week_ending, weekly_cases)) +
geom_col(fill = "#FDAE61", alpha = 0.8) +
scale_x_date(date_breaks = "2 weeks", date_labels = "%b %d") +
scale_y_continuous(labels = comma) +
labs(title = "Weekly reported COVID-19 cases (San Diego County)",
x = NULL, y = "Weekly cases") +
theme(axis.text.x = element_text(angle = 30, hjust = 1))
p_vt / p_casesIntegrating the wastewater viral load trajectory, Freyja variant composition dynamics, and weekly clinical case counts, which statement best captures the public health value of wastewater surveillance at Point Loma during this window?
A) Wastewater and clinical case signals rose and fell in perfect synchrony (zero-lag peak correlation), and variant succession in wastewater mirrored clinical genotyping data simultaneously — indicating WBE added no meaningful lead time advantage over case-based surveillance.
B) The wastewater viral load signal led the clinical case surge by approximately one to two weeks, and Omicron BA.1 was detectable in wastewater at minority abundance while Delta still dominated clinical genotyping reports — demonstrating that integrated WBE provides earlier warning of both epidemic intensity and variant emergence than case-based surveillance alone.
C) The clinical case signal consistently led the wastewater signal by approximately three weeks, suggesting that symptomatic individuals sought testing and drove clinical counts before their wastewater shedding became measurable at the sewershed level.
D) The cross-correlation was not significant at any lag, indicating the wastewater signal and case counts tracked entirely independent phenomena (e.g., non-infectious shedding, industrial contamination) with no epidemiological connection.
## R version 4.6.0 (2026-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26200)
##
## Matrix products: default
## LAPACK version 3.12.1
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] RColorBrewer_1.1-3 viridis_0.6.5 viridisLite_0.4.3 ggrepel_0.9.8
## [5] patchwork_1.3.2 scales_1.4.0 kableExtra_1.4.0 knitr_1.51
## [9] broom_1.0.12 trend_1.1.6 changepoint_2.3 TTR_0.24.4
## [13] slider_0.3.3 zoo_1.8-15 lubridate_1.9.5 forcats_1.0.1
## [17] stringr_1.6.0 dplyr_1.2.1 purrr_1.2.2 readr_2.2.0
## [21] tidyr_1.3.2 tibble_3.3.1 ggplot2_4.0.3 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.57 bslib_0.10.0
## [4] lattice_0.22-9 tzdb_0.5.0 vctrs_0.7.3
## [7] tools_4.6.0 generics_0.1.4 parallel_4.6.0
## [10] curl_7.1.0 xts_0.14.2 pkgconfig_2.0.3
## [13] Matrix_1.7-5 S7_0.2.2 lifecycle_1.0.5
## [16] compiler_4.6.0 farver_2.1.2 textshaping_1.0.5
## [19] htmltools_0.5.9 sass_0.4.10 yaml_2.3.12
## [22] crayon_1.5.3 pillar_1.11.1 jquerylib_0.1.4
## [25] cachem_1.1.0 nlme_3.1-169 tidyselect_1.2.1
## [28] digest_0.6.39 stringi_1.8.7 splines_4.6.0
## [31] labeling_0.4.3 fastmap_1.2.0 grid_4.6.0
## [34] cli_3.6.6 magrittr_2.0.5 withr_3.0.2
## [37] backports_1.5.1 warp_0.2.3 bit64_4.8.0
## [40] timechange_0.4.0 rmarkdown_2.31 extraDistr_1.10.0.3
## [43] bit_4.6.0 gridExtra_2.3 hms_1.1.4
## [46] evaluate_1.0.5 mgcv_1.9-4 rlang_1.2.0
## [49] Rcpp_1.1.1-1.1 glue_1.8.1 xml2_1.5.2
## [52] svglite_2.2.2 rstudioapi_0.18.0 vroom_1.7.1
## [55] jsonlite_2.0.0 R6_2.6.1 systemfonts_1.3.2