This analysis evaluates how precisely the 2025 monitoring system estimates oyster height and length at each reef and assesses the smallest changes that can be statistically detected given the observed variability and sample sizes. By combining exploratory data analysis, confidence intervals, and minimum detectable effect (MDE) estimates, it provides insight into the system’s ability to detect meaningful biological changes across reefs under the current protocol. It also compares these results with 2024 monitoring performance, indicating that the 2025 design generally improved precision and the ability to detect smaller changes across most reefs.


Exploratory Analysis

A total of 3,916 oysters were measured across 11 reefs and 5 sampling months in 2025.

# Count number of samples per reef and month
sampling_summary <- oyster_biometry %>%
  group_by(oyster_reef_name, sampling_month) %>%
  summarise(samples = n(), .groups = "drop") %>%
  arrange(oyster_reef_name, sampling_month)

ggplot(sampling_summary, aes(x = sampling_month, y = samples)) +
  geom_col(fill = "#e6550d", alpha = 0.7) +
  facet_wrap(~ oyster_reef_name, scales = "fixed") +
  labs(
    title = "Sampling Effort per Reef by Month (2025)",
    x = "Sampling Month",
    y = "Number of Samples"
  ) +
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.text.x = element_text(angle = 45, hjust = 1)
  )

Sampling Effort by Month: Sampling effort in 2025 is uneven across reefs and months. Most observations were collected in May (2,357 measurements), with additional sampling rounds in June, August, September, October. Reefs such as Jacarequara, Marauá, Terra Amarela accumulated large sample sizes overall, while Goiabal, Água Boa, Áries had much more limited coverage. These differences in sampling intensity influence the precision and comparability of reef-level estimates.

# Split reefs into two groups for readability
reef_group_1 <- c("Água Boa", "Aquavila", "Áries", "Goiabal", "Jacarequara", "Lauro Sodré")
reef_group_2 <- c("Marauá", "Pinheiro", "Romana", "Terra Amarela", "Tio Oscar")

# Height distributions - first group
oyster_biometry %>%
  filter(oyster_reef_name %in% reef_group_1) %>%
  ggplot(aes(x = height_mm)) +
  geom_histogram(binwidth = 5, fill = "steelblue", color = "white", alpha = 0.7) +
  facet_grid(oyster_reef_name ~ sampling_month) +
  labs(
    title = "Distribution of Oyster Height by Reef and Month (2025)",
    x = "Oyster Height (mm)",
    y = "Count"
  ) +
  theme_minimal() +
  theme(strip.text.y = element_text(size = 7))


# Height distributions - second group
oyster_biometry %>%
  filter(oyster_reef_name %in% reef_group_2) %>%
  ggplot(aes(x = height_mm)) +
  geom_histogram(binwidth = 5, fill = "steelblue", color = "white", alpha = 0.7) +
  facet_grid(oyster_reef_name ~ sampling_month) +
  labs(
    title = "Distribution of Oyster Height by Reef and Month (2025)",
    x = "Oyster Height (mm)",
    y = "Count"
  ) +
  theme_minimal() +
  theme(strip.text.y = element_text(size = 7))

# Length distributions - first group
oyster_biometry %>%
  filter(oyster_reef_name %in% reef_group_1) %>%
  ggplot(aes(x = length_mm)) +
  geom_histogram(binwidth = 5, fill = "darkgreen", color = "white", alpha = 0.7) +
  facet_grid(oyster_reef_name ~ sampling_month) +
  labs(
    title = "Distribution of Oyster Length by Reef and Month (2025)",
    x = "Oyster Length (mm)",
    y = "Count"
  ) +
  theme_minimal() +
  theme(strip.text.y = element_text(size = 7))


# Length distributions - second group
oyster_biometry %>%
  filter(oyster_reef_name %in% reef_group_2) %>%
  ggplot(aes(x = length_mm)) +
  geom_histogram(binwidth = 5, fill = "darkgreen", color = "white", alpha = 0.7) +
  facet_grid(oyster_reef_name ~ sampling_month) +
  labs(
    title = "Distribution of Oyster Length by Reef and Month (2025)",
    x = "Oyster Length (mm)",
    y = "Count"
  ) +
  theme_minimal() +
  theme(strip.text.y = element_text(size = 7))

Oyster Height and Length Distributions (by Reef and Month): Distributions are generally unimodal and approximately normal across most reef-month combinations, supporting the use of parametric methods for mean-based analyses. Some smaller-sample combinations (especially in reefs such as Goiabal, Água Boa, Áries) show more irregular shapes or visible skewness, suggesting greater uncertainty in those estimates.

# Mean height by reef and month
ggplot(oyster_biometry, aes(x = sampling_month, y = height_mm, group = 1)) +
  stat_summary(fun = mean, geom = "line", color = "steelblue") +
  stat_summary(fun = mean, geom = "point", color = "steelblue") +
  facet_wrap(~ oyster_reef_name) +
  labs(
    title = "Mean Oyster Height by Reef and Month (2025)",
    y = "Mean Height (mm)",
    x = "Month"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

# Mean length by reef and month
ggplot(oyster_biometry, aes(x = sampling_month, y = length_mm, group = 1)) +
  stat_summary(fun = mean, geom = "line", color = "darkgreen") +
  stat_summary(fun = mean, geom = "point", color = "darkgreen") +
  facet_wrap(~ oyster_reef_name) +
  labs(
    title = "Mean Oyster Length by Reef and Month (2025)",
    y = "Mean Length (mm)",
    x = "Month"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Mean Plots (Height and Length): Mean oyster size varies across reefs and, in some cases, across sampling rounds within 2025. These plots provide a simple visual summary of temporal patterns in average oyster height and length across reefs. Differences among reefs are evident, and some reefs also show noticeable changes between sampling months. Statistical precision and uncertainty around these mean estimates are examined in the following section using confidence intervals.

summary_by_reef_month %>%
  select(
    Reef = oyster_reef_name,
    Month = sampling_month,
    `Mean Height (mm)` = mean_height,
    `SD Height (mm)` = sd_height,
    `Mean Length (mm)` = mean_length,
    `SD Length (mm)` = sd_length,
    `Sample Size (n)` = n
  ) %>%
  mutate(across(where(is.numeric), ~ round(.x, 2))) %>%
  kbl(
    caption = "Table: Summary statistics of oyster height and length by reef and month (2025).",
    align = "llccccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  )
Table: Summary statistics of oyster height and length by reef and month (2025).
Reef Month Mean Height (mm) SD Height (mm) Mean Length (mm) SD Length (mm) Sample Size (n)
Aquavila May 33.23 9.52 49.70 13.45 159
Aquavila August 36.54 8.87 58.10 15.92 108
Goiabal May 35.83 7.61 42.09 7.95 46
Goiabal September 28.78 5.29 37.33 6.44 9
Jacarequara May 25.78 6.40 33.43 8.04 556
Jacarequara October 36.73 6.21 44.79 8.39 245
Lauro Sodré June 27.75 7.78 37.48 9.86 161
Lauro Sodré September 26.10 6.39 40.68 7.69 108
Marauá May 38.12 6.74 48.02 8.19 382
Marauá September 36.69 5.52 46.29 6.48 226
Pinheiro May 39.32 9.67 47.16 10.45 231
Pinheiro September 31.21 8.01 45.63 8.49 103
Romana May 29.16 9.87 39.73 12.12 328
Romana August 31.30 8.92 44.83 12.65 163
Terra Amarela May 24.59 6.88 34.04 7.91 610
Terra Amarela October 36.40 6.37 43.83 6.98 179
Tio Oscar June 37.40 7.48 44.78 8.96 129
Tio Oscar September 33.58 8.23 44.20 10.27 76
Água Boa June 26.08 6.63 34.27 6.14 26
Água Boa August 33.82 5.17 40.09 3.96 11
Áries May 24.42 8.35 28.98 9.94 45
Áries August 29.33 7.67 38.40 9.75 15

Summary Table by Reef and Month: The table provides a numeric overview of sample size, mean, and standard deviation for each reef-month combination. It highlights both the central tendencies and variability in oyster dimensions and allows quick identification of where sampling effort was strongest and where additional effort may be needed to improve precision.


Confidence Intervals

Answer: How precisely are we estimating the current average height and length of oysters at each reef under the 2025 protocol?

Note:

  • A narrow CI around the mean means high precision (our estimate is likely close to the true population value).

  • A wide CI means lower precision (our current sample may not be enough to estimate the true mean as reliably).

# Plot reef-level mean height and 95% CI for 2025
ggplot(summary_2025,
       aes(x = reorder(oyster_reef_name, mean_height), y = mean_height)) +
  geom_point(color = "steelblue") +
  geom_errorbar(aes(ymin = ci_lower_height, ymax = ci_upper_height), width = 0.2, color = "steelblue") +
  coord_flip() +
  labs(
    title = "Mean Oyster Height by Reef in 2025 with 95% Confidence Intervals",
    y = "Height (mm)",
    x = "Reef"
  ) +
  theme_minimal()


# Table
summary_2025 %>%
  select(
    Reef = oyster_reef_name,
    `Sample Size (n)` = n,
    `Mean Height (mm)` = mean_height,
    `Lower 95% CI` = ci_lower_height,
    `Upper 95% CI` = ci_upper_height
  ) %>%
  mutate(across(where(is.numeric), ~ round(.x, 2))) %>%
  kbl(
    caption = "Table: Mean Oyster Height and 95% Confidence Intervals by Reef (2025).",
    align = "lcccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  )
Table: Mean Oyster Height and 95% Confidence Intervals by Reef (2025).
Reef Sample Size (n) Mean Height (mm) Lower 95% CI Upper 95% CI
Aquavila 267 34.57 33.44 35.69
Goiabal 55 34.67 32.64 36.71
Jacarequara 801 29.13 28.57 29.69
Lauro Sodré 269 27.09 26.21 27.96
Marauá 608 37.59 37.08 38.09
Pinheiro 334 36.82 35.75 37.88
Romana 491 29.87 29.02 30.72
Terra Amarela 789 27.27 26.68 27.85
Tio Oscar 205 35.99 34.89 37.08
Água Boa 37 28.38 26.08 30.67
Áries 60 25.65 23.52 27.78
# Plot reef-level mean length and 95% CI for 2025
ggplot(summary_2025,
       aes(x = reorder(oyster_reef_name, mean_length), y = mean_length)) +
  geom_point(color = "darkgreen") +
  geom_errorbar(aes(ymin = ci_lower_length, ymax = ci_upper_length), width = 0.2, color = "darkgreen") +
  coord_flip() +
  labs(
    title = "Mean Oyster Length by Reef in 2025 with 95% Confidence Intervals",
    y = "Length (mm)",
    x = "Reef"
  ) +
  theme_minimal()


# Table
summary_2025 %>%
  select(
    Reef = oyster_reef_name,
    `Sample Size (n)` = n,
    `Mean Length (mm)` = mean_length,
    `Lower 95% CI` = ci_lower_length,
    `Upper 95% CI` = ci_upper_length
  ) %>%
  mutate(across(where(is.numeric), ~ round(.x, 2))) %>%
  kbl(
    caption = "Table: Mean Oyster Length and 95% Confidence Intervals by Reef (2025).",
    align = "lcccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  )
Table: Mean Oyster Length and 95% Confidence Intervals by Reef (2025).
Reef Sample Size (n) Mean Length (mm) Lower 95% CI Upper 95% CI
Aquavila 267 53.10 51.30 54.91
Goiabal 55 41.31 39.23 43.39
Jacarequara 801 36.90 36.23 37.57
Lauro Sodré 269 38.76 37.67 39.86
Marauá 608 47.38 46.77 47.98
Pinheiro 334 46.69 45.63 47.75
Romana 491 41.43 40.32 42.53
Terra Amarela 789 36.26 35.65 36.87
Tio Oscar 205 44.56 43.27 45.85
Água Boa 37 36.00 34.02 37.98
Áries 60 31.33 28.64 34.02

Example Interpretation of Confidence Intervals:

In Marauá reef, the mean oyster length in 2025 was estimated at 47.38 mm, with a 95% confidence interval of 46.77 to 47.98 mm. This means we can be 95% confident that the true average oyster length at this reef lies within that interval. The relatively narrow interval indicates that the estimate is statistically precise under the current monitoring design.

By contrast, in Áries reef, the mean oyster length in 2025 was 31.33 mm, with a wider 95% confidence interval of 28.64 to 34.02 mm. This wider interval suggests greater uncertainty, likely due to lower sample size and/or higher variability.

How do we know if a CI is good enough?

One practical benchmark is to check whether the confidence interval falls within a certain percentage of the mean. For example:

  • If we aim for precision within ±10% of the mean, then for a reef with a mean of 40 mm, the ideal CI would fall roughly within 36 to 44 mm.

  • In Marauá, the 95% CI half-width is approximately ±1.3% of the mean, which indicates excellent precision.

  • In Áries, the 95% CI half-width is approximately ±8.6% of the mean, which is still usable but notably less precise than in the best-sampled reefs.


Minimum Detectable Effect

Answer: What is the smallest change in oyster height or length that the 2025 monitoring system can reliably detect at each reef, with 80% power and 5% significance?

Note:

  • An 80% power means there is an 80% chance of detecting a real change of that size if it actually exists (that is, a relatively low risk of a false negative).

  • A 5% significance level means we accept a 5% risk of falsely detecting a change when there is none (that is, a false positive).

mde_2025 %>%
  select(
    Reef = oyster_reef_name,
    `Sample Size (n)` = n,
    `Mean Height (mm)` = mean_height,
    `MDE (mm)` = mde_height_mm,
    `MDE (% of mean)` = mde_height_pct
  ) %>%
  mutate(across(where(is.numeric), ~ round(.x, 2))) %>%
  kbl(
    caption = "Table: Minimum detectable effect in oyster height by reef (2025).",
    align = "lcccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  )
Table: Minimum detectable effect in oyster height by reef (2025).
Reef Sample Size (n) Mean Height (mm) MDE (mm) MDE (% of mean)
Aquavila 267 34.57 1.61 4.66
Goiabal 55 34.67 2.91 8.40
Jacarequara 801 29.13 0.80 2.75
Lauro Sodré 269 27.09 1.24 4.59
Marauá 608 37.59 0.72 1.92
Pinheiro 334 36.82 1.52 4.13
Romana 491 29.87 1.21 4.07
Terra Amarela 789 27.27 0.84 3.07
Tio Oscar 205 35.99 1.56 4.33
Água Boa 37 28.38 3.28 11.57
Áries 60 25.65 3.04 11.84
mde_2025 %>%
  select(
    Reef = oyster_reef_name,
    `Sample Size (n)` = n,
    `Mean Length (mm)` = mean_length,
    `MDE (mm)` = mde_length_mm,
    `MDE (% of mean)` = mde_length_pct
  ) %>%
  mutate(across(where(is.numeric), ~ round(.x, 2))) %>%
  kbl(
    caption = "Table: Minimum detectable effect in oyster length by reef (2025).",
    align = "lcccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  )
Table: Minimum detectable effect in oyster length by reef (2025).
Reef Sample Size (n) Mean Length (mm) MDE (mm) MDE (% of mean)
Aquavila 267 53.10 2.58 4.86
Goiabal 55 41.31 2.97 7.20
Jacarequara 801 36.90 0.96 2.60
Lauro Sodré 269 38.76 1.57 4.04
Marauá 608 47.38 0.87 1.83
Pinheiro 334 46.69 1.52 3.25
Romana 491 41.43 1.58 3.82
Terra Amarela 789 36.26 0.87 2.40
Tio Oscar 205 44.56 1.85 4.15
Água Boa 37 36.00 2.83 7.87
Áries 60 31.33 3.85 12.28

Example Interpretation of Minimum Detectable Effect:

In Marauá reef, the mean oyster length in 2025 was estimated at 47.38 mm, and the current monitoring design allows us to detect a minimum change of 0.87 mm; approximately 1.8% of the mean. This suggests that relatively small shifts in oyster length can be statistically detected at this site, indicating strong sensitivity of the monitoring system.

By contrast, in Áries reef, the mean length was 31.33 mm, but the minimum detectable effect was 3.85 mm, or about 12.3% of the mean. This larger threshold implies that only relatively large changes in oyster length would be statistically detectable there, highlighting lower sensitivity; likely due to limited sample size and/or higher variability.


Conclusion

  • The 2025 monitoring dataset includes 3,916 observations across 11 reefs, providing a strong basis for evaluating current monitoring precision under the updated protocol.

  • Sampling effort is not evenly distributed across reefs. Sites such as Jacarequara, Marauá, Terra Amarela have very strong data coverage, whereas Goiabal, Água Boa, Áries have more limited sample sizes and therefore greater uncertainty.

  • Confidence interval analysis shows that reef-level mean estimates of oyster height and length are generally precise. In several reefs, the 95% confidence interval half-width is below ±5% of the mean, indicating strong statistical precision.

  • MDE analysis suggests that the 2025 design is sensitive enough to detect relatively small changes in oyster size in most reefs, especially where sampling effort was high. In the most precise reefs, detectable changes in length are around 2–4% of the mean.

  • However, lower-sample reefs (particularly Goiabal, Água Boa, Áries) show reduced precision and lower sensitivity. In these reefs, MDE values can rise to roughly 8–12% of the mean, meaning that changes in oyster size must be larger to be statistically detectable.

  • Overall, the 2025 data suggest that the updated protocol is producing analyzable reef-level oyster size data with good precision.


Comparison with 2024 monitoring performance

Relative to 2024, the 2025 monitoring design generally shows improved statistical performance for both oyster height and length across reefs. In most sites, the 95% confidence interval half-width is smaller in 2025, indicating more precise mean estimates, and MDE is lower, meaning the system can detect smaller changes in oyster size than before.

historical_oyster_biometry <- read_excel( here("data", "raw", "BRA_oyster_monitoring_database_2021_2024.xlsx"), sheet = "oyster_biometry" ) %>% 
  clean_names() %>% 
  select( country, ma_name, community, waterbody_name, oyster_reef_name, sampling_day, sampling_month, sampling_year, seasonal_period, sampling_time, stratum_no, quadrat_no, height_mm, length_mm ) 

all_oyster_biometry <- bind_rows( historical_oyster_biometry, oyster_biometry %>% select(names(historical_oyster_biometry)) )

comparison_summary <- all_oyster_biometry %>%
  filter(sampling_year %in% c(2024, 2025)) %>%
  group_by(oyster_reef_name, sampling_year) %>%
  summarise(
    n = n(),
    
    mean_height = mean(height_mm, na.rm = TRUE),
    sd_height = sd(height_mm, na.rm = TRUE),
    se_height = sd_height / sqrt(n),
    ci_halfwidth_height = 1.96 * se_height,
    mde_height_mm = (z_alpha + z_beta) * se_height,
    mde_height_pct = 100 * mde_height_mm / mean_height,
    
    mean_length = mean(length_mm, na.rm = TRUE),
    sd_length = sd(length_mm, na.rm = TRUE),
    se_length = sd_length / sqrt(n),
    ci_halfwidth_length = 1.96 * se_length,
    mde_length_mm = (z_alpha + z_beta) * se_length,
    mde_length_pct = 100 * mde_length_mm / mean_length,
    
    .groups = "drop"
  )

comparison_changes_height <- comparison_summary %>%
  select(
    oyster_reef_name,
    sampling_year,
    n,
    ci_halfwidth_height,
    mde_height_pct
  ) %>%
  pivot_wider(
    names_from = sampling_year,
    values_from = c(n, ci_halfwidth_height, mde_height_pct),
    names_sep = "_"
  ) %>%
  mutate(
    change_n = n_2025 - n_2024,
    change_ci_halfwidth = ci_halfwidth_height_2025 - ci_halfwidth_height_2024,
    change_mde_pct = mde_height_pct_2025 - mde_height_pct_2024
  ) %>%
  arrange(change_mde_pct)

comparison_changes_height %>%
  transmute(
    Reef = oyster_reef_name,
    `n (2024)` = n_2024,
    `n (2025)` = n_2025,
    `CI half-width (2024)` = round(ci_halfwidth_height_2024, 2),
    `CI half-width (2025)` = round(ci_halfwidth_height_2025, 2),
    `MDE % (2024)` = round(mde_height_pct_2024, 2),
    `MDE % (2025)` = round(mde_height_pct_2025, 2)
  ) %>%
  kbl(
    caption = "Table: Comparison of oyster height precision and detectability between 2024 and 2025 by reef.",
    align = "lcccccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  )
Table: Comparison of oyster height precision and detectability between 2024 and 2025 by reef.
Reef n (2024) n (2025) CI half-width (2024) CI half-width (2025) MDE % (2024) MDE % (2025)
Água Boa 43 37 3.85 2.30 18.79 11.57
Goiabal 42 55 3.01 2.04 14.29 8.40
Jacarequara 180 801 1.71 0.56 8.06 2.75
Terra Amarela 239 789 1.61 0.58 7.68 3.07
Marauá 326 608 1.22 0.50 6.22 1.92
Romana 339 491 1.40 0.85 6.87 4.07
Pinheiro 295 334 1.54 1.06 6.88 4.13
Tio Oscar 132 205 1.50 1.09 6.95 4.33
Aquavila 224 267 1.86 1.13 6.31 4.66
Lauro Sodré 206 269 1.32 0.87 6.06 4.59
Áries 85 60 2.33 2.13 12.91 11.84

comparison_changes <- comparison_summary %>%
  select(
    oyster_reef_name,
    sampling_year,
    n,
    ci_halfwidth_length,
    mde_length_pct
  ) %>%
  pivot_wider(
    names_from = sampling_year,
    values_from = c(n, ci_halfwidth_length, mde_length_pct),
    names_sep = "_"
  ) %>%
  mutate(
    change_n = n_2025 - n_2024,
    change_ci_halfwidth = ci_halfwidth_length_2025 - ci_halfwidth_length_2024,
    change_mde_pct = mde_length_pct_2025 - mde_length_pct_2024
  ) %>%
  arrange(change_mde_pct)

comparison_changes %>%
  transmute(
    Reef = oyster_reef_name,
    `n (2024)` = n_2024,
    `n (2025)` = n_2025,
    `CI half-width (2024)` = round(ci_halfwidth_length_2024, 2),
    `CI half-width (2025)` = round(ci_halfwidth_length_2025, 2),
    `MDE % (2024)` = round(mde_length_pct_2024, 2),
    `MDE % (2025)` = round(mde_length_pct_2025, 2)
  ) %>%
  kbl(
    caption = "Table: Comparison of oyster length precision and detectability between 2024 and 2025 by reef.",
    align = "lcccccc",
    booktabs = TRUE
  ) %>%
  kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed", "responsive")
  )
Table: Comparison of oyster length precision and detectability between 2024 and 2025 by reef.
Reef n (2024) n (2025) CI half-width (2024) CI half-width (2025) MDE % (2024) MDE % (2025)
Água Boa 43 37 4.61 1.98 18.05 7.87
Goiabal 42 55 3.95 2.08 14.57 7.20
Jacarequara 180 801 2.31 0.67 8.31 2.60
Terra Amarela 239 789 1.85 0.61 7.35 2.40
Marauá 326 608 1.57 0.61 6.07 1.83
Pinheiro 295 334 1.82 1.06 6.77 3.25
Romana 339 491 1.79 1.11 6.99 3.82
Tio Oscar 132 205 1.75 1.29 6.81 4.15
Lauro Sodré 206 269 1.61 1.10 6.04 4.04
Aquavila 224 267 2.44 1.80 5.85 4.86
Áries 85 60 2.80 2.69 12.24 12.28