Introduction

This document presents a power analysis for estimating the Minimum Detectable Effect (MDE) for biomass at the Site Management Unit (SMU) level. The goal is to determine the smallest change in biomass that can be statistically detected based on the number of sampling sites available per SMU.

Three different scenarios are analyzed:

  1. All sites combined per SMU: Estimates MDE for each SMU by treating all sampling sites as a single group.
  2. Control vs. Intervention sites: Splits each SMU’s sites into Control vs. Intervention groups (50/50 split).
  3. Control, Inside Reserve, and Outside Reserve sites: Further splits intervention sites into Inside Reserve vs. Outside Reserve, resulting in three comparison groups.

The estimated MDE values will help assess whether the planned sampling strategy is sufficient to detect meaningful changes in biomass and inform potential adjustments.

Minimum Detectable Effect for Biomass Using All Sites as a Single Group (Per SMU)

This section estimates the MDE for each SMU individually, treating all sampling sites within an SMU as a single group.

# Summarize biomass statistics
biomass_stats <- bind_rows(bio_ma, bio_reserve) %>%
  summarize(
    avg_biomass = mean(biomass_kg_ha, na.rm = TRUE),
    sd_biomass = sd(biomass_kg_ha, na.rm = TRUE)
  )

# Define parameters
alpha <- 0.05
z_alpha <- qnorm(1 - alpha / 2)  # Z value for 95% confidence
power <- 0.8
z_beta <- qnorm(power)           # Z value for 80% power

# Number of sampling sites per SMU
smu_samples <- tibble(
  `Site Management Unit (SMU)` = c("Mideast Negros", "Northeast Negros", "Northwest Cebu", "Southeast Negros"),
  `Total Sampling Sites` = c(39, 53, 203, 75)  # Explicitly defining total sites
)

# Function to calculate MDE
calculate_mde <- function(sd_biomass, n, z_alpha, z_beta) {
  (z_alpha + z_beta) * (sd_biomass / sqrt(n))
}

# Compute MDE for each SMU
mde_results <- smu_samples %>%
  mutate(
    `Minimum Detectable Effect (kg/ha)` = sapply(`Total Sampling Sites`, function(n) calculate_mde(
      sd_biomass = biomass_stats$sd_biomass,
      n = n,
      z_alpha = z_alpha,
      z_beta = z_beta
    )),
    `Percentage Change (%)` = (`Minimum Detectable Effect (kg/ha)` / biomass_stats$avg_biomass) * 100
  )

# Display results with improved column names
mde_results %>%
  kbl(
    caption = "Table 1: Estimated Minimum Detectable Effect (MDE) for biomass at the SMU level using all sites in each SMU.",
    digits = 2
  ) %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Table 1: Estimated Minimum Detectable Effect (MDE) for biomass at the SMU level using all sites in each SMU.
Site Management Unit (SMU) Total Sampling Sites Minimum Detectable Effect (kg/ha) Percentage Change (%)
Mideast Negros 39 1059.69 37.81
Northeast Negros 53 909.02 32.43
Northwest Cebu 203 464.48 16.57
Southeast Negros 75 764.15 27.26

This table presents the MDE for biomass when all sampling sites within each SMU are treated as a single group. The MDE (kg/ha) represents the smallest biomass change that can be statistically detected given the number of sampling sites per SMU. The percentage change column expresses this MDE as a proportion of the average biomass. The results indicate that SMUs with a larger number of sampling sites (e.g., Northwest Cebu) have lower MDE values, meaning they are more likely to detect smaller changes in biomass compared to SMUs with fewer sites.

Minimum Detectable Effect for Biomass with Control vs. Intervention (Per SMU)

This section estimates MDE for Control vs. Intervention comparisons by splitting each SMU’s sites into equal-sized Control and Intervention groups.

# Compute sample size per group (50% control, 50% intervention)
smu_samples <- smu_samples %>%
  mutate(
    `Control Sites` = round(`Total Sampling Sites` / 2),
    `Intervention Sites` = round(`Total Sampling Sites` / 2)
  )

# Function to calculate MDE for two-group comparisons
calculate_mde_two_group <- function(sd_biomass, n, z_alpha, z_beta) {
  (z_alpha + z_beta) * sqrt((2 * (sd_biomass^2)) / n)
}

# Compute MDE for Control vs. Intervention per SMU
mde_results_control_vs_intervention <- smu_samples %>%
  mutate(
    `Minimum Detectable Effect - Control vs. Intervention (kg/ha)` = sapply(`Control Sites`, function(n) calculate_mde_two_group(
      sd_biomass = biomass_stats$sd_biomass,
      n = n,
      z_alpha = z_alpha,
      z_beta = z_beta
    )),
    `Percentage Change - Control vs. Intervention (%)` = (`Minimum Detectable Effect - Control vs. Intervention (kg/ha)` / biomass_stats$avg_biomass) * 100
  )

# Display results with improved column names
mde_results_control_vs_intervention %>%
  select(`Site Management Unit (SMU)`, 
         `Minimum Detectable Effect - Control vs. Intervention (kg/ha)`, 
         `Percentage Change - Control vs. Intervention (%)`) %>%
  kbl(
    caption = "Table 2: Estimated Minimum Detectable Effect (MDE) for biomass at the SMU level, comparing Control vs. Intervention groups.",
    digits = 2
  ) %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Table 2: Estimated Minimum Detectable Effect (MDE) for biomass at the SMU level, comparing Control vs. Intervention groups.
Site Management Unit (SMU) Minimum Detectable Effect - Control vs. Intervention (kg/ha) Percentage Change - Control vs. Intervention (%)
Mideast Negros 2092.72 74.67
Northeast Negros 1835.44 65.49
Northwest Cebu 926.67 33.06
Southeast Negros 1518.22 54.17

This table extends the analysis by splitting each SMU’s sampling sites into equal-sized Control and Intervention groups (50/50 split). Since the total number of samples per group is now half of the original SMU-wide sample, the MDE values are higher than in Table 1, meaning the ability to detect smaller changes is reduced. This highlights that having separate control and intervention groups requires a larger sample size to maintain the same level of statistical power.

Minimum Detectable Effect for Biomass with Control, Inside Reserve, and Outside Reserve (Per SMU)

This section extends the previous analysis by further splitting intervention sites into Inside Reserve vs. Outside Reserve, leading to a three-group comparison.


# Summarize biomass statistics
biomass_stats <- bind_rows(bio_ma, bio_reserve) %>%
  summarize(
    avg_biomass = mean(biomass_kg_ha, na.rm = TRUE),
    sd_biomass = sd(biomass_kg_ha, na.rm = TRUE)
  )

# Define parameters
alpha <- 0.05
z_alpha <- qnorm(1 - alpha / 2)  # Z value for 95% confidence
power <- 0.8
z_beta <- qnorm(power)           # Z value for 80% power

# Define SMU-specific sample sizes
smu_samples <- tibble(
  `Site Management Unit (SMU)` = c("Mideast Negros", "Northeast Negros", "Northwest Cebu", "Southeast Negros"),
  `Total Sampling Sites` = c(39, 53, 203, 75)
) %>%
  mutate(
    `Control Sites` = round(`Total Sampling Sites` / 2),
    `Intervention Sites` = round(`Total Sampling Sites` / 2),
    `Inside Reserve Sites` = round(`Intervention Sites` / 2),
    `Outside Reserve Sites` = round(`Intervention Sites` / 2)
  )

# Function to calculate MDE for two-group comparisons
calculate_mde_two_group <- function(sd_biomass, n, z_alpha, z_beta) {
  (z_alpha + z_beta) * sqrt((2 * (sd_biomass^2)) / n)
}

# Function to calculate MDE for three-group comparisons
calculate_mde_three_group <- function(sd_biomass, n, z_alpha, z_beta) {
  (z_alpha + z_beta) * sqrt((3 * (sd_biomass^2)) / n)
}

# Compute MDE for each SMU
mde_results_three_groups <- smu_samples %>%
  mutate(
    `Minimum Detectable Effect - Control vs. Intervention (kg/ha)` = mapply(calculate_mde_two_group, 
                                         biomass_stats$sd_biomass, `Control Sites`, 
                                         MoreArgs = list(z_alpha = z_alpha, z_beta = z_beta)),
    
    `Minimum Detectable Effect - Inside vs. Outside Reserve (kg/ha)` = mapply(calculate_mde_two_group, 
                                   biomass_stats$sd_biomass, `Inside Reserve Sites`, 
                                   MoreArgs = list(z_alpha = z_alpha, z_beta = z_beta)),

    `Minimum Detectable Effect - Across All Three Groups (kg/ha)` = mapply(calculate_mde_three_group, 
                              biomass_stats$sd_biomass, `Inside Reserve Sites`, 
                              MoreArgs = list(z_alpha = z_alpha, z_beta = z_beta)),

    `Percentage Change - Control vs. Intervention (%)` = (`Minimum Detectable Effect - Control vs. Intervention (kg/ha)` / biomass_stats$avg_biomass) * 100,
    `Percentage Change - Inside vs. Outside Reserve (%)` = (`Minimum Detectable Effect - Inside vs. Outside Reserve (kg/ha)` / biomass_stats$avg_biomass) * 100,
    `Percentage Change - Across All Three Groups (%)` = (`Minimum Detectable Effect - Across All Three Groups (kg/ha)` / biomass_stats$avg_biomass) * 100
  )

# Display results with improved column names
mde_results_three_groups %>%
  select(`Site Management Unit (SMU)`, 
         `Minimum Detectable Effect - Across All Three Groups (kg/ha)`, `Percentage Change - Across All Three Groups (%)`) %>%
  kbl(
    caption = "Table 3: Estimated Minimum Detectable Effect (MDE) for biomass at the SMU level, with Control, Inside Reserve, and Outside Reserve groups.",
    digits = 2
  ) %>%
  kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Table 3: Estimated Minimum Detectable Effect (MDE) for biomass at the SMU level, with Control, Inside Reserve, and Outside Reserve groups.
Site Management Unit (SMU) Minimum Detectable Effect - Across All Three Groups (kg/ha) Percentage Change - Across All Three Groups (%)
Mideast Negros 3624.69 129.32
Northeast Negros 3179.07 113.43
Northwest Cebu 1605.04 57.27
Southeast Negros 2629.63 93.82

This table further divides the intervention sites into Inside Reserve and Outside Reserve, creating a three-group comparison (Control, Inside Reserve, and Outside Reserve). As expected, the MDE values are even higher compared to the two-group scenario in Table 2, since the total number of samples is now split into smaller subgroups. This means that detecting changes in biomass when distinguishing between Inside and Outside Reserve sites requires an even greater sample size to maintain statistical sensitivity.

Considerations for Variability in Biomass Estimates

The biomass measures used to estimate MDE exhibit high variability (mean = 2802.79 kg/ha; sd = 2362.15 kg/ha). This is consistent with global trends in coral reef fish biomass, which fluctuate due to natural variability, habitat complexity, and fishing pressure.

However, if this variability is not representative of the surveyed reefs, the following refinements should be considered:

  • Using a dataset that better represents the surveyed sites.

  • Recalculating MDE based on actual baseline biomass measurements.

These refinements can lead to more accurate estimates and help optimize the sampling design.