Data Source and Selection Criteria

For our boosted regression tree (BRT) modeling framework, we utilized in the BTR model, 2025 coral cover data of shallow reefs (<30m) from the National Coral Reef Monitoring Program (NCRMP) for St. Thomas and St. John (STTSTJ), U.S. Virgin Islands, and 2022 coral cover data of deep fees (>30m and <60) from the DCRMP from XXX

## Rows: 13,744
## Columns: 16
## $ region              <chr> "STTSTJ", "STTSTJ", "STTSTJ", "STTSTJ", "STTSTJ", …
## $ year                <dbl> 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 20…
## $ sub_region_name     <chr> "MSR", "MSR", "MSR", "MSR", "MSR", "MSR", "MSR", "…
## $ admin               <chr> "OPEN", "OPEN", "OPEN", "OPEN", "OPEN", "OPEN", "O…
## $ primary_sample_unit <dbl> 8947, 8947, 8947, 8947, 8947, 8947, 8947, 8947, 89…
## $ lat_degrees         <dbl> 18.2758, 18.2758, 18.2758, 18.2758, 18.2758, 18.27…
## $ lon_degrees         <dbl> -64.9678, -64.9678, -64.9678, -64.9678, -64.9678, …
## $ min_depth           <dbl> 27.4320, 27.4320, 27.4320, 27.4320, 27.4320, 27.43…
## $ max_depth           <dbl> 29.2608, 29.2608, 29.2608, 29.2608, 29.2608, 29.26…
## $ analysis_stratum    <chr> "AGRF_DEEP", "AGRF_DEEP", "AGRF_DEEP", "AGRF_DEEP"…
## $ strat               <chr> "AGRF_DEEP", "AGRF_DEEP", "AGRF_DEEP", "AGRF_DEEP"…
## $ habitat_cd          <chr> "AGRF", "AGRF", "AGRF", "AGRF", "AGRF", "AGRF", "A…
## $ prot                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ cover_group         <chr> "CCA", "HARD CORALS", "MACROALGAE", "OTHER", "SOFT…
## $ percent_cvr         <dbl> 2.000000, 9.000000, 41.000000, 13.000000, 0.000000…
## $ n                   <dbl> 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1,…
## Rows: 2,094
## Columns: 10
## $ lat             <dbl> 18.23582, 18.23582, 18.23582, 18.23582, 18.23582, 18.2…
## $ lon             <dbl> -64.66067, -64.66067, -64.66067, -64.66067, -64.66067,…
## $ site            <chr> "7147U_2020", "7147U_2020", "7147U_2020", "7147U_2020"…
## $ year            <dbl> 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, …
## $ depth_m         <dbl> 50.9016, 50.9016, 50.9016, 50.9016, 50.9016, 50.9016, …
## $ max_hard_relief <dbl> 0.38, 0.38, 0.38, 0.38, 0.38, 0.38, 0.65, 0.65, 0.65, …
## $ max_soft_relief <dbl> 0.725, 0.725, 0.725, 0.725, 0.725, 0.725, 0.300, 0.300…
## $ avg_hard_relief <dbl> 0.16625, 0.16625, 0.16625, 0.16625, 0.16625, 0.16625, …
## $ cover_group     <chr> "pct_sand", "pct_hard_bottom", "pct_rubble", "pct_cora…
## $ pct_cover       <dbl> 17.5, 72.5, 10.0, 3.5, 1.0, 7.0, 7.5, 90.0, 2.5, 7.5, …

We selected hard coral cover as a key environmental predictor variable based on its established ecological importance as habitat structure for reef fish assemblages. We filtered the data to include only:

This selection resulted in 276 sampling points for the NCRMP and 89 from the DCRMP with complete geographic coordinates, ready for spatial integration with the 50m × 50m modeling grid.

Each observation represents percent cover at a primary sampling unit (PSU), with associated metadata including: - Spatial information: Latitude and longitude (decimal degrees), depth range - Stratification variables: Analysis stratum, habitat code, protection status - Administrative data: Sub-region, management designation

rename and aligne names and Stack Data with bind_rows()

Combine the dataframes vertically (stack). The result keeps all rows and columns, filling NAs where needed.

NCRMP_raw <- NCRMP_raw %>%
  rename(
    psu = primary_sample_unit,
    lat = lat_degrees,
    lon = lon_degrees,
    pct_cover = percent_cvr,
    depth = max_depth
    
  ) %>%
  mutate(habitat_depth = "shallow")

DCRMP_coralcover <- DCRMP_coralcover %>%
  rename(
    psu = site,
    depth = depth_m
  ) %>%
  mutate(habitat_depth = "deep")

# Coerce to character in both data frames
DCRMP_coralcover <- DCRMP_coralcover %>%
  mutate(psu = as.character(psu))

NCRMP_raw <- NCRMP_raw %>%
  mutate(psu = as.character(psu))

combined_coral <- bind_rows(DCRMP_coralcover, NCRMP_raw) %>%
  select(lat, lon, psu, year, depth, cover_group, pct_cover, habitat_depth) %>%
  mutate(cover_group = case_when(
    cover_group == "pct_coral" ~ "HARD CORALS",
    # ... (rest as above)
    TRUE ~ cover_group
  )) %>%
  filter(cover_group == "HARD CORALS")

glimpse(combined_coral)
## Rows: 2,067
## Columns: 8
## $ lat           <dbl> 18.23582, 18.23883, 18.19974, 18.20675, 18.22998, 18.238…
## $ lon           <dbl> -64.66067, -64.69121, -64.73132, -64.79445, -64.67952, -…
## $ psu           <chr> "7147U_2020", "7095U_2020", "7099U_2020", "7522_2021", "…
## $ year          <dbl> 2020, 2020, 2020, 2021, 2021, 2018, 2018, 2018, 2022, 20…
## $ depth         <dbl> 50.9016, 50.5968, 50.5968, 50.5968, 50.5968, 50.1396, 49…
## $ cover_group   <chr> "HARD CORALS", "HARD CORALS", "HARD CORALS", "HARD CORAL…
## $ pct_cover     <dbl> 3.5, 7.5, 5.0, 2.5, 2.0, 5.0, 0.0, 4.5, 1.0, 1.5, 1.5, 2…
## $ habitat_depth <chr> "deep", "deep", "deep", "deep", "deep", "deep", "deep", …

Visualization

Filter data to use in the BRT (2022 and 2025)

Select the most updated data - NCRMP = 2025 - DCRMP = 2022

Tables

by_depth <- coral_model %>%
  group_by(habitat_depth) %>%
  summarise(
    `Sites (n)` = n(),
    `Media (%)` = round(mean(pct_cover, na.rm = TRUE), 2),
    `Desv. Est.` = round(sd(pct_cover, na.rm = TRUE), 2),
    `Mediana (%)` = round(median(pct_cover, na.rm = TRUE), 2),
    `Min (%)` = round(min(pct_cover, na.rm = TRUE), 2),
    `Max (%)` = round(max(pct_cover, na.rm = TRUE), 2),
    `Q25 (%)` = round(quantile(pct_cover, 0.25, na.rm = TRUE), 2),
    `Q75 (%)` = round(quantile(pct_cover, 0.75, na.rm = TRUE), 2)
  )

# Fila con el total general
total_row <- coral_model %>%
  summarise(
    habitat_depth = "Total",
    `Sites (n)` = n(),
    `Media (%)` = round(mean(pct_cover, na.rm = TRUE), 2),
    `Desv. Est.` = round(sd(pct_cover, na.rm = TRUE), 2),
    `Mediana (%)` = round(median(pct_cover, na.rm = TRUE), 2),
    `Min (%)` = round(min(pct_cover, na.rm = TRUE), 2),
    `Max (%)` = round(max(pct_cover, na.rm = TRUE), 2),
    `Q25 (%)` = round(quantile(pct_cover, 0.25, na.rm = TRUE), 2),
    `Q75 (%)` = round(quantile(pct_cover, 0.75, na.rm = TRUE), 2)
  )

# Unimos en una tabla final
combined_stats <- bind_rows(total_row, by_depth)

kable(combined_stats, booktabs = TRUE, 
      caption = "Estadísticos descriptivos del porcentaje de cobertura de coral duro por tipo de arrecife") %>%
  kable_styling(latex_options = c("striped"), full_width = FALSE)
Estadísticos descriptivos del porcentaje de cobertura de coral duro por tipo de arrecife
habitat_depth Sites (n) Media (%) Desv. Est. Mediana (%) Min (%) Max (%) Q25 (%) Q75 (%)
Total 365 4.48 5.31 3.0 0 33.5 1.0 6
deep 89 6.80 7.48 3.5 0 33.5 1.5 11
shallow 276 3.74 4.13 3.0 0 29.0 0.0 5

Export Note: Data successfully exported to the local repository in the project path. The generated file (hard_coral_cover_points.csv) contains r total_points records and is optimized for import via the ArcGIS Value to Point tool, using the reference spatial join variables (lat, lon, pct_cover, and region).

Hard Coral Cover Distribution

Overall Statistics Hard coral cover across the 365 sampling sites exhibited low mean cover with high spatial heterogeneity:

_ Mean cover: 4.48% (SD = 5.31%) - Median cover: 3.00% - Range: 0–33.5% - Interquartile range: 0–5% (Q25 = 1%, Q75 = 6%)

The distribution was right-skewed, with 50% of sites having <=4% hard coral cover and 75% of sites having <=6% cover. This indicates a reef system dominated by low coral cover, consistent with regional patterns of coral decline in the Caribbean. The maximum observed cover of 33% was substantially higher than the mean, suggesting a few sites with relatively healthy coral communities persist within the study area.

Hard Coral Cover Visualization

Spatial Extent

The sampling design covered the insular shelf surrounding St. Thomas and St. John:

  • Latitudinal range: 18.1783°N to 18.4139°N (spanning ~26.1 km)
  • Longitudinal range: -65.1451°W to -64.6379°W (spanning ~46.2 km)

This spatial coverage encompasses the major reef habitats surrounding both islands, including fringing reefs, patch reefs, and shelf-edge habitats at varying depths.

## Reading layer `StThomas_merge' from data source 
##   `G:\Shared drives\NSF CoPE internal\GIS_CoPE\GIS_USVI\0_source_data_usvi\coastlines\intermediate\StThomas_merge.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 898 features and 10 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 279652.3 ymin: 1955154 xmax: 334084.7 ymax: 2037228
## Projected CRS: WGS 84 / UTM zone 20N

## # A tibble: 2 × 2
##   habitat_depth n_points
##   <chr>            <int>
## 1 deep                89
## 2 shallow            276