Introduction/Background

This project utilizes data collected by NOAA, available through the package “Reef Visual Census Statistical Package” (RVC) https://grunt.sefsc.noaa.gov/rvc_analysis20/. This data comes from the South Florida National Coral Reef Monitoring Program survey, which dates back to 1999, and includes diver sampled data, stratum data, taxonomic data, and benthic data. Here I use the diver sampled data and taxonomic data to create graphics that visualize the spatial aspect of the data and variations in species abundance over a span of years.

I was inspired to design this project due to my experience studying coral bleaching, in which I attempt to establish geographical trends in bleaching. Therefore, visualizing the locations of surveys is very important, as well as understanding how abundance and distribution might vary over time. I hypothesized that in creating visuals that addressed the spatial nature of the RVC dataset, I would be able to highlight patterns of geographical and temporal change in species abundances and distribution.

Materials

The shapefile of the Florida Keys/coastline was downloaded from the site Natural Earth https://www.naturalearthdata.com/downloads/50m-physical-vectors/. The RVC package was downloaded from github https://github.com/jeremiaheb/rvc.

setwd("/Users/kyra//ne_50m_ocean")
ocean_map <- read_sf("ne_50m_ocean.shp")

fl_keys <- getRvcData(years=2010:2014, regions = c("FLA KEYS"))
dry_tort <- getRvcData(years=2010:2014, regions = c("DRY TORT"))

Methods

The data I used spans from 2010-2014, and is segregated based of region (i.e. Florida Keys and Dry Tortugas). Data from the RVC package comes in a list of three dataframes. To use the data more easily, I first created subsets of the list containing the diver sampled data and the taxonomic data. I then subsetted that data further, and finally used left_join to make a single data frame.

Florida Keys:

subset_sample_data <- fl_keys[['sample_data']]
subset_taxonomic_data <- fl_keys[['taxonomic_data']]

subset_taxonomic_data <- subset_taxonomic_data %>%
  dplyr::select(SPECIES_CD, SCINAME)

combined_data <- left_join(subset_sample_data, subset_taxonomic_data, by= "SPECIES_CD")

Dry Tortugas:

subset_sample_data_tort <- dry_tort[['sample_data']]
subset_taxonomic_data_tort <- dry_tort[['taxonomic_data']]

subset_taxonomic_data_tort <- subset_taxonomic_data_tort %>%
  dplyr::select(SPECIES_CD, SCINAME)

combined_data_tort <- left_join(subset_sample_data_tort, subset_taxonomic_data_tort, by= "SPECIES_CD")

Because the dataset is so large, I chose to focus on the three most frequently surveyed fish species for each region. I then filtered the subsetted data for only those species, and then selected out the data I would need to create my visuals.

Florida Keys:

most_frequent_species <- sort(table(combined_data$SPECIES_CD), decreasing=TRUE)[1:3]

most_frequent_species
## 
## THA BIFA SPA AURO SCA ISER 
##    19691    18878    17686
most_frequent_species_data <- combined_data%>% 
  filter(SPECIES_CD=="THA BIFA"|SPECIES_CD=="SPA AURO"|SPECIES_CD=="SCA ISER")%>%
  dplyr::select("LON_DEGREES", "LAT_DEGREES", "SCINAME","YEAR")

Dry Tortugas:

most_frequent_species_tort <- sort(table(combined_data_tort$SPECIES_CD), decreasing=TRUE)[1:3]

most_frequent_species_tort
## 
## SCA ISER THA BIFA OCY CHRY 
##    15067    13471    13210
most_frequent_species_data_tort <- combined_data_tort%>% 
  filter(SPECIES_CD=="SCA ISER"|SPECIES_CD=="THA BIFA"|SPECIES_CD=="OCY CHRY")%>%
  dplyr::select("LON_DEGREES", "LAT_DEGREES", "SCINAME","YEAR")

Finally I converted the data to sf.

FLorida Keys:

combined_data_sf <-st_as_sf(x = most_frequent_species_data,
                            coords=c(x = 'LON_DEGREES', y = 'LAT_DEGREES'),
                            crs = 4269, remove = F)

Dry Tortugas:

combined_data_sf_tort <-st_as_sf(x = most_frequent_species_data_tort,
                            coords=c(x = 'LON_DEGREES', y = 'LAT_DEGREES'),
                            crs = 4269, remove = F)

I then plotted the data to get a sense of the range and distribution of fish species, which revealed that overplotting would be something I needed to address for future plots.

Florida Keys:

test_box1 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))

ggplot()+
  geom_sf(data=ocean_map)+
  geom_point(aes(most_frequent_species_data$LON_DEGREES, most_frequent_species_data$LAT_DEGREES,    colour=(most_frequent_species_data$SCINAME)))+
  coord_sf(ylim=test_box1[c(2,4)],xlim=test_box1[c(1,3)])+
  geom_jitter()+
  labs(title = "Fish Surverys from 2010-2014 along the Florida Keys",
  x = "Longitude",
  y = "Latitude")

Dry Tortugas:

test_box1 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))

ggplot()+
  geom_sf(data=ocean_map)+
  geom_point(aes(most_frequent_species_data_tort$LON_DEGREES, most_frequent_species_data_tort$LAT_DEGREES,    colour=(most_frequent_species_data_tort$SCINAME)))+
  coord_sf(ylim=test_box1[c(2,4)],xlim=test_box1[c(1,3)])+
  geom_jitter()+
  labs(title = "Fish Surverys from 2010-2014 along the Florida Keys",
  x = "Longitude",
  y = "Latitude")

Results

The following tables show abundances for the three most surveyed species in the FLorida Keys and Dry Tortugas from 2010-2014. No surveys were performed in the Florida Keys in 2013, and no surveys were performed in the Dry Tortugas in 2011 and 2013.

Florida Keys:

THA BIFA=Thalassoma bifasciatu

SPA AURO=Sparisoma aurofrenatum

SCA ISER=Scarus iseri

fishes <- c("THA BIFA","SPA AURO","SCA ISER")
species_abundance <- getDomainAbundance(fl_keys,fishes)

species_abundanc_table <- knitr::kable(species_abundance[,1:4])
species_abundanc_table
YEAR REGION SPECIES_CD abundance
2010 FLA KEYS SCA ISER 34488212
2010 FLA KEYS SPA AURO 16674636
2010 FLA KEYS THA BIFA 71999012
2011 FLA KEYS SCA ISER 41864812
2011 FLA KEYS SPA AURO 23110282
2011 FLA KEYS THA BIFA 56609678
2012 FLA KEYS SCA ISER 34969112
2012 FLA KEYS SPA AURO 21964844
2012 FLA KEYS THA BIFA 65344040
2014 FLA KEYS SCA ISER 39861620
2014 FLA KEYS SPA AURO 29903772
2014 FLA KEYS THA BIFA 69394638

Dry Tortugas

THA BIF=Thalassoma bifasciatu

SCA ISER=Scarus iseri

OCY CHRY=Ocyurus chrysurus

fishes_tort <- c("THA BIFA","SCA ISER","OCY CHRY")
species_abundance_tort <- getDomainAbundance(dry_tort,fishes_tort)


species_abundanc_table_tort <- knitr::kable(species_abundance_tort[,1:4])
species_abundanc_table_tort
YEAR REGION SPECIES_CD abundance
2010 DRY TORT OCY CHRY 17567263
2010 DRY TORT SCA ISER 18683099
2010 DRY TORT THA BIFA 25986485
2012 DRY TORT OCY CHRY 16885618
2012 DRY TORT SCA ISER 26360638
2012 DRY TORT THA BIFA 24647441
2014 DRY TORT OCY CHRY 17539848
2014 DRY TORT SCA ISER 22619720
2014 DRY TORT THA BIFA 24736443

The following graphs display fish species survey distribution according to species and year.

Florida Keys:

test_box1 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))

ggplot()+
  geom_sf(data=ocean_map)+
  stat_bin2d(data=combined_data_sf, aes(y=st_coordinates(combined_data_sf)[,2], x=st_coordinates(combined_data_sf)[,1]), bins=20)+ 
  facet_grid(YEAR ~ SCINAME)+
  scale_fill_distiller(palette="YlOrRd", trans="log", direction=-1, breaks= c(1,10,100,1000))+
  coord_sf(ylim=test_box1[c(2,4)],xlim=test_box1[c(1,3)])+
  labs(title = "Fish Surverys from 2010-2014 along the Florida Keys",
  x = "Longitude",
  y = "Latitude")+
  theme(axis.text.x = element_text(size=5))

Here are survey distribution visuals broken out by species for better visualization:

Due to the small size of the Dry Tortugas reefs, here is a graph for geographical context of the survey sites, and a cropped graph which allows for better visualization of abundance/distribution.

Dry Tortugas:

test_box3 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))

ggplot()+
  geom_sf(data=ocean_map)+
  stat_bin2d(data=combined_data_sf_tort, aes(y=st_coordinates(combined_data_sf_tort)[,2], x=st_coordinates(combined_data_sf_tort)[,1]), bins=20)+ 
  scale_fill_distiller(palette="RdPu", trans="log", direction=-1)+
  coord_sf(ylim=test_box3[c(2,4)],xlim=test_box3[c(1,3)])+
  labs(title = "Fish Surverys from 2010-2014 along Dry Tortugas",
  x = "Longitude",
  y = "Latitude")

test_box2 <- st_bbox(c(xmin = -83.5, xmax = -82, ymax = 25, ymin = 24.2), crs = st_crs(4269))

ggplot()+
  geom_sf(data=ocean_map)+
  stat_bin2d(data=combined_data_sf_tort, aes(y=st_coordinates(combined_data_sf_tort)[,2], x=st_coordinates(combined_data_sf_tort)[,1]), bins=20)+ 
  facet_grid(YEAR ~ SCINAME)+
  scale_fill_distiller(palette="RdPu", trans="log", direction=-1)+
  coord_sf(ylim=test_box2[c(2,4)],xlim=test_box2[c(1,3)])+
  labs(title = "Fish Surverys from 2010-2014 along Dry Tortugas",
  x = "Longitude",
  y = "Latitude")+
  theme(axis.text.x = element_text(size=4))

Conclusion

The RVC dataset that I chose to work with is very large, even when subsetted. Consequently visualizing the spatial nature of the survey data was more complicated than I had anticipated. Ultimately the graphics did highlight trends of how species were distributed across reefs, which combined with research on variation in species abundance, might provide useful insight. For example, in 2010 Ocyurus chrysurus had very low abundance towards the southwest of the Dry Tortugas. This could potentially indicate unfavorable conditions in that region of the reef, perhaps owing to tourism and recreational activities concentrated there. By 2014, however, the distribution and abundance of Ocyurus chrysurus have changed significantly. How did conditions change in that 4 year span to influence distribution?Similar spatial shifts and trends are visible for other species as well, raising questions about species abundance and distribution for potential future research.

Another significant trend in the data Thalassoma bifasciatu, Sparisoma aurofrenatum, Scarus iseri all had their lowest abundance in 2010, which can be noted both in the tables of abundances, as well as in the graphics. These species are all highly abundant in the Florida Keys, which may indicate a temporary change in local conditions. Additionally,it seems that these three species of fish tend to concentrate on the most eastern side of the reef, around the upper/middle keys. This could indicate more favorable conditions, perhaps due to the eastern side being further from coastal activities and waste.

It was also interesting to compare the abundance and distributions of Ocyurus chrysurus and Sparisoma aurofrenatum, which localize to only the Dry Tortugas and the Florida Keys, respectively, to those of Thalassoma bifasciatu and Scarus iseri which are present in both regions. Abundances of Sparisoma aurofrenatum and Ocyurus chrysurus were cosistently lower than those of Thalassoma bifasciatu and Scarus iseri.This may indicate that Thalassoma bifasciatu and Scarus iseri are more “generalist” species, capable of thriving in a wider range of conditions.

In the future, it might be interesting to perform such spatial analysis on reef fish species that have lower abundances/are less frequently surveyed, which might indicate more prevalent patterns of geographical/spatial preference.

References

https://grunt.sefsc.noaa.gov/rvc_analysis20/

https://www.naturalearthdata.com/downloads/50m-physical-vectors/

https://github.com/jeremiaheb/rvc