This project utilizes data collected by NOAA, available through the package “Reef Visual Census Statistical Package” (RVC) https://grunt.sefsc.noaa.gov/rvc_analysis20/. This data comes from the South Florida National Coral Reef Monitoring Program survey, which dates back to 1999, and includes diver sampled data, stratum data, taxonomic data, and benthic data. Here I use the diver sampled data and taxonomic data to create graphics that visualize the spatial aspect of the data and variations in species abundance over a span of years.
I was inspired to design this project due to my experience studying coral bleaching, in which I attempt to establish geographical trends in bleaching. Therefore, visualizing the locations of surveys is very important, as well as understanding how abundance and distribution might vary over time. I hypothesized that in creating visuals that addressed the spatial nature of the RVC dataset, I would be able to highlight patterns of geographical and temporal change in species abundances and distribution.
The shapefile of the Florida Keys/coastline was downloaded from the site Natural Earth https://www.naturalearthdata.com/downloads/50m-physical-vectors/. The RVC package was downloaded from github https://github.com/jeremiaheb/rvc.
setwd("/Users/kyra//ne_50m_ocean")
ocean_map <- read_sf("ne_50m_ocean.shp")
fl_keys <- getRvcData(years=2010:2014, regions = c("FLA KEYS"))
dry_tort <- getRvcData(years=2010:2014, regions = c("DRY TORT"))
The data I used spans from 2010-2014, and is segregated based of region (i.e. Florida Keys and Dry Tortugas). Data from the RVC package comes in a list of three dataframes. To use the data more easily, I first created subsets of the list containing the diver sampled data and the taxonomic data. I then subsetted that data further, and finally used left_join to make a single data frame.
Florida Keys:
subset_sample_data <- fl_keys[['sample_data']]
subset_taxonomic_data <- fl_keys[['taxonomic_data']]
subset_taxonomic_data <- subset_taxonomic_data %>%
dplyr::select(SPECIES_CD, SCINAME)
combined_data <- left_join(subset_sample_data, subset_taxonomic_data, by= "SPECIES_CD")
Dry Tortugas:
subset_sample_data_tort <- dry_tort[['sample_data']]
subset_taxonomic_data_tort <- dry_tort[['taxonomic_data']]
subset_taxonomic_data_tort <- subset_taxonomic_data_tort %>%
dplyr::select(SPECIES_CD, SCINAME)
combined_data_tort <- left_join(subset_sample_data_tort, subset_taxonomic_data_tort, by= "SPECIES_CD")
Because the dataset is so large, I chose to focus on the three most frequently surveyed fish species for each region. I then filtered the subsetted data for only those species, and then selected out the data I would need to create my visuals.
Florida Keys:
most_frequent_species <- sort(table(combined_data$SPECIES_CD), decreasing=TRUE)[1:3]
most_frequent_species
##
## THA BIFA SPA AURO SCA ISER
## 19691 18878 17686
most_frequent_species_data <- combined_data%>%
filter(SPECIES_CD=="THA BIFA"|SPECIES_CD=="SPA AURO"|SPECIES_CD=="SCA ISER")%>%
dplyr::select("LON_DEGREES", "LAT_DEGREES", "SCINAME","YEAR")
Dry Tortugas:
most_frequent_species_tort <- sort(table(combined_data_tort$SPECIES_CD), decreasing=TRUE)[1:3]
most_frequent_species_tort
##
## SCA ISER THA BIFA OCY CHRY
## 15067 13471 13210
most_frequent_species_data_tort <- combined_data_tort%>%
filter(SPECIES_CD=="SCA ISER"|SPECIES_CD=="THA BIFA"|SPECIES_CD=="OCY CHRY")%>%
dplyr::select("LON_DEGREES", "LAT_DEGREES", "SCINAME","YEAR")
Finally I converted the data to sf.
FLorida Keys:
combined_data_sf <-st_as_sf(x = most_frequent_species_data,
coords=c(x = 'LON_DEGREES', y = 'LAT_DEGREES'),
crs = 4269, remove = F)
Dry Tortugas:
combined_data_sf_tort <-st_as_sf(x = most_frequent_species_data_tort,
coords=c(x = 'LON_DEGREES', y = 'LAT_DEGREES'),
crs = 4269, remove = F)
I then plotted the data to get a sense of the range and distribution of fish species, which revealed that overplotting would be something I needed to address for future plots.
Florida Keys:
test_box1 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))
ggplot()+
geom_sf(data=ocean_map)+
geom_point(aes(most_frequent_species_data$LON_DEGREES, most_frequent_species_data$LAT_DEGREES, colour=(most_frequent_species_data$SCINAME)))+
coord_sf(ylim=test_box1[c(2,4)],xlim=test_box1[c(1,3)])+
geom_jitter()+
labs(title = "Fish Surverys from 2010-2014 along the Florida Keys",
x = "Longitude",
y = "Latitude")
Dry Tortugas:
test_box1 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))
ggplot()+
geom_sf(data=ocean_map)+
geom_point(aes(most_frequent_species_data_tort$LON_DEGREES, most_frequent_species_data_tort$LAT_DEGREES, colour=(most_frequent_species_data_tort$SCINAME)))+
coord_sf(ylim=test_box1[c(2,4)],xlim=test_box1[c(1,3)])+
geom_jitter()+
labs(title = "Fish Surverys from 2010-2014 along the Florida Keys",
x = "Longitude",
y = "Latitude")
The following tables show abundances for the three most surveyed species in the FLorida Keys and Dry Tortugas from 2010-2014. No surveys were performed in the Florida Keys in 2013, and no surveys were performed in the Dry Tortugas in 2011 and 2013.
Florida Keys:
THA BIFA=Thalassoma bifasciatu
SPA AURO=Sparisoma aurofrenatum
SCA ISER=Scarus iseri
fishes <- c("THA BIFA","SPA AURO","SCA ISER")
species_abundance <- getDomainAbundance(fl_keys,fishes)
species_abundanc_table <- knitr::kable(species_abundance[,1:4])
species_abundanc_table
| YEAR | REGION | SPECIES_CD | abundance |
|---|---|---|---|
| 2010 | FLA KEYS | SCA ISER | 34488212 |
| 2010 | FLA KEYS | SPA AURO | 16674636 |
| 2010 | FLA KEYS | THA BIFA | 71999012 |
| 2011 | FLA KEYS | SCA ISER | 41864812 |
| 2011 | FLA KEYS | SPA AURO | 23110282 |
| 2011 | FLA KEYS | THA BIFA | 56609678 |
| 2012 | FLA KEYS | SCA ISER | 34969112 |
| 2012 | FLA KEYS | SPA AURO | 21964844 |
| 2012 | FLA KEYS | THA BIFA | 65344040 |
| 2014 | FLA KEYS | SCA ISER | 39861620 |
| 2014 | FLA KEYS | SPA AURO | 29903772 |
| 2014 | FLA KEYS | THA BIFA | 69394638 |
Dry Tortugas
THA BIF=Thalassoma bifasciatu
SCA ISER=Scarus iseri
OCY CHRY=Ocyurus chrysurus
fishes_tort <- c("THA BIFA","SCA ISER","OCY CHRY")
species_abundance_tort <- getDomainAbundance(dry_tort,fishes_tort)
species_abundanc_table_tort <- knitr::kable(species_abundance_tort[,1:4])
species_abundanc_table_tort
| YEAR | REGION | SPECIES_CD | abundance |
|---|---|---|---|
| 2010 | DRY TORT | OCY CHRY | 17567263 |
| 2010 | DRY TORT | SCA ISER | 18683099 |
| 2010 | DRY TORT | THA BIFA | 25986485 |
| 2012 | DRY TORT | OCY CHRY | 16885618 |
| 2012 | DRY TORT | SCA ISER | 26360638 |
| 2012 | DRY TORT | THA BIFA | 24647441 |
| 2014 | DRY TORT | OCY CHRY | 17539848 |
| 2014 | DRY TORT | SCA ISER | 22619720 |
| 2014 | DRY TORT | THA BIFA | 24736443 |
The following graphs display fish species survey distribution according to species and year.
Florida Keys:
test_box1 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))
ggplot()+
geom_sf(data=ocean_map)+
stat_bin2d(data=combined_data_sf, aes(y=st_coordinates(combined_data_sf)[,2], x=st_coordinates(combined_data_sf)[,1]), bins=20)+
facet_grid(YEAR ~ SCINAME)+
scale_fill_distiller(palette="YlOrRd", trans="log", direction=-1, breaks= c(1,10,100,1000))+
coord_sf(ylim=test_box1[c(2,4)],xlim=test_box1[c(1,3)])+
labs(title = "Fish Surverys from 2010-2014 along the Florida Keys",
x = "Longitude",
y = "Latitude")+
theme(axis.text.x = element_text(size=5))
Here are survey distribution visuals broken out by species for better visualization:
Due to the small size of the Dry Tortugas reefs, here is a graph for geographical context of the survey sites, and a cropped graph which allows for better visualization of abundance/distribution.
Dry Tortugas:
test_box3 <- st_bbox(c(xmin = -83, xmax = -78, ymax = 27, ymin = 22.5), crs = st_crs(4269))
ggplot()+
geom_sf(data=ocean_map)+
stat_bin2d(data=combined_data_sf_tort, aes(y=st_coordinates(combined_data_sf_tort)[,2], x=st_coordinates(combined_data_sf_tort)[,1]), bins=20)+
scale_fill_distiller(palette="RdPu", trans="log", direction=-1)+
coord_sf(ylim=test_box3[c(2,4)],xlim=test_box3[c(1,3)])+
labs(title = "Fish Surverys from 2010-2014 along Dry Tortugas",
x = "Longitude",
y = "Latitude")
test_box2 <- st_bbox(c(xmin = -83.5, xmax = -82, ymax = 25, ymin = 24.2), crs = st_crs(4269))
ggplot()+
geom_sf(data=ocean_map)+
stat_bin2d(data=combined_data_sf_tort, aes(y=st_coordinates(combined_data_sf_tort)[,2], x=st_coordinates(combined_data_sf_tort)[,1]), bins=20)+
facet_grid(YEAR ~ SCINAME)+
scale_fill_distiller(palette="RdPu", trans="log", direction=-1)+
coord_sf(ylim=test_box2[c(2,4)],xlim=test_box2[c(1,3)])+
labs(title = "Fish Surverys from 2010-2014 along Dry Tortugas",
x = "Longitude",
y = "Latitude")+
theme(axis.text.x = element_text(size=4))
The RVC dataset that I chose to work with is very large, even when subsetted. Consequently visualizing the spatial nature of the survey data was more complicated than I had anticipated. Ultimately the graphics did highlight trends of how species were distributed across reefs, which combined with research on variation in species abundance, might provide useful insight. For example, in 2010 Ocyurus chrysurus had very low abundance towards the southwest of the Dry Tortugas. This could potentially indicate unfavorable conditions in that region of the reef, perhaps owing to tourism and recreational activities concentrated there. By 2014, however, the distribution and abundance of Ocyurus chrysurus have changed significantly. How did conditions change in that 4 year span to influence distribution?Similar spatial shifts and trends are visible for other species as well, raising questions about species abundance and distribution for potential future research.
Another significant trend in the data Thalassoma bifasciatu, Sparisoma aurofrenatum, Scarus iseri all had their lowest abundance in 2010, which can be noted both in the tables of abundances, as well as in the graphics. These species are all highly abundant in the Florida Keys, which may indicate a temporary change in local conditions. Additionally,it seems that these three species of fish tend to concentrate on the most eastern side of the reef, around the upper/middle keys. This could indicate more favorable conditions, perhaps due to the eastern side being further from coastal activities and waste.
It was also interesting to compare the abundance and distributions of Ocyurus chrysurus and Sparisoma aurofrenatum, which localize to only the Dry Tortugas and the Florida Keys, respectively, to those of Thalassoma bifasciatu and Scarus iseri which are present in both regions. Abundances of Sparisoma aurofrenatum and Ocyurus chrysurus were cosistently lower than those of Thalassoma bifasciatu and Scarus iseri.This may indicate that Thalassoma bifasciatu and Scarus iseri are more “generalist” species, capable of thriving in a wider range of conditions.
In the future, it might be interesting to perform such spatial analysis on reef fish species that have lower abundances/are less frequently surveyed, which might indicate more prevalent patterns of geographical/spatial preference.