BOLD (Barcode Of Life Database) is a database of Barcode DNA sequences of georeferenced specimen that closely approximate species.
We use the package bold to download a set of georeferenced sequences for the Pomacanthidae taxon order request.
We filter and mutate georeferenced sequence dataset from boldsystems.org in order to produce a curated dataframe with rows as individual specimen and columns as specimen information. We add a new column sequence with DNA sequences as string.
The function prepare_bold_res apply 5 filters :
marker_codespecies_name informationlat or lon coordinates informationThe grid is composed of nested squares of siteSize meters that we call site. By default, the grid is built on a map in Behrmann projection. In this example we set a grid with sites with a diameter of 260 kilometers.
We gather together specimen from the same species located within the same site of the grid. Then sequences are aligned and nucleotide diversity is calculated for each species within each site.
We assign a mean species nucleotide diversity value to each site in the worldmap grid.
Then, we can print the wordldmap grid of nucleotide diversity.
## OGR data source with driver: ESRI Shapefile
## Source: "/tmp/RtmpyE1UC7", layer: "ne_50m_coastline"
## with 1428 features
## It has 3 fields
## Integer64 fields read as strings: scalerank
## OGR data source with driver: ESRI Shapefile
## Source: "/tmp/RtmpyE1UC7", layer: "ne_50m_rivers_lake_centerlines"
## with 462 features
## It has 32 fields
## Integer64 fields read as strings: ne_id
## OGR data source with driver: ESRI Shapefile
## Source: "/tmp/RtmpyE1UC7", layer: "ne_50m_lakes"
## with 275 features
## It has 35 fields
## Integer64 fields read as strings: scalerank ne_id