The NBN API (application programming interface) is a means of programmatically querying the NBN atlas database across the web. Queries take the form of a web link (URL) which can be modified to download relevant data directly from the web in to R for further analysis.
The NBN atlas API is documented here - this gives details of how to build up your query depending on the information you require. This query is known as an API call.
The basic workflow for getting relevant data is:
Identify the relevant URI (the base URL for the API call)
Identify the relevant parts pf the API call which need to be populated i.e. your search terms, number of records, download format (if the API allows different formats) etc.
You may need to convert your API call results from
json
format1
Some APIs may require you to obtain an API key or add your email address - doing so may speed up access or allow bigger downloads
The NBN atlas API is well documented with examples of each kind of query you can construct.
For example, lets say we want to extract occurrence records in a 1 km buffer around point 51.507, 0.1287, the base uri would be
https://records-ws.nbnatlas.org/occurrences/search?q=*:*&” - this means search for any species.
We can build this up to specify the lat, lon and the number of records to retrieve so your string would look like:-
which means retrieve 1000 records from a 1km buffer around 51.507, 0.1287
Pasting the link into a web browser returns something like the figure below:
This is json data - which contains the data for each occurrence. We
can parse this in R with various packages - jsonlite
is
useful as follows…
search <- "https://records-ws.nbnatlas.org/occurrences/search?q=*:*&lat=51.507&lon=0.1278&radius=1&pageSize=1000"
df <- jsonlite::fromJSON(search, simplifyDataFrame = TRUE) ## the data is pulled into an object called `occurrences`
df <- df$occurrences
search <- "https://records-ws.nbnatlas.org/occurrences/search?q=*:*&lat=51.507&lon=0.1278&radius=1&pageSize=1000"
df <- jsonlite::fromJSON(search, simplifyDataFrame = TRUE) ## the data is pulled into an object called `occurrences`
df <- df$occurrences
This retrieves all the fields for each of the 1000 records. The dataset can be narrowed down in the usual way.
df_sp_yr_mon <- df |>
filter(between(year, 2015, 2022),
classs == "Aves") |>
dplyr::select(contains("Decimal"), species, vernacularName, year, month)
It is now relatively simple to undertake further analyses e.g. richness or diversity, and visualise or map the data.
Mapping the data is very straightforward with the sf
(simple feature) package.
First we convert our data to an sf
object with the
st_as_sf
function which creates a geometry list
column and requires a pre-specified crs
df_sp_yr_mon_sf <- st_as_sf(df_sp_yr_mon, coords = c("decimalLongitude", "decimalLatitude"), crs = 4326)
We can now plot the distribution of (say) Blackbirds with the
mapview
package.2
library(mapview)
df_sp_yr_mon_sf |>
mutate(year = factor(year)) |>
filter(vernacularName == "Blackbird") |>
count(geometry)|>
rename(blackbirds = n) |>
mapview(zcol = "blackbirds", label = "blackbirds",
legend = FALSE)
or create a side by side map…
winter <- df_sp_yr_mon_sf |>
filter(month %in% c(10, 11, 12, 1, 2, 3)) |>
count(geometry)
summer <- df_sp_yr_mon_sf |>
filter(!month %in% c(10, 11, 12, 1, 2, 3)) |>
count(geometry)
mapview(winter, zcol = "n") | mapview(summer, zcol = "n")
There are other mapping approaches like tmap
,
ggplot
with ggspatial
and
ggmap
.
purrr:map
to extract
multiple datasetsIf you want to retrieve data for multiple points it is pretty straightforward to build a function with takes lat, long as parameters and which can be iterated over each location.
get_nbn_buffer <- function(lon, lat, radius = 1, n = 1000){
library(dplyr)
require(jsonlite)
require(tictoc) ## a timer function
tic() ## this starts a timer
base_url <- "https://records-ws.nbnatlas.org/occurrences/search?q=*:*&"
search <- paste0(base_url, "lat=", lat, "&lon=", lon, "&radius=", radius, "&pageSize=", n)
df <- fromJSON(search, simplifyDataFrame = TRUE)
toc() ## ends the timer
df$occurrences |>
dplyr::select(kingdom:genus, contains("decimal"), year, month, dataProviderName, speciesGroups, vernacularName, species)
}
library(dplyr)
test <- get_nbn_buffer(lon = 0.1278, lat = 51.507, radius = 1, n = 1000) |>
head()
## 3.114 sec elapsed
test
## kingdom phylum classs order family genus
## 1 Animalia Chordata Aves Charadriiformes Laridae Chroicocephalus
## 2 Animalia Chordata Aves Gruiformes Rallidae Gallinula
## 3 Animalia Chordata Aves Gruiformes Rallidae Fulica
## 4 Animalia Chordata Aves Charadriiformes Laridae Sterna
## 5 Animalia Chordata Aves Charadriiformes Laridae Larus
## 6 Animalia Chordata Aves Anseriformes Anatidae Branta
## decimalLatitude decimalLongitude year month dataProviderName
## 1 51.50393 0.138217 2010 04 British Trust for Ornithology
## 2 51.50393 0.138217 2016 01 British Trust for Ornithology
## 3 51.51318 0.124237 2018 07 British Trust for Ornithology
## 4 51.50393 0.138217 2010 07 British Trust for Ornithology
## 5 51.50393 0.138217 2010 02 British Trust for Ornithology
## 6 51.50393 0.138217 2009 10 British Trust for Ornithology
## speciesGroups vernacularName species
## 1 Animals, Birds Black-headed Gull Chroicocephalus ridibundus
## 2 Animals, Birds Moorhen Gallinula chloropus
## 3 Animals, Birds Coot Fulica atra
## 4 Animals, Birds Common Tern Sterna hirundo
## 5 Animals, Birds Lesser Black-backed Gull Larus fuscus
## 6 Animals, Birds Canada Goose Branta canadensis
This extracted 1,000 records in about 2 seconds.
Lets say we have a list of lat, longs
tf_lat_long <- read_csv("~/Library/CloudStorage/GoogleDrive-julian.flowers12@gmail.com/My Drive/tiny-forest-project/data/tf_lat_long.csv")
## iterate over first 3 records
x <- 1:3
## runs `get_nbn_buffer` over 1st 3 records and compile into a data frame
sp_area <- purrr::map_dfr(x, ~(get_nbn_buffer(lon = tf_lat_long$long[.x], lat = tf_lat_long$lat[.x])))
## 2.913 sec elapsed
## 3.24 sec elapsed
## 2.865 sec elapsed
sp_area |>
DT::datatable()