Query the NBN atlas API in R

The NBN API

The NBN API (application programming interface) is a means of programmatically querying the NBN atlas database across the web. Queries take the form of a web link (URL) which can be modified to download relevant data directly from the web in to R for further analysis.

The NBN atlas API is documented here - this gives details of how to build up your query depending on the information you require. This query is known as an API call.

The basic workflow for getting relevant data is:

Identify the relevant URI (the base URL for the API call)
Identify the relevant parts pf the API call which need to be populated i.e. your search terms, number of records, download format (if the API allows different formats) etc.
You may need to convert your API call results from json format¹
Some APIs may require you to obtain an API key or add your email address - doing so may speed up access or allow bigger downloads

The NBN atlas API is well documented with examples of each kind of query you can construct.

For example, lets say we want to extract occurrence records in a 1 km buffer around point 51.507, 0.1287, the base uri would be

https://records-ws.nbnatlas.org/occurrences/search?q=*:*&” - this means search for any species.

We can build this up to specify the lat, lon and the number of records to retrieve so your string would look like:-

https://records-ws.nbnatlas.org/occurrences/search?q=*:*&lat=51.507&lon=0.1278&radius=1&pageSize=1000

which means retrieve 1000 records from a 1km buffer around 51.507, 0.1287

Pasting the link into a web browser returns something like the figure below:

This is json data - which contains the data for each occurrence. We can parse this in R with various packages - jsonlite is useful as follows…

search <- "https://records-ws.nbnatlas.org/occurrences/search?q=*:*&lat=51.507&lon=0.1278&radius=1&pageSize=1000"

df <- jsonlite::fromJSON(search, simplifyDataFrame = TRUE) ## the data is pulled into an object called `occurrences`

df <- df$occurrences

search <- "https://records-ws.nbnatlas.org/occurrences/search?q=*:*&lat=51.507&lon=0.1278&radius=1&pageSize=1000"

df <- jsonlite::fromJSON(search, simplifyDataFrame = TRUE) ## the data is pulled into an object called `occurrences`

df <- df$occurrences

This retrieves all the fields for each of the 1000 records. The dataset can be narrowed down in the usual way.

df_sp_yr_mon <- df |>
  filter(between(year, 2015, 2022), 
         classs == "Aves") |>
  dplyr::select(contains("Decimal"), species, vernacularName, year, month)

It is now relatively simple to undertake further analyses e.g. richness or diversity, and visualise or map the data.

Map the data

Mapping the data is very straightforward with the sf (simple feature) package.

First we convert our data to an sf object with the st_as_sf function which creates a geometry list column and requires a pre-specified crs

df_sp_yr_mon_sf <- st_as_sf(df_sp_yr_mon, coords = c("decimalLongitude", "decimalLatitude"), crs = 4326)

We can now plot the distribution of (say) Blackbirds with the mapview package.²

library(mapview)

df_sp_yr_mon_sf |>
  mutate(year = factor(year)) |>
  filter(vernacularName == "Blackbird") |>
  count(geometry)|>
  rename(blackbirds = n) |>
  mapview(zcol = "blackbirds", label = "blackbirds", 
          legend = FALSE)

or create a side by side map…

winter <- df_sp_yr_mon_sf |>
  filter(month %in% c(10, 11, 12, 1, 2, 3)) |>
  count(geometry)

summer <- df_sp_yr_mon_sf |>
  filter(!month %in% c(10, 11, 12, 1, 2, 3)) |>
  count(geometry)

mapview(winter, zcol = "n") | mapview(summer, zcol = "n")

There are other mapping approaches like tmap, ggplot with ggspatial and ggmap.

Building a function and using `purrr:map` to extract multiple datasets

If you want to retrieve data for multiple points it is pretty straightforward to build a function with takes lat, long as parameters and which can be iterated over each location.

get_nbn_buffer <- function(lon, lat, radius = 1, n = 1000){
  
  library(dplyr)
  require(jsonlite)
  require(tictoc) ## a timer function
  
  tic() ## this starts a timer
  
  base_url <- "https://records-ws.nbnatlas.org/occurrences/search?q=*:*&"
  search <- paste0(base_url, "lat=", lat, "&lon=", lon, "&radius=", radius, "&pageSize=", n)
  df <- fromJSON(search, simplifyDataFrame = TRUE)
  
  toc() ## ends the timer
  
  df$occurrences |>
    dplyr::select(kingdom:genus, contains("decimal"), year, month, dataProviderName, speciesGroups, vernacularName, species)
  
  
}

Run a single example

library(dplyr)
test <- get_nbn_buffer(lon = 0.1278, lat = 51.507, radius = 1, n = 1000) |>
  head()

## 3.114 sec elapsed

test

##    kingdom   phylum classs           order   family           genus
## 1 Animalia Chordata   Aves Charadriiformes  Laridae Chroicocephalus
## 2 Animalia Chordata   Aves      Gruiformes Rallidae       Gallinula
## 3 Animalia Chordata   Aves      Gruiformes Rallidae          Fulica
## 4 Animalia Chordata   Aves Charadriiformes  Laridae          Sterna
## 5 Animalia Chordata   Aves Charadriiformes  Laridae           Larus
## 6 Animalia Chordata   Aves    Anseriformes Anatidae          Branta
##   decimalLatitude decimalLongitude year month              dataProviderName
## 1        51.50393         0.138217 2010    04 British Trust for Ornithology
## 2        51.50393         0.138217 2016    01 British Trust for Ornithology
## 3        51.51318         0.124237 2018    07 British Trust for Ornithology
## 4        51.50393         0.138217 2010    07 British Trust for Ornithology
## 5        51.50393         0.138217 2010    02 British Trust for Ornithology
## 6        51.50393         0.138217 2009    10 British Trust for Ornithology
##    speciesGroups           vernacularName                    species
## 1 Animals, Birds        Black-headed Gull Chroicocephalus ridibundus
## 2 Animals, Birds                  Moorhen        Gallinula chloropus
## 3 Animals, Birds                     Coot                Fulica atra
## 4 Animals, Birds              Common Tern             Sterna hirundo
## 5 Animals, Birds Lesser Black-backed Gull               Larus fuscus
## 6 Animals, Birds             Canada Goose          Branta canadensis

This extracted 1,000 records in about 2 seconds.

Running multiple examples

Lets say we have a list of lat, longs

tf_lat_long <- read_csv("~/Library/CloudStorage/GoogleDrive-julian.flowers12@gmail.com/My Drive/tiny-forest-project/data/tf_lat_long.csv")
                        
## iterate over first 3 records
x <- 1:3

## runs `get_nbn_buffer` over 1st 3 records and compile into a data frame
sp_area <- purrr::map_dfr(x, ~(get_nbn_buffer(lon = tf_lat_long$long[.x], lat = tf_lat_long$lat[.x])))

## 2.913 sec elapsed
## 3.24 sec elapsed
## 2.865 sec elapsed

sp_area |>
  DT::datatable()

json = JavaScript Object Notation is a lightweight format for storing data and metadata on the web.↩︎
This is built on the leaflet package for interactive html maps↩︎

Query the NBN atlas API in R

Julian Flowers

2023-03-11

The NBN API

Map the data

Building a function and using `purrr:map` to extract multiple datasets

Run a single example

Running multiple examples

Query the NBN atlas API in R

Julian Flowers

2023-03-11

The NBN API

Map the data

Building a function and using purrr:map to extract multiple datasets

Run a single example

Running multiple examples

Building a function and using `purrr:map` to extract multiple datasets