Using the geoconnex Reference Feature Server with R

Introduction

Below we demonstrates how to use the Geoconnex Reference Feature server API with R’s spatial packages, particularly sf. We’ll explore various use cases for working with hydrological and related spatial data. These use cases can be groups into three categories:

  1. Finding identifiers and spatial geometries for real-world features
  2. Finding features that are hydrologically related to a feature of interest (if that information is available)
  3. Finding datasets related to a feature of interest

This API conforms to the OGC-API Features specification, and its full API documentation is available here for those who wish to explore its capabilities further than what is demonstrated below.

Setup

First, let’s load the necessary libraries and set up our API base URL.

Code
library(sf)
library(dplyr)
library(mapview)
library(jsonlite)
library(knitr)
library(DT)

base_url <- "https://reference.geoconnex.us"

Use Case 1: Identifying features and their spatial geometry

The Geoconnex Reference Feature server is a source for identifiers and geometries for real-world features that many organizations may collect and publish data about. The attributes of these features vary, but all include a “uri” which serves as a persistent identifer. First, let’s discover what kinds of features are available:

Code
url <- paste0(base_url, "/collections?f=json")
collections <- jsonlite::fromJSON(url)

datatable(collections$collections)

We see a number of options available, including watershed boundaries like the Hydrologic Unit Codes, administrative boundaries like counties, states, and public water systems, and hydrologic features like mainstems (rivers) and aquifers, and hydrometric features such as dams and gages. The reference feature server lets us find features according to attribute and spatial filters. In general, this is accomplished by passing queries of the form https://reference.geoconnex.us/collections/<collectionId>/items?filter= Let’s see how to use these collections!

Attribute Query

Fuzzy text match

Let’s say we’re interested in a specific river, one we think is called the “Animas river”. We can pass an attribute filter query to the mainstems collection”, and then use the R sf package to read into a dataframe, and the mapview package to visualize.

Code
# construct a query for a river name that includes the string "animas"
query <- URLencode("name_at_outlet ILIKE '%animas%'") 
url <- paste0(base_url, "/collections/mainstems/items?f=json&filter=", query)

# Read the data into an sf object
animas_rivers <- st_read(url, quiet = TRUE) 

# Display the results
animas_rivers |> 
  select(uri, name_at_outlet, outlet_drainagearea_sqkm) |>
  datatable()
Code
# Map the results
mapview(animas_rivers |> 
  select(uri, name_at_outlet), zcol = "name_at_outlet")

There are evidently 3 rivers that include the word “Animas”. Let’s say we were interested in the “Animas River”, shown on the map in Green. We find that it’s Geoconnex URI is https://geoconnex.us/ref/mainstems/35394.

Logical and quantitative

We can also do filters based on logical and quantitative filters on attributes. Let’s say we wanted to find all rivers with drainage areas (in this reference dataset, the attribute is outlet_drainagearea_sqkm) greater than 1,000,000 square kilometers:

Code
# construct a query for a river with outlet_drainagearea_sqkm > 600,000
query <- URLencode("outlet_drainagearea_sqkm > 500000")
url <- paste0(base_url, "/collections/mainstems/items?f=json&filter=", query)

# Read the data into an sf object
large_mainstems <- st_read(url, quiet = TRUE)

# Display the results
large_mainstems |> 
  select(uri, name_at_outlet, outlet_drainagearea_sqkm) |>
  datatable()
Code
# Map the results
mapview(large_mainstems, zcol = "name_at_outlet")

Queries over multiple attributes can also be made, combining with ‘AND’ or’OR’. For example, let’s find all Dams that include the name “Hoover”, but then also filter to only the ones with a drainage area of more than 100,000 square kilometers:

Code
# Step 1: Find all dams named "Hoover"
query_hoover <- URLencode("name LIKE '%Hoover%'")
url_hoover <- paste0(base_url, "/collections/dams/items?f=json&filter=", query_hoover)
hoover_dams <- st_read(url_hoover, quiet = TRUE)

cat("Number of dams named 'Hoover':", nrow(hoover_dams), "\n")
Number of dams named 'Hoover': 39 
Code
# Create an interactive table of all Hoover dams
datatable(
  hoover_dams %>%
    st_drop_geometry() %>%
    select(uri, name, drainage_area_sqkm) %>%
    arrange(desc(drainage_area_sqkm)),
  options = list(pageLength = 10, scrollX = TRUE),
  caption = "All Dams Named 'Hoover'",
  rownames = FALSE
)
Code
# Step 2: Query for large Hoover dams using a combined filter
query_large_hoover <- URLencode("name LIKE '%Hoover%' AND drainage_area_sqkm > 100000")
url_large_hoover <- paste0(base_url, "/collections/dams/items?f=json&filter=", query_large_hoover)
large_hoover_dams <- st_read(url_large_hoover, quiet = TRUE)

cat("\nNumber of large Hoover dams (Drainage Area > 100,000 sq km):", nrow(large_hoover_dams), "\n")

Number of large Hoover dams (Drainage Area > 100,000 sq km): 1 
Code
# Create an interactive table of large Hoover dams
datatable(
  large_hoover_dams %>%
    st_drop_geometry() %>%
    select(uri, name, drainage_area_sqkm) %>%
    arrange(desc(drainage_area_sqkm)),
  options = list(pageLength = 10, scrollX = TRUE),
  caption = "Large Dams Named 'Hoover' (Drainage Area > 100,000 sq km)",
  rownames = FALSE
)
Code
# Create a map view of all Hoover dams, highlighting the large ones
m <- mapview(hoover_dams %>%
    select(uri, name, drainage_area_sqkm), layer.name = "All Hoover Dams", label="name")
m + mapview(large_hoover_dams %>%
    select(uri, name, drainage_area_sqkm), color = "red", col.regions="red", layer.name = "Large Hoover Dams", lwd=2, cex=15, label="Hoover")

We found 39 Dams in the US named “Hoover”, but only 1 with a large drainage area, the famous one near Las Vegas, NV.

Spatial Queries

Bounding box

We can also do spatial queries, using bounding box queries (min lon, min lat, max lon, max lat) or by passing WKT-encoded geometry. Let’s say we want to find all Public Water Systems within a bounding box around the four-corners region.

Code
# Define the bounding box for the Four Corners area
# Format: (min Longitude, min Latitude, max Longitude, max Latitude)
bbox <- c(-109.5, 36.5, -107.5, 37.5)

# Construct the URL with the bbox parameter
url <- paste0(base_url, "/collections/pws/items?f=json&bbox=", paste(bbox, collapse = ","))

# Read the data into an sf object
four_corners_pws <- st_read(url, quiet = TRUE)

# Display summary of the results
cat("Number of Public Water Systems found:", nrow(four_corners_pws), "\n")
Number of Public Water Systems found: 75 
Code
# Create an interactive table of the results
four_corners_pws %>%
  st_drop_geometry() %>%
  select(uri, pws_name, population_served_count) %>%
  arrange(desc(population_served_count)) %>%
  datatable(
    options = list(pageLength = 5, scrollX = TRUE),
    caption = "Public Water Systems in the Four Corners Area",
    rownames = FALSE
  )
Code
# Create a map view of the results
m <- mapview(four_corners_pws, zcol = "population_served_count", layer.name = "Population Served", label= "pws_name")

# Add the bounding box to the map
bbox_poly <- st_as_sf(st_as_sfc(st_bbox(c(xmin = bbox[1], ymin = bbox[2], xmax = bbox[3], ymax = bbox[4]), crs = 4326)))
m + mapview(bbox_poly, col.region = "red", alpha.regions=0, color="red", lwd=2, layer.name = "Query Bounding Box")

Intersection