Cast a line to subsegments in R

This post introduces a used-defined function used for casting sf objects of class LINESTRING or POLYGON into sub-strings.

Required R packages

library(sf)
library(rnaturalearth)
library(dplyr)

The problem

The sfpackage includes st_cast, a very powerful function that transforms geometries into other different types of geometries (i.e. LINESTRINGto POLYGON, etc.).

italy <- ne_countries(country = "italy", returnclass = "sf")
italy_pol <- italy %>% st_cast("POLYGON")
italy_lin <- italy_pol %>% st_cast("LINESTRING")
italy_pt <- italy_lin %>% st_cast("POINT")
par(mfrow = c(2, 2), mar = c(1, 1, 1, 1))
plot(st_geometry(italy), col = c("red", "yellow", "blue"), main = "MULTIPOLYGON")
plot(st_geometry(italy_pol), col = c("red", "yellow", "blue"), main = "POLYGON")
plot(st_geometry(italy_lin), col = c("red", "yellow", "blue"), main = "LINE")
plot(st_geometry(italy_pt), col = c("red", "yellow", "blue"), main = "POINT")

plot of chunk 20190505_italycast

What I missed when using st_castis the possibility to "break" the LINESTRINGobjects into sub-segments:

plot of chunk 20190505_italycastsub

An approach

So one possible solution could be to create LINESTRING objects for each consecutive pair of POINTobjects across the original geometry. Let's check it:

par(mfrow = c(1, 2), mar = c(1, 1, 1, 1))
test <- ne_countries(country = "spain", returnclass = "sf") %>%
  st_cast("POLYGON") %>%
  st_cast("LINESTRING")
plot(st_geometry(test), col = c("red", "yellow", "blue"), main = "LINESTRING")

geom <- lapply(
  1:(length(st_coordinates(test)[, 1]) - 1),
  function(i)
    rbind(
      as.numeric(st_coordinates(test)[i, 1:2]),
      as.numeric(st_coordinates(test)[i + 1, 1:2])
    )
) %>%
  st_multilinestring() %>%
  st_sfc(crs = st_crs(test)) %>%
  st_cast("LINESTRING")
plot(st_geometry(geom), col = c("red", "yellow", "blue"), main = "AFTER FUNCTION")

plot of chunk 20190505_testspain

The function stdh_cast_substring

Finally, I wrapped the solution into a function and extended it a little bit:

stdh_cast_substring <- function(x, to = "MULTILINESTRING") {
  ggg <- st_geometry(x)

  if (!unique(st_geometry_type(ggg)) %in% c("POLYGON", "LINESTRING")) {
    stop("Input should be  LINESTRING or POLYGON")
  }
  for (k in 1:length(st_geometry(ggg))) {
    sub <- ggg[k]
    geom <- lapply(
      1:(length(st_coordinates(sub)[, 1]) - 1),
      function(i)
        rbind(
          as.numeric(st_coordinates(sub)[i, 1:2]),
          as.numeric(st_coordinates(sub)[i + 1, 1:2])
        )
    ) %>%
      st_multilinestring() %>%
      st_sfc()

    if (k == 1) {
      endgeom <- geom
    }
    else {
      endgeom <- rbind(endgeom, geom)
    }
  }
  endgeom <- endgeom %>% st_sfc(crs = st_crs(x))
  if (class(x)[1] == "sf") {
    endgeom <- st_set_geometry(x, endgeom)
  }

  if (to == "LINESTRING") {
    endgeom <- endgeom %>% st_cast("LINESTRING")
  }
  return(endgeom)
}

The function could be improved in terms of performance. Given that it works at a coordinate level, for high-resolution objects it has some degree of delay

test100 <- ne_countries(
  continent = "south america",
  returnclass = "sf"
) %>%
  st_cast("POLYGON")

test50 <- ne_countries(50,
  continent = "south america",
  returnclass = "sf"
) %>%
  st_cast("POLYGON")



init <- Sys.time()
t1 <- stdh_cast_substring(test100, "LINESTRING")
end <- Sys.time()
kable(end - init, format = "markdown")
x
0.2212591 secs
init <- Sys.time()
t2 <- stdh_cast_substring(test50, "LINESTRING")
end <- Sys.time()
kable(end - init, format = "markdown")
x
2.996987 secs
par(mfrow = c(1, 1), mar = c(0, 0, 0, 0))
plot(st_geometry(test50), col = NA, bg = "#C6ECFF")
plot(st_geometry(ne_countries(50, returnclass = "sf")), col = "#F6E1B9", border = "#646464", add = T)
plot(st_geometry(test50), col = "#FEFEE9", border = "#646464", add = T)
plot(st_geometry(t2), col = c("red", "yellow", "blue"), add = T, lwd = 0.5)

plot of chunk 20190505_benchmarkfunction

It can be seen a difference in terms of performance, noting that test100 has 15 polygons decomposed in 914 sub-strings while test50 has 80 polygons to 8,414 sub-strings. In that sense, the original st_castis much faster, although this solution may work well in most cases.

A collection of my user-defined functions can be checked here