Geometry operations

Disclaimer: The contents of this document come from Chapter 5. Geometry operations of Geocomputation with `R`(Lovelace, Nowosad, & Muenchow, 2019). This document is prepared for CP6521 Advanced GIS, a graduate-level city planning elective course at Georgia Tech in Spring 2019. For any question, contact the instructor, Yongsung Lee, Ph.D. via yongsung.lee(at)gatech.edu.

This document is also published on RPubs.

install.packages("tidyverse", dependencies = TRUE)
install.packages("sf", dependencies = TRUE)
install.packages("spData", dependencies = TRUE)
install.packages("tmap", dependencies = TRUE)
install.packages("rmapshaper", dependencies = TRUE)

library(tidyverse)
library(sf)          # classes and functions for vector data
library(spData)      # load geographic data
library(tmap)
library(rmapshaper)

Introduction

What we do:

Learn how to deal with the sfc list-column in sf objects

Why we do:

Geographic analysis often involves in generating new geometries from input data and their relationships.

5.1 Introduction

Two basic types of geometry operations
- Unary operations work on a single geometry in isolation. This includes simplification (of lines and polygons), the creation of buffers and centroids, and shifting/scaling/rotating single geometries using ‘affine transformations’.
- Binary transformations modify one geometry based on the shape of another. This includes clipping and geometry unions.

5.2 Geometric operations on vector data

Functions in this section work on objects of class sfc in addition to objects of class sf.

5.2.1 Simplification

seine 
class(seine) # unit? 
typeof(seine)
str(seine)
# For crs, https://epsg.io/2154

tmap_mode("view") # from now on, all tmaps are interactive 

tm_basemap("OpenStreetMap.Mapnik") + 
  tm_shape(seine) + 
  tm_lines()

seine_simp <- st_simplify(seine, dTolerance = 2000) # 2000 meter 

tm_basemap("OpenStreetMap.Mapnik", alpha = 0.1) + 
  tm_shape(seine) + 
  tm_lines() +
  tm_shape(seine_simp) + 
  tm_lines(col = "red")

object.size(seine)
object.size(seine_simp)

To simplify sf objects, make sure they are projected.

us_states 
class(us_states)
typeof(us_states)
str(us_states)
# https://epsg.io/4269 unprojected 

us_states2163 <- st_transform(us_states, 2163)

us_states_simp1 <- st_simplify(us_states2163, dTolerance = 100000) # 100 km 

tmap_mode("plot")

tm_shape(us_states_simp1) +
  tm_polygons() # do not preserve topology 


# keep_shapes = TRUE
# Prevent small polygon features from disappearing at high simplification
# https://www.rdocumentation.org/packages/rmapshaper/versions/0.4.1/topics/ms_simplify

# ms_simplify can't deal with the unit class 
# map(us_states2163, class)
us_states_simp2 <- rmapshaper::ms_simplify(us_states2163, keep = 0.01,
                                           keep_shapes = TRUE) 
tm_shape(us_states_simp2) +
  tm_polygons()

us_states2163_sp <- sf:::as_Spatial(us_states2163) # conver sf to sp 
us_states_simp3 <- rmapshaper::ms_simplify(us_states2163_sp, keep = 0.01,
                                           keep_shapes = TRUE) 
tm_shape(us_states_simp3) +
  tm_polygons()

us_states2163_no_unit <- us_states2163
us_states2163_no_unit$AREA <- as.numeric(us_states2163_no_unit$AREA)
us_states_simp4 <- rmapshaper::ms_simplify(us_states2163_no_unit, keep = 0.01,
                                           keep_shapes = TRUE) 
tm_shape(us_states_simp4) +
  tm_polygons()

5.2.2 Centroids

st_centroid identifies the center of geographic objects.

st_crs(nz)
nz_centroid <- st_centroid(nz)

tm_shape(nz) +
  tm_polygons() + 
  tm_shape(nz_centroid) + 
  tm_dots(shape = 1, size = 1)

st_point_on_surface returns points inside original geometries.

nz_pos <- st_point_on_surface(nz)

tm_shape(nz) +
  tm_polygons() + 
  tm_shape(nz_centroid) + 
  tm_dots(shape = 1, size = 1) + 
  tm_shape(nz_pos) + 
  tm_dots(shape = 1, col = "red", size = 1)

seine_centroid <- st_centroid(seine)
seine_pos <- st_point_on_surface(seine)

tm_shape(seine) +
  tm_lines() + 
  tm_shape(seine_centroid) + 
  tm_dots(shape = 1, size = 1) + 
  tm_shape(seine_pos) + 
  tm_dots(shape = 1, col = "red", size = 1)

5.2.3 Buffers

Buffers: Polygons representing the area within a given distance of a geometric feature.
- Regardless of the type of inputs, the output is polygon.

seine_buffer_5km <- st_buffer(seine, dist = 5000)
seine_buffer_50km <- st_buffer(seine, dist = 50000)

tm_shape(seine_buffer_5km) +
  tm_polygons(col = "name") +
  tm_shape(seine) +
  tm_lines()

tm_shape(seine_buffer_50km) +
  tm_polygons(col = "name") +
  tm_shape(seine) +
  tm_lines()

5.2.4 Affine transformations

Skip

5.2.5 Clipping

Spatial clipping is a form of spatial subsetting that involves changes to the geometry columns of at least some of the affected features.

Start with sample data sets.

b <- st_sfc(st_point(c(0, 1)), st_point(c(1, 1))) # create 2 points 
b <- st_buffer(b, dist = 1) # convert points to circles 
plot(b) 
text(x = c(-0.5, 1.5), y =1, labels = c("x", "y")) # add text 

# examine b 
b
class(b)
typeof(b)
str(b)
nrow(b)
length(b) # b behaves as if it's a vector

x = b[1] # The single bracket on a list returns a sublist (the same class)
y = b[2] 
x_and_y = st_intersection(x, y)

plot(b)
plot(x_and_y, col = "lightgrey", add = TRUE) # color intersecting area  

# below, create the same plot with the tmap package 
tm_shape(b) + # no projection, but it's fine 
  tm_polygons(col = "white") + 
  tm_shape(x_and_y) +
  tm_polygons(col = "lightgrey") # color intersecting area

Explorea a list of possible clipping cases.

# first define a mapping function with one input, x_clip_y 
my_tmap <- function(x_clip_y) {
  tm_shape(b) +    
    tm_polygons(col = "white") + 
    tm_shape(x_clip_y) +
    tm_polygons(col = "lightgrey") + 
    tm_shape(b) + 
    tm_borders()
}

st_difference(y, x) %>% my_tmap()
x %>% my_tmap()
st_union(x, y) %>% my_tmap() 
st_intersection(x, y) %>% my_tmap()
st_sym_difference(x, y) %>% my_tmap()
st_difference(x, y) %>% my_tmap()
y %>% my_tmap()

Now, let’s combine spatial subsetting and clipping!

# make sample data ready 
bb <- st_bbox(st_union(x, y)) # create a bounding box 
box <- st_as_sfc(bb) # creat an sfc object out of the bounding box (sfc = sort of an sf object only with the geometry column)  
set.seed(2017) 
p <- st_sample(x = box, size = 10) # generate 10 random points inside the bounding box 

# plot the sample data 
plot(box)
plot(x, add = TRUE)
plot(y, add = TRUE)
plot(p, add = TRUE)
text(x = c(-0.5, 1.5), y = 1, labels = c("x", "y"))

# spatial subsetting: no changes in geometries   
sel_p_xy <- st_intersects(p, x, sparse = FALSE)[, 1] &
  st_intersects(p, y, sparse = FALSE)[, 1]
# spatial clipping: changes in geometries  
x_and_y = st_intersection(x, y)

# spatial subsetting: point[polygon]
p_xy1 <- p[sel_p_xy] # point[logical vector]
p_xy2 <- p[x_and_y]  # point[polygon]
identical(p_xy1, p_xy2)
# above, you may be wondering why there is no comma inside the single brackets.  
# This is because we deal with an sfc object, which behaves like a vector. 
# That is, the sfc object, p, does not have rows or columns. only a sequence of elements. 
# to see how the above syntax would differ for an sf object, see below. 

# first, convert the sfc object to an sf object: still, only one column, but in a tabular format now.  
class(p)
p_sf <- st_sf(p) 
class(p_sf)

p_sf_xy1 <- p_sf[sel_p_xy, ] # put the comma inside the single bracket, otherwise you will get errors.  
p_sf_xy2 <- p_sf[x_and_y, ]
identical(p_sf_xy1, p_sf_xy2)

5.2.6 Geometry unions

Spatial aggregation silently dissolve the geometries of touching polygons in the same group.

# in a base R way 
regions = aggregate(x = us_states[, "total_pop_15"], by = list(us_states$REGION), 
                    FUN = sum, na.rm = TRUE)

plot(regions[, "total_pop_15"])

tm_shape(regions) + 
  tm_polygons(col = "total_pop_15")

# in a tidyverse way 
regions2 <- us_states %>%
  group_by(REGION) %>%
  summarize(pop = sum(total_pop_15, na.rm = TRUE))

plot(regions2[, "pop"])

tm_shape(regions2) + 
  tm_polygons(col = "pop")

Behind the scenes, both aggregate() and summarize() combine the geometries and dissolve the boudaries between them using st_union().

Now, let’s use st_union() instead

# attribute subsetting in a base R way 
us_west <- us_states[us_states$REGION == "West", ] 

# attribute subsetting in a tidyverse way 
us_west <- us_states %>%
  filter(REGION == "West")

# then, geometry operation, st_union
us_west_union = st_union(us_west) 

tm_shape(us_states) + 
  tm_polygons(col = "white") + 
  tm_shape(us_west_union) + 
  tm_polygons(col = "grey", alpha = 0.5, lwd = 2)

st_union() works with multiple inputs.

texas <- us_states %>%
  filter(NAME == "Texas")
texas_union <- st_union(us_west_union, texas)

tm_shape(us_states) + 
  tm_polygons(col = "white") + 
  tm_shape(texas_union) + 
  tm_polygons(col = "grey", alpha = 0.5, lwd = 2)

5.2.7 Type transformations

Skip

5.3. Geometric operations on raster data

To be continued in the following session

5.4. Raster-vector interactions

To be continued in the following session

5.5 Exercises

Some of the exercises use a vector (random_points) and raster dataset (ndvi) from the RQGIS package. They also use a polygonal ‘convex hull’ derived from the vector dataset (ch) to represent the area of interest:

install.packages("RQGIS", dep = TRUE) 
library(RQGIS)
data(random_points)
# data(ndvi)

ch = st_combine(random_points) %>% # convert point to multipoint
  st_convex_hull()

plot(st_geometry(random_points))
plot(st_geometry(ch), add = TRUE)

No 1. Generate and plot simplified versions of the nz dataset. Experiment with different values of keep (ranging from 0.5 to 0.00005) for ms_simplify() and dTolerance (from 100 to 100,000) for st_simplify().
- At what value does the form of the result start to break down for each method, making New Zealand unrecognizable?

No 2. In the first exercise in Chapter 4 it was established that Canterbury region had 70 of the 101 highest points in New Zealand. Using st_buffer(), how many points in nz_height are within 100 km of Canterbury?

No 3. Find the geographic centroid of New Zealand. How far is it from the geographic centroid of Canterbury?

No 6. Calculate the length of the boundary lines of US states in meters. Which state has the longest border and which has the shortest? Hint: The st_length function computes the length of a LINESTRING or MULTILINESTRING geometry.