Edzer UseR2017
Geocomputation with R by Robin Lovelace, Jakub Nowosad, and Jannes Muenchow
r-spatial
tidyverse cookbook
Adam Wilson site
“Rencontres R 2018” à Rennes
Jesse Sadler site CRAN task on Spatial data
# List of methods:
methods(class = "sf")
## [1] $<- [ [[<-
## [4] aggregate anti_join arrange
## [7] as.data.frame cbind coerce
## [10] dbDataType dbWriteTable distinct
## [13] extent extract filter
## [16] full_join gather group_by
## [19] group_map group_split identify
## [22] idw initialize inner_join
## [25] krige krige.cv left_join
## [28] mask merge mutate
## [31] nest plot print
## [34] raster rasterize rbind
## [37] rename right_join sample_frac
## [40] sample_n select semi_join
## [43] separate show slice
## [46] slotsFromS3 spread st_agr
## [49] st_agr<- st_area st_as_sf
## [52] st_bbox st_boundary st_buffer
## [55] st_cast st_centroid st_collection_extract
## [58] st_convex_hull st_coordinates st_crop
## [61] st_crs st_crs<- st_difference
## [64] st_geometry st_geometry<- st_interpolate_aw
## [67] st_intersection st_is st_line_merge
## [70] st_nearest_points st_node st_normalize
## [73] st_point_on_surface st_polygonize st_precision
## [76] st_segmentize st_set_precision st_simplify
## [79] st_snap st_sym_difference st_transform
## [82] st_triangulate st_union st_voronoi
## [85] st_wrap_dateline st_write st_zm
## [88] summarise transmute ungroup
## [91] unite unnest
## see '?methods' for accessing help and source code
##################################################################
##################################################################
### Make sf objects
##################################################################
##################################################################
library(sf); library(sp)
# sf object can be simply considered as a dataframe with an associated geometry, i.e. MULTIPOINTS, POINTS, MULTIPOLYONG, etc.)
# 1. To convert from sp objects to sf (and back-convertion)
# https://cran.r-project.org/web/packages/sf/vignettes/sf2.html
# In sp we use to do to create a spatial object:
# load a data frame with coordinates:
data(meuse, package = "sp");
head(meuse)
## x y cadmium copper lead zinc elev dist om ffreq soil
## 1 181072 333611 11.7 85 299 1022 7.909 0.00135803 13.6 1 1
## 2 181025 333558 8.6 81 277 1141 6.983 0.01222430 14.0 1 1
## 3 181165 333537 6.5 68 199 640 7.800 0.10302900 13.0 1 1
## 4 181298 333484 2.6 81 116 257 7.655 0.19009400 8.0 1 2
## 5 181307 333330 2.8 48 117 269 7.480 0.27709000 8.7 1 2
## 6 181390 333260 3.0 61 137 281 7.791 0.36406700 7.8 1 2
## lime landuse dist.m
## 1 1 Ah 50
## 2 1 Ah 30
## 3 1 Ah 150
## 4 0 Ga 270
## 5 0 Ah 380
## 6 0 Ga 470
# To make an sf object from a dataframe:
st_as_sf(meuse, coords=c("x","y"))
## Simple feature collection with 155 features and 12 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 178605 ymin: 329714 xmax: 181390 ymax: 333611
## epsg (SRID): NA
## proj4string: NA
## First 10 features:
## cadmium copper lead zinc elev dist om ffreq soil lime landuse
## 1 11.7 85 299 1022 7.909 0.00135803 13.6 1 1 1 Ah
## 2 8.6 81 277 1141 6.983 0.01222430 14.0 1 1 1 Ah
## 3 6.5 68 199 640 7.800 0.10302900 13.0 1 1 1 Ah
## 4 2.6 81 116 257 7.655 0.19009400 8.0 1 2 0 Ga
## 5 2.8 48 117 269 7.480 0.27709000 8.7 1 2 0 Ah
## 6 3.0 61 137 281 7.791 0.36406700 7.8 1 2 0 Ga
## 7 3.2 31 132 346 8.217 0.19009400 9.2 1 2 0 Ah
## 8 2.8 29 150 406 8.490 0.09215160 9.5 1 1 0 Ab
## 9 2.4 37 133 347 8.668 0.18461400 10.6 1 1 0 Ab
## 10 1.6 24 80 183 9.049 0.30970200 6.3 1 2 0 W
## dist.m geometry
## 1 50 POINT (181072 333611)
## 2 30 POINT (181025 333558)
## 3 150 POINT (181165 333537)
## 4 270 POINT (181298 333484)
## 5 380 POINT (181307 333330)
## 6 470 POINT (181390 333260)
## 7 240 POINT (181165 333370)
## 8 120 POINT (181027 333363)
## 9 240 POINT (181060 333231)
## 10 420 POINT (181232 333168)
# Or, if it is an sp spatial object, it works too:
coordinates(meuse) = ~x+y
# or coordinates(meuse) <- c("x","y")
plot(meuse)
meuse_sf <- st_as_sf(meuse);
# either way, the default of sf plot is:
plot(meuse_sf)
class(meuse_sf)
## [1] "sf" "data.frame"
st_geometry(meuse_sf)
## Geometry set for 155 features
## geometry type: POINT
## dimension: XY
## bbox: xmin: 178605 ymin: 329714 xmax: 181390 ymax: 333611
## epsg (SRID): NA
## proj4string: NA
## First 5 geometries:
# If we want to get back to the dataframe (no coordinates however)
head(st_drop_geometry(meuse_sf))
## cadmium copper lead zinc elev dist om ffreq soil lime landuse
## 1 11.7 85 299 1022 7.909 0.00135803 13.6 1 1 1 Ah
## 2 8.6 81 277 1141 6.983 0.01222430 14.0 1 1 1 Ah
## 3 6.5 68 199 640 7.800 0.10302900 13.0 1 1 1 Ah
## 4 2.6 81 116 257 7.655 0.19009400 8.0 1 2 0 Ga
## 5 2.8 48 117 269 7.480 0.27709000 8.7 1 2 0 Ah
## 6 3.0 61 137 281 7.791 0.36406700 7.8 1 2 0 Ga
## dist.m
## 1 50
## 2 30
## 3 150
## 4 270
## 5 380
## 6 470
# Which is the same as:
dplyr::select(as.data.frame(meuse_sf), -geometry) %>% head()
## cadmium copper lead zinc elev dist om ffreq soil lime landuse
## 1 11.7 85 299 1022 7.909 0.00135803 13.6 1 1 1 Ah
## 2 8.6 81 277 1141 6.983 0.01222430 14.0 1 1 1 Ah
## 3 6.5 68 199 640 7.800 0.10302900 13.0 1 1 1 Ah
## 4 2.6 81 116 257 7.655 0.19009400 8.0 1 2 0 Ga
## 5 2.8 48 117 269 7.480 0.27709000 8.7 1 2 0 Ah
## 6 3.0 61 137 281 7.791 0.36406700 7.8 1 2 0 Ga
## dist.m
## 1 50
## 2 30
## 3 150
## 4 270
## 5 380
## 6 470
meuse_sf %>% st_set_geometry(NULL) %>% head()
## cadmium copper lead zinc elev dist om ffreq soil lime landuse
## 1 11.7 85 299 1022 7.909 0.00135803 13.6 1 1 1 Ah
## 2 8.6 81 277 1141 6.983 0.01222430 14.0 1 1 1 Ah
## 3 6.5 68 199 640 7.800 0.10302900 13.0 1 1 1 Ah
## 4 2.6 81 116 257 7.655 0.19009400 8.0 1 2 0 Ga
## 5 2.8 48 117 269 7.480 0.27709000 8.7 1 2 0 Ah
## 6 3.0 61 137 281 7.791 0.36406700 7.8 1 2 0 Ga
## dist.m
## 1 50
## 2 30
## 3 150
## 4 270
## 5 380
## 6 470
# This way we get also the coordinates (but not in an elegant way):
head(data.frame(as(meuse_sf, "Spatial")))
## cadmium copper lead zinc elev dist om ffreq soil lime landuse
## 1 11.7 85 299 1022 7.909 0.00135803 13.6 1 1 1 Ah
## 2 8.6 81 277 1141 6.983 0.01222430 14.0 1 1 1 Ah
## 3 6.5 68 199 640 7.800 0.10302900 13.0 1 1 1 Ah
## 4 2.6 81 116 257 7.655 0.19009400 8.0 1 2 0 Ga
## 5 2.8 48 117 269 7.480 0.27709000 8.7 1 2 0 Ah
## 6 3.0 61 137 281 7.791 0.36406700 7.8 1 2 0 Ga
## dist.m coords.x1 coords.x2 optional
## 1 50 181072 333611 TRUE
## 2 30 181025 333558 TRUE
## 3 150 181165 333537 TRUE
## 4 270 181298 333484 TRUE
## 5 380 181307 333330 TRUE
## 6 470 181390 333260 TRUE
# Create a MULTIPOINTS object:
# For example, matrix with 5 points:
plot(st_multipoint(matrix(1:10, nc=2)), col=2, axes=TRUE)
# Again, using meuse data:
meuse=data.frame(meuse)
meuse_sf <- st_as_sf(meuse, coords = c("x", "y"), crs = 28992)
# plot it:
plot(meuse_sf, axes=TRUE) # plot first 10 variables
plot(meuse_sf[1], axes=TRUE)# plot only first layer
# the same as before: plot(meuse_sf["cadmium"], axes=TRUE)
# or plot it using ggplot:
ggplot()+
geom_sf(data=meuse_sf, aes(col=cadmium))
# for more information: ?plot.sf
# Sf can be used in combination with dplyr/tidyr
# For example, to estimate mean cadmium by landuse:
meuse_sf1 <- st_as_sf(meuse, coords = c("x","y"))%>%
dplyr::group_by(landuse) %>%
dplyr::summarise(mean.cad=mean(cadmium)) %>%
dplyr::filter(mean.cad > 1) #filter data where mean cadmium is greater than 1
head(meuse_sf1)
## Simple feature collection with 6 features and 2 fields
## geometry type: MULTIPOINT
## dimension: XY
## bbox: xmin: 178605 ymin: 329807 xmax: 181352 ymax: 333611
## epsg (SRID): NA
## proj4string: NA
## # A tibble: 6 x 3
## landuse mean.cad geometry
## <fct> <dbl> <MULTIPOINT>
## 1 Ab 3.05 (179128 330867, 179140 330955, 180135 331552, 180632 33~
## 2 Ag 1.76 (180704 332664, 180878 332489, 180973 332687, 181032 33~
## 3 Ah 2.66 (178786 329822, 178875 330311, 178875 330516, 179030 33~
## 4 Am 1.90 (178605 330406, 178701 330557, 178803 330349, 178810 33~
## 5 B 1.37 (180954 332399, 180956 332318, 181133 332570)
## 6 Bw 2.58 (179337 329870, 179445 329807, 179852 330801, 180199 33~
plot(meuse_sf1, axes=TRUE, pch=16)
# back to dataframe
data.frame(meuse_sf1)
## landuse mean.cad geometry
## 1 Ab 3.050000 MULTIPOINT (179128 330867, ...
## 2 Ag 1.760000 MULTIPOINT (180704 332664, ...
## 3 Ah 2.664103 MULTIPOINT (178786 329822, ...
## 4 Am 1.895455 MULTIPOINT (178605 330406, ...
## 5 B 1.366667 MULTIPOINT (180954 332399, ...
## 6 Bw 2.583333 MULTIPOINT (179337 329870, ...
## 7 Fh 1.400000 POINT (181191 333115)
## 8 Fw 2.860000 MULTIPOINT (179456 330072, ...
## 9 Ga 2.266667 MULTIPOINT (180561 332193, ...
## 10 W 5.064000 MULTIPOINT (178912 330779, ...
## 11 <NA> 12.900000 POINT (180555 332707)
# ploting with ggplot (again)
ggplot() +
geom_sf(data = meuse_sf1, aes(col = mean.cad))+theme_bw()
# plotting with ggplot and using dplyr:
st_as_sf(meuse, coords = c("x","y"))%>%
ggplot(aes(col=cadmium)) +
geom_sf() +
scale_colour_distiller(palette="Spectral") +theme_bw()
# ploting with ggplot and facets:
st_as_sf(meuse, coords = c("x","y"))%>%
ggplot(aes(col=cadmium)) +
geom_sf()+
facet_wrap(~landuse)+
scale_colour_distiller(palette="Spectral") +theme_bw()
# LINESTRIG
plot(ls1 <- st_linestring(matrix(1:10, nc=2)), col=2, axes=TRUE)
# POLYGON:
print(outer <- matrix(c(0,0,10,0,10,10,0,10,0,0),ncol=2, byrow=TRUE))
## [,1] [,2]
## [1,] 0 0
## [2,] 10 0
## [3,] 10 10
## [4,] 0 10
## [5,] 0 0
plot(pl1 <- st_polygon(list(outer)), col=2, axes=TRUE)
Consider the case where we have 2 sf objects, each with 1 line. We use ‘rbind’ if the field name is the same.
#########################################################################
# Bind together geometries:
# From https://r-spatial.github.io/sf/reference/bind.html
# Bind a point + line + multilinestring with the same
crs = st_crs(3857)
a = st_sf(a=1, geom = st_sfc(st_point(0:1)), crs = crs)
b = st_sf(a=1, geom = st_sfc(st_linestring(matrix(1:4,2))), crs = crs)
c = st_sf(a=4, geom = st_sfc(st_multilinestring(list(matrix(1:4,2)))), crs = crs)
plot(rbind(a,b,c))
# bind wa point and a line, each with one feature but with different field names
a = st_sf(a=1, geom = st_sfc(st_point(0:1)), crs = crs)
b = st_sf(b=1, geom = st_sfc(st_linestring(matrix(1:4,2))), crs = crs)
# rbind(a,b) # does not work
cbind(a,b) # works but makes something different'?'
## Simple feature collection with 1 feature and 2 fields
## Active geometry column: geom
## geometry type: POINT
## dimension: XY
## bbox: xmin: 0 ymin: 1 xmax: 0 ymax: 1
## epsg (SRID): 3857
## proj4string: +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
## a b geom geom.1
## 1 1 1 POINT (0 1) LINESTRING (1 3, 2 4)
plot(st_union(a,b)) # makes a geometry collection
# st_combine(a,b) # does not work
st_join(a,b) # makes a
## Simple feature collection with 1 feature and 2 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 0 ymin: 1 xmax: 0 ymax: 1
## epsg (SRID): 3857
## proj4string: +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
## a b geom
## 1 1 NA POINT (0 1)
# third example
(df = data.frame(x=3))
## x
## 1 3
c
## Simple feature collection with 1 feature and 1 field
## geometry type: MULTILINESTRING
## dimension: XY
## bbox: xmin: 1 ymin: 3 xmax: 2 ymax: 4
## epsg (SRID): 3857
## proj4string: +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
## a geom
## 1 4 MULTILINESTRING ((1 3, 2 4))
(st_sf(data.frame(c, df)))
## Simple feature collection with 1 feature and 2 fields
## geometry type: MULTILINESTRING
## dimension: XY
## bbox: xmin: 1 ymin: 3 xmax: 2 ymax: 4
## epsg (SRID): 3857
## proj4string: +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs
## a x geom
## 1 4 3 MULTILINESTRING ((1 3, 2 4))
plot(dplyr::bind_cols(c, df))
Clarifying image from Edzer Pebesma: “Geometry is the the type of a simple feature column when the feature geometries in the column are a mix of types. GeometryCollection is the geometry type of a single feature geometry where this geometry can be anything, including a mix of geometry types.” (Pebesma).
##################################################################
### Raster object across spatial formats
##################################################################
# Create a raster using the raster package
# (from Barry Rowlingson)
par(mfrow=c(1,1))
r.raster <- raster(ncol=36, nrow=18)
r.raster[] <- 1:ncell(r.raster)
plot(r.raster)
r.raster
## class : RasterLayer
## dimensions : 18, 36, 648 (nrow, ncol, ncell)
## resolution : 10, 10 (x, y)
## extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : layer
## values : 1, 648 (min, max)
# Change the raster: from raster to sp format:
r.sp <- as(r.raster, "SpatialPixelsDataFrame")
gridded(r.sp)
## [1] TRUE
spplot(r.sp, main="raster to sp - SpatialPixelsDataFrame")
plot(r.sp, axes=TRUE) # Works too...
# Change the raster: sp to sf format:
plot(r.sf <- st_as_sf(r.sp), axes=TRUE)
# Change the raster: from raster to sf format:
#plot(st_as_sf(r.raster), axes=TRUE) # does not work.
plot(st_as_sf(data.frame(rasterToPoints(r.raster)), coords=c("x","y"))) #this non-elegant way it works
##################################################################
### Points object across spatial formats
##################################################################
# Random sample the sp raster (i.e. SpatialPixelsDataFrame) to get points
pts.sp <- spsample(r.sp, n=15,type="random")
plot(r.sp)
plot(pts.sp, col=3, pch=2, add=TRUE)
# Change the points: sp to sf format:
plot(r.sf, axes=TRUE)
pts.sf <- st_as_sf(pts.sp) # ok, but then...
plot(pts.sf, col=3, pch=2, add=TRUE) # Does not work properly.
# But this way it works:
ggplot()+
geom_sf(data=r.sf, aes(col=layer))+
geom_sf(data=pts.sf, col=2)+
theme_bw()
# Change the points: sf to sp format:
as(pts.sf, "Spatial")
## class : SpatialPointsDataFrame
## features : 15
## extent : -123.1067, 176.9643, -89.13979, 62.94669 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## variables : 0
# Change the points: sp/sf format to dataframe:
head(data.frame(pts.sp))
## x y
## 1 48.33800 -5.451327
## 2 -21.86619 -63.574219
## 3 57.26661 -62.500878
## 4 -44.30464 -89.139794
## 5 -29.65806 -30.184878
## 6 92.59290 62.946695
head(as.data.frame(pts.sf)) #does not include the coordinates.
## geometry
## 1 POINT (48.338 -5.451327)
## 2 POINT (-21.86619 -63.57422)
## 3 POINT (57.26661 -62.50088)
## 4 POINT (-44.30464 -89.13979)
## 5 POINT (-29.65806 -30.18488)
## 6 POINT (92.5929 62.94669)
##################################################################
### Lines objects across spatial formats
##################################################################
# MULTILINESTRING & LINESTRING
# From Robin Lovelace book (https://geocompr.robinlovelace.net/geometric-operations.html#geo-vec)
#1)
multilinestring.list <- list(matrix(c(1, 4, 5, 3), ncol = 2),
matrix(c(4, 4, 4, 1), ncol = 2),
matrix(c(2, 4, 2, 2), ncol = 2))
# 2)
multilinestring <- st_multilinestring(multilinestring.list)
# 3)
multilinestring <- st_sfc(multilinestring)
multilinestring_sf = st_sf(geom = multilinestring) # or plot(multilinestring_sf)
linestring_sf2 = st_cast(multilinestring_sf, "LINESTRING")
linestring_sf2$name = c("Riddle Rd", "Marshall Ave", "Foulke St")
linestring_sf2$length = st_length(linestring_sf2)
plot(linestring_sf2)
# Another example (from sf vignette)
# a line is a linestring:
plot(ls <- st_linestring(rbind(c(0,0),c(1,1),c(2,1))))
# two lines are a multilinestring:
plot(mls <- st_multilinestring(list(rbind(c(2,2),c(1,3)), rbind(c(0,0),c(1,1),c(2,1)))))
# The Spatial part: it binds the two previous objects together and makes a simple feature with 2 features. Note that both become 'MULTILINESTRING' geometry
(sfc <- st_sfc(ls,mls))
## Geometry set for 2 features
## geometry type: GEOMETRY
## dimension: XY
## bbox: xmin: 0 ymin: 0 xmax: 2 ymax: 3
## epsg (SRID): NA
## proj4string: NA
st_cast(sfc, "MULTILINESTRING") #this would be just the spatial part
## Geometry set for 2 features
## geometry type: MULTILINESTRING
## dimension: XY
## bbox: xmin: 0 ymin: 0 xmax: 2 ymax: 3
## epsg (SRID): NA
## proj4string: NA
sf <- st_sf(a = 5:4, b=c("orange","apple"),geom = sfc) # in here we add an attribute
st_cast(sf, "MULTILINESTRING")
## Simple feature collection with 2 features and 2 fields
## geometry type: MULTILINESTRING
## dimension: XY
## bbox: xmin: 0 ymin: 0 xmax: 2 ymax: 3
## epsg (SRID): NA
## proj4string: NA
## a b geom
## 1 5 orange MULTILINESTRING ((0 0, 1 1,...
## 2 4 apple MULTILINESTRING ((2 2, 1 3)...
plot(sf)
##################################################################
### Polygons objects across spatial formats
##################################################################
par(mfrow=c(1,1), mar=c(4,4,1,1))
# to see code examples: https://github.com/r-spatial/sf/commit/bc9824620805e888e19d0e22e007dc1dcd4ecd59
multilinestring.list <- list(matrix(c(1, 4, 5, 3), ncol = 2),
matrix(c(4, 4, 4, 1), ncol = 2),
matrix(c(2, 4, 2, 2), ncol = 2))
multilinestring_sf = st_sf(geom = st_sfc(st_multilinestring(multilinestring.list)))
# Create a polygon from lines:
# From an "sfc_LINESTRING" or "sfc_MULTILINESTRING" to a polygon we can use:
multilinestring_sf
## Simple feature collection with 1 feature and 0 fields
## geometry type: MULTILINESTRING
## dimension: XY
## bbox: xmin: 1 ymin: 1 xmax: 4 ymax: 5
## epsg (SRID): NA
## proj4string: NA
## geom
## 1 MULTILINESTRING ((1 5, 4 3)...
plot(multilinestring_sf)
# Makes a hull around the multilinestring and turns into a polygon
plot(st_convex_hull(multilinestring_sf), border=2, main="st_convex_hull")
# Extracts the bordering points a boundary around the multilinestring and turns into a polygon (blue triangles)
plot(st_boundary(multilinestring_sf), col=4, pch=6, main="st_boundary")
####################################
# So either way we can get a Polygon:
plot(st_cast(st_boundary(multilinestring_sf), "POLYGON"), main="polygon vs. 1: st_cast(st_boundary(x))")
plot(st_convex_hull(multilinestring_sf), border=2, lty=3, main="polygon vs 2: st_convex_hull(x)")
# Triangulates (light blue)
plot(st_triangulate(multilinestring_sf), border=5, col=NA, pch=6, main="st_triangulate")
# Voronoi of the bouding box (pink):
plot(st_voronoi(multilinestring_sf), border=6, col=NA, main="st_voronoi")
# Centroid
plot(st_centroid(multilinestring_sf), col=4, pch=2, main="st_centroid")
# Buffer around the lines
plot(st_buffer(multilinestring_sf, dist=.1), border=6, pch=2, main="st_buffer")
# For geometric operations see: example(geos_unary)
# plot(st_polygonize(multilinestring_sf), col=4, pch=6)
# plot(st_line_merge(multilinestring_sf), col=4, pch=6, add=TRUE)
# plot(st_point_on_surface(multilinestring_sf), col=4, pch=6, add=TRUE)
# plot(st_node(multilinestring_sf), col=4, pch=6, add=TRUE)
# plot(st_segmentize(multilinestring_sf, dfMaxLength=2), col=4, pch=6, add=TRUE)
# Or to create a polygon from a matrix we can:
plot(st_polygon(list(rbind(c(-1,-1), c(1,-2), c(4,1), c(-1,1), c(-1,-1)))), axes=TRUE, main="Polygon from points matrix")
## Now, lets create polygons from points
# Or using dplyr:
plot(pts_sf <- data.frame(
x = rnorm(11),
y = seq(147, 148, by=0.1),
attr_data = rnorm(11,42,42),
id = c(rep("fred",6), rep("wilma",5))) %>%
sf::st_as_sf(coords = c("x","y")) %>%
sf::st_set_crs(4326))
# to get one feature only (1 multipoint with several points):
st_union(pts_sf)
## Geometry set for 1 feature
## geometry type: MULTIPOINT
## dimension: XY
## bbox: xmin: -1.692644 ymin: 147 xmax: 1.46986 ymax: 148
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
plot(st_cast(st_boundary(multilinestring_sf), "POLYGON"))
So in summary, to create an sf object the steps are [st_sf(geom = st_sfc(st_multilinestring((]: 1) matrix with coordinates OR dataframe %>% coords=c() 2) st_multilinestring (or other, st_point, st_multipoint, st_polygon…) 3) geom = st_sfc (to Create simple feature geometry list column) 4) st_sf create the sf object - and add the non-spatial part of the data)
##############################################
##############################################
## Operations
##############################################
##############################################
# # Examples from Lovelace book:
# https://geocompr.robinlovelace.net/geometric-operations.html#geo-vec
b = st_sfc(st_point(c(0, 1)), st_point(c(1, 1)))
# create 2 points
b = st_buffer(b, dist = 1) # convert points to circles
plot(b, main="two points with a buffer around")
text(x = c(-0.5, 1.5), y = 1, labels = c("x", "y")) # add text
b
## Geometry set for 2 features
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -1 ymin: 0 xmax: 2 ymax: 2
## epsg (SRID): NA
## proj4string: NA
x = b[1]
y = b[2]
plot(st_union(x, y), main="st_union")
# plot(st_join(x, y), main="st_union") # does not work
plot(x_and_y <- st_intersection(x, y), col = "lightgrey", main="st_intersection") # color intersecting area
bb = st_bbox(st_union(x, y))
box = st_as_sfc(bb)
set.seed(2017)
p = st_sample(x = box, size = 10)
plot(b)
plot(box, add=TRUE)
plot(p, add = TRUE, col=2)
text(x = c(-0.5, 1.5), y = 1, labels = c("x", "y"))
# points of the intersection of the two polygons
sel_p_xy = st_intersects(p, x, sparse = FALSE)[, 1] &
st_intersects(p, y, sparse = FALSE)[, 1]
plot(p_xy1 <- p[sel_p_xy], cex=3, col=6, add=TRUE) #result of intersection
# points that are within polygon of the intersections
p_xy2 = p[x_and_y]
identical(p_xy1, p_xy2)
## [1] TRUE
# These are the points p that are inside polygon x:
plot(p[st_intersects(x, p, sparse = FALSE)], cex=2, pch=4, add=TRUE)
How do we get a vector to add to pts.sf with the cells that correspond to the polygons, with the respective value (like in over/extract)?
Use the command: st_join(pts.sf, polys.sf)
## I use to do things this way, using raster and sp. Now, I will try to reproduce those things in sf.
set.seed(12345)
# Lets create a raster to extract the data
r.raster <- raster(ncol=36, nrow=18)
r.raster[] <- 1:ncell(r.raster)
plot(r.raster)
r.raster
## class : RasterLayer
## dimensions : 18, 36, 648 (nrow, ncol, ncell)
## resolution : 10, 10 (x, y)
## extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : layer
## values : 1, 648 (min, max)
cds1 <- rbind(c(-180,0), c(-160,5), c(-10, 0), c(-160,-60), c(-180,0))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
plot(polys.sp <- spPolygons(cds1, cds2, attr=data.frame(id=1:2, nam=c("a","b"))), add=TRUE)
## Extract the raster values in the polygons:
raster::extract(r.raster, polys.sp)
## [[1]]
## [1] 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 361
## [18] 362 363 364 365 366 367 368 369 370 371 372 373 398 399 400 401 402
## [35] 403 404 405 406 407 434 435 436 437 438 439 440 470 471 472 473 474
## [52] 507
##
## [[2]]
## [1] 172 173 208 209 244 245 279 280 281 282 315 316 317 318 351 352 353
## [18] 354 388 389 390 425 426 462 498
# Or, if we want to get the raster chunks:
plot(mask(r.raster, polys.sp))
# Now the same with sf:
(polys.sf <- st_as_sf(polys.sp)) #first convert to sf
## Simple feature collection with 2 features and 2 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -180 ymin: -60 xmax: 120 ymax: 60
## epsg (SRID): NA
## proj4string: NA
## id nam geometry
## 1 1 a POLYGON ((-180 0, -160 5, -...
## 2 2 b POLYGON ((80 0, 100 60, 120...
raster::extract(r.raster, polys.sf) # it works.
## [[1]]
## [1] 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 361
## [18] 362 363 364 365 366 367 368 369 370 371 372 373 398 399 400 401 402
## [35] 403 404 405 406 407 434 435 436 437 438 439 440 470 471 472 473 474
## [52] 507
##
## [[2]]
## [1] 172 173 208 209 244 245 279 280 281 282 315 316 317 318 351 352 353
## [18] 354 388 389 390 425 426 462 498
plot(mask(r.raster, polys.sf)) # and this too!
##########################################
## Extract the raster values in the points:
# Lets imagine I have a dataframe with points (I did it in two parts to assure we will get points inside and outside the polygons):
head(pts1 <- data.frame(rasterToPoints(sampleRandom(r.raster, 5, asRaster=TRUE)), oh.my=c(1:5)))
## x y layer oh.my
## 1 -105 5 296 1
## 2 175 -35 468 2
## 3 75 -45 494 3
## 4 95 -65 568 4
## 5 165 -65 575 5
head(pts2 <- data.frame(rasterToPoints(sampleRandom(mask(r.raster, polys.sp), 10, asRaster=TRUE)), oh.my=c(1:10)))
## x y layer oh.my
## 1 105 45 173 1
## 2 105 35 209 2
## 3 115 5 318 3
## 4 -175 -5 325 4
## 5 -65 -5 336 5
## 6 85 -5 351 6
# make it spatial
dim(pts <- rbind(pts1,pts2))
## [1] 15 4
pts.sp <- pts
coordinates(pts.sp) = ~x+y
plot(pts.sp, add=TRUE)
# Extract the value of the raster in the spatial points file locations, and add this data to the spatial points dataframe.
# with sp, I use to do this:
pts$my.raster <- raster::extract(r.raster, pts.sp)
# using sp, get the values of the polygons in each point location:
(pts$my.polygon <- sp::over(pts.sp, polys.sp))
## id nam
## 1 NA <NA>
## 2 NA <NA>
## 3 NA <NA>
## 4 NA <NA>
## 5 NA <NA>
## 6 2 b
## 7 2 b
## 8 2 b
## 9 1 a
## 10 1 a
## 11 2 b
## 12 1 a
## 13 1 a
## 14 1 a
## 15 1 a
# Now, if using sf, start by making the MULTIPOINTS sf (from the sp):
plot(pts.sf <- st_as_sf(pts.sp), pch=16, axes=TRUE)
# Works fine:
raster::extract(r.raster, pts.sf)
## [1] 296 468 494 568 575 173 209 318 325 336 351 363 367 404 507
# Does not work:
# sp::over(pts.sf, polys.sp)
par(mfrow=c(1,1))
plot(polys.sp)
points(pts.sp, col=2)
# fully spatial (works):
sp::over(as(pts.sf, "Spatial"), polys.sp)
## id nam
## 1 NA <NA>
## 2 NA <NA>
## 3 NA <NA>
## 4 NA <NA>
## 5 NA <NA>
## 6 2 b
## 7 2 b
## 8 2 b
## 9 1 a
## 10 1 a
## 11 2 b
## 12 1 a
## 13 1 a
## 14 1 a
## 15 1 a
sp::over(polys.sp,as(pts.sf, "Spatial")) #gives the first match
## layer oh.my
## 1 325 4
## 2 173 1
# Now, do the extract/over using sf:
# plot (using ggplot I don't have the distorsion when changing window size)
(pp <- ggplot()+
geom_sf(data=polys.sf, aes(fill=nam), col=NA, alpha=.3)+ # the polygons
geom_sf(data=pts.sf, aes(col=oh.my), size=2)+# the points
scale_color_viridis()+#nicer col palette
theme_bw())#nicer ggplot theme
# Get the points that fall inside the polygons:
(st_intersects(pts.sf, polys.sf, sparse = FALSE))
## [,1] [,2]
## [1,] FALSE FALSE
## [2,] FALSE FALSE
## [3,] FALSE FALSE
## [4,] FALSE FALSE
## [5,] FALSE FALSE
## [6,] FALSE TRUE
## [7,] FALSE TRUE
## [8,] FALSE TRUE
## [9,] TRUE FALSE
## [10,] TRUE FALSE
## [11,] FALSE TRUE
## [12,] TRUE FALSE
## [13,] TRUE FALSE
## [14,] TRUE FALSE
## [15,] TRUE FALSE
# Get the polygons matching for each spatial sf point:
(pts.sf <- st_join(pts.sf, polys.sf))
## Simple feature collection with 15 features and 4 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -175 ymin: -65 xmax: 175 ymax: 45
## epsg (SRID): NA
## proj4string: NA
## First 10 features:
## layer oh.my id nam geometry
## 1 296 1 NA <NA> POINT (-105 5)
## 2 468 2 NA <NA> POINT (175 -35)
## 3 494 3 NA <NA> POINT (75 -45)
## 4 568 4 NA <NA> POINT (95 -65)
## 5 575 5 NA <NA> POINT (165 -65)
## 6 173 1 2 b POINT (105 45)
## 7 209 2 2 b POINT (105 35)
## 8 318 3 2 b POINT (115 5)
## 9 325 4 1 a POINT (-175 -5)
## 10 336 5 1 a POINT (-65 -5)
(pp <- ggplot()+
geom_sf(data=polys.sf, aes(fill=nam), col=NA, alpha=.3)+ # the polygons
geom_sf(data=pts.sf, aes(col=nam), size=3)+# the points
#scale_color_viridis()+#nicer col palette
theme_bw())#nicer ggplot theme
# The matching polygons-points can also be found using:
diag(diag(st_intersects(pts.sf, polys.sf, sparse = TRUE)))
## [1] NA NA NA NA NA 2 2 2 1 1 2 1 1 1 1
Union vs. combine vs. intersect…
atriplex
“It turns out st_union() just merges all the polygons together without resolving those overlaps into discrete polygons.
The st_intersection() tool was closer to what I needed – but it throws out all the polygons that don’t overlap at all, rather than retaining them. I needed some kind of “full join” that kept those isolated, non-overlapping polygons.”
# Example to combine vs. union:
# example adapted from the help file of 'geos_binary_ops':
set.seed(131)
library(sf)
m = rbind(c(0,0), c(1,0), c(1,1), c(0,1), c(0,0))
p = st_polygon(list(m))
n = 20
l = vector("list", n)
for (i in 1:n)
l[[i]] = p + 10 * runif(2)
# Simple features object:
sf = st_sf(st_sfc(l))
plot(sf)
(pp <- ggplot()+theme_bw()+
geom_sf(data=sf, col=2))
# all intersections: produces 2 variables (fields), the first called n.overlaps has the number of overlaps in each intersected polygon and the second field, named origin, has the polygons that were intersected in that corresponding intersected polygon.
# Note that overlap==1 means that there is no intersection, i.e. only one polygonin that area.
(i = st_intersection(sf))
## Simple feature collection with 31 features and 2 fields
## geometry type: GEOMETRY
## dimension: XY
## bbox: xmin: 0.3840939 ymin: 1.132567 xmax: 10.19807 ymax: 10.14945
## epsg (SRID): NA
## proj4string: NA
## First 10 features:
## geometry n.overlaps origins
## 1 POLYGON ((2.06437 1.249422,... 1 1
## 2 POLYGON ((3.385234 4.757797... 1 2
## 3 POLYGON ((8.463468 5.721718... 1 3
## 4 POLYGON ((6.186254 2.916041... 1 4
## 5 POLYGON ((3.263181 9.149452... 1 5
## 6 POLYGON ((4.162483 3.446414... 1 6
## 7 POLYGON ((9.651544 3.401347... 1 7
## 8 POLYGON ((5.231186 3.378545... 2 4, 8
## 9 MULTIPOLYGON (((5.231186 3.... 1 8
## 10 POLYGON ((0.3840939 6.87075... 1 9
plot(i["n.overlaps"])
# ggplot version (maybe clearer):
pp +
geom_sf(data=i, aes(fill=factor(n.overlaps)), alpha=.3)+
ggtitle("st_intersection")
summary(i$n.overlaps - lengths(i$origins))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
# A helper function that erases all of y from x:
# create a second set of polygons:
for (i in 1:n)
l[[i]] = p + 10 * runif(2)
sf2 = st_sf(st_sfc(l))
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
ggtitle("Base plot of sf and sf2")
# function from the help file:
st_erase = function(x, y) st_difference(x, st_union(st_combine(y)))
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
geom_sf(data=st_union(st_combine(sf2)), fill=3, alpha=.2)+
ggtitle("st_union / st_combine")
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
geom_sf(data=st_difference(sf,sf2), fill=3, alpha=.2)+
ggtitle("st_difference")
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
geom_sf(data=st_difference(sf,st_union(st_combine(sf2))), fill=3, alpha=.2)+
ggtitle("st_difference+st_union / st_combine")
# Could we use st_crop instead? Not really... st_crop would be for rectangles/bound box (st_bbox).
# st_crop(sf,sf2)
# explore st_snap (snap one shape to the other and get a 'middle' shape, something in between)
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
geom_sf(data=st_snap(sf,sf2, tolerance=.5), fill=3, alpha=.2)+
ggtitle("st_snap")
# explore st_intersection
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
geom_sf(data=st_intersection(sf,sf2), fill=3, alpha=.2)+
ggtitle("st_intersection")
# explore st_difference
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
geom_sf(data=st_difference(sf,sf2), fill=3, alpha=.2)+
ggtitle("st_difference")
# explore st_sym_difference
ggplot() +theme_bw()+
geom_sf(data=sf, col=1, alpha=.5)+
geom_sf(data=sf2, col=2, alpha=.5)+
geom_sf(data=st_sym_difference(sf,sf2), fill=3, alpha=.2)+
ggtitle("st_sym_difference")
All from Edzer UseR2017:
-feature: abstraction of real world phenomena (type, or instance)
-simple feature: feature with all geometric attributes described piecewise by straight line or planar interpolation between sets of points
- 7 + 10 types, 68 classes, of which 7 used in like 99% of the use cases
- support for mixed type (GEOMETRYCOLLECTION), and type mix (GEOMETRY)
-text and binary serialisations (WKT, WKB)
-support for empty geometry (empty set: somewhat like NA)
st_as_sfc: convert geometries to sfc (e.g., from WKT, WKB)
as(x, “Spatial”): convert to Spatial
st_as_sf: convert to sf (e.g., convert from Spatial)
These functions generate vectors/matrices of TRUE/FALSE.
st_intersects: touch or overlap
st_disjoint: !intersects
st_touches: touch
st_crosses: cross (don’t touch)
st_within: within
st_contains: contains
st_overlaps: overlaps
st_covers: cover
st_covered_by: covered by
st_equals: equals
st_equals_exact: equals, with some fuzz
These functions generate new sf objects with the resulting operation.
st_union: union of several geometries
st_intersection: intersection of pairs of geometries
st_difference: difference between pairs of geometries
st_sym_difference: symmetric difference (xor)
st_cast
#### Higher-level operations:
summarise, interpolate, aggregate, st_join aggregate and summarise use st_union (by default) to group feature geometries
st_interpolate_aw: area-weighted interpolation, uses st_intersection to interpolate or redistribute attribute values, based on area of overlap: st_join uses one of the logical binary geometry predicates (default: st_intersects) to join records in table pairs
These functions generate new sf objects with the resulting operation.
st_line_merge: merges lines
st_segmentize: adds points to straight lines
st_voronoi: creates voronoi tesselation
st_centroid: gives centroid of geometry
st_convex_hull: creates convex hull of set of points
st_triangulate: triangulates set of points (not constrained)
st_polygonize: creates polygon from lines that form a closed ring
st_simplify: simplifies lines by removing vertices
st_split: split a polygon given line geometry
st_buffer: compute a buffer around this geometry/each geometry
st_make_valid: tries to make an invalid geometry valid (requires lwgeom)
st_boundary: return the boundary of a geometry
st_zm: sets or removes z and/or m geometry (changes dim)
st_coordinates: retrieve coordinates in a matrix or data.frame
st_geometry: set, or retrieve sfc from an sf object
st_is: check whether geometry is of a particular type
(x = st_sf(a = 1:2, geom = st_sfc(st_point(c(0,0)), st_point(c(1,1))))) (y=data.frame(b=3:4)) merge(x,y)
(y=data.frame(a=1:4)) merge(x,y)