In this class we will work with maps and analize the case when some data intersects our maps.
Before to start our project, lets clean our working space.
rm(list = ls())
Now the packages we will use:
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(raster)
## Loading required package: sp
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:raster':
##
## intersect, select, union
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(spData)
library(ggspatial)
## Warning: package 'ggspatial' was built under R version 4.0.3
library(ggplot2)
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.0.3
## -- Attaching packages -------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v tibble 3.0.3 v purrr 0.3.4
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ----------------------------------------------------------------------------------- tidyverse_conflicts() --
## x tidyr::extract() masks raster::extract()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
## x dplyr::select() masks raster::select()
In this example we use projected data for the 16 main regions and 101 points in New Zealand. The following code generates the data base for those objects:
view(nz_height)
data(nz)
nz
## Simple feature collection with 16 features and 6 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 1090144 ymin: 4748537 xmax: 2089533 ymax: 6191874
## projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
## First 10 features:
## Name Island Land_area Population Median_income Sex_ratio
## 1 Northland North 12500.561 175500 23400 0.9424532
## 2 Auckland North 4941.573 1657200 29600 0.9442858
## 3 Waikato North 23900.036 460100 27900 0.9520500
## 4 Bay of Plenty North 12071.145 299900 26200 0.9280391
## 5 Gisborne North 8385.827 48500 24400 0.9349734
## 6 Hawke's Bay North 14137.524 164000 26100 0.9238375
## 7 Taranaki North 7254.480 118000 29100 0.9569363
## 8 Manawatu-Wanganui North 22220.608 234500 25000 0.9387734
## 9 Wellington North 8048.553 513900 32700 0.9335524
## 10 West Coast South 23245.456 32400 26900 1.0139072
## geom
## 1 MULTIPOLYGON (((1745493 600...
## 2 MULTIPOLYGON (((1803822 590...
## 3 MULTIPOLYGON (((1860345 585...
## 4 MULTIPOLYGON (((2049387 583...
## 5 MULTIPOLYGON (((2024489 567...
## 6 MULTIPOLYGON (((2024489 567...
## 7 MULTIPOLYGON (((1740438 571...
## 8 MULTIPOLYGON (((1866732 566...
## 9 MULTIPOLYGON (((1881590 548...
## 10 MULTIPOLYGON (((1557042 531...
Lets separate only one of those regions, Canterbury, and save its information:
canterbury <- nz %>%
filter(Name == "Canterbury")
canterbury_height <- nz_height[canterbury, ]
view(canterbury_height)
Lets observe graphically what information we just created:
ggplot() + geom_sf(data = nz)
ggplot() + geom_sf(data = canterbury_height)
We can make the map more beautiful by using the following code:
ggplot() + geom_sf(data = nz, fill="white") + geom_sf(fill="antiquewhite") +
geom_sf(data = canterbury, fill="yellow") + geom_sf(data=nz_height, size=3, shape=23, fill="red") +
xlab("Longitude") + ylab("Latitude") + ggtitle("High points in New Zealand") +
annotation_north_arrow(location = "br", which_north = "true", pad_x = unit(0.25, "in"), pad_y = unit(0.15, "in"),
style = north_arrow_fancy_orienteering) +
theme(panel.background = element_rect(fill = "lightblue"))
If we are interested in find the points that are outside a certain space or polygon, we can use the function st_joint:
nz_height[canterbury, , op = st_disjoint]
## Simple feature collection with 31 features and 2 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: 1204143 ymin: 5048309 xmax: 1822492 ymax: 5650492
## projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
## First 10 features:
## t50_fid elevation geometry
## 1 2353944 2723 POINT (1204143 5049971)
## 2 2354404 2820 POINT (1234725 5048309)
## 3 2354405 2830 POINT (1235915 5048745)
## 4 2369113 3033 POINT (1259702 5076570)
## 12 2363996 2759 POINT (1373264 5175442)
## 25 2364028 2756 POINT (1374183 5177165)
## 26 2364029 2800 POINT (1374469 5176966)
## 27 2364031 2788 POINT (1375422 5177253)
## 46 2364166 2782 POINT (1383006 5181085)
## 47 2364167 2905 POINT (1383486 5181270)
nz_height[canterbury, 2, op = st_disjoint]
## Simple feature collection with 31 features and 1 field
## geometry type: POINT
## dimension: XY
## bbox: xmin: 1204143 ymin: 5048309 xmax: 1822492 ymax: 5650492
## projected CRS: NZGD2000 / New Zealand Transverse Mercator 2000
## First 10 features:
## elevation geometry
## 1 2723 POINT (1204143 5049971)
## 2 2820 POINT (1234725 5048309)
## 3 2830 POINT (1235915 5048745)
## 4 3033 POINT (1259702 5076570)
## 12 2759 POINT (1373264 5175442)
## 25 2756 POINT (1374183 5177165)
## 26 2800 POINT (1374469 5176966)
## 27 2788 POINT (1375422 5177253)
## 46 2782 POINT (1383006 5181085)
## 47 2905 POINT (1383486 5181270)
Now lets identify and safe the information for those points that intersects Canterbury, we can use the following code:
sel_sgbp <- st_intersects(x = nz_height, y = canterbury)
In the previous code, an object of type sbgp is created, i.e. a sparse geometry binary predicate. Also, we create a list of length x converted into a logical vector sel_logical. The function lengths() identifies which features in nz_height intersect with any objects in y.
class(sel_sgbp)
## [1] "sgbp" "list"
sel_logical <- lengths(sel_sgbp) > 0
canterbury_height2 <- nz_height[sel_logical, ]
Of course, we can get the same results by using pipes and the function filter():
canterbury_height3 <- nz_height %>%
filter(st_intersects(x = ., y = canterbury, sparse = FALSE))
Graphically what we obtain is:
ggplot() + geom_sf(data = nz, fill="white") + geom_sf(fill="antiquewhite") +
geom_sf(data = canterbury, fill="yellow") + geom_sf(data=canterbury_height, size=3, shape=23, fill="red") +
xlab("Longitude") + ylab("Latitude") + ggtitle("High points in Canterburry / New Zealand") +
annotation_north_arrow(location = "br", which_north = "true", pad_x = unit(0.25, "in"), pad_y = unit(0.15, "in"),
style = north_arrow_fancy_orienteering) +
theme(panel.background = element_rect(fill = "lightblue"))
It describes the spatial relationships between objects. In order to understand in detail what we are doing, lets create some spatial objects, a polygon (a), a line (l), and some points (p).
# Polygon
p_poly <- st_polygon(list(rbind(c(-1, -1), c(1, -1), c(1, 1), c(-1, -1))))
p <- st_sfc(p_poly)
# Line
l_line <- st_linestring(x = matrix(c(-1, -1, -0.5, 1), ncol = 2))
l <- st_sfc(l_line)
# Points
a_matrix <- matrix(c(0.5, 1, -1, 0, 0, 1, 0.5, 1), ncol = 2)
a_multi <- st_multipoint(x = a_matrix)
a <- st_cast(st_sfc(a_multi), "POINT")
Lets observe graphically what we created
plot(p, axes = TRUE, lwd=2, bg="antiquewhite")
plot(l, add=TRUE, lwd=2)
plot(a, add=TRUE, cex=2, fill = "red", pch=21, bg="red")
Lets identify some relationships between two polygons:
st_intersects(a, p)
## Sparse geometry binary predicate list of length 4, where the predicate was `intersects'
## 1: 1
## 2: 1
## 3: (empty)
## 4: (empty)
st_intersects(a, p, sparse = FALSE)
## [,1]
## [1,] TRUE
## [2,] TRUE
## [3,] FALSE
## [4,] FALSE
st_disjoint(a, p, sparse = FALSE)[,1] #[,1] it changes the output into a vector
## [1] FALSE FALSE TRUE TRUE
st_within(a, p, sparse = FALSE)[, 1]
## [1] TRUE FALSE FALSE FALSE
st_touches(a, p, sparse = FALSE)[, 1]
## [1] FALSE TRUE FALSE FALSE
Lets imagine that we want to find those points that do not touch the polygon we are interesed in but almost touch it. For that purpose, we use the command st_is_within_distance(), which includes the dist option, which give as a relative reference of how close can be a point.
# Lets check if the points "almost" touch the polygon rather than just touch, we set the distance.
sel <- st_is_within_distance(a, p, dist = 0.9) # can only return a sparse matrix
lengths(sel) > 0
## [1] TRUE TRUE FALSE TRUE
For this purpose, imagine you have points randomly distributed across the Earth’s surface. Lets compare with World map information:
set.seed(2020) # set seed for reproducibility
(bb_world <- st_bbox(world)) # the world's bounds
## xmin ymin xmax ymax
## -180.00000 -90.00000 180.00000 83.64513
random_df <- tibble(
x = runif(n = 100, min = bb_world[1], max = bb_world[3]),
y = runif(n = 100, min = bb_world[2], max = bb_world[4])
)
random_points <- random_df %>%
st_as_sf(coords = c("x", "y")) %>% # set coordinates
st_set_crs(4326) # set geographic CRS
Now we identify the joined set:
world_random <- world[random_points, ]
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
nrow(world_random)
## [1] 17
random_joined <- st_join(random_points, world["name_long"])
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
random_joined2 <- st_join(random_points, world["name_long"], left = FALSE)
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
Graphycally we have:
# graphic
ggplot() + geom_sf(data=world, fill="white")
ggplot() + geom_sf(data=world, fill="white") + geom_sf(data=random_joined, col="red") +
theme(panel.background = element_rect(fill = "lightblue")) + ggtitle("World Map")
ggplot() + geom_sf(data=world, fill="white") + geom_sf(data=random_joined2, col="red") +
theme(panel.background = element_rect(fill = "lightblue"))
Aggregated data show some statistics about a variable related to some kind of grouping variable. By using the New Zealand example, imagine we want to find out the average height of high points in each region:
view(nz)
nz_avheight <- aggregate(x = nz_height, by = nz, FUN = mean)
Graphically:
ggplot() + geom_sf(data = nz_avheight) +
theme(panel.background = element_rect(fill = "lightblue"))
Same results we can get by using pipes:
nz_avheight2 <- nz %>%
st_join(nz_height) %>%
group_by(Name) %>%
summarize(elevation = mean(elevation, na.rm = TRUE))
## `summarise()` ungrouping output (override with `.groups` argument)
ggplot(nz_avheight2) + geom_sf(aes(fill = elevation)) + theme(panel.background = element_rect(fill = "lightblue")) +
scale_fill_stepsn(colors = c('#efedf5','#bcbddc','#756bb1'))
table(nz_avheight2$elevation)
##
## 2720 2723 2734.33333333333 2777
## 1 1 1 1
## 2825 2889.45454545455 2994.6
## 1 1 1