Spatial Data Operations: In-class Exercises

install.packages("tidyverse", dependencies = TRUE)
install.packages("sf", dependencies = TRUE)
install.packages("spData", dependencies = TRUE)
install.packages("tmap", dependencies = TRUE)

library(tidyverse)
library(sf)          # classes and functions for vector data
library(spData)      # load geographic data
library(tmap)

nz
nz_height

tmap_mode("view") # from now on, all tmaps are interactive, but longer processing time 
tm_basemap("OpenStreetMap.Mapnik") + 
  tm_shape(nz) +
  tm_borders() +
  tm_shape(nz_height) +
  tm_symbols(col = "red", shape=16, size = 0.5, alpha = 0.5)  
# tmap_mode("plot") # from now on, all tmaps are static, but shorter processing time

A0. Introduction to spatial subsetting: points[polygons, ]

canterbury <- nz %>% filter(Name == "Canterbury") # attribute subsetting
canterbury_height <- nz_height[canterbury, ]      # spatial subsetting 

# examine data 
canterbury_height
class(canterbury_height)
typeof(canterbury_height)
str(canterbury_height)
View(canterbury_height)
nrow(canterbury_height)

tm_basemap("OpenStreetMap.Mapnik") +
  tm_shape(canterbury)+
  tm_borders() +
  tm_shape(canterbury_height) +
  tm_symbols(col = "red", shape=16, size = 0.5, alpha = 0.5)

A1. Report the number of highest points in each of the following spatial operation cases: 1 through 6.

# define a mapping function to visualize results easily 
mymap <- function(x){
  tm_basemap("OpenStreetMap.Mapnik") +
    tm_shape(canterbury)+
    tm_borders() +
    tm_shape(x) +
    tm_symbols(col = "red", shape=16, size = 0.5, alpha = 0.5)    
}

No 1 nz_height[canterbury, , op = st_intersects] %>% mymap() # by default
No 2 nz_height[canterbury, , op = st_touches] %>% mymap()
No 3 nz_height[canterbury, , op = st_crosses] %>% mymap()
No 4 nz_height[canterbury, , op = st_within] %>% mymap()
No 5 nz_height[canterbury, , op = st_disjoint] %>% mymap()
No 6 nz_height[canterbury, , op = st_is_within_distance, dist = 5000] %>% mymap()

A1. Answers:

70
0
0
70
31
92

Report your answers to this section on Google Forms Section A

B0. Introduction to spatial joinning: st_join(points, polygons)
- While you are working on this, you will see another type of spatial subsetting: polygons[points, ]

tmap_mode("plot") # from now on, all tmaps are static, but shorter processing time 

set.seed(2020)               # set seed for reproducibility
bb_world = st_bbox(world[world$name_long == "United States", ])  # the world's bounds: c(xmin, ymin, xmax, ymax)

random_df = tibble(
  x = runif(n = 30, min = bb_world[1], max = bb_world[3]),
  y = runif(n = 30, min = bb_world[2], max = bb_world[4])
)

# Note that random_points has no other attributes except xy coordinates 
random_points = random_df %>% 
  st_as_sf(coords = c("x", "y")) %>% # set coordinates
  st_set_crs(4326)                   # set geographic CRS 

world_random = world[random_points, ] # spatially subset polygons if they have any point falling inside them.   
nrow(world_random)                    # number of polygons with successful spatial joining 

random_joined = st_join(random_points, world["name_long"]) # all points are returned regardelss of being matched to any country 
View(random_joined)

tm_shape(world[world$name_long == "United States", ]) +
  tm_borders() + 
  tm_shape(world_random) + 
  tm_fill(col = "name_long") + 
  tm_text("name_long", col = "blue") +
  tm_shape(random_joined) + 
  tm_symbols(shape = 4)

B1. Report your results in the following cases.

No 7. Now, change the seed number to 2019 and the number of random points to 20. Report How many countries those random points fall into.

No 8. Now, change the seed number to 2020 and the number of random points to 30. Report How many countries those random points fall into.

No 9. Now, change the bounding box to that for the United States (Hint: Use attribute subsetting with name_long == "United States" on the world sf object). With the seed number 2019 and 20 random points, count how many states in the US have any point in them.

No 10. Now, change the bounding box to that for the United States (Hint: Use attribute subsetting with name_long == "United States" on the world sf object). With the seed number 2020 and 30 random points, count how many states in the US have any point in them.

B1. Answers:

Five: Antarctica, Canada, India, Myanmar, & Tanzania
Eight: Antarctica, Austri, Brazil, Chad, D. R. C., Greenland, Iran, & Russia
Four
Five

Report your answers to this section on Google Forms Section B

C. 4.2.4 Non-overlapping joins deals with st_join(points, points). We will skip this during the class today, but review and practice the scripts in the materials in your study time.

D0. Introduction to spatial data aggregating: st_join(polygons, points).
- Note that spatial join generates each and every possible combination between two input geometry types.

nz_height  
nz         

nz_join_height <- nz %>% # all polygons are returned regardelss of having points inside them 
  st_join(nz_height) 

nz_join_height2 <- nz %>%
  st_join(nz_height) %>%
  group_by(Name) %>%
  summarize(n = n()) # number of rows 
            
nz_join_height3 <- nz %>%
  st_join(nz_height) %>%
  group_by(Name) %>%
  summarize(n = sum(is.na(elevation) == FALSE)) %>% # count valid point id's per region
  filter(n == 0) # nine out of 16 regions have no highest points 

nz_avheight = nz %>%
  st_join(nz_height) %>%
  group_by(Name) %>%
  summarize(elevation = mean(elevation)) %>% 
  arrange(desc(elevation))

D1. report the number of highest points in the following cases.

No 11. How many more rows does nz_join_height have compared to nz_height?

No 12. How many rows in nz_join_height have missing values for elevation?

No 13. How many rows does nz_join_height2 have?

No 14. How many rows does nz_join_height3 have?

No 15. How many rows in nz_avheight have missing values for elevation?

No 16. What is the average elevation of highest points for Canterbury?

No 17. What is the maximum elevation of highest points for Canterbury?

No 18. What is the maximum elevation of highest points for Otago?

Report your answers to this section on Google Forms Section D

D1. Answers:

9
9
16
9
9
2994.6
3724
2830

Spatial Data Operations: In-class Exercises

February 27, 2019 (Wendesday)

A1. Answers:

B1. Answers:

D1. Answers: