R(Lovelace, Nowosad, & Muenchow, 2019). This document is prepared for CP6521 Advanced GIS, a graduate-level city planning elective course at Georgia Tech in Spring 2019. For any question, contact the instructor, Yongsung Lee, Ph.D. via yongsung.lee(at)gatech.edu.install.packages("tidyverse", dependencies = TRUE)
install.packages("stringr", dependencies = TRUE)
install.packages("sf", dependencies = TRUE)
install.packages("raster", dependencies = TRUE)
install.packages("spData", dependencies = TRUE)
install.packages("devtools") # for this, you need Rtools installed on your machine
devtools::install_github("Nowosad/spDataLarge")
# install.packages("sp", dependencies = TRUE)
# install.packages("mapview", dependencies = TRUE)
# install.packages("tmap", dependencies = TRUE)
library(tidyverse)
library(stringr) # for working with strings (pattern matching)
library(sf) # classes and functions for vector data
library(raster) # classes and functions for raster data
library(spData) # load geographic data
library(spDataLarge) # load larger geographic data
# library(sp)
# library(mapview)
# library(tmap)
What we do:
1. Learn how to interact with raster data
- Extract/modify cell values
- Subset a raster object
- Summarize/visualize cell values
2. Learn how to spatially manipulate raster data
- Spatially subset a raster object
- Map algebra: local, focal, zonal, & global operations
- Merge raster objects
In contrast to the vector data model underlying simple features (which represents points, lines and polygons as discrete entities in space), raster data represent continuous surfaces.
elev = raster(nrows = 6, ncols = 6, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = 1:36)
Raster objects can also contain categorical values of class logical or factor variables in R.
grain_order = c("clay", "silt", "sand")
grain_char = sample(grain_order, 36, replace = TRUE) # randomly generate 36 values out of grain_order
grain_fact = factor(grain_char, levels = grain_order) # convert data type from character to factor
typeof(grain_order)
typeof(grain_char)
typeof(grain_fact)
class(grain_fact)
grain = raster(nrows = 6, ncols = 6, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = grain_fact)
The raster object stores the corresponding look-up table or “Raster Attribute Table” (RAT) as a data frame in a new slot named attributes.
str(grain)
grain@data@attributes # extract slot elements
levels(grain) # list
levels(grain)[[1]] # data.frame, before change
levels(grain)[[1]] <-
cbind(levels(grain)[[1]], wetness = c("wet", "moist", "dry")) # add a new column to Raster Attribute Table
levels(grain)[[1]] # data.frame, after change
Change factor values to their actual meanings
# extract(i.e., subset) cell values by a numeric vector (cell id)
grain[c(1, 11, 35)]
# change above factor values to their meanings
factorValues(grain, grain[c(1, 11, 35)])
factorValues(grain, c(1, 11, 35)) # the second argument should be a vector of possible factor values
Raster subsetting is done with the base R operator [, which accepts a variety of inputs:
- Row-column indexing
- Cell IDs
- Coordinates (in Chapter 4) - Another raster object (in Chapter 4)
elev[1, 1]
elev[1]
r_stack = stack(elev, grain)
names(r_stack) = c("elev", "grain")
str(grain)
str(r_stack)
# three ways to extract a layer of a stack
raster::subset(r_stack, "elev")
r_stack[["elev"]] # RasterStack becomes RasterLayer
r_stack$elev
# update individual cell values
elev[] # all values, before change
elev[1, 1] <- 0 # assign a new value
elev[] # all values, after change
# summarize
cellStats(elev, sd)
?cellStats # check the full list of avaliable summary statistics
summary(elev[])
elev %>% values() %>% summary()
elev %>% getValues() %>% summary()
# histogram
hist(elev)
elev.df <- as.data.frame(elev[]) #ggplot takes only data.frames, not vectors
names(elev.df) <- "elev"
ggplot(data = elev.df) +
geom_histogram(aes(x = elev), color = "white", binwidth = 5)
Create a raster from scratch with nine rows and columns and a resolution of 0.5 decimal degrees (WGS84). Fill it with random numbers. Extract the values of the four corner cells.
14a. Change the seed number to 2020 and extract those values in the four corners.
14b. Change the seed number to 2030 and extract those values in the four corners.
14c. Instead of vals = sample(81), use rnorm(n = 81), and compute max, min, and median values (Use set.seed(2019).
14d. Create 20 by 20 raster, instead of 9 by 9. With runif(n = 4000), compute max, min, and median values (Use set.seed(2019)). You need to change xmn, ymn, xmx, and ymx accordingly (Use the same res value).
Use Google Forms here: https://goo.gl/forms/643wQk2qjGsYzK4K3
set.seed(2019)
raster1 <- raster(
nrows = 9, ncols = 9, res = 0.5,
xmn= -2.25, ymn = -2.25, xmx = 2.25, ymx = 2.25,
vals = sample(81))
plot(raster1)
text(raster1)
x <- ncol(raster1)
y <- nrow(raster1)
raster1[1,1] # top left
raster1[1,x] # top right
raster1[y,1] # bottom left
raster1[y,x] # bottom right
grain (hint: modal())?plot(grain)
text(grain)
levels(grain)
grain %>% values() %>% table()
grain %>% getValues() %>% table()
data(dem, package = "RQGIS") raster.# read in the data
install.packages("RQGIS", dependencies = TRUE)
library(RQGIS)
data(dem, package = "RQGIS")
# base R functions: execute the following three lines together
# https://www.rdocumentation.org/packages/graphics/versions/3.5.2/topics/par
par(mfrow = c(1, 2))
hist(dem)
boxplot(dem)
# ggplot functions
dem.df <- as.data.frame(values(dem))
names(dem.df) <- "elev"
ggplot(data = dem.df) +
geom_histogram(aes(x = elev), color = "white", binwidth = 50)
ggplot(data = dem.df) +
geom_boxplot(aes(y = elev)) +
theme_classic()
data(ndvi, package = "RQGIS"). Create a raster stack using dem and ndvi, and make a pairs() plot.# read in the data
install.packages("RQGIS", dependencies = TRUE)
library(RQGIS)
data(ndvi, package = "RQGIS")
?ndvi
plot(ndvi)
r_stack_mongon <- stack(dem, ndvi)
names(r_stack_mongon)
pairs(r_stack_mongon)
Subset raster by location (coordinates)
id = cellFromXY(elev, xy = c(0.1, 0.1))
?cellFromXY
elev[id]
# the same as
raster::extract(elev, data.frame(x = 0.1, y = 0.1))
?raster::extract
Subsect raster by another raster: target raster[subsetting raster]
clip = raster(nrows = 3, ncols = 3, res = 0.3, xmn = 0.9, xmx = 1.8,
ymn = -0.45, ymx = 0.45, vals = rep(1, 9)) # repeat 1 for 9 times
elev[clip] # target raster[subsetting raster]
plot(elev)
plot(clip, add = TRUE, alpha = 0.75)
If we want a new raster object instead of raster values, use drop = FALSE
plot(elev[1:2, drop = FALSE]) # spatial subsetting with cell IDs
elev[1:2] # returns cell values
plot(elev[1:2]) # plots two points without values
plot(elev[1, 1:2, drop = FALSE]) # spatial subsetting by row, column indices
elev[1, 1:2] # returns cell values
plot(elev[1, 1:2]) # plots two points without values
Create a masking raster on the fly and use it on the target raster: also use mask() and overlay()
# check the raw input
plot(elev)
# create subsetting raster
rmask <- elev
values(rmask) <- sample(c(NA, TRUE), 36, replace = TRUE)
plot(rmask) # only 1 or missing
# the following three scripts return the same output
elev[rmask, drop = FALSE] %>% plot() # with [ operator
mask(elev, rmask) %>% plot() # with mask()
overlay(elev, rmask, fun = "max") %>% plot() # with overlay()
If two or more raster datasets share the same extent, projection, and resolution (i.e., one-to-one locational correspondence), one could treat them as matrices for raster processing. In this case, raster processing gets really fast because in the process, we do not check their geographic coordianates. Four types of raster operations (divided by the number of cells involved in cacluation):
Cell-by-cell operations in one or several layers
# construct a reclasification matrix
rcl <- matrix(c(0, 12, 1, 12, 24, 2, 24, 36, 3), ncol = 3, byrow = TRUE)
recl <- reclassify(elev, rcl = rcl)
# in each row, the lower bounds not included, but the upper bound included
# check to which "new"" value the "old" value 12 is reclassified
plot(elev) # before reclassification
plot(recl) # after reclassification, note that the upper left corner has the value of zero.
Basic arithmetic operations
elev + elev
elev^2
log(elev)
elev > 5 # the outcome is a Boolean raster (TRUE or FALSE)
plot(elev > 5)
Other examples: the normalized difference vegetation index (NDVI) and predictive mapping (e.g., by using lm and glm)
Focal operations take into account a central cell and its neighbors.
- The neighborhood (e.g., kernel, filter, or moving window) under consideration is typically of size 3-by-3 cells.
- The neighborhood can be in non-rectangular shapes.
- A focal operation applies an aggregation function to all cells within the specified neighborhood.
- Also called spatial filtering and convolution
r_focal <- focal(elev, w = matrix(1, nrow = 3, ncol = 3), fun = min)
plot(elev) # input
plot(r_focal) # output
?focal
focal() provides options for low-pass/smoothing (i.e., removing extreme values) and high-passing (e.g., accentuating features and detecting lines). See examples.terrain() calculates the slope, aspect, and flow directions of individual cells, but not all common operations are available in R. R allows to access desktop GIS functionality from within R (see Chapter 9, but we may not cover this chapter in the class).Zonal filters can take on any shape instead of just a predefined windonw.
# check the inputs
plot(elev)
plot(grain)
# calculate statistics for individual zones
z <- zonal(elev, grain, fun = "mean") %>% # zonal returns a matrix
as.data.frame()
z
A special case of zonal operations with the entire raster dataset representing a single zone. Examples are:
- Descriptive statistics for the entire raster dataset
- The distance from each cell to a specific target cell: e.g., distance to the nearest coast
- Visibility and viewshed (a viewshed raster in Chapter 9)
Merge two rasters that cover neighborhing areas. If they overlap, merge returns values from the first raster.
Useful links:
- getData:https://www.rdocumentation.org/packages/raster/versions/2.8-19/topics/getData
- About SRTM: http://opentopo.sdsc.edu/raster?opentopoID=OTSRTM.082015.4326.1
- For more details: Remote Sensing and GIS for Ecologists: Using Open Source Software
raster::ccodes()
aut <- getData("alt", country = "AUT", mask = TRUE)
ch <- getData("alt", country = "CHE", mask = TRUE)
aut_ch <- merge(aut, ch)
plot(aut)
plot(ch)
plot(aut_ch)
Use data(dem, package = "RQGIS"), and reclassify the elevation in three classes: low, medium and high. Secondly, attach the NDVI raster (data(ndvi, package = "RQGIS")) and compute the mean NDVI and the mean elevation for each altitudinal class.
4a. Change the number of classes to five, and compute mean values for each class. Report the mean for dem from the middle class (i.e., 3rd class).
4b. Still with five classes, report the mean for ndvi from the same class (i.e., 3rd class).
4c. Now, change the number of classes to seven, and compute mean values for each class. Report the median for dem from the 1st class (i.e., one with the lowest values in dem).
4d. Still with seven classes, report the median for ndvi from the same class (i.e., one with the lowest values in dem).
Use Google Forms here: https://goo.gl/forms/643wQk2qjGsYzK4K3 (the same link as the above)
library(RQGIS)
data(dem, package = "RQGIS")
plot(dem)
summary(dem)
library(classInt) # extract cutoff values
?classIntervals
classIntervals(values(dem), n = 3) %>% str() # check its structure
brk <- classIntervals(values(dem), n = 3)$brk # extract the second list element, brk
rcl <- matrix(c(brk[1]-1, brk[2], 1, brk[2], brk[3], 2, brk[3], brk[4], 3), ncol = 3, byrow = TRUE)
#in the above line, for each row, the lower-bound value is not included
recl <- reclassify(dem, rcl = rcl)
plot(recl)
summary(recl)
data(ndvi, package = "RQGIS")
plot(ndvi)
summary(ndvi)
# for two rasters, stack them first, and compute zonal statistics together
zonal(stack(dem, ndvi), recl, fun = "mean") # base R way
stack(dem, ndvi) %>%
zonal(recl, fun = "mean") # with the pipe
raster(system.file("external/rlogo.grd", package = "raster"))). Plot the result. Hint: Read ?raster::focal.Useful links:
- An example of the Sobel filter: https://mgimond.github.io/Spatial/raster-operations-in-r.html#focal-operations-and-functions-1
?raster::focal
# before applying a filter
raster(system.file("external/rlogo.grd", package = "raster")) %>% plot()
# create a filter for line detection
sobel <- matrix(c(-1,0,1,-2,0,2,-1,0,1) / 4, nrow=3)
# after applying the filter
# inside the plot function, all values are divided into two groups, each of which gets either white or black
raster(system.file("external/rlogo.grd", package = "raster")) %>% focal(w=sobel, fun=sum) %>% plot(col = c("white", "black"))
spDataLarge package (system.file("raster/landsat.tif", package="spDataLarge")).Useful links:
- How to assign RGB to which bands? # https://landsat.usgs.gov/what-are-band-designations-landsat-satellites
- How to rescale each band to 0 to 255? https://www.neonscience.org/dc-multiband-rasters-r
file <- system.file("raster/landsat.tif", package="spDataLarge")
multiband.raster <- stack(file)
multiband.raster
plotRGB(multiband.raster, r = 3, g = 2, b = 1, stretch = "hist")
# "landsat.4" - Near Infrared (NIR)
# "landsat.3" - Red
ndvi <- (multiband.raster[["landsat.4"]] - multiband.raster[["landsat.3"]]) /
(multiband.raster[["landsat.4"]] + multiband.raster[["landsat.3"]])
plot(ndvi)
raster::distance(). Retrieve a digital elevation model of Spain, and compute a raster which represents distances to the coast across the country (hint: use getData()). Second, use a simple approach to weight the distance raster with elevation (other weighting approaches are possible, include flow direction and steepness); every 100 altitudinal meters should increase the distance to the coast by 10 km. Finally, compute the difference between the raster using the Euclidean distance and the raster weighted by elevation. Note: it may be wise to increase the cell size of the input raster to reduce compute time during this operation.Useful links:
- getData: https://www.rdocumentation.org/packages/raster/versions/2.8-19/topics/getData
# read in DEM for Spain
raster::ccodes() %>%
filter(NAME %in% "Spain")
dem <- getData("alt", country = "ESP", mask = TRUE)
plot(dem)
# reduce resolution for faster processing
agg = aggregate(dem, fact = 10)
plot(agg)
# put a boundary sp object
# "GADM" global administrative boundaries
poly = getData("GADM", country = "ESP", level = 1)
# poly
# poly@data
plot(agg)
plot(poly, add = TRUE)
# create a Boolean raster
is.na(agg)
plot(is.na(agg))
dist <- is.na(agg)
cellStats(dist, summary) # quick summary stat for rasters
# compute distance from land to the nearest shore
# raster::distance computes the distance to the nearest non-NA cell
# convert land cells into NAs and sea cells into 1s
dist[dist == FALSE] <- NA
dist[dist == TRUE] <- 1
plot(dist)
# compute distance to nearest non-NA cell
dist = raster::distance(dist)
plot(dist)
plot(poly, add = TRUE)
# just keep Spain
dist = mask(dist, poly)
# convert distance into km
dist = dist / 1000
plot(dist)
plot(poly, add = TRUE)
# now let's weight each 100 altitudinal meters by an additionaly distance of 10 km
agg <- mask(agg, poly)
agg[agg < 0] = 0 # its unit is meter
weight = dist + agg / 100 * 10
# plot the difference between two approaches
plot(weight - dist)
plot(poly, add = TRUE)
To be continued Next week
To be continued Next week