2026-02-26

Class Structure

  • Warm-up Activity (5 min)
  • Environmental Risks & Hazards (5 min)
  • Intro to Simple Features in R (10 min)
  • Example 1: Air Quality (10 min)
  • Example 2: Flooding (10 min)
  • Wrap up: Discussion of Extensions (5 min)

Warm-up Activity

Using the data handouts:

  • How would you assess differences in exposure to air quality across neighborhoods?

Warm-up Recap

  • How would you assess differences in exposure to air quality?
  • What challenges came up (or might come up)?

Data Visualization

Data Visualization

Learning Goals

  • Understanding of risks vs. hazards Three steps:
  • Creating sf objects with st_as_sf
  • Plotting sf objects with ggplot
  • Joining sf objects with st_join

Hazards vs. Risks

Hazards vs. Risks

A hazard is a geophysical event.

A risk is a probability of some (usually negative) consequence for humans/assets.

Hazards vs. Risks

A hazard is a geophysical event itself.

A risk is a probability of some (usually negative) consequence for humans/assets.

Source: USGS

Source: USGS

Hazards vs. Risks

If a tree falls in a forest, and no one is around, is it:

  • A hazard
  • A risk
  • Both
  • Neither

Producing Inequality?

The language of risks and hazards has also been applied to social issues. For example, historical redlining used “hazardous” neighborhood labels to entrench ethnoracial spatial divisions.

Los Angeles, 1939

Los Angeles, 1939

Risk Management

Governments shape risks (real and perceived) that people live with.

Through risk management, states and institutions set “acceptable levels” of risks, with implications for society.

Risk Management

  • What is an example of an “acceptable level” of risk, set by a government, institution, or group?

Risks and Spatial Data

  • To assess environmental risks, we often need to work with spatial data
  • We will use simple features to do this

Introduction to Simple Features in R

Some R Basics

  • Free, open-source, statistical analysis tool
  • Normally: RStudio
  • Here: bit.ly/4c3ZMTD
  • Boxes with R Code: you can modify!
  • To run code: click run code!

What Are Simple Features?

  • What is a simple feature?
  • Two elements:
  • Geometry noting its location in space
  • Attributes noting all other properties
  • Advantages: flexible, reproducible, join and analyze data all in one place

What are Attributes?

  • A dataframe with any kind of information
  • In R, # are comments, or notes to ourselves
  • c() is how we store information
  • <- is how we put information into an object
# create an air quality dataframe
aqi <- data.frame("monitor" = c("1", "2", "3"),
                  "pm" = c(4, 5, 10))

# look at the dataframe
aqi
##   monitor pm
## 1       1  4
## 2       2  5
## 3       3 10
  • On your own, try changing one of these numbers!

What is Geometry?

  • There were 3 types of geometries in the warm-up. What were they?

What is Geometry?

  • Points!
  • Lines!
  • Polygons!

What is Geometry?

  • Let’s start small
# try making a point
point <- st_point(x = c(0, 1))

point
## POINT (0 1)

-Try creating your own!

What About a Line?

# try a line
line <- st_linestring(x = matrix(data = c(0, 1, 
                                          0, 0), 
                                 ncol = 2, 
                                 byrow = TRUE))

line
## LINESTRING (0 1, 0 0)

Polygons

# try a polygon
polygon <- st_polygon(x = list(matrix(data = c(0, 1, 
                                               0, 0, 
                                               1, 2, 
                                               0 ,1), 
                                      ncol = 2,
                                      byrow = TRUE)))
polygon
## POLYGON ((0 1, 0 0, 1 2, 0 1))

Joining Points and Polygons

What About Multiple Geometries?

  • Let’s create air quality monitors locations by hand:
# let's create three points
point1 <- st_point(x = c(0, 1))

point2 <- st_point(x = c(2, 3))

point3 <- st_point(x = c(0, 4))
  • Can you create the points from the warm-up?

Create an sf Object

  • %<>% is one way to permanently change an object in R
# create an sf object
aqi %<>%
  bind_cols(bind_rows(data.frame(st_sfc(point1)), 
                      data.frame(st_sfc(point2)),
                      data.frame(st_sfc(point3))))

-Try plotting

Is it sf?

  • Examine this object
  • Does it contain attributes and geometries?
  • Is it an sf object?

Now Make it sf

# look at the class
class(aqi)
## [1] "data.frame"
  • Still a dataframe …

Now Make it sf

  • st_as_sf() makes the data an sf object
# make it sf
aqi %<>%
  st_as_sf()
  • What is the class now?

Plot the sf Object Using ggplot2

  • In ggplot2, ggplot() creates a blank canvas to plot data
# create canvas
ggplot()

Plot the sf Object Using ggplot2

  • geom_sf adds a “geometry” layer to the plot
# create canvas and plot
ggplot()+
  geom_sf(data = aqi)

Plot the sf Object Using ggplot2

# plot an sf object
ggplot()+
  geom_sf(data = aqi, aes(col = pm))

Now Let’s Make a Polygon sf Object

  • We can also add polygons
# first build dataframe
zones <- data.frame("area" = c("claremont", "pomona"),
                       "students" = c(234, 120))


# then build polygons
polygon1 <- st_polygon(list(matrix(data = c(0, 1, 0, 0, 1, 2, 0 ,1), 
                                      ncol = 2,
                                      byrow = TRUE)))

polygon2 <- st_polygon(list(matrix(data = c(1, 0, 0, 0, 1, 2, 1 ,0), 
                                      ncol = 2,
                                      byrow = TRUE)))

# create an sf object
zones %<>%
  bind_cols(bind_rows(data.frame(st_sfc(polygon1)), 
                      data.frame(st_sfc(polygon2))))
  • What class is zones?

Make it sf!

# make it sf
zones %<>%
  st_as_sf()

Plot the sf Object

# multiple points
ggplot()+
  geom_sf(data = zones)

Plot them together

ggplot()+
  geom_sf(data = zones)+
  geom_sf(data = aqi, aes(col = pm))

Now Spatial Join

st_join(aqi, zones)
## Simple feature collection with 3 features and 4 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 0 ymin: 1 xmax: 2 ymax: 4
## CRS:           NA
##   monitor pm      area students    geometry
## 1       1  4 claremont      234 POINT (0 1)
## 2       2  5      <NA>       NA POINT (2 3)
## 3       3 10      <NA>       NA POINT (0 4)
  • Does it matter which order you put them in?

Example 1: Air Quality

Joining Points and Polygons

  • Air quality is measured at specific monitors
  • Let’s read in air quality data from 2022
# read in data
aqi22 <- read_csv("aqi22_clean_ca.csv")

# make air quality data spatial
aqi22 %<>%
  st_as_sf(coords = c("X", "Y"))

# set crs
aqi22 %<>%
  st_set_crs(st_crs("NAD83"))

The Tigris Package

Using Tigris to Gather Shapefiles

  • Draws form Census TIGER files (Topologically Integrated Geographic Encoding and Referencing system)
  • Using tigris, we can gather lots of different shapefiles from the US Census and the National Historical Geographic Information System
  • For example:
library(tigris)

# gather county shapefiles
counties(state = "CA")

Using Tigris to Gather Shapefiles

  • Using tigris, we can gather lots of different shapefiles from the US Census and the National Historical Geographic Information System.

Using Tigris to Gather Shapefiles

  • For example, counties:
# gather county shapefiles
counties_ca <- counties(state = "CA")
  • How would we plot the data?

Plot the Counties

# plot counties 
ggplot(counties_ca)+
  geom_sf()

Joining Points to Polygons

  • We can plot these air quality monitors
  • aes(col = ) adds color to a layer
# try plotting again
ggplot()+
  geom_sf(data = counties_ca)+
  geom_sf(data = aqi22, aes(col = `Arithmetic Mean`))

Joining Points and Polygons

  • We may want to join the data with air quality measures
# join school district boundaries with AQI
counties_ca %<>%
  st_join(aqi22)

Now map it again

# try plotting again
ggplot(counties_ca)+
  geom_sf(aes(fill = `Arithmetic Mean`))+
  theme_minimal()+
  labs(col = "PM2.5 Average")

Example 2: Floodplains

The 100 Year Floodplain

We might be interested in data on risks. For example, the National Flood Insurance Program designates the 100 year floodplain (1% annual flooding likelihood) as the primary geography of flood risk.

Floodplain Example. Source: FEMA

Floodplain Example. Source: FEMA

Flood Risks in Los Angeles

Let’s Look at an Example

## Reading layer `pomona_small' from data source 
##   `/Users/tylermcdaniel/Library/CloudStorage/GoogleDrive-tylermc@stanford.edu/My Drive/Applications/Scripps/doi_10_7280_D1RH7Z__v20220829/pomona_small.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 3 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -117.773 ymin: 34.0383 xmax: -117.773 ymax: 34.0383
## Geodetic CRS:  NAD83
## Reading layer `flood_hazard_small' from data source 
##   `/Users/tylermcdaniel/Library/CloudStorage/GoogleDrive-tylermc@stanford.edu/My Drive/Applications/Scripps/doi_10_7280_D1RH7Z__v20220829/flood_hazard_small.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 2 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -117.807 ymin: 34.02632 xmax: -117.757 ymax: 34.05826
## Geodetic CRS:  NAD83
  • How would you plot the flood risks?

Examine Flood Risks

# plot flood risks
ggplot()+
  geom_sf(data = flood_hazard_small)

Examine Flood Risks

ggplot()+
  geom_sf(data = flood_hazard_small, aes(fill = FLD_ZONE))

Examine Flood Risks

ggplot()+
  geom_sf(data = flood_hazard_small, aes(fill = FLD_ZONE))+
  geom_sf(data = pomona_small)

Join Flood Risks

# join flood risks
flood_hazard_small %<>%
  st_join(pomona_small)


flood_hazard_small
## Simple feature collection with 2 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -117.807 ymin: 34.02632 xmax: -117.757 ymax: 34.05826
## Geodetic CRS:  NAD83
##     FLD_AR_ID FLD_ZONE    ID   OwnerName C100yrMEAN
## 1  06037C_664        D 48083 POMONA CITY  0.3157211
## 2 06037C_5417        X    NA        <NA>         NA
##                         geometry
## 1 MULTIPOLYGON (((-117.7878 3...
## 2 MULTIPOLYGON (((-117.7572 3...

Recap

Hazards vs. Risks

  • What is a hazard?
  • What is a risk?

Spatial Data

  • What are the two types of data that simple features combines?
  • What are the three types of spatial “features”?

Joining Points and Polygons

  • How do we join points and polygons?
  • What happens to points/polygons outside the intersection?

Applications and Extensions

  • How are spatial joins useful for air quality measurement?
  • How are spatial joins useful for flood measurement?

Flood risks in Los Angeles

la_flood_risks <- read_csv("lafloodrisk_2020/lafloodrisk_2020_p_d_s_50.csv",
                           show_col_types = FALSE)


# let's look at our data
names(la_flood_risks)
##  [1] "ID"                 "APN"                "TRACT_ID"          
##  [4] "BLKGRP_ID"          "TaxRateCity"        "UseType"           
##  [7] "LotSizeSquareFeet"  "TotalAssessedValue" "C100yrMEAN"        
## [10] "P100yrMEAN"         "Q100yrMEAN"         "T100yrMEAN"        
## [13] "FEMA_ZONE"          "FEMA_SUBTITLE"      "FEMA_SFHA"         
## [16] "POP2020"            "POP20_SQMI"         "WHITE"             
## [19] "BLACK"              "AMERI_ES"           "ASIAN"             
## [22] "HAWN_PI"            "HISPANIC"           "OTHER"             
## [25] "MULT_RACE"          "MEDINC"             "SVI"               
## [28] "SOVI"               "NDI"

Let’s look at and example

Try Mapping Buildings

ggplot()+
  geom_sf(data = pomona)

pomona_small <- pomona %>%
  head(1)

st_write(pomona_small, "pomona_small.shp")

#la_verne <- st_as_sf(st_set_crs(la_verne, st_crs(flood_hazard)))

Flood Hazard

flood_hazard <- st_read("06037C_20260127/S_FLD_HAZ_AR.shp")

flood_hazard_small <- 
  st_intersection(flood_hazard, st_as_sfc((st_bbox(pomona)))) %>%
  select(FLD_AR_ID, FLD_ZONE, geometry)



st_write(flood_hazard_small, "flood_hazard_small.shp")

flood_hazard_small <- st_read("flood_hazard_small.shp")


ggplot(flood_hazard_small)+
  geom_sf(aes(fill = FLD_ZONE))+
#  lims(x = c(117.81, 117.76), y = c(34.025, 34.060))+
  geom_sf(data = pomona)

Mapping Populations in Los Angeles

Multi-Geometries

  • Points!
  • Lines!
  • Polygons!

Let’s Look at Intersection

point1 <- st_point(x = c(0, 1))

point2 <- st_point(x = c(2, 3))

point3 <- st_point(x = c(0, 4))


# multiple points
multipoint <- st_multipoint(x = matrix(data = c(1, 1, 0, 0, 2, 0), 
                                 ncol = 2, 
                                 byrow = TRUE))


# multiple polygons
multipolygon <- st_multipolygon(x = list(list(matrix(data = c(0, 1, 0, 0, 1, 2, 0 ,1), 
                                      ncol = 2,
                                      byrow = TRUE)),
                                    list(matrix(data = c(1, 0, 0, 0, 1, 2, 1 ,0), 
                                      ncol = 2,
                                      byrow = TRUE))))

What Are Simple Features

library(sf)

# load a simple features object
nc <- st_read(system.file("shape/nc.shp", package="sf"))

What Are Simple Features?

  • You might notice “CRS” in sf objects
  • What’s this? Coordinate Reference System
  • Projects points on global (or North American) maps
# look at nc
head(nc)

Let’s Examine them

  • What is our object?
class(nc)
  • Can we plot it?
# plot entire dataset
plot(nc)

# plot one column
plot(nc$AREA)
  • What if we wanted to plot one county?

What Are Simple Features?

  • Mapping with ggplot2
  • We can use geom_sf() to plot simple features
library(ggplot2)

ggplot(nc)+
  geom_sf()

What Are Simple Features?

  • In groups: Can you plot nc using ggplot() and geom_sf()?
  • Can you plot just the counties with multiple polygons?
  • Try changing color (fill), line width (lwd), transparency (alpha)

Using Tigris to Gather Shapefiles

  • How would we overlay two types of shapefiles?

Using Tigris to Gather Shapefiles

  • For example, counties and roads in Los Angeles County:
# gather county shapefiles
#water_la <- linear_water(state = "CA", county = "Los Angeles")

Using Tigris to Gather Shapefiles

# plot counties 
#ggplot(counties)+
#  geom_sf()+
#  geom_sf(data = water_la)

Using Tigris to Gather Shapefiles

# it may help to limit data to los angeles county
#county_la <- counties %>%
#  filter(COUNTYFP == "037")


#tracts_la <- tracts(county = "Los Angeles", state = "California")

# plot la county first 
#ggplot(tracts_la)+
#  geom_sf()+
#  geom_sf(data = water_la)

Using Tigris to Gather Shapefiles

# get state of california shape
#ca <- states() %>%
#  filter(NAME == "California")

Air Quality Across Space

  • Air pollution reduces lifespans by approximately 2 years, on average, globally
  • 97% of world population lives in areas with “unsafe” levels of particulate matter (e.g. PM2.5)
  • In the U.S., particulates are estimated to account for 100,000 deaths annually (from pulmonary and cardiac disease)

Air Quality Across Space

  • Gains in recent decades in the U.S. are being undone by wildfires

Social & Spatial Data

In Los Angeles, researchers have estimate a different zone of potential flooding.

To compare official risks to experienced hazards, we’ll need to combine data from different sources. By combining data on flooding risks and spatial demographics, we can investigate what this risk management means for communities.

Los Angeles

First, we gather data on Los Angeles:

Los Angeles

Preparation

This class will use R. We will use the following packages:

install.packages("dplyr")
install.packages("readr")
instll.packages("tigris")
install.packages("ggplot2")
install.packages("sf")
install.packages("magrittr")

library(dplyr)
library(readr)
library(tigris)
library(ggplot2)
library(sf)
library(magrittr)

Alternative Flood Risk Definition

Assignment

Choose a neighborhood or region in Los Angeles, and combine building and flood risk data.

  • Combine your map with the Los Angeles flooding data. Look at who experiences flooding in the following categories:
  • inside official and unofficial floodplains
  • inside official floodplains, outside unofficial floodplains
  • outside official floodplains, inside unofficial floodplains
  • outside official and unofficial floodplains
  • Who experiences which risks? What can you conclude about what will happen in the event of a flood?
  • First, draw how you might communicate these findings. Then try to visually display them using R. What changes?