2026-02-26

Class Structure

  • Data Visualization
  • Producing Environmental Hazards
  • Activity
  • Working with spatial data in R
  • Application: Air Quality
  • Wrap-up Discussion
  • Additional Application: Flooding

Air Quality Across Time - LA

Air Quality Across Time - SF

Air Quality Across Time and Space

Producing Environmental Hazards

Producing Environmental Hazards

The language of hazards has also meaning for social issues. For example, historical redlining used “hazardous” neighborhood labels to entrench ethnoracial spatial divisions.

Los Angeles, 1939

Los Angeles, 1939

Producing Environmental Hazards

These designations matter today!

Lane et al., 2022

Lane et al., 2022

Activity

Using the data handouts:

  • What can you tell me about differences in exposure to air quality across neighborhoods?

Activity Recap

  • What can you tell me about differences in exposure to air quality across neighborhoods?
  • What challenges came up (or might come up)?

Hazards and Spatial Data

  • To assess the social implications of environmental hazards, we often need to work with spatial data
  • We will use spatial analysis in R to do this

Introduction to Spatial Analysis in R

Some R Basics

  • Free, open-source, statistical analysis tool
  • Normally: RStudio

Some R Basics

  • Free, open-source, statistical analysis tool
  • Normally: RStudio

Some R Basics

  • The big picture: we are creating, plotting, and joining spatial data to analyze an environmental hazard

Discussion and Recap

Wrap-up Questions

How did we join points and polygons?

Wrap-up Questions

How did we join points and polygons?

  • We created sf objects using st_as_sf
  • We plotted the sf objects together using ggplot
  • We joined sf objects using st_join

Wrap-up Questions

  • How might this be useful for studying inequalities in pollution?
  • What are some challenges that might come up?
  • Could we use a similar method to analyze another hazard (for example, flooding)?

Thank you!

Evaluating Flood Risks

  • Large flood risks exist outside of FEMA floodzones
Lane et al., 2022

Lane et al., 2022

Hazards vs. Risks

A hazard is a geophysical that could do harm to humans/assets.

A risk is a probability of actual harm for humans/assets.

Hazards vs. Risks

A hazard is a geophysical that could do harm to humans/assets.

A risk is a probability of actual harm for humans/assets.

Source: USGS

Source: USGS

Hazards vs. Risks

If a tree falls in a forest, and no one is around, is it:

  • A hazard
  • A risk
  • Both
  • Neither

What Are Simple Features?

  • What is a simple feature?
  • Two elements:
  • Geometry noting its location in space
  • Attributes noting all other properties
  • Advantages: flexible, reproducible, join and analyze data all in one place

What are Attributes?

  • A dataframe with any kind of information
  • In R, # are comments, or notes to ourselves
  • c() is how we store information
  • <- is how we put information into an object
# create an air quality dataframe
aqi <- data.frame("monitor" = c("1", "2", "3"),
                  "pm" = c(4, 5, 10))

# look at the dataframe
aqi
##   monitor pm
## 1       1  4
## 2       2  5
## 3       3 10
  • On your own, try changing one of these numbers!

What is Geometry?

  • There were 3 types of geometries in the activity. What were they?

What is Geometry?

  • Points!
  • Lines!
  • Polygons!

What is Geometry?

  • Let’s start small
# try making a point
point <- st_point(x = c(0, 1))

point
## POINT (0 1)

-Try creating your own!

What About a Line?

# try a line
line <- st_linestring(x = matrix(data = c(0, 1, 
                                          0, 0), 
                                 ncol = 2, 
                                 byrow = TRUE))

line
## LINESTRING (0 1, 0 0)

Polygons

# try a polygon
polygon <- st_polygon(x = list(matrix(data = c(0, 1, 
                                               0, 0, 
                                               1, 2, 
                                               0 ,1), 
                                      ncol = 2,
                                      byrow = TRUE)))
polygon
## POLYGON ((0 1, 0 0, 1 2, 0 1))

Joining Points and Polygons

What About Multiple Geometries?

  • Let’s create air quality monitors locations by hand:
# let's create three points
point1 <- st_point(x = c(0, 1))

point2 <- st_point(x = c(2, 3))

point3 <- st_point(x = c(0, 4))
  • Can you create the points from the activity?

Create an sf Object

  • %<>% is one way to permanently change an object in R
# create an sf object
aqi %<>%
  bind_cols(bind_rows(data.frame(st_sfc(point1)), 
                      data.frame(st_sfc(point2)),
                      data.frame(st_sfc(point3))))


write_csv(aqi, "aqi.csv")

-Try plotting

Is it sf?

  • Examine this object
  • Does it contain attributes and geometries?
  • Is it an sf object?

Now Make it sf

# look at the class
class(aqi)
## [1] "data.frame"
  • Still a dataframe …

Now Make it sf

  • st_as_sf() makes the data an sf object
# make it sf
aqi %<>%
  st_as_sf()
  • What is the class now?

Plot the sf Object Using ggplot2

  • In ggplot2, ggplot() creates a blank canvas to plot data
# create canvas
ggplot()

Plot the sf Object Using ggplot2

  • geom_sf adds a “geometry” layer to the plot
# create canvas and plot
ggplot()+
  geom_sf(data = aqi)

Plot the sf Object Using ggplot2

# plot an sf object
ggplot()+
  geom_sf(data = aqi, aes(col = pm))

Now Let’s Make a Polygon sf Object

  • We can also add polygons
# first build dataframe
zones <- data.frame("area" = c("claremont", "pomona"),
                       "students" = c(234, 120))


# then build polygons
polygon1 <- st_polygon(list(matrix(data = c(0, 1, 0, 0, 1, 2, 0 ,1), 
                                      ncol = 2,
                                      byrow = TRUE)))

polygon2 <- st_polygon(list(matrix(data = c(1, 0, 0, 0, 1, 2, 1 ,0), 
                                      ncol = 2,
                                      byrow = TRUE)))

# create an sf object
zones %<>%
  bind_cols(bind_rows(data.frame(st_sfc(polygon1)), 
                      data.frame(st_sfc(polygon2))))
  • What class is zones?

Make it sf!

# make it sf
zones %<>%
  st_as_sf()

Plot the sf Object

# multiple points
ggplot()+
  geom_sf(data = zones)

Plot them together

ggplot()+
  geom_sf(data = zones)+
  geom_sf(data = aqi, aes(col = pm))

Now Spatial Join

st_join(aqi, zones)
## Simple feature collection with 3 features and 4 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 0 ymin: 1 xmax: 2 ymax: 4
## CRS:           NA
##   monitor pm      area students    geometry
## 1       1  4 claremont      234 POINT (0 1)
## 2       2  5      <NA>       NA POINT (2 3)
## 3       3 10      <NA>       NA POINT (0 4)
  • Does it matter which order you put them in?

Example 1: Air Quality

Joining Points and Polygons

  • Air quality is measured at specific monitors
  • Let’s read in air quality data from 2022
# read in data
id <- "1R3N_iToKDEIRapTKgAkhbzBZW6dRtm9e"
aqi22 <- read.csv(sprintf("https://docs.google.com/uc?id=%s&export=download", id))

# make air quality data spatial
aqi22 %<>%
  st_as_sf(coords = c("X", "Y"))

# set crs
aqi22 %<>%
  st_set_crs(st_crs("NAD83"))

The Tigris Package

Using Tigris to Gather Shapefiles

  • Draws form Census TIGER files (Topologically Integrated Geographic Encoding and Referencing system)
  • Using tigris, we can gather lots of different shapefiles from the US Census and the National Historical Geographic Information System
  • For example:
library(tigris)

# gather county shapefiles
counties(state = "CA")

Using Tigris to Gather Shapefiles

  • Using tigris, we can gather lots of different shapefiles from the US Census and the National Historical Geographic Information System.

Using Tigris to Gather Shapefiles

  • For example, counties:
# gather county shapefiles
counties_ca <- counties(state = "CA")
  • How would we plot the data?

Plot the Counties

# plot counties 
ggplot(counties_ca)+
  geom_sf()

Joining Points to Polygons

  • We can plot these air quality monitors
  • aes(col = ) adds color to a layer
# try plotting again
ggplot()+
  geom_sf(data = counties_ca)+
  geom_sf(data = aqi22, aes(col = `Arithmetic.Mean`))

Joining Points and Polygons

  • We may want to join the data with air quality measures
# join school district boundaries with AQI
counties_ca %<>%
  st_join(aqi22)

Now map it again

# try plotting again
ggplot(counties_ca)+
  geom_sf(aes(fill = `Arithmetic.Mean`))+
  theme_minimal()+
  labs(col = "PM2.5 Average")

Example 2: Floodplains

The 100 Year Floodplain

We might be interested in data on risks. For example, the National Flood Insurance Program designates the 100 year floodplain (1% annual flooding likelihood) as the primary geography of flood risk.

Floodplain Example. Source: FEMA

Floodplain Example. Source: FEMA

Flood Risks in Los Angeles

Let’s Look at an Example

## Reading layer `pomona_small' from data source 
##   `/Users/tylermcdaniel/Library/CloudStorage/GoogleDrive-tylermc@stanford.edu/My Drive/Applications/Scripps/doi_10_7280_D1RH7Z__v20220829/pomona_small.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 3 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -117.773 ymin: 34.0383 xmax: -117.773 ymax: 34.0383
## Geodetic CRS:  NAD83
## Reading layer `flood_hazard_small' from data source 
##   `/Users/tylermcdaniel/Library/CloudStorage/GoogleDrive-tylermc@stanford.edu/My Drive/Applications/Scripps/doi_10_7280_D1RH7Z__v20220829/flood_hazard_small.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 2 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -117.807 ymin: 34.02632 xmax: -117.757 ymax: 34.05826
## Geodetic CRS:  NAD83
  • How would you plot the flood risks?

Examine Flood Risks

# plot flood risks
ggplot()+
  geom_sf(data = flood_hazard_small)

Examine Flood Risks

ggplot()+
  geom_sf(data = flood_hazard_small, aes(fill = FLD_ZONE))

Examine Flood Risks

ggplot()+
  geom_sf(data = flood_hazard_small, aes(fill = FLD_ZONE))+
  geom_sf(data = pomona_small)

Join Flood Risks

# join flood risks
flood_hazard_small %<>%
  st_join(pomona_small)


flood_hazard_small
## Simple feature collection with 2 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -117.807 ymin: 34.02632 xmax: -117.757 ymax: 34.05826
## Geodetic CRS:  NAD83
##     FLD_AR_ID FLD_ZONE    ID   OwnerName C100yrMEAN
## 1  06037C_664        D 48083 POMONA CITY  0.3157211
## 2 06037C_5417        X    NA        <NA>         NA
##                         geometry
## 1 MULTIPOLYGON (((-117.7878 3...
## 2 MULTIPOLYGON (((-117.7572 3...

Recap

Hazards vs. Risks

  • What is a hazard?
  • What is a risk?

Spatial Data

  • What are the two types of data that simple features combines?
  • What are the three types of spatial “features”?

Joining Points and Polygons

  • How do we join points and polygons?
  • What happens to points/polygons outside the intersection?

Applications and Extensions

  • How are spatial joins useful for air quality measurement?
  • How are spatial joins useful for flood measurement?

Producing Inequality?

The language of risks and hazards has also been applied to social issues. For example, historical redlining used “hazardous” neighborhood labels to entrench ethnoracial spatial divisions.

Los Angeles, 1939

Los Angeles, 1939

Risk Management

Governments shape risks (real and perceived) that people live with.

Through risk management, states and institutions set “acceptable levels” of risks, with implications for society.

Risk Management

  • What is an example of an “acceptable level” of risk, set by a government, institution, or group?

Flood risks in Los Angeles

la_flood_risks <- read_csv("lafloodrisk_2020/lafloodrisk_2020_p_d_s_50.csv",
                           show_col_types = FALSE)


# let's look at our data
names(la_flood_risks)
##  [1] "ID"                 "APN"                "TRACT_ID"          
##  [4] "BLKGRP_ID"          "TaxRateCity"        "UseType"           
##  [7] "LotSizeSquareFeet"  "TotalAssessedValue" "C100yrMEAN"        
## [10] "P100yrMEAN"         "Q100yrMEAN"         "T100yrMEAN"        
## [13] "FEMA_ZONE"          "FEMA_SUBTITLE"      "FEMA_SFHA"         
## [16] "POP2020"            "POP20_SQMI"         "WHITE"             
## [19] "BLACK"              "AMERI_ES"           "ASIAN"             
## [22] "HAWN_PI"            "HISPANIC"           "OTHER"             
## [25] "MULT_RACE"          "MEDINC"             "SVI"               
## [28] "SOVI"               "NDI"

Let’s look at and example

Try Mapping Buildings

ggplot()+
  geom_sf(data = pomona)

pomona_small <- pomona %>%
  head(1)

st_write(pomona_small, "pomona_small.shp")

#la_verne <- st_as_sf(st_set_crs(la_verne, st_crs(flood_hazard)))

Flood Hazard

flood_hazard <- st_read("06037C_20260127/S_FLD_HAZ_AR.shp")

flood_hazard_small <- 
  st_intersection(flood_hazard, st_as_sfc((st_bbox(pomona)))) %>%
  select(FLD_AR_ID, FLD_ZONE, geometry)



st_write(flood_hazard_small, "flood_hazard_small.shp")

flood_hazard_small <- st_read("flood_hazard_small.shp")


ggplot(flood_hazard_small)+
  geom_sf(aes(fill = FLD_ZONE))+
#  lims(x = c(117.81, 117.76), y = c(34.025, 34.060))+
  geom_sf(data = pomona)

Mapping Populations in Los Angeles

Multi-Geometries

  • Points!
  • Lines!
  • Polygons!

Let’s Look at Intersection

point1 <- st_point(x = c(0, 1))

point2 <- st_point(x = c(2, 3))

point3 <- st_point(x = c(0, 4))


# multiple points
multipoint <- st_multipoint(x = matrix(data = c(1, 1, 0, 0, 2, 0), 
                                 ncol = 2, 
                                 byrow = TRUE))


# multiple polygons
multipolygon <- st_multipolygon(x = list(list(matrix(data = c(0, 1, 0, 0, 1, 2, 0 ,1), 
                                      ncol = 2,
                                      byrow = TRUE)),
                                    list(matrix(data = c(1, 0, 0, 0, 1, 2, 1 ,0), 
                                      ncol = 2,
                                      byrow = TRUE))))

What Are Simple Features

library(sf)

# load a simple features object
nc <- st_read(system.file("shape/nc.shp", package="sf"))

What Are Simple Features?

  • You might notice “CRS” in sf objects
  • What’s this? Coordinate Reference System
  • Projects points on global (or North American) maps
# look at nc
head(nc)

Let’s Examine them

  • What is our object?
class(nc)
  • Can we plot it?
# plot entire dataset
plot(nc)

# plot one column
plot(nc$AREA)
  • What if we wanted to plot one county?

What Are Simple Features?

  • Mapping with ggplot2
  • We can use geom_sf() to plot simple features
library(ggplot2)

ggplot(nc)+
  geom_sf()

What Are Simple Features?

  • In groups: Can you plot nc using ggplot() and geom_sf()?
  • Can you plot just the counties with multiple polygons?
  • Try changing color (fill), line width (lwd), transparency (alpha)

Using Tigris to Gather Shapefiles

  • How would we overlay two types of shapefiles?

Using Tigris to Gather Shapefiles

  • For example, counties and roads in Los Angeles County:
# gather county shapefiles
#water_la <- linear_water(state = "CA", county = "Los Angeles")

Using Tigris to Gather Shapefiles

# plot counties 
#ggplot(counties)+
#  geom_sf()+
#  geom_sf(data = water_la)

Using Tigris to Gather Shapefiles

# it may help to limit data to los angeles county
#county_la <- counties %>%
#  filter(COUNTYFP == "037")


#tracts_la <- tracts(county = "Los Angeles", state = "California")

# plot la county first 
#ggplot(tracts_la)+
#  geom_sf()+
#  geom_sf(data = water_la)

Using Tigris to Gather Shapefiles

# get state of california shape
#ca <- states() %>%
#  filter(NAME == "California")

Air Quality Across Space

  • Air pollution reduces lifespans by approximately 2 years, on average, globally
  • 97% of world population lives in areas with “unsafe” levels of particulate matter (e.g. PM2.5)
  • In the U.S., particulates are estimated to account for 100,000 deaths annually (from pulmonary and cardiac disease)

Air Quality Across Space

  • Gains in recent decades in the U.S. are being undone by wildfires

Social & Spatial Data

In Los Angeles, researchers have estimate a different zone of potential flooding.

To compare official risks to experienced hazards, we’ll need to combine data from different sources. By combining data on flooding risks and spatial demographics, we can investigate what this risk management means for communities.

Los Angeles

First, we gather data on Los Angeles:

Los Angeles

Preparation

This class will use R. We will use the following packages:

install.packages("dplyr")
install.packages("readr")
instll.packages("tigris")
install.packages("ggplot2")
install.packages("sf")
install.packages("magrittr")

library(dplyr)
library(readr)
library(tigris)
library(ggplot2)
library(sf)
library(magrittr)

Alternative Flood Risk Definition

Assignment

Choose a neighborhood or region in Los Angeles, and combine building and flood risk data.

  • Combine your map with the Los Angeles flooding data. Look at who experiences flooding in the following categories:
  • inside official and unofficial floodplains
  • inside official floodplains, outside unofficial floodplains
  • outside official floodplains, inside unofficial floodplains
  • outside official and unofficial floodplains
  • Who experiences which risks? What can you conclude about what will happen in the event of a flood?
  • First, draw how you might communicate these findings. Then try to visually display them using R. What changes?