Lab #4 - Part A

Introduction

The first part of this weeks lab asked for three products to be developed utilizing tornado point and path data for the state of Oklahoma. The three products are as follows:

Maps of tornado paths and touchdown points each year from 2016-2021
Maps of density of tornado touchdown points by county and year
Comparative maps of tornado density based on quantile breaks of classes ranging from 3 to 6

Package and Data Prep

First the packages are called using library().

# load packages ----

library(here)
library(sf)
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
library(RColorBrewer)
library(units)
library(cowplot)

Next, three datasets are imported in: Oklahoma County Boundaries, Tornado Touchdown Points, and Tornado Paths.

# import data files ----

# Oklahoma county boundaries

okcounty <- st_read(here("data", "ok_counties.shp"), 
                    quiet = TRUE)

# Tornado touchdown points

tpoint <- st_read(here("data", "ok_tornado_point.shp"), 
                  quiet = TRUE)

# Tornado path lines

tpath <- st_read(here("data", "ok_tornado_path.shp"),
                 quiet = TRUE)

Finally, to limit the scope of this project, we filter down the tornado paths and points to data from 2016-2021.

# filter tornado data to 2016-2021 ----

tpoint_16_21 <- tpoint %>%
  filter( yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date)

tpath_16_21 <- tpath %>%
  filter(yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date)

1) Mapping Tornado Paths and Points

Two maps, one of tornado paths and the other touchdown points, are generated utilizing color as the attribute to distinguish between each year.

Map of Tornado Paths

The first map is of tornado points. In addition to identifying the year by color, the line width for each path is set to a value of 2 for visual impact. The resulting map is saved to the value tpath_year.

# generate map of tornado paths w/ each year a different color ----

tpath_year <- ggplot() +
  geom_sf(data = tpath_16_21,
          aes(color = as.factor(yr)),
          linewidth = 2) +
  geom_sf(data = okcounty, fill = NA) +
  scale_color_discrete(name = "Year") +
  coord_sf(datum = NA) +
  theme_void()

Map of Tornado Touchdown Points

The second map is of tornado touchdown points. This map is saved to the value tpoint_year.

# generate map of tornado points w/ each year a different color ----

tpoint_year <- ggplot() +
  geom_sf(data = tpoint_16_21,
          aes(color = as.factor(yr))) +
  geom_sf(data = okcounty, fill = NA) +
  scale_color_discrete(name = "Year") +
  coord_sf(datum = NA) +
  theme_void()

Combining into Single Product

The resulting maps of the Tornado Paths and Touchdown Points are combined using plot_grid() to produce a single combined product (Figure 1). From the resulting maps, one can identify an increase in tornado length towards the northeast corner of Oklahoma, as well as, several bands of tornado touchdown points stretching from the southwest corner to the northeast corner. Bands of the same year in a line may be indicative of a weather system with multiple tornadoes.

# generate composite figure of tornado paths and points by year

plot_grid(tpath_year, tpoint_year,
          labels = c("A) Tornado Paths by Year",
                     "B) Tornado Touchdown Points by Year",
                     label_size = 10),
          ncol = 1,
          hjust = 0,
          label_x = 0,
          align = "hv")

Figure 1 - A) Tornado paths by year from 2016-2021. Tornado paths are seen to increase in length towards the northeast corner of Oklahoma. B) Tornado Touchdown Points by year from 2016 to 2021. Several bands of tornado touchdown points can be seen stretching from the southwest corner to the northeast corner of Oklahoma. Bands of the same year in a line may be indicative of a weather system with multiple tornadoes.

2) Tornado Touchdown Density

A faceted plot for each year from 2016-2021 is produced for tornado touchdown point density plot in each county.

Prep Data

A data table countypnt is produced from the spatial joining of the tornado point data with the county boundaries. The geometries are then dropped in preparation for density calculations.

# spatially join tornadoes to counties ----

countypnt <- st_join(tpoint_16_21, okcounty)

# group tornadoes by county and year and summarize; create density columns for each year ----

countypnt <- st_drop_geometry(countypnt)

Calculating Density

To calculate density for each county by year, summary tables are produced using the group_by() function to isolate the county GEOIDs and filter by year. The tornado counts are calculated for each county. This summary table is then joined back to the county boundaries table by associating the GEOIDs. Utilizing mutate(), another column is produced containing a density calculation for each county utilizing area in 1000km^2.

# 2016

county_16_sum <- countypnt %>%
  filter(yr == 2016) %>%
  group_by(GEOID, yr = 2016) %>%
  summarize(tcnt = n())

countymap_16 <- okcounty %>%
  left_join(county_16_sum, by = "GEOID") %>%
  replace_na(list(yr = 2016, tcnt = 0)) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt/area) %>%
  drop_units()

# 2017

county_17_sum <- countypnt %>%
  filter(yr == 2017) %>%
  group_by(GEOID, yr = 2017) %>%
  summarize(tcnt = n())

countymap_17 <- okcounty %>%
  left_join(county_17_sum, by = "GEOID") %>%
  replace_na(list(yr = 2017, tcnt = 0)) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt/area) %>%
  drop_units()

# 2018

county_18_sum <- countypnt %>%
  filter(yr == 2018) %>%
  group_by(GEOID, yr = 2018) %>%
  summarize(tcnt = n())

countymap_18 <- okcounty %>%
  left_join(county_18_sum, by = "GEOID") %>%
  replace_na(list(yr = 2018, tcnt = 0)) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt/area) %>%
  drop_units()

# 2019

county_19_sum <- countypnt %>%
  filter(yr == 2019) %>%
  group_by(GEOID, yr = 2019) %>%
  summarize(tcnt = n())

countymap_19 <- okcounty %>%
  left_join(county_19_sum, by = "GEOID") %>%
  replace_na(list(yr = 2019, tcnt = 0)) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt/area) %>%
  drop_units()

# 2020

county_20_sum <- countypnt %>%
  filter(yr == 2020) %>%
  group_by(GEOID, yr = 2020) %>%
  summarize(tcnt = n())

countymap_20 <- okcounty %>%
  left_join(county_20_sum, by = "GEOID") %>%
  replace_na(list(yr = 2020, tcnt = 0)) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt/area) %>%
  drop_units()

# 2021

county_21_sum <- countypnt %>%
  filter(yr == 2021) %>%
  group_by(GEOID, yr = 2021) %>%
  summarize(tcnt = n())

countymap_21 <- okcounty %>%
  left_join(county_21_sum, by = "GEOID") %>%
  replace_na(list(yr = 2021, tcnt = 0)) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt/area) %>%
  drop_units()

The resulting yearly density tables are then joined utilizing bind_rows() to produce a composite table to allow for faceted plotting.

# make density tables manageable by condensing into one table ----

countymap_comp <- bind_rows(countymap_16, countymap_17, countymap_18,
          countymap_19, countymap_20, countymap_21)

Faceted Map

Utilizing ggplot(), a faceted plot is produced for each year. Figure 2 is a faceted map produced from the tornado densities by county in each year. Of note, 2019 shows a comparatively higher density of tornadoes concentrated towards the northeast with 2020 showing the lowest densities.

# create choropleth density map faceted by year for each county ----

ggplot(data = countymap_comp) +
  geom_sf(aes(fill = tdens)) +
  scale_fill_distiller(name = expression("Tornadoes/1000 km"^2),
                       palette = "YlOrRd",
                       breaks = pretty_breaks(),
                       direction = 1) +
  theme_void() +
  theme(legend.position = "bottom") +
  labs(title = "Tornado Density by County from 2016-2021") +
  facet_wrap(~yr)

Figure 2 - A faceted map produced from the tornado densities by county in each year. 2019 shows a comparitively higher density of tornadoes concentrated towards the northeast with 2020 showing the lowest densities

3) Comparing Quantile Class Breaks on Tornado Density Maps

The final product for Part A focuses on the comparison of Quantile class breaks, ranging from 3 to 6, on tornado densities displayed on maps.

Calculate Density

Utilizing a similar technique to that above, densities are calculated for the full range of years. A joined table is produced of the county boundaries associated with the tornado count and density values.

# group tornadoes by county and summarize ----

countysum <- countypnt %>%
  group_by(GEOID) %>%
  summarize(tcnt = n())

# associate counts to counties; convert NA to zeros; calculate area in 1000 km^2 ----

countymap <- okcounty %>%
  left_join(countysum, by = "GEOID") %>%
  replace(is.na(.), 0) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt/area) %>%
  drop_units()

Create Plots for Individual Classes

Next, plots are produced for each number of classes ranging from 3 to 6. A list of quantile values is produced for each number of classes. mutate() is then used to add a variable to separate teh data by classes. The resulting plot is then named and produced for each number of classes to allow for cowplotting in the following step.

# convert continuous values into discrete classes from 3 to 6 and create choropleth----

numclas <- 3

class_3 <- seq(0, 1, length.out = numclas + 1)

countymap <- countymap %>%
  mutate(tdens_c1 = cut(tdens,
                        breaks = quantile(tdens, probs = class_3),
                        include.lowest = T))

choro_clas_3 <- ggplot(data = countymap) +
  geom_sf(aes(fill = tdens_c1)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                    palette = "YlOrRd") +
  theme_void() +
  theme(legend.position = "right")

numclas <- 4

class_4 <- seq(0, 1, length.out = numclas + 1)

countymap_4_clas <- countymap %>%
  mutate(tdens_c1 = cut(tdens,
                        breaks = quantile(tdens, probs = class_4),
                        include.lowest = T))

choro_clas_4 <- ggplot(data = countymap_4_clas) +
  geom_sf(aes(fill = tdens_c1)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                    palette = "YlOrRd") +
  theme_void() +
  theme(legend.position = "right")

numclas <- 5

class_5 <- seq(0, 1, length.out = numclas + 1)

countymap_5_clas <- countymap %>%
  mutate(tdens_c1 = cut(tdens,
                        breaks = quantile(tdens, probs = class_5),
                        include.lowest = T))

choro_clas_5 <- ggplot(data = countymap_5_clas) +
  geom_sf(aes(fill = tdens_c1)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                    palette = "YlOrRd") +
  theme_void() +
  theme(legend.position = "right")

numclas <- 6

class_6 <- seq(0, 1, length.out = numclas + 1)

countymap_6_clas <- countymap %>%
  mutate(tdens_c1 = cut(tdens,
                        breaks = quantile(tdens, probs = class_6),
                        include.lowest = T))

choro_clas_6 <- ggplot(data = countymap_6_clas) +
  geom_sf(aes(fill = tdens_c1)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                    palette = "YlOrRd") +
  theme_void() +
  theme(legend.position = "right")

Cowplot of Density

Utilizing plot_grid(), the four different class plots are combined for comparison (Figure 3). Of note, as the number of classes increases, the greater distinction is found in the higher densities but at a loss of lower density distinctiveness. 5 classes appears to be the ideal amount to balance.

# use cowplot to make a composite figure of each class----

plot_grid(choro_clas_3, choro_clas_4,
          choro_clas_5, choro_clas_6,
          labels = c("A) Tornado Density - 3 Classes",
                     "B) Tornado Density - 4 Classes",
                     "C) Tornado Density - 5 Classes",
                     "D) Tornado Density - 6 Classes",
                     label_size = 12),
          ncol = 2,
          hjust = 0,
          label_x = 0,
          align = "hv")

Figure 3 - A cowplot of four choropleth plots comparing tornado density displayed using quantile classes ranging from 3 to 6. As the number of classes increases, so does the distinction of higher density counties but with a reduction in distinction at the lower densities. 5 classes appears to be the right amount to balance