The first part of this weeks lab asked for three products to be developed utilizing tornado point and path data for the state of Oklahoma. The three products are as follows:
Maps of tornado paths and touchdown points each year from 2016-2021
Maps of density of tornado touchdown points by county and year
Comparative maps of tornado density based on quantile breaks of classes ranging from 3 to 6
First the packages are called using library().
# load packages ----
library(here)
library(sf)
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
library(RColorBrewer)
library(units)
library(cowplot)
Next, three datasets are imported in: Oklahoma County Boundaries, Tornado Touchdown Points, and Tornado Paths.
# import data files ----
# Oklahoma county boundaries
okcounty <- st_read(here("data", "ok_counties.shp"),
quiet = TRUE)
# Tornado touchdown points
tpoint <- st_read(here("data", "ok_tornado_point.shp"),
quiet = TRUE)
# Tornado path lines
tpath <- st_read(here("data", "ok_tornado_path.shp"),
quiet = TRUE)
Finally, to limit the scope of this project, we filter down the tornado paths and points to data from 2016-2021.
# filter tornado data to 2016-2021 ----
tpoint_16_21 <- tpoint %>%
filter( yr >= 2016 & yr <= 2021) %>%
select(om, yr, date)
tpath_16_21 <- tpath %>%
filter(yr >= 2016 & yr <= 2021) %>%
select(om, yr, date)
Two maps, one of tornado paths and the other touchdown points, are generated utilizing color as the attribute to distinguish between each year.
The first map is of tornado points. In addition to identifying the
year by color, the line width for each path is set to a value of 2 for
visual impact. The resulting map is saved to the value
tpath_year.
# generate map of tornado paths w/ each year a different color ----
tpath_year <- ggplot() +
geom_sf(data = tpath_16_21,
aes(color = as.factor(yr)),
linewidth = 2) +
geom_sf(data = okcounty, fill = NA) +
scale_color_discrete(name = "Year") +
coord_sf(datum = NA) +
theme_void()
The second map is of tornado touchdown points. This map is saved to
the value tpoint_year.
# generate map of tornado points w/ each year a different color ----
tpoint_year <- ggplot() +
geom_sf(data = tpoint_16_21,
aes(color = as.factor(yr))) +
geom_sf(data = okcounty, fill = NA) +
scale_color_discrete(name = "Year") +
coord_sf(datum = NA) +
theme_void()
The resulting maps of the Tornado Paths and Touchdown Points are
combined using plot_grid() to produce a single combined
product (Figure 1). From the resulting maps, one can identify an
increase in tornado length towards the northeast corner of Oklahoma, as
well as, several bands of tornado touchdown points stretching from the
southwest corner to the northeast corner. Bands of the same year in a
line may be indicative of a weather system with multiple tornadoes.
# generate composite figure of tornado paths and points by year
plot_grid(tpath_year, tpoint_year,
labels = c("A) Tornado Paths by Year",
"B) Tornado Touchdown Points by Year",
label_size = 10),
ncol = 1,
hjust = 0,
label_x = 0,
align = "hv")
Figure 1 - A) Tornado paths by year from 2016-2021. Tornado paths are seen to increase in length towards the northeast corner of Oklahoma. B) Tornado Touchdown Points by year from 2016 to 2021. Several bands of tornado touchdown points can be seen stretching from the southwest corner to the northeast corner of Oklahoma. Bands of the same year in a line may be indicative of a weather system with multiple tornadoes.
A faceted plot for each year from 2016-2021 is produced for tornado touchdown point density plot in each county.
A data table countypnt is produced from the spatial
joining of the tornado point data with the county boundaries. The
geometries are then dropped in preparation for density calculations.
# spatially join tornadoes to counties ----
countypnt <- st_join(tpoint_16_21, okcounty)
# group tornadoes by county and year and summarize; create density columns for each year ----
countypnt <- st_drop_geometry(countypnt)
To calculate density for each county by year, summary tables are
produced using the group_by() function to isolate the
county GEOIDs and filter by year. The tornado counts are calculated for
each county. This summary table is then joined back to the county
boundaries table by associating the GEOIDs. Utilizing
mutate(), another column is produced containing a density
calculation for each county utilizing area in 1000km^2.
# 2016
county_16_sum <- countypnt %>%
filter(yr == 2016) %>%
group_by(GEOID, yr = 2016) %>%
summarize(tcnt = n())
countymap_16 <- okcounty %>%
left_join(county_16_sum, by = "GEOID") %>%
replace_na(list(yr = 2016, tcnt = 0)) %>%
mutate(area = st_area(okcounty),
tdens = 10^6 * 10^3 * tcnt/area) %>%
drop_units()
# 2017
county_17_sum <- countypnt %>%
filter(yr == 2017) %>%
group_by(GEOID, yr = 2017) %>%
summarize(tcnt = n())
countymap_17 <- okcounty %>%
left_join(county_17_sum, by = "GEOID") %>%
replace_na(list(yr = 2017, tcnt = 0)) %>%
mutate(area = st_area(okcounty),
tdens = 10^6 * 10^3 * tcnt/area) %>%
drop_units()
# 2018
county_18_sum <- countypnt %>%
filter(yr == 2018) %>%
group_by(GEOID, yr = 2018) %>%
summarize(tcnt = n())
countymap_18 <- okcounty %>%
left_join(county_18_sum, by = "GEOID") %>%
replace_na(list(yr = 2018, tcnt = 0)) %>%
mutate(area = st_area(okcounty),
tdens = 10^6 * 10^3 * tcnt/area) %>%
drop_units()
# 2019
county_19_sum <- countypnt %>%
filter(yr == 2019) %>%
group_by(GEOID, yr = 2019) %>%
summarize(tcnt = n())
countymap_19 <- okcounty %>%
left_join(county_19_sum, by = "GEOID") %>%
replace_na(list(yr = 2019, tcnt = 0)) %>%
mutate(area = st_area(okcounty),
tdens = 10^6 * 10^3 * tcnt/area) %>%
drop_units()
# 2020
county_20_sum <- countypnt %>%
filter(yr == 2020) %>%
group_by(GEOID, yr = 2020) %>%
summarize(tcnt = n())
countymap_20 <- okcounty %>%
left_join(county_20_sum, by = "GEOID") %>%
replace_na(list(yr = 2020, tcnt = 0)) %>%
mutate(area = st_area(okcounty),
tdens = 10^6 * 10^3 * tcnt/area) %>%
drop_units()
# 2021
county_21_sum <- countypnt %>%
filter(yr == 2021) %>%
group_by(GEOID, yr = 2021) %>%
summarize(tcnt = n())
countymap_21 <- okcounty %>%
left_join(county_21_sum, by = "GEOID") %>%
replace_na(list(yr = 2021, tcnt = 0)) %>%
mutate(area = st_area(okcounty),
tdens = 10^6 * 10^3 * tcnt/area) %>%
drop_units()
The resulting yearly density tables are then joined utilizing
bind_rows() to produce a composite table to allow for
faceted plotting.
# make density tables manageable by condensing into one table ----
countymap_comp <- bind_rows(countymap_16, countymap_17, countymap_18,
countymap_19, countymap_20, countymap_21)
Utilizing ggplot(), a faceted plot is produced for each
year. Figure 2 is a faceted map produced from the tornado densities by
county in each year. Of note, 2019 shows a comparatively higher density
of tornadoes concentrated towards the northeast with 2020 showing the
lowest densities.
# create choropleth density map faceted by year for each county ----
ggplot(data = countymap_comp) +
geom_sf(aes(fill = tdens)) +
scale_fill_distiller(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd",
breaks = pretty_breaks(),
direction = 1) +
theme_void() +
theme(legend.position = "bottom") +
labs(title = "Tornado Density by County from 2016-2021") +
facet_wrap(~yr)
Figure 2 - A faceted map produced from the tornado densities by county in each year. 2019 shows a comparitively higher density of tornadoes concentrated towards the northeast with 2020 showing the lowest densities
The final product for Part A focuses on the comparison of Quantile class breaks, ranging from 3 to 6, on tornado densities displayed on maps.
Utilizing a similar technique to that above, densities are calculated for the full range of years. A joined table is produced of the county boundaries associated with the tornado count and density values.
# group tornadoes by county and summarize ----
countysum <- countypnt %>%
group_by(GEOID) %>%
summarize(tcnt = n())
# associate counts to counties; convert NA to zeros; calculate area in 1000 km^2 ----
countymap <- okcounty %>%
left_join(countysum, by = "GEOID") %>%
replace(is.na(.), 0) %>%
mutate(area = st_area(okcounty),
tdens = 10^6 * 10^3 * tcnt/area) %>%
drop_units()
Next, plots are produced for each number of classes ranging from 3 to
6. A list of quantile values is produced for each number of classes.
mutate() is then used to add a variable to separate teh
data by classes. The resulting plot is then named and produced for each
number of classes to allow for cowplotting in the following step.
# convert continuous values into discrete classes from 3 to 6 and create choropleth----
numclas <- 3
class_3 <- seq(0, 1, length.out = numclas + 1)
countymap <- countymap %>%
mutate(tdens_c1 = cut(tdens,
breaks = quantile(tdens, probs = class_3),
include.lowest = T))
choro_clas_3 <- ggplot(data = countymap) +
geom_sf(aes(fill = tdens_c1)) +
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
theme_void() +
theme(legend.position = "right")
numclas <- 4
class_4 <- seq(0, 1, length.out = numclas + 1)
countymap_4_clas <- countymap %>%
mutate(tdens_c1 = cut(tdens,
breaks = quantile(tdens, probs = class_4),
include.lowest = T))
choro_clas_4 <- ggplot(data = countymap_4_clas) +
geom_sf(aes(fill = tdens_c1)) +
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
theme_void() +
theme(legend.position = "right")
numclas <- 5
class_5 <- seq(0, 1, length.out = numclas + 1)
countymap_5_clas <- countymap %>%
mutate(tdens_c1 = cut(tdens,
breaks = quantile(tdens, probs = class_5),
include.lowest = T))
choro_clas_5 <- ggplot(data = countymap_5_clas) +
geom_sf(aes(fill = tdens_c1)) +
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
theme_void() +
theme(legend.position = "right")
numclas <- 6
class_6 <- seq(0, 1, length.out = numclas + 1)
countymap_6_clas <- countymap %>%
mutate(tdens_c1 = cut(tdens,
breaks = quantile(tdens, probs = class_6),
include.lowest = T))
choro_clas_6 <- ggplot(data = countymap_6_clas) +
geom_sf(aes(fill = tdens_c1)) +
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
theme_void() +
theme(legend.position = "right")
Utilizing plot_grid(), the four different class plots
are combined for comparison (Figure 3). Of note, as the number of
classes increases, the greater distinction is found in the higher
densities but at a loss of lower density distinctiveness. 5 classes
appears to be the ideal amount to balance.
# use cowplot to make a composite figure of each class----
plot_grid(choro_clas_3, choro_clas_4,
choro_clas_5, choro_clas_6,
labels = c("A) Tornado Density - 3 Classes",
"B) Tornado Density - 4 Classes",
"C) Tornado Density - 5 Classes",
"D) Tornado Density - 6 Classes",
label_size = 12),
ncol = 2,
hjust = 0,
label_x = 0,
align = "hv")
Figure 3 - A cowplot of four choropleth plots comparing tornado density displayed using quantile classes ranging from 3 to 6. As the number of classes increases, so does the distinction of higher density counties but with a reduction in distinction at the lower densities. 5 classes appears to be the right amount to balance