Load Library Packages

The first step for this analysis is to load the library packages that will be used throughout the process. The packages are:

tidyverse = Includes the packages that are use in everyday data analyses, such as ggplot2, dplyr, and readr to name a few. Additional details can be found at This Site

cowplot = The cowplot package provides various features that help with creating publication-quality figures, such as a set of themes, functions to align plots and arrange them into complex compound figures, and functions that make it easy to annotate plots and or mix plots with images. Additional details can be found at This Site

RColorBrewer = Provides color schemes for maps (and other graphics) designed by Cynthia Brewer. Additionalinformation on this package can be found at This Site

sf =Support for simple features, a standardized way to encode spatial vector data. Additional information can be located at This Site

# Load necessary R packages

library(tidyverse)
library(cowplot)
library(RColorBrewer)
library(sf)

Data Overview

The data that we will be using for this project consist of three shapefile

Oklahoma Counties = This shapefile contains the geometry (borders) of counties in Oklahoma.

Oklahoma Tornado Paths = This shapefile outlines the paths taken by tornadoes, including start and end points, and the path in between.

Oklahoma Tornado Points = This shapefile represents the reported locations of tornado occurrences, such as points of touchdown or points of significant events along the tornado’s path.

Data Loading

For the analysis in section 1, we will load the data that will be used for all three analysis in section 1 to simplify the process.

# Read the datasets

ok_counties <- st_read("ok_counties.shp", quiet = TRUE)
tornado_points <- st_read("ok_tornado_point.shp", quiet = TRUE)
tornado_paths <- st_read("ok_tornado_path.shp", quiet = TRUE)

Analysis

1. Generate a Composit Map of Tornado Paths and Points for All Years Reported

First, we ill generate the map for the reported tornado paths broken down by year.

# Generate the map for tornado paths colored by year

tornado_paths_map <- ggplot() +
  geom_sf(data = ok_counties, fill = NA, color = "grey") +
  geom_sf(data = tornado_paths, aes(color = as.factor(yr)), size = 1) +
  scale_color_viridis_d(name = "Year") +
  theme_minimal() +
  theme(legend.position = "right")

# Display the Tornado Paths plot

tornado_paths_map

And next, generate the map for the reported tornado paths broken down by year.

# Generate the map for tornado points

tornado_points_map <- ggplot() +
  geom_sf(data = ok_counties, fill = NA, color = "grey") +
  geom_sf(data = tornado_points, aes(color = as.factor(yr))) +
  scale_color_viridis_d(name = "Year") +
  theme_minimal() +
  theme(legend.position = "right")

# Display the Tornado Points  plot

tornado_points_map

And finally, a composite of the two products.

# Create a composite figure

composite_figure <- plot_grid(tornado_paths_map, tornado_points_map, 
                              labels = c("Reported Tornado Paths", "Reported Tornado Points"), 
                              ncol = 1, 
                              align = "v")

# Display the composite figure plot

composite_figure

2. Summarize The Density of Tornado Points by Both County and Year

Next, we are going to take Summarize the density of tornado points by both county and year. Then, generate a faceted plot that displays maps of county-level tornado density from 2016-2021.

As seen in the previous analysis, there is 67 years of reported data. When displaying all of the data, the product is cluttered and hard to read. For this analysis, let’s apply a filter of the data to only encompass the years 2016-2021.

# Filter tornado data for years 2016-2021

tpoint_filtered <- tornado_points  %>%
  filter(yr >= 2016 & yr <= 2021)

Next, a spatial join needs to be performed in order of the tornado points and the Oklahoma County datasets.

# Perform a spatial join of tornado points to counties

tornado_county_join <- st_join(tpoint_filtered, ok_counties, join = st_within)

# Drop geometry for non-spatial operations

tornado_county_no_geom <- st_drop_geometry(tornado_county_join)

# Summarize tornado counts by COUNTYFP and year

tornado_summary <- tornado_county_no_geom %>%
  group_by(COUNTYFP, yr) %>%
  summarize(tornado_count = n(), .groups = 'drop')

# Join summary back to counties

ok_counties_unique <- ok_counties %>%
  select(COUNTYFP, geometry) %>%
  distinct()

# Perform a spatial join 

county_tornado_summary <- merge(ok_counties_unique, tornado_summary, by = "COUNTYFP")

After the spatial join succeeded, next we need to calculate the square kilometers of each county.

# Calculate the area of each county in square kilometers

county_tornado_summary$area_km2 <- st_area(county_tornado_summary) / 10^6

Next, we are going to summarize the count of tornadoes within each county to get the density of tornados in each county.

# Ensuring there is a tornado count for each county-year combination

county_tornado_summary <- county_tornado_summary %>%
  group_by(COUNTYFP) %>%
  mutate(tornado_density_km2 = ifelse(is.na(tornado_count), 0, tornado_count / area_km2 * 1000)) %>%
  ungroup()

And finally, it is time to create the composite plot of County-Level Tornado Density in Oklahoma (2016-2021).

# Creating the faceted plot

ggplot(data = county_tornado_summary) +
  geom_sf(aes(fill = tornado_density_km2)) +
  facet_wrap(~ yr, ncol = 3) +
  scale_fill_viridis_c(option = "viridis", direction = 1, name = "Tornado Density\n(per 1000 km²)") +
  labs(title = "County-Level Tornado Density in Oklahoma (2016-2021)", 
       subtitle = "Faceted by Year") +
  theme_minimal() +
  theme(legend.position = "bottom", 
        strip.background = element_blank(),
        strip.text.x = element_text(size = 10, angle = 0, hjust = 0.5))

3. Generate Four Choropleth Maps of Tornado Density Based on Quantile Breaks

This analysis combines spatial data manipulation with visualization techniques to provide insights into how tornado occurrences are distributed by density across counties, emphasizing the impact of different classification schemes on the perception of data.

First, we need to make sure the Oklahoma Counties data we are using has valid geometries for the spatial join that needs to occur.

# Making sure 'ok_counties_sf' has valid geometries

ok_counties_sf <- st_make_valid(ok_counties)

After verification of the geometries, it is time to perfom the spatial join between the tornado point data and the Oklahoma Counties data.

# Performing a spatial join to attribute tornado points to counties

tornado_in_county <- st_join(tornado_points, ok_counties_sf, join = st_within)

Next, we need to aggregate the tornado data in each county. This aggregation not only quantifies the tornado activity within each county but also prepares us for merging these insights back into our county dataset.

# Aggregating tornado counts by county

tornado_count <- tornado_in_county %>%
  group_by(COUNTYFP) %>%
  summarise(tornado_count = n(), .groups = 'drop')

After the aggregation of the data, we prepare our county dataset for merging by ensuring compatible data types. Following a successful merge with the tornado counts, we re-infuse the spatial geometry into our enriched dataset.

# Preparing 'ok_counties_sf' for merging

ok_counties_sf$COUNTYFP <- as.character(ok_counties_sf$COUNTYFP)

# Merge tornado counts back into county data then dropping geometry for the merge and rejoining later ensures no logical vector issue

ok_counties_sf_no_geom <- st_drop_geometry(ok_counties_sf)
ok_counties_merged <- left_join(ok_counties_sf_no_geom, tornado_count, by = "COUNTYFP")

# Adding geometry back to the merged data frame

ok_counties_merged <- st_as_sf(ok_counties_merged, geometry = st_geometry(ok_counties_sf), crs = st_crs(ok_counties_sf))

Now the merge of the data is complete, it is time to start examining tornado density. By calculating the area of each county and the corresponding tornado density, we transform raw counts into meaningful metrics that reflect the spatial extent of tornado occurrences per unit area.

# Calculating area in square kilometers

ok_counties_merged$area_km2 <- as.numeric(st_area(ok_counties_merged)) / 10^6

# Calculating tornado density per 1000 km square and handling NA values from counties without tornadoes

ok_counties_merged$tornado_density <- with(ok_counties_merged, ifelse(is.na(tornado_count), 0, tornado_count) / area_km2 * 1000)

Now that the calculations are complete, it is time to display the data into a series of choropleth maps, each visualizing tornado density with different classification schemes. These maps, ranging from three to six classes, are not mere representations of data; they are a visual exploration of how classification impacts our perception of tornado risk across Oklahoma.

# Generating maps with corrected area and tornado density calculations

plots <- list()
for (num_classes in 3:6) {
  ok_counties_merged$tdens_class <- cut_number(ok_counties_merged$tornado_density, n = num_classes, labels = FALSE)
  
  p <- ggplot(data = ok_counties_merged) +
    geom_sf(aes(fill = factor(tdens_class)), color = "white") +
    scale_fill_brewer(palette = "YlOrRd", name = "Tornado Density\n(Per 1000 km²)") +
    labs(title = paste("Tornado Density with", num_classes, "Classes")) +
    theme_minimal() +
    theme(legend.position = "bottom")
  
  plots[[num_classes-2]] <- p # Adjust index for list
}

# Creating composite figure to display

composite_map <- plot_grid(plotlist = plots, ncol = 2)

# Displaying map

print(composite_map)

This visualization brings together the separate threads of our analysis, presenting a cohesive and comparative view of tornado density across Oklahoma counties.

Conclusion

We have undertaken a comprehensive analysis of tornado activity within Oklahoma, leveraging an extensive dataset that spans 67 years. This investigation commenced with the generation of distinct maps delineating the paths and points of reported tornadoes, each differentiated by color to signify the year of occurrence. Aiming to shed light on the patterns and densities of these natural phenomena, we meticulously compiled a composite figure that merges the visual representations of both tornado paths and points. Delving deeper, we applied a focused lens on the years 2016 to 2021, summarizing tornado point density across counties to produce a faceted plot that vividly illustrates the distribution of tornadoes during this period. Our analytical journey further led us to create four choropleth maps, each categorized by different quantile breaks ranging from 3 to 6 classes. This approach not only highlights the spatial variability of tornado density across Oklahoma but also invites contemplation on how varying classification schemes influence the interpretation of such data. Through these methodical steps, this section not only aims to enhance our understanding of tornado activity’s spatial dynamics but also sets a precedent for employing sophisticated mapping and data visualization techniques in environmental analysis.

Mapping of Tornados in Oklahoma

Erik Reid

2024-03-19

Introduction