During this analysis, we will examine tornado activity in Oklahoma with a various geospatial data sets which cover 67 years of reporting. First, we will generate a map of tornado paths where the paths from each year are displayed as a different color, and create a composite figure containing the map of tornado paths and the map of tornado points. Next, we will summarize the density of tornado points by both county and year and generate a faceted plot that displays maps of county-level tornado density from 2016-2021.And finally, we will generate four choropleth maps of tornado density based on quantile breaks with numbers of classes ranging from 3 to 6, and create a composite figure containing the four maps to examine how the number of classes changes the interpretation of the map.
The first step for this analysis is to load the library packages that will be used throughout the process. The packages are:
tidyverse = Includes the packages that are use in everyday data analyses, such as ggplot2, dplyr, and readr to name a few. Additional details can be found at This Site
cowplot = The cowplot package provides various features that help with creating publication-quality figures, such as a set of themes, functions to align plots and arrange them into complex compound figures, and functions that make it easy to annotate plots and or mix plots with images. Additional details can be found at This Site
RColorBrewer = Provides color schemes for maps (and other graphics) designed by Cynthia Brewer. Additionalinformation on this package can be found at This Site
sf =Support for simple features, a standardized way to encode spatial vector data. Additional information can be located at This Site
# Load necessary R packages
library(tidyverse)
library(cowplot)
library(RColorBrewer)
library(sf)
The data that we will be using for this project consist of three shapefile
Oklahoma Counties = This shapefile contains the geometry (borders) of counties in Oklahoma.
Oklahoma Tornado Paths = This shapefile outlines the paths taken by tornadoes, including start and end points, and the path in between.
Oklahoma Tornado Points = This shapefile represents the reported locations of tornado occurrences, such as points of touchdown or points of significant events along the tornado’s path.
For the analysis in section 1, we will load the data that will be used for all three analysis in section 1 to simplify the process.
# Read the datasets
ok_counties <- st_read("ok_counties.shp", quiet = TRUE)
tornado_points <- st_read("ok_tornado_point.shp", quiet = TRUE)
tornado_paths <- st_read("ok_tornado_path.shp", quiet = TRUE)
First, we ill generate the map for the reported tornado paths broken down by year.
# Generate the map for tornado paths colored by year
tornado_paths_map <- ggplot() +
geom_sf(data = ok_counties, fill = NA, color = "grey") +
geom_sf(data = tornado_paths, aes(color = as.factor(yr)), size = 1) +
scale_color_viridis_d(name = "Year") +
theme_minimal() +
theme(legend.position = "right")
# Display the Tornado Paths plot
tornado_paths_map
And next, generate the map for the reported tornado paths broken down by year.
# Generate the map for tornado points
tornado_points_map <- ggplot() +
geom_sf(data = ok_counties, fill = NA, color = "grey") +
geom_sf(data = tornado_points, aes(color = as.factor(yr))) +
scale_color_viridis_d(name = "Year") +
theme_minimal() +
theme(legend.position = "right")
# Display the Tornado Points plot
tornado_points_map
And finally, a composite of the two products.
# Create a composite figure
composite_figure <- plot_grid(tornado_paths_map, tornado_points_map,
labels = c("Reported Tornado Paths", "Reported Tornado Points"),
ncol = 1,
align = "v")
# Display the composite figure plot
composite_figure
Next, we are going to take Summarize the density of tornado points by both county and year. Then, generate a faceted plot that displays maps of county-level tornado density from 2016-2021.
As seen in the previous analysis, there is 67 years of reported data. When displaying all of the data, the product is cluttered and hard to read. For this analysis, let’s apply a filter of the data to only encompass the years 2016-2021.
# Filter tornado data for years 2016-2021
tpoint_filtered <- tornado_points %>%
filter(yr >= 2016 & yr <= 2021)
Next, a spatial join needs to be performed in order of the tornado points and the Oklahoma County datasets.
# Perform a spatial join of tornado points to counties
tornado_county_join <- st_join(tpoint_filtered, ok_counties, join = st_within)
# Drop geometry for non-spatial operations
tornado_county_no_geom <- st_drop_geometry(tornado_county_join)
# Summarize tornado counts by COUNTYFP and year
tornado_summary <- tornado_county_no_geom %>%
group_by(COUNTYFP, yr) %>%
summarize(tornado_count = n(), .groups = 'drop')
# Join summary back to counties
ok_counties_unique <- ok_counties %>%
select(COUNTYFP, geometry) %>%
distinct()
# Perform a spatial join
county_tornado_summary <- merge(ok_counties_unique, tornado_summary, by = "COUNTYFP")
After the spatial join succeeded, next we need to calculate the square kilometers of each county.
# Calculate the area of each county in square kilometers
county_tornado_summary$area_km2 <- st_area(county_tornado_summary) / 10^6
Next, we are going to summarize the count of tornadoes within each county to get the density of tornados in each county.
# Ensuring there is a tornado count for each county-year combination
county_tornado_summary <- county_tornado_summary %>%
group_by(COUNTYFP) %>%
mutate(tornado_density_km2 = ifelse(is.na(tornado_count), 0, tornado_count / area_km2 * 1000)) %>%
ungroup()
And finally, it is time to create the composite plot of County-Level Tornado Density in Oklahoma (2016-2021).
# Creating the faceted plot
ggplot(data = county_tornado_summary) +
geom_sf(aes(fill = tornado_density_km2)) +
facet_wrap(~ yr, ncol = 3) +
scale_fill_viridis_c(option = "viridis", direction = 1, name = "Tornado Density\n(per 1000 km²)") +
labs(title = "County-Level Tornado Density in Oklahoma (2016-2021)",
subtitle = "Faceted by Year") +
theme_minimal() +
theme(legend.position = "bottom",
strip.background = element_blank(),
strip.text.x = element_text(size = 10, angle = 0, hjust = 0.5))
This analysis combines spatial data manipulation with visualization techniques to provide insights into how tornado occurrences are distributed by density across counties, emphasizing the impact of different classification schemes on the perception of data.
First, we need to make sure the Oklahoma Counties data we are using has valid geometries for the spatial join that needs to occur.
# Making sure 'ok_counties_sf' has valid geometries
ok_counties_sf <- st_make_valid(ok_counties)
After verification of the geometries, it is time to perfom the spatial join between the tornado point data and the Oklahoma Counties data.
# Performing a spatial join to attribute tornado points to counties
tornado_in_county <- st_join(tornado_points, ok_counties_sf, join = st_within)
Next, we need to aggregate the tornado data in each county. This aggregation not only quantifies the tornado activity within each county but also prepares us for merging these insights back into our county dataset.
# Aggregating tornado counts by county
tornado_count <- tornado_in_county %>%
group_by(COUNTYFP) %>%
summarise(tornado_count = n(), .groups = 'drop')
After the aggregation of the data, we prepare our county dataset for merging by ensuring compatible data types. Following a successful merge with the tornado counts, we re-infuse the spatial geometry into our enriched dataset.
# Preparing 'ok_counties_sf' for merging
ok_counties_sf$COUNTYFP <- as.character(ok_counties_sf$COUNTYFP)
# Merge tornado counts back into county data then dropping geometry for the merge and rejoining later ensures no logical vector issue
ok_counties_sf_no_geom <- st_drop_geometry(ok_counties_sf)
ok_counties_merged <- left_join(ok_counties_sf_no_geom, tornado_count, by = "COUNTYFP")
# Adding geometry back to the merged data frame
ok_counties_merged <- st_as_sf(ok_counties_merged, geometry = st_geometry(ok_counties_sf), crs = st_crs(ok_counties_sf))
Now the merge of the data is complete, it is time to start examining tornado density. By calculating the area of each county and the corresponding tornado density, we transform raw counts into meaningful metrics that reflect the spatial extent of tornado occurrences per unit area.
# Calculating area in square kilometers
ok_counties_merged$area_km2 <- as.numeric(st_area(ok_counties_merged)) / 10^6
# Calculating tornado density per 1000 km square and handling NA values from counties without tornadoes
ok_counties_merged$tornado_density <- with(ok_counties_merged, ifelse(is.na(tornado_count), 0, tornado_count) / area_km2 * 1000)
Now that the calculations are complete, it is time to display the data into a series of choropleth maps, each visualizing tornado density with different classification schemes. These maps, ranging from three to six classes, are not mere representations of data; they are a visual exploration of how classification impacts our perception of tornado risk across Oklahoma.
# Generating maps with corrected area and tornado density calculations
plots <- list()
for (num_classes in 3:6) {
ok_counties_merged$tdens_class <- cut_number(ok_counties_merged$tornado_density, n = num_classes, labels = FALSE)
p <- ggplot(data = ok_counties_merged) +
geom_sf(aes(fill = factor(tdens_class)), color = "white") +
scale_fill_brewer(palette = "YlOrRd", name = "Tornado Density\n(Per 1000 km²)") +
labs(title = paste("Tornado Density with", num_classes, "Classes")) +
theme_minimal() +
theme(legend.position = "bottom")
plots[[num_classes-2]] <- p # Adjust index for list
}
# Creating composite figure to display
composite_map <- plot_grid(plotlist = plots, ncol = 2)
# Displaying map
print(composite_map)
This visualization brings together the separate threads of our analysis, presenting a cohesive and comparative view of tornado density across Oklahoma counties.
We have undertaken a comprehensive analysis of tornado activity within Oklahoma, leveraging an extensive dataset that spans 67 years. This investigation commenced with the generation of distinct maps delineating the paths and points of reported tornadoes, each differentiated by color to signify the year of occurrence. Aiming to shed light on the patterns and densities of these natural phenomena, we meticulously compiled a composite figure that merges the visual representations of both tornado paths and points. Delving deeper, we applied a focused lens on the years 2016 to 2021, summarizing tornado point density across counties to produce a faceted plot that vividly illustrates the distribution of tornadoes during this period. Our analytical journey further led us to create four choropleth maps, each categorized by different quantile breaks ranging from 3 to 6 classes. This approach not only highlights the spatial variability of tornado density across Oklahoma but also invites contemplation on how varying classification schemes influence the interpretation of such data. Through these methodical steps, this section not only aims to enhance our understanding of tornado activity’s spatial dynamics but also sets a precedent for employing sophisticated mapping and data visualization techniques in environmental analysis.