Introduction

Tornadoes are considered one of the most dangerous and unpredictable weather events causing damage or worse death. Oklahoma state has had a complex history of tornadoes with the most deadly on record in 1947, the Woodward tornado with 116 fatalities and 782 injuries with severe damage to property Woodward Tornado, Oklahoma 1947, F5 category. Tornadoes can produce multiple storms extending for miles across the state in well-defined damage paths that may or may not meet up to form stronger super tornadoe paths. The Woodward tornado ranked a F5 category (violent): 261-318 mph, (rare) incredible damage [Woodward F5 Category] (https://www.weather.gov/oun/tornadodata-county-ok)

The data set used in the analysis is a historic data set of tornadoes in the state of Oklahoma spanning the years 1950 to 2021. The objective of this analysis is to map tornadoes for years 2016 to 2021. The tornado data includes a number of variable columns with information including tornado touchdown point locations defining tornado paths as the storm migrates.

Study objectives

Task 1 - Generate a map for tornado paths where the paths from each year are displayed as a different color (see Figure 5.4). Create a map with both tornado paths and tornado points with plot_grid()

Task 2 - Summarize the density of tornado points by both country and year and generate a faceted plot maps of country -level tornado density from 2016-2021

Task 3 - Generate four chloropleth maps of tornado density based on quantile breaks with numbers of classes ranging from 3-6. Create a composite figure containing the four maps using plot_grid() and examine how the number fo classes changes the interpretation.

Data Preparation

# *Load packages required for the study*

library(sf)
library(rgdal)
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
library(RColorBrewer)
library(units)
library(cowplot)
library(tidyverse)
library(here)
# *Import multiple file format data using st_read to import shapefiles to sf objects and assign objects names*

okcounty <- st_read("ok_counties.shp", quiet = TRUE)  #*Import county boundaries for OK state*
tpoint <- st_read("ok_tornado_point.shp", quiet = TRUE) #*Import tornado touchdown data*
tpath <- st_read("ok_tornado_path.shp", quiet = TRUE) #*Import tornado paths after touchdown*
class(okcounty)
## [1] "sf"         "data.frame"
glimpse(okcounty) # *Take a look at variable types in data set*
## Rows: 77
## Columns: 8
## $ STATEFP  <chr> "40", "40", "40", "40", "40", "40", "40", "40", "40", "40", "…
## $ COUNTYFP <chr> "077", "025", "011", "107", "105", "153", "001", "053", "059"…
## $ COUNTYNS <chr> "01101826", "01101800", "01101793", "01101841", "01101840", "…
## $ AFFGEOID <chr> "0500000US40077", "0500000US40025", "0500000US40011", "050000…
## $ GEOID    <chr> "40077", "40025", "40011", "40107", "40105", "40153", "40001"…
## $ NAME     <chr> "Latimer", "Cimarron", "Blaine", "Okfuskee", "Nowata", "Woodw…
## $ LSAD     <chr> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "…
## $ geometry <POLYGON [°]> POLYGON ((-95.50766 35.0292..., POLYGON ((-103.0025 3…

Data Analysis

okcounty_sp <- as(okcounty, 'Spatial') 
class(okcounty_sp) # *Convert geospatial tornado data from an sf object to an sp object for increased functionality*
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
okcounty_sf <- st_as_sf(okcounty_sp)
class(okcounty_sf) # *Convert geospatial data from an sp object back to an sf object*
## [1] "sf"         "data.frame"
names(tpoint) # *Examine the column names for the tornado year (yr) variable*
##  [1] "om"       "yr"       "mo"       "dy"       "date"     "time"    
##  [7] "tz"       "st"       "stf"      "stn"      "mag"      "inj"     
## [13] "fat"      "loss"     "closs"    "slat"     "slon"     "elat"    
## [19] "elon"     "len"      "wid"      "fc"       "geometry"
tpoint_16_21 <- tpoint %>%
  filter(yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date)
tpath_16_21 <- tpath %>%
  filter(yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date) #*Filter the yr variable for the years of interest*

Task 1 - Generate a map for tornado paths where the paths from each year are displayed as a different color with plot_grid()

library(ggplot2)
library(cowplot)
library(RColorBrewer)

p1 <- ggplot() +
      geom_sf(data = okcounty, fill = NA) +
      geom_sf(data = tpath_16_21, aes(color = as.factor(yr), lwd = 0.25))

p2 <- ggplot() +
    geom_sf(data = okcounty, fill = NA) +
    geom_sf(data = tpoint_16_21, 
          aes(color = as.factor(yr)))

# arrange two plots into one column
plot_grid(
  p1, p2,
  labels = "AUTO", ncol = 1
)

# add a single title for the composite plot

title <- ggdraw() + 
  draw_label(
    "Oklahoma Tornadoes between 2016 and 2021",
    fontface = 'bold',
    x = 0,
    hjust = 0)

Figure 1. Composite map of Oklahoma state between 2016 and 2021 showing tornado paths colored by year (A) and tornado points colored by year (B) using plot_grid(). General trends are visible in both maps but in general the images are difficult to interpret as far as risk to counties.

Task 2 - Summarize the density of tornado points by both country and year and generate a faceted plot maps of country -level tornado density from 2016-2021

Explore the distribution of tornado touchdowns by year

ggplot() +
  geom_sf(data = okcounty, 
          fill = NA, 
          color = "gray") +
  geom_sf(data = tpoint_16_21, size = 0.75) +
  facet_wrap(vars(yr), ncol = 2) + # *facet wrap to display tornadoes for each year*
  coord_sf(datum = NA) +
  theme_void()

Figure 2. Tornado initiation points of tornadoes in Oklahoma from 2016 and 2021 with facets mapped by year. Tornado touchdown point patterns vary over the years. There was increased activity in 2019 shown as a clustering of tornado touchdowns in central and northeastern part of the state, In 2021 the points form a well-defined east-northeast trend.

countypnt <- st_join(tpoint_16_21, okcounty) #* Create a joined data frame to leverage*
glimpse(countypnt) #* Take a look at the new data frame variables*
## Rows: 434
## Columns: 11
## $ om       <dbl> 613662, 613675, 613676, 613677, 613678, 613727, 613728, 61372…
## $ yr       <dbl> 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2…
## $ date     <chr> "2016-03-23", "2016-03-30", "2016-03-30", "2016-03-30", "2016…
## $ STATEFP  <chr> "40", "40", "40", "40", "40", "40", "40", "40", "40", "40", "…
## $ COUNTYFP <chr> "001", "113", "105", "131", "035", "139", "139", "139", "139"…
## $ COUNTYNS <chr> "01101788", "01101844", "01101840", "01101853", "01101805", "…
## $ AFFGEOID <chr> "0500000US40001", "0500000US40113", "0500000US40105", "050000…
## $ GEOID    <chr> "40001", "40113", "40105", "40131", "40035", "40139", "40139"…
## $ NAME     <chr> "Adair", "Osage", "Nowata", "Rogers", "Craig", "Texas", "Texa…
## $ LSAD     <chr> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "…
## $ geometry <POINT [°]> POINT (-94.5042 35.6916), POINT (-96.0151 36.2151), POI…

The numbers of tornadoes in each county can be calculated using summarize() with the n() function to count the records in each group. Before grouping and summarizing, countypnt must be converted from an sf object to a normal data frame using st_drop_geometry().

Group the data by GEOID and by Year and then summarize tornado touchdowns at county-level

countypnt <- st_drop_geometry(countypnt)
countysum <- countypnt %>%
  group_by(GEOID) %>% #* To compute total tornadoes per county - group by GEOID county code*
  summarize(tcnt = n()) #* Use summarize() with the n() function to count tornadoes in each group*
glimpse(countysum)
## Rows: 75
## Columns: 2
## $ GEOID <chr> "40001", "40005", "40007", "40009", "40011", "40013", "40015", "…
## $ tcnt  <int> 6, 3, 4, 8, 1, 4, 10, 5, 7, 5, 3, 12, 10, 5, 5, 1, 7, 9, 7, 8, 2…
countymap <- okcounty %>%
  left_join(countysum, by = "GEOID") %>% #*Join okcounty to countysum to match county polygon with correct summary; left_join ensures all county polygons are retained in output*
  replace(is.na(.), 0) %>% #* to deal with missing tornado records in some counties*
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt / area) %>% #*Computes the area of each county use st_area*
  drop_units() #* Units are dropped for simplicity*
glimpse(countymap)
## Rows: 77
## Columns: 11
## $ STATEFP  <chr> "40", "40", "40", "40", "40", "40", "40", "40", "40", "40", "…
## $ COUNTYFP <chr> "077", "025", "011", "107", "105", "153", "001", "053", "059"…
## $ COUNTYNS <chr> "01101826", "01101800", "01101793", "01101841", "01101840", "…
## $ AFFGEOID <chr> "0500000US40077", "0500000US40025", "0500000US40011", "050000…
## $ GEOID    <chr> "40077", "40025", "40011", "40107", "40105", "40153", "40001"…
## $ NAME     <chr> "Latimer", "Cimarron", "Blaine", "Okfuskee", "Nowata", "Woodw…
## $ LSAD     <chr> "06", "06", "06", "06", "06", "06", "06", "06", "06", "06", "…
## $ tcnt     <dbl> 1, 12, 1, 10, 6, 2, 6, 0, 4, 9, 3, 10, 12, 1, 2, 8, 13, 4, 5,…
## $ geometry <POLYGON [°]> POLYGON ((-95.50766 35.0292..., POLYGON ((-103.0025 3…
## $ area     <dbl> 1890663261, 4766283042, 2427121029, 1657249513, 1503893122, 3…
## $ tdens    <dbl> 0.5289149, 2.5176851, 0.4120108, 6.0340944, 3.9896452, 0.6197…
st_write(countymap, 
         dsn = "oktornadosum.shp",  #* Save newly joined file as a shapefile*
         append = FALSE) #* overwrites the shapefile if it exists*
## Deleting layer `oktornadosum' using driver `ESRI Shapefile'
## Writing layer `oktornadosum' to data source 
##   `oktornadosum.shp' using driver `ESRI Shapefile'
## Writing 77 features with 10 fields and geometry type Polygon.
ggplot(data = countymap) + 
  geom_sf(aes(fill = tdens)) + #*County-level tornado density*
  theme_void() #*Theme function fo eliminate the axes and graticules to focus on map data*

Figure 3. Oklahoma state tornadoes between 2016 and 2021 choropleth map where the tornado density by county can be easily visualized through the dark-to-light blue color ramp. By default, the fill colors are based on a dark-to-light blue color ramp. The counties highest hit by tornado touchdowns follow an east-northeast orientation.

Task 3 - Generate four chloropleth maps of tornado density based on quantile breaks with numbers of classes ranging from 3-6. Create a composite figure containing the four maps using plot_grid() and examine how the number fo classes changes the interpretation.

Step 1 Join the two datasets to leverage the variable contents and aggregate the count data

countypnt <- st_join(tpoint_16_21, okcounty) #* Join the tornado point data with the okcounty polygon*
countypnt_yr <- st_drop_geometry(countypnt)
countysum_yr <- countypnt_yr %>% #* Group by GEOID and yr*
  group_by(GEOID, yr) %>%
  summarise(tcnt = n()) %>%
  aggregate(tcnt~yr + GEOID, FUN = sum, na.rm = TRUE) #*Aggregate the tornado count data*
## `summarise()` has grouped output by 'GEOID'. You can override using the
## `.groups` argument.
glimpse(countysum_yr) #* Take a peak at the data*
## Rows: 217
## Columns: 3
## $ yr    <dbl> 2016, 2017, 2019, 2020, 2021, 2016, 2019, 2017, 2018, 2019, 2017…
## $ GEOID <chr> "40001", "40001", "40001", "40001", "40001", "40005", "40005", "…
## $ tcnt  <int> 1, 1, 1, 1, 2, 2, 1, 1, 1, 2, 3, 1, 3, 1, 1, 1, 1, 2, 1, 5, 4, 1…

Step 2 Rejoin the aggregated tornado counts by county with the county map polygons

  #* Join the summarized tornado count date with the county map polygons*
countymapjoin <- okcounty %>%
  left_join(countysum_yr, by = "GEOID") %>% #*Summarized data is rejoined to county shapefile to access geometry variable*
  replace(is.na(.), 0) %>%
  mutate(area = st_area(.),
         tdens2 = 10^6 * 10^3 * tcnt/area) %>% #* Calculate density at county-level*
         drop_units() #* Units are dropped to simplify

Step 3 Create the final composite choropleth plot

ggplot() +
  geom_sf(data = okcounty, 
          fill = NA, 
          color = "grey") +
  geom_sf(data = countymapjoin, aes(fill = tdens2)) +
  facet_wrap(vars(yr), ncol = 2) +
  coord_sf(datum = NA) +
  scale_fill_distiller(name = expression ("Tornadoes per 1000 km squared
                        using quantile breaks"),
                       palette = "YlOrRd",
                        breaks = pretty_breaks(),
                        direction = 1) +
  theme_void()

Figure 4. Oklahoma tornado density at county-level based on quantile breaks (equal count classification) between 2016 and 2021. The tornado density can be easily visualized through the yellow-orange-red color ramp.

Conclusion

The composite map showing both of tornado paths color coded by year and tornado touchdown points color coded by year is difficult to interpret the meaning. Trends and clusters are easily identified. Tow tornado path trends were identified. One along a ENE-trend and the second along a ENE-trend. Clusters of tornado touchdowns are more obvious in the northeastern part of the state (Figure 1). Overlaying the two maps is a useful to show clustering of tornado touchdowns in each path that appear wider in areas of larger clusters.

The tornado touchdown density map faceted by year and colored by a black color was more useful to identify obvious trends and clusters of tornado touchdowns to be able to quickly assess which county might have the worst damage or worse deaths (Figure 2).

The quantile choropleth map presenting tornado density at the county-level (Figure 4) needs to be carefully considered during interpretation. A quantile classification takes the number of features (in this case tornado touchdowns and divides the total by the number of classes). From the composite image in Figure 4 it looks like the greatest activity for tornado touchdowns was in central and northeastern Oklahoma in 2019. When comparing this result to the tornado density map (Figure 2) there is clustering of tornado touchdowns in central and northeastern part of the state in 2019. The choropleth becomes more useful using 6 classes with those counties clearly visible in red and orange with th most activity in 2019. The maps showing 1 to 3 classes (in yellow) appear to blend together.