Part A

This portion of the lesson continues exploring tornado occurrences in Oklahoma.The data includes tornado touch down points and tornado paths from 2016 to 2017.As provided by the text from this week’s lesson, Wimberly (2023) Chapter 5 on Vector Geospatial Data, this data was derived from larger, national-level datasets generated by the National Oceanographic and Atmospheric Administration (NOAA) National Weather Service Storm Prediction Center.

Question 1

Generate a map of tornado paths where the paths from each year are displayed as a different color, similar to the map of tornado points in Figure 5.4. Create a composite figure containing the map of tornado paths and the map of tornado points using plot_grid().

Intial data wrangling

To begin I extract the values I want to map. The yr column indicates the year of each tornado and can be used to filter the data to a smaller year range from 2016-2021. This column is retained in the output along with om, a unique ID code, and date, the date of each tornado.

#read in datasets from chapter 5 of tutorial
okcounty <- st_read(here("Chapter5/ok_counties.shp"), quiet = TRUE)
tpoint <- st_read(here("Chapter5/ok_tornado_point.shp"), quiet = TRUE)
tpath <- st_read(here("Chapter5/ok_tornado_path.shp"), quiet = TRUE)

#filter years, Initiation points and paths of tornadoes in Oklahoma from 2016-2021.
tpoint_16_21 <- tpoint %>%
  filter(yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date)
tpath_16_21 <- tpath %>%
  filter(yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date)

Now the tornado paths and points are ready to map. We are including all years here mapped with the context of Oklahoma’s counties.

# Initiation points of tornadoes in Oklahoma from 2016-2021 with years represented by the color aesthetic.
pathsbyyear <- ggplot() +
  geom_sf(data = tpath_16_21, 
         aes(color = as.factor(yr)), lwd = 1.5) +
  geom_sf(data = okcounty, fill = NA) +
  scale_color_discrete(name = "Year") +
  coord_sf(datum = NA) +
  #geom_sf(linewidth = 5) +
  theme_void()

pointsbyyear <- ggplot() +
  geom_sf(data = tpoint_16_21, 
          aes(color = as.factor(yr))) +
  geom_sf(data = okcounty, fill = NA) +
  scale_color_discrete(name = "Year") +
  coord_sf(datum = NA) +
  theme_void()

#composite
plot_grid(pointsbyyear, pathsbyyear, 
          labels = c("A) Initiation Points of tornadoes by Year", 
                     "B) Tornado Paths by Year",
                     label_size = 12),
          ncol = 1, 
          hjust = 0, 
          label_x = 0, 
          align = "hv")

Figure 1. Tornado Initiation Points and Paths in Oklahoma, from 2016 to 2019. Points and paths are colored by year. Distribution of tornadoes appears to diagonally across Oklahoma’s center. Ranging from SW to NE.

Question 2

Summarize the density of tornado points by both county and year and generate a faceted plot that displays maps of county-level tornado density from 2016-2021.

Intial data wrangling

For this map, we need to calculdate the tornado density, using the intiation point dataset. To start I calculate the area of each county in Oklahoma. Next, I join the point dataset to the county so I have the county information appended to the point data. Now, I calculate the number of tornado points per county per year. This sum is stored in the tcnt field.

#get area of each county
countyarea <- okcounty %>%
  replace(is.na(.), 0) %>%
  mutate(area = st_area(okcounty)) %>%
  drop_units()

#join county to tornado points
countypnt <- st_join(tpoint_16_21, okcounty)

#get number of tornadoes per county per year
countypnt <- st_drop_geometry(countypnt)
countysumyr <- countypnt %>%
  group_by(GEOID, yr) %>%
  summarize(tcnt = n())

head(countysumyr)

## # A tibble: 6 × 3
## # Groups:   GEOID [2]
##   GEOID    yr  tcnt
##   <chr> <dbl> <int>
## 1 40001  2016     1
## 2 40001  2017     1
## 3 40001  2019     1
## 4 40001  2020     1
## 5 40001  2021     2
## 6 40005  2016     2

Next, I need to join the tornadoes per county per year dataset, countysumyr, to the county area dataset, countyarea. From here, I can calculate the number of tornadoes (per county per year) divided by the area of each county, thus getting the density. We just calculated the tornado Density per 1000 square kilometers.

#join number of tornadoes per county per year to the county area variable calculated above. calculate tornado
#density per county area (100km2) per year. inner join to only get matching geoids for density calculation.

countydensitybyyear <- countyarea %>%
  inner_join(countysumyr, by = "GEOID") %>%
  #mutate(tcnt = replace_na(tcnt, 0)) %>%
  #replace(is.na(.), 0) %>%
  mutate(tdensyr = 10^6 * 10^3 * tcnt / area) %>%
  drop_units()

Now we are ready to map tornado density.

#tornado density map. continuous
ggplot() +
  geom_sf(data = countydensitybyyear, aes(fill = tdensyr)) +
  geom_sf(data = okcounty, color = "black", fill = NA) +
  scale_fill_distiller(name = expression("Tornadoes/1000 km"^2), 
                       palette = "Oranges", 
                       breaks = pretty_breaks(),
                       direction = 1) +
  facet_wrap(~yr) +
  theme_void() +
  theme(legend.position = "bottom")

Figure 2. Tornado density, number of tornadoes per 1000 square kilometers in Oklahoma each year from 2016 through 2021. The year of the most tornadoes per area is 2019.

Question 3

Generate four choropleth maps of tornado density based on quantile breaks with numbers of classes ranging from 3 to 6. Create a composite figure containing the four maps using plot_grid() and examine how the number of classes changes the interpretation of the map.

Wrangling and prep

Very similar calculated are needed for the map in figure 2, just removing year from the tornado point sum aggregation (tcnt). Now we are just calculating the number of tornadoes per county. Next we do the same area and density calculation.

#calculate density, starting with number of tornadoes per county
countysum <- countypnt %>%
  group_by(GEOID) %>%
  summarize(tcnt = n())

#join number of tornadoes per county to county shapefile. calculate county area. calculate tornado number 
#(density) per area (100km2). nulls are zeros.
countymap <- okcounty %>%
  left_join(countysum, by = "GEOID") %>%
  replace(is.na(.), 0) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt / area) %>%
  drop_units()

head(countysum)

## # A tibble: 6 × 2
##   GEOID  tcnt
##   <chr> <int>
## 1 40001     6
## 2 40005     3
## 3 40007     4
## 4 40009     8
## 5 40011     1
## 6 40013     4

For this map we are classifying our tornado density with four different class options, 3-6. At first I wasnt having luck with the quantile for breaks other than 4 so I chose to take a deeper look at the density distribution. I saw that equal (pretty) breaks would not work best here, as the distribution of tornadoes was mainly in the lower (left) totals, with a bit of an outlier at 10 tornadoes on the far right. Because of this, I chose Jenks natural breaks at 3, 4, 5 and 6 classes based on the requirements.

#going with Jenks Natural Breaks
countymap <- countymap %>%
  mutate(tdens_6class = cut(tdens, 
                            breaks = classIntervals(tdens, n = 6, style = "jenks")$brks, 
                            include.lowest = TRUE),
         tdens_5class = cut(tdens, 
                            breaks = classIntervals(tdens, n = 5, style = "jenks")$brks, 
                            include.lowest = TRUE),
         tdens_4class = cut(tdens, 
                            breaks = classIntervals(tdens, n = 4, style = "jenks")$brks, 
                            include.lowest = TRUE),
         tdens_3class = cut(tdens, 
                            breaks = classIntervals(tdens, n = 3, style = "jenks")$brks, 
                            include.lowest = TRUE))

# see disbursement of values in tornado density
tdenshist <- hist(countymap$tdens, main = "Histogram of Tornado Density", xlab = "Tornado Density")

Figure 3. Histogram. Distribution of tornado density (number of tornadoes per 100km^2) across Oklahoma counties, 2016-2021.

Now its time to map these 4 different class breaks!

# plot
class3 <- ggplot(data = countymap) +
  geom_sf(aes(fill = tdens_3class)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),   
                    palette = "Oranges") +
  theme_void() +
  theme(legend.position = "bottom") 

class4 <- ggplot(data = countymap) +
  geom_sf(aes(fill = tdens_4class)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),   
                    palette = "Oranges") +
  theme_void() +
  theme(legend.position = "bottom") 

class5 <- ggplot(data = countymap) +
  geom_sf(aes(fill = tdens_5class)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),   
                    palette = "Oranges") +
  theme_void() +
  theme(legend.position = "bottom") 

class6 <- ggplot(data = countymap) +
  geom_sf(aes(fill = tdens_6class)) +
  scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),   
                    palette = "Oranges") +
  theme_void() +
  theme(legend.position = "bottom") 

# composite 
plot_grid(class3, class4, class5, class6, 
          labels = c("A) Three Classes", 
                     "B) Four Classes",
                     "C) Five Classes",
                     "D) Six Classes",
                     label_size = 12),
          ncol = 1, 
          label_x = 0, 
          align = "hv")

Figure 3. Mapping Tornado Densities in Oklahoma Counties, 2016 - 2021. Density is per 100 square kilometers. We can see that it isnt until the maps of 5 and 6 classes that we can begin to see how much higher density some NE counties in Oklahoma have relative to the other counties (these put the outlier values of 10 in their own class). Three classes hides the fact that most density values are at 4 tornadoes, and all other high values are included in the same third class.

Lesson 4 Part A

D.Mesler

2024-03-20

Part A

Question 1

Intial data wrangling

Question 2

Intial data wrangling

Question 3

Wrangling and prep