Part A Introduction

The first part of this assignment involved using ggplot to create static maps of tornado locations and paths from 2016 to 2021. Three different figures were created in this section. First, tornado points and paths by year will be mapped. Then, the density of tornados in each county by year is calculated and mapped. Finally, density of tornados by county are mapped using different numbers of class breaks.

Load Libraries

Before the analysis can be done it is necessary to load all relevant packages for this analysis.

library(sf)           
library(ggplot2)      
library(dplyr)        
library(tidyr)        
library(scales)       
library(RColorBrewer) 
library(units)
library(cowplot)

Tornado Paths and Points by Year

This part of the assignment involves creating a map of tornado paths where the paths from each year are displayed as a different color and creating a composite figure containing the map of tornado paths and the map of tornado points by year.

To accomplish this, path and point tornado data first needs to be loaded and filtered to the appropriate years. County polygons will also be loaded for background reference.

#Load Data
okcounty <- st_read("ok_counties.shp", quiet = TRUE)
tpoint <- st_read("ok_tornado_point.shp", quiet = TRUE)
tpath <- st_read("ok_tornado_path.shp", quiet = TRUE)




#Filter to relevant dates for points and paths
tpoint_16_21 <- tpoint %>%
  filter(yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date)
tpath_16_21 <- tpath %>%
  filter(yr >= 2016 & yr <= 2021) %>%
  select(om, yr, date)

Next, maps of tornado path and points need to be created using ggplot and the sf package. These maps will be saved for use in the composite figure later.

#Create map of tornado points by year
ptgraph <- ggplot() +
  geom_sf(data = tpoint_16_21, 
          aes(color = as.factor(yr))) +
  geom_sf(data = okcounty, fill = NA) +
  scale_color_discrete(name = "Year") +
  coord_sf(datum = NA) +
  theme_void()

#Create map of tornado paths by year
linegraph <- ggplot() +
  geom_sf(data = tpath_16_21, size = 1, 
          aes(color = as.factor(yr))) +
  geom_sf(data = okcounty, fill = NA) +
  scale_color_discrete(name = "Year") +
  coord_sf(datum = NA) +
  theme_void()

Finally, a composite figure containing both of the created maps is generated using plot_grid().

#Create a composite figure containing the map of tornado paths and the map of tornado points using plot_grid().
plots1 <- plot_grid(ptgraph, linegraph, 
          labels = c("A) Tornado Points by Year", 
                     "B) Tornado Paths by Year",
                     label_size = 12),
          ncol = 1, 
          hjust = 0, 
          label_x = 0, 
          align = "hv")

title <- ggdraw() + 
  draw_label(
    "Tornados in Oklahoma",
    fontface = 'bold',
    x = 0,
    hjust = 0
  ) +
  theme(
    # add margin on the left of the drawing canvas,
    # so title is aligned with left edge of first plot
    plot.margin = margin(0, 0, 0, 7)
  )

caption <- ggdraw() + 
  draw_label(
    "Figure 1: Tornado locations and Paths in OK from 2016-2021",
    x = 0,
    hjust = 0,
    size=12
  ) +
  theme(

    
  )

plot_grid(
  title, plots1, caption,
  ncol = 1,
  # rel_heights values control vertical title margins
  rel_heights = c(0.1, 1,0.1)
)

Tornado Path Densities by Year

The second part of part A involves summarizing the density of tornado points by both county and year and using these densities to generate a faceted plot that displays maps of county-level tornado density from 2016-2021. We can use the previously created and filtered tornado points dataset and the previously loaded in county dataset in order to create a choropleth map of tornado density for each year.

First, the area for each county needs to be calculated. Then the tornado point data needs to be joined to the county data to calculate the county which each point is in. This joined dateset will be used to calculate a summary of tornados by year and county.

#Add Area to each county
okcounty <- okcounty %>%
  mutate(area = st_area(okcounty))%>%
  drop_units()

#Generate counts of tornado's by county and year
countypnt <- st_join(tpoint_16_21, okcounty)#join counties to tornados

countypnt <- st_drop_geometry(countypnt)#drop geometry so summary data can be created

countysum <- countypnt %>% #create a summary of tornados by county and year
  group_by(GEOID,yr) %>%
  summarize(tcnt = n())

Next, the summary table needs to be joined to the county geometry and then the density for each county and year can be calculated using the counts of tornados by year and county and the area for each county.

#Join county counts to geometry and calculate density
countymap <- okcounty %>%
  inner_join(countysum, by = "GEOID") %>%
  replace(is.na(.), 0) %>%
  ungroup()%>%
  mutate(tdens = 10^6 * 10^3 * tcnt / area) %>%
  drop_units()

Finally, a plot can be created showing the density of tornados by county for each year.

#Create map
ggplot(data = countymap) +
  geom_sf(data = okcounty, fill = NA) +#add background county data
  geom_sf(aes(fill = tdens)) +
  facet_wrap(vars(yr), ncol = 2)+#separate maps by year
  scale_fill_distiller(name = expression("Tornadoes/1000 km"^2), 
                       palette = "YlOrRd", 
                       breaks = pretty_breaks(),
                       direction = 1) +
  theme_void()+
  labs(title = "Density of tornado points by both county and year 2016 to 2021",caption = "Figure 2: Tornado densities by year and county. Some of the highest densities occured in 2019.")#add titles

Tornado density using different class breaks.

This part of the analysis involves generating different choropleth maps using different class breaks and adding them to the same plot for comparison of how class breaks change how a map can be interpreted.

First, total county densities needed to be generated in a similar way as in the previous section except only by county and not year. The previously calculated county areas were used for this calculation as well.

#Generate Density Numbers
countypnttot <- st_join(tpoint_16_21, okcounty)

countypnttot <- st_drop_geometry(countypnt)
countysumtot <- countypnttot %>%
  group_by(GEOID) %>%
  summarize(tcnt = n())

countybreaks <- okcounty %>%
  left_join(countysumtot, by = "GEOID") %>%
  replace(is.na(.), 0) %>%
  mutate(area = st_area(okcounty),
         tdens = 10^6 * 10^3 * tcnt / area) %>%
  drop_units()

Next class breaks can be created for numbers from 3-6. A sequence for the class breaks can be created and then used to cut the density numbers at specific quantile intervals and store these classes in a new column of the data. This new column can be used to create a map plot with class breaks. This plot is saved for later use in a composite plot.

# Create Three Breaks
numclas3 <- 3
qbrks3 <- seq(0, 1, length.out = numclas3 + 1)

#Add class breaks field to dataset
countybreaks <- countybreaks %>%
  mutate(tdens_c3 = cut(tdens,
                        breaks = quantile(tdens, probs = qbrks3),
                        include.lowest = T))

#Create map using 3 class breaks
breaks3 <- ggplot(data = countybreaks) +
  geom_sf(aes(fill = tdens_c3)) +
  scale_fill_brewer(name="Density",   
                    palette = "YlOrRd") +
  theme_void()

#Four Breaks
numclas4 <- 4
qbrks4 <- seq(0, 1, length.out = numclas4 + 1)


countybreaks <- countybreaks %>%
  mutate(tdens_c4 = cut(tdens,
                        breaks = quantile(tdens, probs = qbrks4),
                        include.lowest = T))

breaks4 <- ggplot(data = countybreaks) +
  geom_sf(aes(fill = tdens_c4)) +
  scale_fill_brewer(name="Density",   
                    palette = "YlOrRd") +
  theme_void()

#Five Breaks
numclas5 <- 5
qbrks5 <- seq(0, 1, length.out = numclas5 + 1)


countybreaks <- countybreaks %>%
  mutate(tdens_c5 = cut(tdens,
                        breaks = quantile(tdens, probs=qbrks5),
                        include.lowest = T))

breaks5 <- ggplot(data = countybreaks) +
  geom_sf(aes(fill = tdens_c5)) +
  scale_fill_brewer(name="Density",   
                    palette = "YlOrRd") +
  theme_void()

#Six Breaks
numclas6 <- 6
qbrks6 <- seq(0, 1, length.out = numclas6 + 1)


countybreaks <- countybreaks %>%
  mutate(tdens_c6 = cut(tdens,
                        breaks = quantile(tdens, probs = qbrks6),
                        include.lowest = T))

breaks6 <- ggplot(data = countybreaks) +
  geom_sf(aes(fill = tdens_c6)) +
  scale_fill_brewer(name="Density",   
                    palette = "YlOrRd") +
  theme_void()

Finally a composite plot showing density maps with 3, 4, 5, and 6 class breaks is created as you can see the distribution of the data can look different based of where class breaks are made. More class breaks allows for more distinguishment between areas.

plots1 <- plot_grid(breaks3, breaks4, breaks5, breaks6, 
          labels = c("A) Three Breaks", 
                     "B) Four Breaks",
                     "C) Five Breaks",
                     "D) Six Breaks",
                     label_size = 12),
          ncol = 2, 
          hjust = 0, 
          label_x = 0, 
          align = "hv")

title <- ggdraw() + 
  draw_label(
    "Tornado Density By County",
    fontface = 'bold',
    x = 0,
    hjust = 0
  ) +
  theme(
    # add margin on the left of the drawing canvas,
    # so title is aligned with left edge of first plot
    plot.margin = margin(0, 0, 0, 7)
  )

caption <- ggdraw() + 
  draw_label(
    "Figure 3: Tornado Densities in OK from 2016-2021 by county using different class breaks",
    x = 0,
    hjust = 0,
    size=12
  ) +
  theme(

    
  )

plot_grid(
  title, plots1, caption,
  ncol = 1,
  # rel_heights values control vertical title margins
  rel_heights = c(0.1, 1,0.1)
)

Discussion

The created figures each provide information about tornados in the state of Oklahoma. Figure one shows the individual points of each tornado and its path by year. Although it seems that there is a somewhat higher amount in the northeastern part of the state than anywhere else, there is still a lot of data that can be hard to parse. Figure 2 is easier to parse showing the density of tornados by county and year. This makes it clear that there was high density of tornados in northeastern counties in the year of 2019. Other years do not show as high of densities. Figure 3 shows overall density by county using class breaks. Although each map shows some differences across the state, the most detailed pattern is using six breaks. However, the higher amount of breaks makes distinguishing the colors somewhat harder.