Lesson_5_Rodriguez

Introduction

In this report, I will be generate a series of visualizations of tornadoes that struck the state of Oklahoma in the years 2016 through 2021. I will be guiding you through the process to generate the visualizations.

Creating a map of tornado tornado paths and a composite map of tornado paths and points

In the first part of this report we will be working on creating a visualization of tornadoes that struck that state of Oklahoma in the years 2016 through 2021.

We begin by loading the the necessary packages using the library function

  library(sf)           
  library(ggplot2)      
  library(dplyr)        
  library(tidyr)        
  library(scales)       
  library(RColorBrewer) 
  library(units)
  library(cowplot)
  library(here)

The next step is to load the necessary shapefiles using st_read

  okcounty <- st_read(here("Chapter5", "ok_counties.shp"), quiet = TRUE)
# Loading the counties shapefile

  tpoint <- st_read(here("Chapter5", "ok_tornado_point.shp"), quiet = TRUE)
# Loading the tornado points shapefile

  tpath <- st_read(here("Chapter5", "ok_tornado_path.shp"), quiet = TRUE)
# Loading the tornado paths shapefile

  class(okcounty)

## [1] "sf"         "data.frame"

# This returns the class of the okcounty object

Now that we have the necessary shapefiles loaded, we need to limit the data to the years we are interested in. For this report we will only be looking at tornado data for the years from 2016 to 2021.

  tpoint_16_21 <- tpoint %>%
# We begin by piping the tornado point data into an object called tpoint_16_21
    filter(yr >= 2016 & yr <= 2021) %>%
# We filter for the years equal or greater than 2016 and equal or less than 2021
    select(om, yr, date)
# from the data set we select the om, yr, and date columns. The om column is a unique identifier
  
  tpath_16_21 <- tpath %>%
# we pipe the tornado path data into an object called tpath_16_21
    filter(yr >= 2016 & yr <= 2021) %>%
# We filter for the years equal or greater than 2016 and equal or less than 2021
    select(om, yr, date)
# from the data set we select the om, yr, and date columns. The om column is a unique identifier

Now we have two objects, tpoint_16_21 and tpath_16_21 that contain the tornado point and path data for the years 2016 through 2021. We can now begin creating our maps.

  paths_map <-ggplot() +
# we create a new object called paths_map and call the ggplot function
    geom_sf(data = tpath_16_21, 
            aes(color = as.factor(yr))) +
#  the geom_sf function is used to create a map based on the tornado path data. We give each year a separate color using as.factor(yr) passed through the color argument. 
  
    geom_sf(data = okcounty, fill = NA) +
# we add another layer showing the counties in Oklahoma  
  
    scale_color_discrete(name = "Year") +
# we set a legend name for the colors called "Year"  

    coord_sf(datum = NA) +
# we remove the grid lines  
  
    theme_void()
# we remove the plot axis and labels

We have our map of tornado paths saved as an object called paths_map. We can now begin working on our tornado points map using a similar process.

  points_map <- ggplot() +
# we create a new object called paths_map and call the ggplot function
    geom_sf(data = tpoint_16_21, 
#  the geom_sf function is used to create a map based on the tornado point data.
            
    aes(color = as.factor(yr))) +
# We give each year a separate color using as.factor(yr) passed through the color argument.

    geom_sf(data = okcounty, fill = NA) +
# we add another laying showing the counties in Oklahoma
    
    scale_color_discrete(name = "Year") +
# we set a legend name for the colors called "Year"
  
    coord_sf(datum = NA) +
# we remove the grid lines
  
    theme_void()
# we remove the the plot axis and labels

We now have our map of tornado point data saved in an object called points_map.

In the next section we use plot_grid to produce a composite figure with both of our maps.

  plot_grid(paths_map, points_map,
# use the plot_grid function to create a composite figure with both of our maps (subplots)

            labels = c("A) Tornado Paths", 
                       "B) Tornado Points",
                       label_size = 12),
# creating labels for each our subplots and setting the size

            ncol = 1, 
# arranging the subplots in one column

            hjust = 0,
# making the horizontal justification left-aligned

            label_x = 0,
# sets the horizontal position of the label, in this case on the far left

            align = "hv")

# align the plots horizontally and vertically

We now have both of our maps in a single composite figure.

Creating a Facet-Plot Displaying Maps of County-Level Tornado Density

First we will join the tornado point data to the county data

  countypnt_yr <- st_join(tpoint_16_21, okcounty)
# using st_join to join the tornado point data to the okcounty data. We use st_join because it joins the rows based on spatial locations

Next we work on on getting our data ready for grouping and summarizing

  countypnt_yr <- st_drop_geometry(countypnt_yr)
# we drop the geometry from the countypnt_yr object we created to convert it to a normal data frame so we can group and summarize the data

Now that our data is a normal data frame we can proceed with grouping and summarizing

  countysum_yr <- countypnt_yr %>%
# creating a pipeline for countypnt_yr that will be saved into countysum_yr
    group_by(GEOID, yr) %>%
# grouping the GEOID county code and year
  
    summarize(tcnt = n(), .groups = "drop")
# creating a new column called tcnt that has the number of rows within each GEOID and yr group. Then ungrouping the data with "drop"

Our next step is to join our countymap_yr and countysum_yr data and generate our tornado density calculations

  countymap_yr <- okcounty %>%
# creating a pipeline for our okcounty spatial dataframe that will be saved into countymap_yr
  
    left_join(countysum_yr, by = "GEOID") %>%
# using a left join on the countysum_yr to join it to okcounty by "GEOID". The left join retains all the counties from okcounty even if they have no corresponding row in countysum_yr
  
    mutate(area = st_area(geometry),
           tdens = 10^6 * 10^3 * tcnt / area) %>%
# using mutate to generate the area of each county using st_area. Then the tornado density is calculated by taking the number of tornadoes, divide by area and converting it to tornadoes per 1000 km2 by multiplying it by 10^6 and 10^3
  
    drop_units()
# dropping the unit attribute created by the st_area function

The final piece is using ggplot to create a facet plot based on each year that shows the tornado density for each county

  ggplot(data = countymap_yr) +
# calling the ggplot function to create a plot based on the countymap_yr data
  
    geom_sf(aes(fill = tdens)) +
# filing the polygons based on the tdens column
  
    scale_fill_distiller(name = expression("Tornadoes/1000 km"^2),
                         palette = "YlOrRd",
                         breaks = pretty_breaks(),
                         direction = 1) +
# using scale_fill_distiller to call a color ramp from the RColorBrewer package. The name of the scale is set to "Tornadoes/1000 km^2, the palette chosen is "YlOrRd", pretty_breaks creates a set of breaks for the legend, and the direction is set to 1 so that the scale goes from light to dark
  
    facet_wrap(vars(yr), ncol = 2) +
# creates a separate map for each year with two columns
  
    theme_void() +
# removing grid lines and axis labels
  
    theme(legend.position = "bottom",
# setting the legend to the bottom of the plot
          
          strip.text = element_text(size = 10, face = "bold"))

# setting the year labels above each map to a font size of 10 and bold

Creating four choropleth maps based on tornado density with quartile breaks broken into 3, 4, 5, and 6 classes

Next we will be creating a series of four choropleth maps based on tornado density with varying classes

  countymap <- countymap_yr %>%
# creating a pipline for countymap_yr that will be saved as countymap
  
    mutate(
# calling the mutate function to generate new variables for the quantile breaks
      tdens_c3 = cut(tdens,
                     breaks = quantile(tdens, probs = seq(0, 1, length.out = 4), na.rm = TRUE),
                     include.lowest = TRUE),
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 3, and removing NAs

      tdens_c4 = cut(tdens,
                     breaks = quantile(tdens, probs = seq(0, 1, length.out = 5), na.rm = TRUE),
                     include.lowest = TRUE),
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 4, and removing NAs

      tdens_c5 = cut(tdens,
                     breaks = quantile(tdens, probs = seq(0, 1, length.out = 6), na.rm = TRUE),
                     include.lowest = TRUE),
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 5, and removing NAs

      tdens_c6 = cut(tdens,
                     breaks = quantile(tdens, probs = seq(0, 1, length.out = 7), na.rm = TRUE),
                     include.lowest = TRUE)
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 6, and removing NAs
    )
  

  map_c3 <- ggplot(data = countymap) +
# creating our first map by calling the ggpplot function and basing it off the countymap
    
    geom_sf(aes(fill = tdens_c3)) +
# filling the polygons based off the quantile breaks we created earlier
    
    scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                      palette = "YlOrRd") +
# using scale_fill_distiller to call a color ramp from the RColorBrewer package. The name of the scale is set to "Tornadoes/1000 km^2,  and the palette chosen is "YlOrRd"
    
    theme_void() +
# removing axis labels and gridlines
    
    theme(legend.position = "bottom",
          legend.text = element_text(size = 6),
          legend.title = element_text(size = 7))
# placing the legend on the bottom  with selected text and title font size
  

# repeating the previous steps for the next three choropleths
  map_c4 <- ggplot(data = countymap) +
    geom_sf(aes(fill = tdens_c4)) +
    scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                      palette = "YlOrRd") +
    
    theme_void() +
    theme(legend.position = "bottom",
          legend.text = element_text(size = 6),
          legend.title = element_text(size = 7))
  
  map_c5 <- ggplot(data = countymap) +
    geom_sf(aes(fill = tdens_c5)) +
    scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                      palette = "YlOrRd") +
    
    theme_void() +
    theme(legend.position = "bottom",
          legend.text = element_text(size = 6),
          legend.title = element_text(size = 7))
  
  map_c6 <- ggplot(data = countymap) +
    geom_sf(aes(fill = tdens_c6)) +
    scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
                      palette = "YlOrRd") +

    theme_void() +
    theme(legend.position = "bottom",
          legend.text = element_text(size = 6),
          legend.title = element_text(size = 7))

Now that we have our four choropleths created we can can create a composite displaying them.

  plot_grid(map_c3, map_c4, map_c5, map_c6,
# creating a composite using plot_grid and adding the four previously completed choropleths

            labels = c("A) - 3 Classes", "B) - 4 Classes", "C) - 5 Classes", "D) - 6 Classes"),
# creating labels for each choropleth

            ncol = 2,
# creating a grid of two colums

            hjust = 0,
# making the horizontal justication left-aligned

            label_x = 0,
# set the horizontal position of the label, in this case the far left

            align = "hv")

# aligning the four plots horizontally and vertically

Conclusion

We now have created our three sets of plots.

Congrats!