In this report, I will be generate a series of visualizations of tornadoes that struck the state of Oklahoma in the years 2016 through 2021. I will be guiding you through the process to generate the visualizations.
In the first part of this report we will be working on creating a visualization of tornadoes that struck that state of Oklahoma in the years 2016 through 2021.
We begin by loading the the necessary packages using the library function
library(sf)
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
library(RColorBrewer)
library(units)
library(cowplot)
library(here)
The next step is to load the necessary shapefiles using st_read
okcounty <- st_read(here("Chapter5", "ok_counties.shp"), quiet = TRUE)
# Loading the counties shapefile
tpoint <- st_read(here("Chapter5", "ok_tornado_point.shp"), quiet = TRUE)
# Loading the tornado points shapefile
tpath <- st_read(here("Chapter5", "ok_tornado_path.shp"), quiet = TRUE)
# Loading the tornado paths shapefile
class(okcounty)
## [1] "sf" "data.frame"
# This returns the class of the okcounty object
Now that we have the necessary shapefiles loaded, we need to limit the data to the years we are interested in. For this report we will only be looking at tornado data for the years from 2016 to 2021.
tpoint_16_21 <- tpoint %>%
# We begin by piping the tornado point data into an object called tpoint_16_21
filter(yr >= 2016 & yr <= 2021) %>%
# We filter for the years equal or greater than 2016 and equal or less than 2021
select(om, yr, date)
# from the data set we select the om, yr, and date columns. The om column is a unique identifier
tpath_16_21 <- tpath %>%
# we pipe the tornado path data into an object called tpath_16_21
filter(yr >= 2016 & yr <= 2021) %>%
# We filter for the years equal or greater than 2016 and equal or less than 2021
select(om, yr, date)
# from the data set we select the om, yr, and date columns. The om column is a unique identifier
Now we have two objects, tpoint_16_21 and tpath_16_21 that contain the tornado point and path data for the years 2016 through 2021. We can now begin creating our maps.
paths_map <-ggplot() +
# we create a new object called paths_map and call the ggplot function
geom_sf(data = tpath_16_21,
aes(color = as.factor(yr))) +
# the geom_sf function is used to create a map based on the tornado path data. We give each year a separate color using as.factor(yr) passed through the color argument.
geom_sf(data = okcounty, fill = NA) +
# we add another layer showing the counties in Oklahoma
scale_color_discrete(name = "Year") +
# we set a legend name for the colors called "Year"
coord_sf(datum = NA) +
# we remove the grid lines
theme_void()
# we remove the plot axis and labels
We have our map of tornado paths saved as an object called paths_map. We can now begin working on our tornado points map using a similar process.
points_map <- ggplot() +
# we create a new object called paths_map and call the ggplot function
geom_sf(data = tpoint_16_21,
# the geom_sf function is used to create a map based on the tornado point data.
aes(color = as.factor(yr))) +
# We give each year a separate color using as.factor(yr) passed through the color argument.
geom_sf(data = okcounty, fill = NA) +
# we add another laying showing the counties in Oklahoma
scale_color_discrete(name = "Year") +
# we set a legend name for the colors called "Year"
coord_sf(datum = NA) +
# we remove the grid lines
theme_void()
# we remove the the plot axis and labels
We now have our map of tornado point data saved in an object called points_map.
In the next section we use plot_grid to produce a composite figure with both of our maps.
plot_grid(paths_map, points_map,
# use the plot_grid function to create a composite figure with both of our maps (subplots)
labels = c("A) Tornado Paths",
"B) Tornado Points",
label_size = 12),
# creating labels for each our subplots and setting the size
ncol = 1,
# arranging the subplots in one column
hjust = 0,
# making the horizontal justification left-aligned
label_x = 0,
# sets the horizontal position of the label, in this case on the far left
align = "hv")
# align the plots horizontally and vertically
We now have both of our maps in a single composite figure.
First we will join the tornado point data to the county data
countypnt_yr <- st_join(tpoint_16_21, okcounty)
# using st_join to join the tornado point data to the okcounty data. We use st_join because it joins the rows based on spatial locations
Next we work on on getting our data ready for grouping and summarizing
countypnt_yr <- st_drop_geometry(countypnt_yr)
# we drop the geometry from the countypnt_yr object we created to convert it to a normal data frame so we can group and summarize the data
Now that our data is a normal data frame we can proceed with grouping and summarizing
countysum_yr <- countypnt_yr %>%
# creating a pipeline for countypnt_yr that will be saved into countysum_yr
group_by(GEOID, yr) %>%
# grouping the GEOID county code and year
summarize(tcnt = n(), .groups = "drop")
# creating a new column called tcnt that has the number of rows within each GEOID and yr group. Then ungrouping the data with "drop"
Our next step is to join our countymap_yr and countysum_yr data and generate our tornado density calculations
countymap_yr <- okcounty %>%
# creating a pipeline for our okcounty spatial dataframe that will be saved into countymap_yr
left_join(countysum_yr, by = "GEOID") %>%
# using a left join on the countysum_yr to join it to okcounty by "GEOID". The left join retains all the counties from okcounty even if they have no corresponding row in countysum_yr
mutate(area = st_area(geometry),
tdens = 10^6 * 10^3 * tcnt / area) %>%
# using mutate to generate the area of each county using st_area. Then the tornado density is calculated by taking the number of tornadoes, divide by area and converting it to tornadoes per 1000 km2 by multiplying it by 10^6 and 10^3
drop_units()
# dropping the unit attribute created by the st_area function
The final piece is using ggplot to create a facet plot based on each year that shows the tornado density for each county
ggplot(data = countymap_yr) +
# calling the ggplot function to create a plot based on the countymap_yr data
geom_sf(aes(fill = tdens)) +
# filing the polygons based on the tdens column
scale_fill_distiller(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd",
breaks = pretty_breaks(),
direction = 1) +
# using scale_fill_distiller to call a color ramp from the RColorBrewer package. The name of the scale is set to "Tornadoes/1000 km^2, the palette chosen is "YlOrRd", pretty_breaks creates a set of breaks for the legend, and the direction is set to 1 so that the scale goes from light to dark
facet_wrap(vars(yr), ncol = 2) +
# creates a separate map for each year with two columns
theme_void() +
# removing grid lines and axis labels
theme(legend.position = "bottom",
# setting the legend to the bottom of the plot
strip.text = element_text(size = 10, face = "bold"))
# setting the year labels above each map to a font size of 10 and bold
Next we will be creating a series of four choropleth maps based on tornado density with varying classes
countymap <- countymap_yr %>%
# creating a pipline for countymap_yr that will be saved as countymap
mutate(
# calling the mutate function to generate new variables for the quantile breaks
tdens_c3 = cut(tdens,
breaks = quantile(tdens, probs = seq(0, 1, length.out = 4), na.rm = TRUE),
include.lowest = TRUE),
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 3, and removing NAs
tdens_c4 = cut(tdens,
breaks = quantile(tdens, probs = seq(0, 1, length.out = 5), na.rm = TRUE),
include.lowest = TRUE),
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 4, and removing NAs
tdens_c5 = cut(tdens,
breaks = quantile(tdens, probs = seq(0, 1, length.out = 6), na.rm = TRUE),
include.lowest = TRUE),
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 5, and removing NAs
tdens_c6 = cut(tdens,
breaks = quantile(tdens, probs = seq(0, 1, length.out = 7), na.rm = TRUE),
include.lowest = TRUE)
# creating a choropleth based on tornado density (tdens), creating quartile breaks starting with 6, and removing NAs
)
map_c3 <- ggplot(data = countymap) +
# creating our first map by calling the ggpplot function and basing it off the countymap
geom_sf(aes(fill = tdens_c3)) +
# filling the polygons based off the quantile breaks we created earlier
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
# using scale_fill_distiller to call a color ramp from the RColorBrewer package. The name of the scale is set to "Tornadoes/1000 km^2, and the palette chosen is "YlOrRd"
theme_void() +
# removing axis labels and gridlines
theme(legend.position = "bottom",
legend.text = element_text(size = 6),
legend.title = element_text(size = 7))
# placing the legend on the bottom with selected text and title font size
# repeating the previous steps for the next three choropleths
map_c4 <- ggplot(data = countymap) +
geom_sf(aes(fill = tdens_c4)) +
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
theme_void() +
theme(legend.position = "bottom",
legend.text = element_text(size = 6),
legend.title = element_text(size = 7))
map_c5 <- ggplot(data = countymap) +
geom_sf(aes(fill = tdens_c5)) +
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
theme_void() +
theme(legend.position = "bottom",
legend.text = element_text(size = 6),
legend.title = element_text(size = 7))
map_c6 <- ggplot(data = countymap) +
geom_sf(aes(fill = tdens_c6)) +
scale_fill_brewer(name = expression("Tornadoes/1000 km"^2),
palette = "YlOrRd") +
theme_void() +
theme(legend.position = "bottom",
legend.text = element_text(size = 6),
legend.title = element_text(size = 7))
Now that we have our four choropleths created we can can create a composite displaying them.
plot_grid(map_c3, map_c4, map_c5, map_c6,
# creating a composite using plot_grid and adding the four previously completed choropleths
labels = c("A) - 3 Classes", "B) - 4 Classes", "C) - 5 Classes", "D) - 6 Classes"),
# creating labels for each choropleth
ncol = 2,
# creating a grid of two colums
hjust = 0,
# making the horizontal justication left-aligned
label_x = 0,
# set the horizontal position of the label, in this case the far left
align = "hv")
# aligning the four plots horizontally and vertically
We now have created our three sets of plots.
Congrats!