Background

The NOAA provides a database of historical storm data for storms that have hit the United States over time.

One might be interested to see how many storms have hit different parts of the United States over time to see how dangerous an area is to these natural disasters in case one was planning construction or to move.

Objective

Make a summary plot of all storms that have hit the United States after 1950 and calculate which five states have experienced the most storms in that time.

Method

Libraries

The following packages and datasets must be loaded

library(sf)
library(tidyverse)
library(ggmap)
library(rnoaa)
library(spData)

knitr::opts_chunk$set(cache=TRUE) # cache the results for quick compiling

Data

Retrieve ‘world’ and ‘us_states’ data for country and state polygons from the spData package

data(world)
data(us_states)

Download the NOAA IBTrACS Storm data and read the shapefile data

storms <- read_sf("C:\\Users\\Catherine\\Desktop\\Winter 2021\\R Data Science\\Assignments\\08a\\08b_data\\08b_data\\IBTrACS.NA.list.v04r00.points.shp",  
                     quiet = T,  stringsAsFactors = FALSE)

Wrangle the data

Filter to storms 1950-present with filter()

storms <- filter(storms,SEASON>=1950)

Use mutate_if() to convert -9999.0 to NA in all numeric columns

storms <- mutate_if(storms, is.numeric, function(x) ifelse(x==-9999.0,NA,x))

Add a column for decade value

storms <- storms[-1,] %>% 
  filter(! BASIN %in% c("EP","NI")) %>%
  mutate(Year=as.numeric(SEASON)) %>% 
  mutate(decade=floor(Year/10)*10)

Use st_bbox() to identify the bounding box of the storm data and save as ‘region’

region <- st_bbox(storms)

Plotting

Make the plot

Use ggplot() to plot the world polygon layer

ggplot()+
  geom_sf(data=world,inherit.aes=F,size=.1,fill="grey",colour="black")+ # world polygon layer
  facet_wrap(~decade,ncol=4)+ # multiple plot panels for multiple decades
  stat_bin2d(data=storms,aes(y=st_coordinates(storms)[,2],x=st_coordinates(storms)[,1]),bins=100)+ # storm data
  scale_fill_distiller(palette="YlOrRd",trans="log",direction=-1,breaks=c(1,10,100,1000))+ # color ramp
  coord_sf(ylim=region[c(2,4)],xlim=region[c(1,3)]) # crop plot to region of storm data

Plot is displayed in Results section.

Calculations

Calculate table of the five states with most storms

Use st_transform to reproject ‘us_states’ to the coordinate reference system of ‘storms’

states <- st_transform(us_states, crs=st_crs(storms))

Rename the NAME column from ‘states’ to ‘StateName’ to avoid confusion with NAME column in ‘storms’

states <- rename(states,StateName=NAME)

Perform a spatial join between ‘storms’ and ‘states’

storm_states <- st_join(storms,states,join=st_intersects,left=F) 
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
## although coordinates are longitude/latitude, st_intersects assumes that they are planar

Find the top 5 states where most storms occurred

storm_states <- group_by(storm_states,StateName) %>% # Group by StateName
  mutate(n=length(unique(SID))) %>% slice_head(n=1) # count how many unique storms occurred in each state and remove duplicate occurrences of each state

storm_states <- storm_states[order(-storm_states$n),] # sort by the number of storms in each state

storm_states = storm_states[c(1:5),c("StateName","n")] # keep only the top 5 states

Produce table of 5 state names and the number of storms with kable() (shown in Result section)

knitr::kable(storm_states,col.names=c("Top Five States","Number of Storms 1950-present","Location"))

Result

Top Five States Number of Storms 1950-present Location
Florida 131 POINT (-82.6 28.4)
North Carolina 84 POINT (-80.6 35.4)
Texas 69 POINT (-99 26.6)
Georgia 68 POINT (-82.34 30.71)
Louisiana 63 POINT (-92.8 29.7)

Conclusion

The historical storm data has been plotted above and the top 5 states where the most storms occurred are shown in the table.