Synopsis

What have the effects of extreme weather been on the United States from 1950-2011? I answer this question in terms of human and economic cost in the following three graphs:

  1. Graph 1 - A table showing the frequency and the cost of each event type in the data
  2. Graph 2 - The human cost in injuries and fatalities of each event type
  3. Graph 3 - How the frequency of these events have changed from 1950 - 2011

DATA PROCESSING

#CODE TO REPRODUCE OUTPUT

library(downloader)
## Warning: package 'downloader' was built under R version 3.6.1
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.6.1
## -- Attaching packages ---------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.0     v purrr   0.3.2
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   0.8.3     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## Warning: package 'tibble' was built under R version 3.6.1
## Warning: package 'tidyr' was built under R version 3.6.1
## Warning: package 'readr' was built under R version 3.6.1
## Warning: package 'purrr' was built under R version 3.6.1
## Warning: package 'dplyr' was built under R version 3.6.1
## Warning: package 'stringr' was built under R version 3.6.1
## Warning: package 'forcats' was built under R version 3.6.1
## -- Conflicts ------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggrepel)
## Warning: package 'ggrepel' was built under R version 3.6.1
library(ggthemes)
## Warning: package 'ggthemes' was built under R version 3.6.1
library(lubridate)
## Warning: package 'lubridate' was built under R version 3.6.1
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
## 
##     date
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2" 
download(url, dest="repdata_data_StormData.csv", mode="wb")
storm <- read.csv("repdata_data_StormData.csv")

#The Event type category is a bit messy, so I make some adjustments
#to the largest categories to clean up based on the supp docs

storm$EVTYPE[storm$EVTYPE %in% "TSTM WIND"] <- "THUNDERSTORM WIND"
storm$EVTYPE[storm$EVTYPE %in% "THUNDERSTORM WINDS"] <- "THUNDERSTORM WIND"
storm$EVTYPE[storm$EVTYPE %in% "MARINE TSTM WIND"] <- "MARINE THUNDERSTORM WIND"


#First we look at what the top 20 most common events are in the data

count <- head(arrange(storm %>% count(EVTYPE), desc(n)), n=20)
count$rank <- seq(20)
count$top <- "X"


#Next we see which caused the most property damage
storm$dmg <- storm$PROPDMG+storm$CROPDMG

propdmg <- head(arrange(storm %>% 
                                group_by(EVTYPE) %>% 
                                summarize(dmg = sum(dmg)),
                        desc(dmg)), n = 10)

propdmg <- arrange(merge(propdmg, count))


################################################

#INJURIES AND FATALITIES

inj <- head(arrange(storm %>% 
                            group_by(EVTYPE) %>% 
                            summarize(INJURIES = sum(INJURIES)),
                    desc(INJURIES)), n = 20)

ftl <- head(arrange(storm %>% 
                            group_by(EVTYPE) %>% 
                            summarize(FATALITIES = sum(FATALITIES)),
                    desc(FATALITIES)), n = 20)

hurt <- head(merge(inj, ftl), n = 10)
hurt$totals <- (hurt$INJURIES+hurt$FATALITIES)

hurt2 <- gather(hurt, key = Incident, value = totals, -EVTYPE, -totals)

####Changes over time and the composition of these disasters

storm$date <- mdy_hms(storm$BGN_DATE)
storm$year <- year(storm$date)

top <- filter(merge(storm, count), top == "X" & rank < 11)

RESULTS

First, I wished to know the most common storm “event types” in US during this time period as well as the economic cost, plotting both elements on a single table. Here, we can see that despite being a common event, hail has lower costs, while tornadoes are the inverse: rare yet costly.

ggplot(propdmg, aes(x = dmg, y = n))+
        geom_point()+
        geom_label_repel(size = 2.5, label=propdmg$EVTYPE)+
        labs(x = "Property and Crop Damage",
             y = "Number of Events",
             title = "Damage per Event (TOP 10)*",
             caption = "*In terms of damage in dollars")+
        theme_economist()

The severity of tornadoes is further reflected in the next chart showing injuries and fatalities. Here we see that of the top 10 disasters, tornadoes are deadliest, followed by thunderstorm winds as a distant second.

ggplot(hurt2, aes(EVTYPE, totals))+
        geom_bar(aes(fill = Incident), position = position_stack(reverse = TRUE),
                 stat="identity")+coord_flip()+
        labs(y = "Total Individuals",
             x = NULL,
             title = "Fatalities and Injuries by Type of Event (TOP 10)*",
             caption = "*In terms of total injuries and fatalities")+
        theme_igray()

Finally, since this is a time series and extreme climate is a very relevant subject in the era of climate change, I plotted these events chronologically. Here we can see that, assuming a constant methodology, these events are increasing in frequency.

y <- ggplot(top, aes(year))
y+geom_bar(aes(fill=EVTYPE))+
        labs(x = "Year",
             y = "Number of Events",
             title = "Number of Events by Year (TOP 10)*",
             caption = "*By number of occurrences")+
             scale_fill_brewer(palette="RdYlBu")

This graph further demonstrates that the cost of extreme weather will likely continue to increase, potentially reaching hundreds of billions of dollars a year as Yale researches have pointed out in April 2019. https://www.yaleclimateconnections.org/2019/04/climate-change-could-cost-u-s-economy-billions/