The Adverse Health & Economic Effects of Categorical Weather Events (1950-2011)

Synopsis

The data concering the adverse impact on the health and economic well-being of the US population between 1950 and 2011 illustrates a complex picture. After examining the data, it is clear that, cumulatively, tornadoes have been responsible for more deaths and injuries by a wide margin - more than the next five weather events combined. However, on a per event Heat is responsible for deaths and injuries than any other event. From an economic perspective, cumulatively, flooding has caused higher property damage costs than other weather events by a wide margin. Also, cumulatively, drought has caused higher crop damage costs than other events by a significant margin. On a per event basis, excluding exceptionally rare events (i.e. occuring less than 10 times), the cost per event tells a different story. On a per event basis, typhoons have cost the most in terms of property and crop damage on a per event basis.

Data Processing

The data for this analysis was sourced from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. Packages were used to perform data processing, transformation and analysis. The data was sourced into R by downloading the csv.bz2 file via URL and then using the base R read.csv() function. To facilitate data processing and transformation, multiple packages were installed and loaded: tidyverse (for data tidying and manipulation), ggplot2 (for data visualization), and forcats (for factor recoding)

if(!file.exists("weatherevents.csv")) {
    download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile="weatherevents.csv", method="curl");
}

df <- read.csv("weatherevents.csv")

list.of.packages <- c("tidyverse", "ggplot2", "forcats")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[, "Package"])]
if(length(new.packages)) install.packages(new.packages)
library(tidyverse); library(ggplot2); library(forcats)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag():    dplyr, stats

Results

To examine the effects of weather events on population health, the data was grouped by event type (e.g. Tornado, Flood, etc.) and the cumulative number of deaths and injuries for each event type was calculated. In addition, the average number of deaths and injuries for each individual was calculated to examine which events have the greatest impact on health on a per event basis. A subset of the event types that resulted in the 10 highest cumulative death rates was extracted for further analysis.

healthEffect <- df %>% group_by(type=EVTYPE) %>% summarize(events=length(EVTYPE), deaths=sum(FATALITIES), injuries=sum(INJURIES), deathByEvent=deaths/events, injuryByEvent=injuries/events) %>% arrange(desc(deaths)) %>% head(10)

Deaths and injuries data was combined to examine the cumulative casuality rate by event type.

casualtyEffect <- healthEffect %>% gather(deaths, injuries, key="casualtyType", value="casualties") %>% group_by(type) %>% arrange(desc(casualties,type))

The graph below illustrates the highest cumulative casualty rates among the 10 deadliest weather event types. By far, tornadoes have resulted in the greatest number of deaths and injuries.

ggplot(casualtyEffect, aes(x=reorder(type,-casualties), y=casualties,fill=casualtyType)) + geom_bar(stat="identity") + labs(x="Event", y="Casualties") + ggtitle("Casualties by Event (Deaths & Injuries)") + theme(axis.text.x=element_text(angle = -90, hjust = 0)) + scale_fill_discrete(name="", labels=c("Deaths","Injuries"))

While tornadoes may have resulted in the highest number of total deaths and injuries, heat and excessive heat are more detrimental to public health on a per event basis.

The table below illustrates the 10 deadliest weather events on a per event basis. As you can see, Heat, Rip Currents, and Avalanches are far more lethal events:

healthEffect %>% select(type, events, deathByEvent) %>% arrange(desc(deathByEvent)) %>% head(10)
## # A tibble: 10 x 3
##              type events deathByEvent
##            <fctr>  <int>        <dbl>
##  1           HEAT    767  1.221642764
##  2 EXCESSIVE HEAT   1678  1.134088200
##  3    RIP CURRENT    470  0.782978723
##  4      AVALANCHE    386  0.580310881
##  5        TORNADO  60652  0.092874101
##  6      LIGHTNING  15754  0.051796369
##  7          FLOOD  25326  0.018558004
##  8    FLASH FLOOD  54277  0.018018682
##  9      HIGH WIND  20212  0.012269939
## 10      TSTM WIND 219940  0.002291534

And the table below illustrates the events that result in the most injuries per event. As you can see, again, Heat has the most adverse effect:

healthEffect %>% select(type, events, injuryByEvent) %>% arrange(desc(injuryByEvent)) %>% head(10)
## # A tibble: 10 x 3
##              type events injuryByEvent
##            <fctr>  <int>         <dbl>
##  1 EXCESSIVE HEAT   1678    3.88855781
##  2           HEAT    767    2.73794003
##  3        TORNADO  60652    1.50606740
##  4    RIP CURRENT    470    0.49361702
##  5      AVALANCHE    386    0.44041451
##  6      LIGHTNING  15754    0.33197918
##  7          FLOOD  25326    0.26806444
##  8      HIGH WIND  20212    0.05625371
##  9    FLASH FLOOD  54277    0.03273947
## 10      TSTM WIND 219940    0.03163135

To examine the effects of weather on US economic health, some data tidying was required prior to analysis. First, a subset of data containing event types, property damage and crop damage was extracted. Then, property and crop damage exponent data (e.g. “B” for billion, “5”, for one hundred thousand, “K” for thousand, etc.) was recoded corresponding to its numerical value. Property and crop damage data was multiplied by the converted exponent value to obtain the actual property and crop damage figures. The data was then grouped by event type and the cumulative property and crop damage costs for each event type was calculated. In addition, the average property damange and crop damage costs per event was calculated to examine which events have the greatest impact on health on a per event basis. Subsets of the event types that resulted in the 10 highest costs in property damage and crop damage rates was extracted for further analysis.

econEffect <- df %>% select(EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
econEffect1 <- econEffect %>% mutate(propDmgExp=fct_collapse(PROPDMGEXP, "1000000000"="B", "100000000"="8", "10000000"="7", "1000000"=c("M","6"), "100000"="5", "10000"="4", "1000"=c("K","3"), "100"="2", "10"="1", "1"=c("?","-","+","0","")))
econEffect2 <- econEffect1 %>% mutate(cropDmgExp=fct_collapse(CROPDMGEXP, "1000000000"="B", "1000000"=c("M","m"), "1000"=c("K","k"), "100"="2", "1"=c("0","?","")))
econEffect3 <- econEffect2 %>% mutate(propDmgExp=as.numeric(as.character(propDmgExp)), cropDmgExp=as.numeric(as.character(cropDmgExp)), propDmg=PROPDMG * propDmgExp, cropDmg=CROPDMG * cropDmgExp)
## Warning in eval(substitute(expr), envir, enclos): NAs introduced by
## coercion
propEffect <- econEffect3 %>% group_by(type=EVTYPE) %>% summarize(events=length(EVTYPE), propDmg=sum(propDmg, na.rm=TRUE), cropDmg=sum(cropDmg, na.rm=TRUE), propDmgByEvent=propDmg/events, cropDmgByEvent=cropDmg/events) %>% arrange(desc(propDmg)) %>% head(10)
cropEffect <- econEffect3 %>% group_by(type=EVTYPE) %>% summarize(events=length(EVTYPE), propDmg=sum(propDmg, na.rm=TRUE), cropDmg=sum(cropDmg, na.rm=TRUE), propDmgByEvent=propDmg/events, cropDmgByEvent=cropDmg/events) %>% arrange(desc(cropDmg)) %>% head(10)

The first graph below illustrates the 10 highest cumulative property damage costs by weather event types; By far, floods have resulted in the greatest amount of property damage in terms of economic costs. The second graph below illustrates the 10 highest cumulative crop damage costs by weather event types; By far, droughts have resulted in the greatest amount of crop damage in terms of economic costs.

ggplot(propEffect, aes(x=reorder(type, -propDmg), y=propDmg, fill=type)) + geom_bar(stat="identity") + labs(x="Event", y="Damage (in Billions)") + ggtitle("Property Damage by Event") + theme(axis.text.x=element_text(angle = -90, hjust = 0), legend.position ="none") 

ggplot(cropEffect, aes(x=reorder(type, -cropDmg), y=cropDmg, fill=type)) + geom_bar(stat="identity") + labs(x="Event", y="Damage (in Billions)") + ggtitle("Crop Damage by Event") + theme(axis.text.x=element_text(angle = -90, hjust = 0), legend.position ="none") 

Similar to health effects data, on a per individual event basis the data tells a different story. The table below illustrates the 10 most costly weather events on a per event basis in terms of property damage. The data has been filtered to account for events that have occurred more than 10 times to exclude rare events. As you can see, Typhoons, Storm Surges and Severe Thunderstorms, although infrequent, have had disproportionately adverse property damage effects on a per event basis.

propDamageByEvent <- econEffect3 %>% group_by(type=EVTYPE) %>% summarize(events=length(EVTYPE), propDmg=sum(propDmg, na.rm=TRUE), cropDmg=sum(cropDmg, na.rm=TRUE), propDmgByEvent=propDmg/events, cropDmgByEvent=cropDmg/events) %>% arrange(desc(propDmgByEvent))
propDamageByEvent %>% select(type, events, propDmgByEvent) %>% filter(events > 10) %>% arrange(desc(propDmgByEvent)) %>% head(10)
## # A tibble: 10 x 3
##                   type events propDmgByEvent
##                 <fctr>  <int>          <dbl>
##  1   HURRICANE/TYPHOON     88      787566364
##  2         STORM SURGE    261      165990559
##  3 SEVERE THUNDERSTORM     13       92720000
##  4           HURRICANE    174       68208730
##  5             TYPHOON     11       54566364
##  6    STORM SURGE/TIDE    148       31359378
##  7         RIVER FLOOD    173       29589280
##  8   FLASH FLOOD/FLOOD     22       12384091
##  9      TROPICAL STORM    690       11165059
## 10             TSUNAMI     20        7203100

The table below illustrates the 10 most costly weather events on a per event basis in terms of crop damage. The data has been filtered to account for events that have occurred more than 10 times to exclude rare events. Typhoons, River Floods, and Hurricanes, although infrequent, have had disproportionately adverse crop damage effects on a per event basis.

cropDamageByEvent <- econEffect3 %>% group_by(type=EVTYPE) %>% summarize(events=length(EVTYPE), propDmg=sum(propDmg, na.rm=TRUE), cropDmg=sum(cropDmg, na.rm=TRUE), propDmgByEvent=propDmg/events, cropDmgByEvent=cropDmg/events) %>% arrange(desc(cropDmgByEvent))
cropDamageByEvent %>% select(type, events, cropDmgByEvent) %>% filter(events > 10) %>% arrange(desc(cropDmgByEvent)) %>% head(10)
## # A tibble: 10 x 3
##                 type events cropDmgByEvent
##               <fctr>  <int>          <dbl>
##  1 HURRICANE/TYPHOON     88       29634918
##  2       RIVER FLOOD    173       29072017
##  3         HURRICANE    174       15758103
##  4            FREEZE     74        6030068
##  5           DROUGHT   2488        5615983
##  6         ICE STORM   2006        2503546
##  7       HEAVY RAINS     26        2326923
##  8      EXTREME COLD    655        1974005
##  9             FROST     53        1245283
## 10 UNSEASONABLY COLD     23        1088804