Synopsis

In this report we attempt to identify the types of natural disasters that have the greatest impact on human health and economic consequences. We examine disaster data downloaded available from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. After excluding some extremely atypical cases we found that over the past 20 years extreme summer conditions were responsible for the most human fatalities and flooding caused the most property and crop damage.

Data Processing

The NOAA data was downloaded from the web and manually unzipped to the hard drive.

# Adjust wd() to local machine, run only once due to size. 
setwd("C:/Dropbox/Professional/Coursera/5_ReproResearch/")
#download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", 
#              destfile = "noaa.csv.bz2")
noaa <- read.csv("noaa.csv", header=TRUE)

We converted the raw date to a readable format, and created numerical multipliers for damage in the billions (1,000 million). This will allow us later to combine crop and property damage costs as one measure of total economic cost.

library(dplyr); library(lattice)
## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
noaa$date <- as.Date(noaa$BGN_DATE, format="%m/%d/%Y %H:%M:%S")
noaa$pxp <- ifelse(noaa$PROPDMGEXP == "B", 1000, 1)
noaa$cxp <- ifelse(noaa$CROPDMGEXP == "B", 1000, 1)

Broad classifications of natural disaster types were created: tornadoes, floods, extreme summer conditions, extreme winter conditions, and hurricanes. For example, extreme cold, ice and blizzards are all products of winter, whereas droughts and severe heat are products of summer conditions.

noaa$etype <- tolower(noaa$EVTYPE)
noaa$event <- "Other"
noaa$event <- ifelse(grepl("tornado", noaa$etype), "Tornados", noaa$event)
noaa$event <- ifelse(grepl("flood", noaa$etype), "Floods", noaa$event)
noaa$event <- ifelse(grepl("heat", noaa$etype) | grepl("hot", noaa$etype) | grepl("drought", noaa$etype) | 
                       grepl("high", noaa$etype) | grepl("driest", noaa$etype), "Summer", noaa$event)
noaa$event <- ifelse(grepl("tstm", noaa$etype) | grepl("thunder", noaa$etype), "Thunderstorms", noaa$event)
noaa$event <- ifelse(grepl("tropical", noaa$etype) | grepl("hurri", noaa$etype) | grepl("typh", noaa$etype),
                     "Hurricanes", noaa$event)
noaa$event <- ifelse(grepl("winter", noaa$etype) | grepl("snow", noaa$etype) | 
                       grepl("blizzard", noaa$etype) | grepl("cold", noaa$etype) | 
                       grepl("freeze", noaa$etype) | grepl("hail", noaa$etype) | 
                       grepl("ice", noaa$etype) | grepl("chill", noaa$etype) , 
                     "Winter", noaa$event)

We restricted the data to only look at events over the past 20 years. To investigate the impact on human health we further restricted the data to events where at least one person has died. Separately to investigate the economic consequences we restricted the data to events that caused property or crop damage over 1 million dollars.

hh <- noaa %>% 
        mutate(death = FATALITIES) %>% 
        filter(date >= "1985-01-01" & FATALITIES > 0)
dmg <- noaa %>% 
        mutate(pcost = PROPDMG*pxp, ccost = CROPDMG*cxp, tcost = pcost+ccost) %>%
        filter(date >= "1985-01-01" & 
              (PROPDMGEXP %in% c('B', 'M') | CROPDMGEXP %in% c('B', 'M')))   

Exclusion of outliers

The purpose of this report is to identifying broad types of events that are most harmful to human health and have the greatest economic consequences. There are two events, one that caused excess economic damage and one that caused excess fatalities that are not typical of the rest of the events and so will be excluded from the analysis.

  1. A flood in Napa, California in 2006 that caused over 100 Billion in property damage. This high price tag is due to the fact that this is an very wealthy region of the country due to its world renown wine industry. Typical floods may do similar amounts of physical property damage but are unlikely to cause as much economic damage in most other regions.
  2. The heat wave in Chicago, 1995. This tragic event saw temperatures in the 100’s with very high levels of humidity that resulted in over 700 heat-related deaths. However, this event would not be considered typical since there were a number of other factors that led to such a high mortality rate such as poverty and crime rates in the most stricken areas leading to many elderly people refusing to open their windows at night.
dmg <- dmg[dmg$REFNUM!=605943,]
hh  <- hh[hh$REFNUM != 198690,]

Following the guidelines stated in the Storm Data Preparation documentation for this data, one record that was listed as a storm surge but was the result of Hurricane Katrina was reassigned to the Hurricane category.

dmg$event[dmg$REFNUM == 577616] <- "Hurricane"

Results

The following exploratory data analysis is based on 5986 events that caused at least one fatality and 10780 events causing at least $1M in damages.

Extreme summer conditions result in the highest fatalities.

We can see in found that while a single tornado was responsible for the greatest number of deaths in one occurrence, disasters that are a product of summer tend to have higher death tolls more frequently. Further investigation to examine any predictable periodicity of extreme summer events would be warranted.

xyplot(death ~ date | event, data=hh, ylab="Deaths", xlab="Year", pch=16, alpha=.4, cex=.8,
       main="Number of Fatalities per disaster by disaster group")

Hurricanes result in the greatest economic cost.

It is clear that on an individual event level, Hurricanes cause by far the most property and crop damage. These events not only have high total costs associated with each event but also have more frequent high costing events. Specifically around the year 2005 was especially bad years for hurricanes.

xyplot(tcost ~ date | event, data=dmg, ylab="Cost (Millions)",xlab="Year", pch=16, alpha=.4, cex=.8,
       main="Total cost of damages per disaster by disaster group")

Long term impact

Continual “low damage” disasters that occur frequently may end up causing more cumulative death and destruction than a single large event. To address this question we calculate total death and damages for each of the disaster types.

tdth <- hh  %>% group_by(event) %>% summarise(total_death=sum(death)) %>% arrange(desc(total_death))
tdmg <- dmg %>% group_by(event) %>% summarise(total_damage=sum(tcost)) %>% arrange(desc(total_damage))
tdth
## Source: local data frame [7 x 2]
## 
##           event total_death
## 1        Summer        3002
## 2         Other        2329
## 3      Tornados        2005
## 4        Floods        1525
## 5        Winter        1144
## 6 Thunderstorms         684
## 7    Hurricanes         201

This shows that over the past 20 years, extreme summer conditions have been responsible for 3002 deaths, followed by other disasters not included in the specified groups, with tornadoes coming in third with 2005 deaths.

tdmg
## Source: local data frame [8 x 2]
## 
##           event total_damage
## 1        Floods    197568.46
## 2    Hurricanes    106974.02
## 3        Winter    103119.66
## 4      Tornados     71671.59
## 5 Thunderstorms     49087.61
## 6         Other     36738.20
## 7        Summer     36376.36
## 8     Hurricane     31300.00

While hurricanes cause the most per-event damage, they are comparatively less frequent than foods and occur only in select regions of the country. Floods however are much more prevalent, and since most agriculture is located in flood planes it is not surprising that floods have caused over $60B more in crop and property damage compared to hurricanes over the past 20 years.

Conclusions

While single large scale disasters such as tornadoes and hurricanes can cause the most damage and impact to human health in a single event, more “mundane” and frequent events such as extreme heat waves and floods can be more disastrous in the long run.