In this report we attempt to identify the types of natural disasters that have the greatest impact on human health and economic consequences. We examine disaster data downloaded available from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. After excluding some extremely atypical cases we found that over the past 20 years extreme summer conditions were responsible for the most human fatalities and flooding caused the most property and crop damage.
The NOAA data was downloaded from the web and manually unzipped to the hard drive.
# Adjust wd() to local machine, run only once due to size.
setwd("C:/Dropbox/Professional/Coursera/5_ReproResearch/")
#download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
# destfile = "noaa.csv.bz2")
noaa <- read.csv("noaa.csv", header=TRUE)
We converted the raw date to a readable format, and created numerical multipliers for damage in the billions (1,000 million). This will allow us later to combine crop and property damage costs as one measure of total economic cost.
library(dplyr); library(lattice)
##
## Attaching package: 'dplyr'
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
noaa$date <- as.Date(noaa$BGN_DATE, format="%m/%d/%Y %H:%M:%S")
noaa$pxp <- ifelse(noaa$PROPDMGEXP == "B", 1000, 1)
noaa$cxp <- ifelse(noaa$CROPDMGEXP == "B", 1000, 1)
Broad classifications of natural disaster types were created: tornadoes, floods, extreme summer conditions, extreme winter conditions, and hurricanes. For example, extreme cold, ice and blizzards are all products of winter, whereas droughts and severe heat are products of summer conditions.
noaa$etype <- tolower(noaa$EVTYPE)
noaa$event <- "Other"
noaa$event <- ifelse(grepl("tornado", noaa$etype), "Tornados", noaa$event)
noaa$event <- ifelse(grepl("flood", noaa$etype), "Floods", noaa$event)
noaa$event <- ifelse(grepl("heat", noaa$etype) | grepl("hot", noaa$etype) | grepl("drought", noaa$etype) |
grepl("high", noaa$etype) | grepl("driest", noaa$etype), "Summer", noaa$event)
noaa$event <- ifelse(grepl("tstm", noaa$etype) | grepl("thunder", noaa$etype), "Thunderstorms", noaa$event)
noaa$event <- ifelse(grepl("tropical", noaa$etype) | grepl("hurri", noaa$etype) | grepl("typh", noaa$etype),
"Hurricanes", noaa$event)
noaa$event <- ifelse(grepl("winter", noaa$etype) | grepl("snow", noaa$etype) |
grepl("blizzard", noaa$etype) | grepl("cold", noaa$etype) |
grepl("freeze", noaa$etype) | grepl("hail", noaa$etype) |
grepl("ice", noaa$etype) | grepl("chill", noaa$etype) ,
"Winter", noaa$event)
We restricted the data to only look at events over the past 20 years. To investigate the impact on human health we further restricted the data to events where at least one person has died. Separately to investigate the economic consequences we restricted the data to events that caused property or crop damage over 1 million dollars.
hh <- noaa %>%
mutate(death = FATALITIES) %>%
filter(date >= "1985-01-01" & FATALITIES > 0)
dmg <- noaa %>%
mutate(pcost = PROPDMG*pxp, ccost = CROPDMG*cxp, tcost = pcost+ccost) %>%
filter(date >= "1985-01-01" &
(PROPDMGEXP %in% c('B', 'M') | CROPDMGEXP %in% c('B', 'M')))
The purpose of this report is to identifying broad types of events that are most harmful to human health and have the greatest economic consequences. There are two events, one that caused excess economic damage and one that caused excess fatalities that are not typical of the rest of the events and so will be excluded from the analysis.
dmg <- dmg[dmg$REFNUM!=605943,]
hh <- hh[hh$REFNUM != 198690,]
Following the guidelines stated in the Storm Data Preparation documentation for this data, one record that was listed as a storm surge but was the result of Hurricane Katrina was reassigned to the Hurricane category.
dmg$event[dmg$REFNUM == 577616] <- "Hurricane"
The following exploratory data analysis is based on 5986 events that caused at least one fatality and 10780 events causing at least $1M in damages.
We can see in found that while a single tornado was responsible for the greatest number of deaths in one occurrence, disasters that are a product of summer tend to have higher death tolls more frequently. Further investigation to examine any predictable periodicity of extreme summer events would be warranted.
xyplot(death ~ date | event, data=hh, ylab="Deaths", xlab="Year", pch=16, alpha=.4, cex=.8,
main="Number of Fatalities per disaster by disaster group")
It is clear that on an individual event level, Hurricanes cause by far the most property and crop damage. These events not only have high total costs associated with each event but also have more frequent high costing events. Specifically around the year 2005 was especially bad years for hurricanes.
xyplot(tcost ~ date | event, data=dmg, ylab="Cost (Millions)",xlab="Year", pch=16, alpha=.4, cex=.8,
main="Total cost of damages per disaster by disaster group")
Continual “low damage” disasters that occur frequently may end up causing more cumulative death and destruction than a single large event. To address this question we calculate total death and damages for each of the disaster types.
tdth <- hh %>% group_by(event) %>% summarise(total_death=sum(death)) %>% arrange(desc(total_death))
tdmg <- dmg %>% group_by(event) %>% summarise(total_damage=sum(tcost)) %>% arrange(desc(total_damage))
tdth
## Source: local data frame [7 x 2]
##
## event total_death
## 1 Summer 3002
## 2 Other 2329
## 3 Tornados 2005
## 4 Floods 1525
## 5 Winter 1144
## 6 Thunderstorms 684
## 7 Hurricanes 201
This shows that over the past 20 years, extreme summer conditions have been responsible for 3002 deaths, followed by other disasters not included in the specified groups, with tornadoes coming in third with 2005 deaths.
tdmg
## Source: local data frame [8 x 2]
##
## event total_damage
## 1 Floods 197568.46
## 2 Hurricanes 106974.02
## 3 Winter 103119.66
## 4 Tornados 71671.59
## 5 Thunderstorms 49087.61
## 6 Other 36738.20
## 7 Summer 36376.36
## 8 Hurricane 31300.00
While hurricanes cause the most per-event damage, they are comparatively less frequent than foods and occur only in select regions of the country. Floods however are much more prevalent, and since most agriculture is located in flood planes it is not surprising that floods have caused over $60B more in crop and property damage compared to hurricanes over the past 20 years.
While single large scale disasters such as tornadoes and hurricanes can cause the most damage and impact to human health in a single event, more “mundane” and frequent events such as extreme heat waves and floods can be more disastrous in the long run.