Synopsis

This report uses the NOAA Storm database, which catalogs 902,297 severe weather events in the United States from 1950 to November 2011, to determine which events have caused the greatest public health and economic consequences for communities and municipalities. In this analysis, the number of fatalities and injuries, along with the financial cost, were calculated according to event type. The event types that caused the greatest population health and economic impact were then determined.

The event types most harmful to population health are as follows:
1. Tornadoes (5,633 fatalities and 91,346 injuries)
2. Excessive heat and heat (2,840 fatalities and 8,625 injuries)
3. Floods and flash floods (1,448 fatalities and 8,566 injuries)
4. Lightning (816 fatalities and 5,230 injuries)
5. Thunderstorm wind and TSTM wind (637 fatalities and 8,445 injuries)

The event types with the greatest economic consequences (combined property and crop damage) are as follows:
1. Floods and flash floods ($167 billion)
2. Hurricanes and hurricane/typhoons ($86.5 billion)
3. Tornadoes ($57.3 billion)
4. Hail ($18.8 billion)
5. Drought ($15.0 billion)

Data Processing

The following R packages were used in this report:

library(ggplot2)
library(dplyr)
library(reshape2)
library(knitr)
library(scales)

The data was downloaded from the NOAA storm database and stored in the storm dataset.

# make temp file to load storm data
temp <- tempfile()
# url for storm data
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
# download data
download.file(url, temp, mode = "wb")
# read data into storm variable
storm <- read.csv(temp)
# delete temp file
unlink(temp)

In order to calculate the total financial cost, an extra data processing step is required. Since the property & crop damage values are each split over two columns in the storm dataset (eg. PROPDMG and PROPDMGEXP), the following code converts the two EXP columns from an alphabetical value (K / M / B) to a numerical value (\(10^3\) / \(10^6\) / \(10^9\)). The other values in the EXP columns are not considered material to this analysis.

# convert PROPDMGEXP & CROPDMGEXP columns from a letter to a numeric value
storm <- mutate(storm, propexp =if_else(PROPDMGEXP =="K", 10^3,
                                if_else(PROPDMGEXP =="M", 10^6,
                                if_else(PROPDMGEXP =="B", 10^9, 
                                1))))
storm <- mutate(storm, cropexp =if_else(CROPDMGEXP =="K", 10^3, 
                                if_else(CROPDMGEXP =="M", 10^6,
                                if_else(CROPDMGEXP =="B", 10^9,
                                1))))

# calculate actual property, crop and total cost per event
storm$propcalc <- storm$PROPDMG * storm$propexp 
storm$cropcalc <- storm$CROPDMG * storm$cropexp
storm$ecocalc <- storm$propcalc + storm$cropcalc

The data was then grouped by event type (EVTYPE) and the total injuries, fatalities and financial cost were calculated. Three lists were generated, based on total fatalities (top.fat), total injuries (top.inj) and financial cost (top.cost).

# group storm data by EVTYPE
by_EVTYPE <- group_by(storm, EVTYPE) 
# sum up total injuries, fatalities and property damage amounts
sum <- summarise(by_EVTYPE, total.injuries =sum(INJURIES), 
                 total.fatalities = sum(FATALITIES), 
                 total.cost = sum(ecocalc),
                 total.property = sum(propcalc),
                 total.crop = sum(cropcalc)
                )
# arrange the data by total fatalities and keep the top ten
arrange.fat <- arrange(sum, desc(total.fatalities))
top.fat <- arrange.fat[1:10,c(1,3,2)]
# arrange the data by total injuries and keep the top ten
arrange.inj <- arrange(sum, desc(total.injuries))
top.inj <- arrange.inj[1:10,1:3]
# arrange the data by total property damage and keep the top ten
arrange.eco <- arrange(sum, desc(total.cost))
top.cost <- arrange.eco[1:10,c(1,4:6)]

Results

Effects on population health

The most damaging event types, ordered by fatalities and injuries, are shown below. To assess the event types most impactful to population health, the fatality and injury lists were assessed and plotted together on a log scale.

## display top ten event types by greatest number of fatalities
kable(top.fat, caption = "Event types by total fatalities")
Event types by total fatalities
EVTYPE total.fatalities total.injuries
TORNADO 5633 91346
EXCESSIVE HEAT 1903 6525
FLASH FLOOD 978 1777
HEAT 937 2100
LIGHTNING 816 5230
TSTM WIND 504 6957
FLOOD 470 6789
RIP CURRENT 368 232
HIGH WIND 248 1137
AVALANCHE 224 170
## display top ten event types by greatest number of injuries
kable(top.inj, caption = "Event types by total injuries")
Event types by total injuries
EVTYPE total.injuries total.fatalities
TORNADO 91346 5633
TSTM WIND 6957 504
FLOOD 6789 470
EXCESSIVE HEAT 6525 1903
LIGHTNING 5230 816
HEAT 2100 937
ICE STORM 1975 89
FLASH FLOOD 1777 978
THUNDERSTORM WIND 1488 133
HAIL 1361 15
# plot total fatalities and injuries by EVTYPE
melted.fat <- melt(top.fat, id.vars = "EVTYPE")
ggplot(data=melted.fat, aes(x=EVTYPE, y=value, group=variable, color = variable)) + geom_line() + theme(axis.text.x = element_text(angle=45, hjust=1, vjust=1, size=rel(0.9))) + scale_y_log10() + labs(title="Top Event Types by Total Fatalities and Injuries", x="Event Type", y="People Affected")

By far, the single event type most damaging to population health is the tornado, followed by excessive heat and heat, floods and flash floods, lightning, and thunderstorm wind and TSTM wind.

Effects on economic cost

The most damaging event types according to total economic cost are shown and graphed below. The total property and crop damage values are also plotted individually.

## display event types by greatest economic effect
kable(comma(top.cost), caption = "Event types by total dollar value", align = 'r')
Event types by total dollar value
EVTYPE total.cost total.property total.crop
FLOOD 150,319,678,257 144,657,709,807 5,661,968,450
HURRICANE/TYPHOON 71,913,712,800 69,305,840,000 2,607,872,800
TORNADO 57,340,614,060 56,925,660,790 414,953,270
STORM SURGE 43,323,541,000 43,323,536,000 5,000
HAIL 18,752,904,943 15,727,367,053 3,025,537,890
FLASH FLOOD 17,562,129,167 16,140,812,067 1,421,317,100
DROUGHT 15,018,672,000 1,046,106,000 13,972,566,000
HURRICANE 14,610,229,010 11,868,319,010 2,741,910,000
RIVER FLOOD 10,148,404,500 5,118,945,500 5,029,459,000
ICE STORM 8,967,041,360 3,944,927,860 5,022,113,500
# plot total economic cost by EVTYPE
melted.cost <- melt(top.cost, id.vars = "EVTYPE")
ggplot(data=melted.cost, aes(x=EVTYPE, y=value, group=variable, color = variable)) + geom_line() + theme(axis.text.x = element_text(angle=45, hjust=1, vjust=1, size=rel(0.9))) + labs(title="Top Event Types by Total Economic Cost", x="Event Type", y="Dollar Value") 

In dollar terms, floods and flash floods are the most damaging, followed by hurricanes and hurricane/typhoons. Tornadoes, hail, and drought round out the top five event types.