This report uses the NOAA Storm database, which catalogs 902,297 severe weather events in the United States from 1950 to November 2011, to determine which events have caused the greatest public health and economic consequences for communities and municipalities. In this analysis, the number of fatalities and injuries, along with the financial cost, were calculated according to event type. The event types that caused the greatest population health and economic impact were then determined.
The event types most harmful to population health are as follows:
1. Tornadoes (5,633 fatalities and 91,346 injuries)
2. Excessive heat and heat (2,840 fatalities and 8,625 injuries)
3. Floods and flash floods (1,448 fatalities and 8,566 injuries)
4. Lightning (816 fatalities and 5,230 injuries)
5. Thunderstorm wind and TSTM wind (637 fatalities and 8,445 injuries)
The event types with the greatest economic consequences (combined property and crop damage) are as follows:
1. Floods and flash floods ($167 billion)
2. Hurricanes and hurricane/typhoons ($86.5 billion)
3. Tornadoes ($57.3 billion)
4. Hail ($18.8 billion)
5. Drought ($15.0 billion)
The following R packages were used in this report:
library(ggplot2)
library(dplyr)
library(reshape2)
library(knitr)
library(scales)
The data was downloaded from the NOAA storm database and stored in the storm dataset.
# make temp file to load storm data
temp <- tempfile()
# url for storm data
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
# download data
download.file(url, temp, mode = "wb")
# read data into storm variable
storm <- read.csv(temp)
# delete temp file
unlink(temp)
In order to calculate the total financial cost, an extra data processing step is required. Since the property & crop damage values are each split over two columns in the storm dataset (eg. PROPDMG and PROPDMGEXP), the following code converts the two EXP columns from an alphabetical value (K / M / B) to a numerical value (\(10^3\) / \(10^6\) / \(10^9\)). The other values in the EXP columns are not considered material to this analysis.
# convert PROPDMGEXP & CROPDMGEXP columns from a letter to a numeric value
storm <- mutate(storm, propexp =if_else(PROPDMGEXP =="K", 10^3,
if_else(PROPDMGEXP =="M", 10^6,
if_else(PROPDMGEXP =="B", 10^9,
1))))
storm <- mutate(storm, cropexp =if_else(CROPDMGEXP =="K", 10^3,
if_else(CROPDMGEXP =="M", 10^6,
if_else(CROPDMGEXP =="B", 10^9,
1))))
# calculate actual property, crop and total cost per event
storm$propcalc <- storm$PROPDMG * storm$propexp
storm$cropcalc <- storm$CROPDMG * storm$cropexp
storm$ecocalc <- storm$propcalc + storm$cropcalc
The data was then grouped by event type (EVTYPE) and the total injuries, fatalities and financial cost were calculated. Three lists were generated, based on total fatalities (top.fat), total injuries (top.inj) and financial cost (top.cost).
# group storm data by EVTYPE
by_EVTYPE <- group_by(storm, EVTYPE)
# sum up total injuries, fatalities and property damage amounts
sum <- summarise(by_EVTYPE, total.injuries =sum(INJURIES),
total.fatalities = sum(FATALITIES),
total.cost = sum(ecocalc),
total.property = sum(propcalc),
total.crop = sum(cropcalc)
)
# arrange the data by total fatalities and keep the top ten
arrange.fat <- arrange(sum, desc(total.fatalities))
top.fat <- arrange.fat[1:10,c(1,3,2)]
# arrange the data by total injuries and keep the top ten
arrange.inj <- arrange(sum, desc(total.injuries))
top.inj <- arrange.inj[1:10,1:3]
# arrange the data by total property damage and keep the top ten
arrange.eco <- arrange(sum, desc(total.cost))
top.cost <- arrange.eco[1:10,c(1,4:6)]
The most damaging event types, ordered by fatalities and injuries, are shown below. To assess the event types most impactful to population health, the fatality and injury lists were assessed and plotted together on a log scale.
## display top ten event types by greatest number of fatalities
kable(top.fat, caption = "Event types by total fatalities")
| EVTYPE | total.fatalities | total.injuries |
|---|---|---|
| TORNADO | 5633 | 91346 |
| EXCESSIVE HEAT | 1903 | 6525 |
| FLASH FLOOD | 978 | 1777 |
| HEAT | 937 | 2100 |
| LIGHTNING | 816 | 5230 |
| TSTM WIND | 504 | 6957 |
| FLOOD | 470 | 6789 |
| RIP CURRENT | 368 | 232 |
| HIGH WIND | 248 | 1137 |
| AVALANCHE | 224 | 170 |
## display top ten event types by greatest number of injuries
kable(top.inj, caption = "Event types by total injuries")
| EVTYPE | total.injuries | total.fatalities |
|---|---|---|
| TORNADO | 91346 | 5633 |
| TSTM WIND | 6957 | 504 |
| FLOOD | 6789 | 470 |
| EXCESSIVE HEAT | 6525 | 1903 |
| LIGHTNING | 5230 | 816 |
| HEAT | 2100 | 937 |
| ICE STORM | 1975 | 89 |
| FLASH FLOOD | 1777 | 978 |
| THUNDERSTORM WIND | 1488 | 133 |
| HAIL | 1361 | 15 |
# plot total fatalities and injuries by EVTYPE
melted.fat <- melt(top.fat, id.vars = "EVTYPE")
ggplot(data=melted.fat, aes(x=EVTYPE, y=value, group=variable, color = variable)) + geom_line() + theme(axis.text.x = element_text(angle=45, hjust=1, vjust=1, size=rel(0.9))) + scale_y_log10() + labs(title="Top Event Types by Total Fatalities and Injuries", x="Event Type", y="People Affected")
By far, the single event type most damaging to population health is the tornado, followed by excessive heat and heat, floods and flash floods, lightning, and thunderstorm wind and TSTM wind.
The most damaging event types according to total economic cost are shown and graphed below. The total property and crop damage values are also plotted individually.
## display event types by greatest economic effect
kable(comma(top.cost), caption = "Event types by total dollar value", align = 'r')
| EVTYPE | total.cost | total.property | total.crop |
|---|---|---|---|
| FLOOD | 150,319,678,257 | 144,657,709,807 | 5,661,968,450 |
| HURRICANE/TYPHOON | 71,913,712,800 | 69,305,840,000 | 2,607,872,800 |
| TORNADO | 57,340,614,060 | 56,925,660,790 | 414,953,270 |
| STORM SURGE | 43,323,541,000 | 43,323,536,000 | 5,000 |
| HAIL | 18,752,904,943 | 15,727,367,053 | 3,025,537,890 |
| FLASH FLOOD | 17,562,129,167 | 16,140,812,067 | 1,421,317,100 |
| DROUGHT | 15,018,672,000 | 1,046,106,000 | 13,972,566,000 |
| HURRICANE | 14,610,229,010 | 11,868,319,010 | 2,741,910,000 |
| RIVER FLOOD | 10,148,404,500 | 5,118,945,500 | 5,029,459,000 |
| ICE STORM | 8,967,041,360 | 3,944,927,860 | 5,022,113,500 |
# plot total economic cost by EVTYPE
melted.cost <- melt(top.cost, id.vars = "EVTYPE")
ggplot(data=melted.cost, aes(x=EVTYPE, y=value, group=variable, color = variable)) + geom_line() + theme(axis.text.x = element_text(angle=45, hjust=1, vjust=1, size=rel(0.9))) + labs(title="Top Event Types by Total Economic Cost", x="Event Type", y="Dollar Value")
In dollar terms, floods and flash floods are the most damaging, followed by hurricanes and hurricane/typhoons. Tornadoes, hail, and drought round out the top five event types.