Synopsis:

The National Climatic Data Center (NCDC) is provided with regular updates on data related to adverse weather. This “storm data” is received from the National Weather Service (NWS), who has 60 days to submit after the end of each data month (1). The database was started in April 1950, continued to November 2011, and comprises 902,297 observations on 37 variables. Among the latter, are event type, health outcomes including fatality and injury and property damage. There are 48 types of events documented in the database (2).

This analysis aimed to identify which event types were most harmful to population health as well as had the greatest economic consequences. To achieve this goal, the event types were analysed to identify which were associated with the most fatality, injury and property damage.

The results demonstrate that tornadoes were the most harmful to population health and accounted for the greatest amount of property damage. These results can therefore help to inform allocations of limited resources, to minimize mortality, morbidity, and economic loss.

Data Processing

The .csv.bz2 file was downloaded from the website using the download.file() function. It was then read directly into R with read.csv(). The process was performed with header=T to include the header information, sep = “,” and converting factors into character variables for subsequent plotting. As this was a fairly large file, this stage was cached to increase speed of the analysis.

download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile="StormData.csv")
data <- read.csv("StormData.csv", header = TRUE, sep = ",", stringsAsFactors=F)

Results

Across the United States, the top ten event types causing the most harm to population health are shown in figures 1 and 2. The most fatal event was tornadoes, with 5633. Other highly lethal events included excessive heat, flash floods, heat and lightning (Fig. 1)

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
by_event <- group_by(data, EVTYPE)
fatal <- arrange(summarize(by_event, Fatalities=sum(FATALITIES)), desc(Fatalities))
head(fatal,20)
## # A tibble: 20 x 2
##    EVTYPE                  Fatalities
##    <chr>                        <dbl>
##  1 TORNADO                       5633
##  2 EXCESSIVE HEAT                1903
##  3 FLASH FLOOD                    978
##  4 HEAT                           937
##  5 LIGHTNING                      816
##  6 TSTM WIND                      504
##  7 FLOOD                          470
##  8 RIP CURRENT                    368
##  9 HIGH WIND                      248
## 10 AVALANCHE                      224
## 11 WINTER STORM                   206
## 12 RIP CURRENTS                   204
## 13 HEAT WAVE                      172
## 14 EXTREME COLD                   160
## 15 THUNDERSTORM WIND              133
## 16 HEAVY SNOW                     127
## 17 EXTREME COLD/WIND CHILL        125
## 18 STRONG WIND                    103
## 19 BLIZZARD                       101
## 20 HIGH SURF                      101
fatal10 <- fatal[1:10,]
par(mar=c(6,6,4,1))
barplot(fatal10$Fatalities, names.arg=fatal10$EVTYPE, oma=c(4,2,2,1), col="blue", main="Fig. 1: Fatalities by Event", ylab="Fatality", ylim=c(0, 6000), xlab="Event", las=2, cex.axis=0.8, cex.names=0.6, mgp=c(5,1,0))

The event that led to the most injuries was also tornadoes, with more than 91,000 cases. Other fairly lethal events included TSTM wind, flood, excessive heat, and lightning (Fig. 2)

injury <- arrange(summarize(by_event, Injuries=sum(INJURIES)), desc(Injuries))
head(injury, 20)
## # A tibble: 20 x 2
##    EVTYPE             Injuries
##    <chr>                 <dbl>
##  1 TORNADO               91346
##  2 TSTM WIND              6957
##  3 FLOOD                  6789
##  4 EXCESSIVE HEAT         6525
##  5 LIGHTNING              5230
##  6 HEAT                   2100
##  7 ICE STORM              1975
##  8 FLASH FLOOD            1777
##  9 THUNDERSTORM WIND      1488
## 10 HAIL                   1361
## 11 WINTER STORM           1321
## 12 HURRICANE/TYPHOON      1275
## 13 HIGH WIND              1137
## 14 HEAVY SNOW             1021
## 15 WILDFIRE                911
## 16 THUNDERSTORM WINDS      908
## 17 BLIZZARD                805
## 18 FOG                     734
## 19 WILD/FOREST FIRE        545
## 20 DUST STORM              440
injury14 <- injury[1:14,]
par(mar=c(6,6,4,1))
options(scipen=999)
barplot(injury14$Injuries, names.arg=injury14$EVTYPE, oma=c(4,2,2,1), col="blue", main="Fig. 2: Injuries by Event", ylab="Injuries", ylim=c(0, 100000), xlab="Event", las=2, cex.axis=0.8, cex.names=0.6, mgp=c(5,1,0))

Across the United States, the top ten types of events with the greatest economic consequence is shown in figure 3. The most costly event was tornadoes, accounting for $3,212,258 worth of property damage Other costly events included flash floods, TSTM wind, flood, and thundersstorm wind (Fig. 3).

damage <- arrange(summarize(by_event, Damage=sum(PROPDMG)), desc(Damage))
head(damage, 20)
## # A tibble: 20 x 2
##    EVTYPE                 Damage
##    <chr>                   <dbl>
##  1 TORNADO              3212258.
##  2 FLASH FLOOD          1420125.
##  3 TSTM WIND            1335966.
##  4 FLOOD                 899938.
##  5 THUNDERSTORM WIND     876844.
##  6 HAIL                  688693.
##  7 LIGHTNING             603352.
##  8 THUNDERSTORM WINDS    446293.
##  9 HIGH WIND             324732.
## 10 WINTER STORM          132721.
## 11 HEAVY SNOW            122252.
## 12 WILDFIRE               84459.
## 13 ICE STORM              66001.
## 14 STRONG WIND            62994.
## 15 HIGH WINDS             55625 
## 16 HEAVY RAIN             50842.
## 17 TROPICAL STORM         48424.
## 18 WILD/FOREST FIRE       39345.
## 19 FLASH FLOODING         28497.
## 20 URBAN/SML STREAM FLD   26052.
damage10 <- damage[1:10,]
par(mar=c(6,6,4,1))
options(scipen=999)
barplot(damage10$Damage, names.arg=damage10$EVTYPE, oma=c(4,2,2,1), col="blue", main="Fig. 3: Property Damage by Event", ylab="Property Damage", ylim=c(0, 3500000), xlab="Event", las=2, cex.axis=0.8, cex.names=0.6, mgp=c(5,1,0))

Conclusion

This analysis shows that tornadoes were the most harmful event in terms of population health and property damage. Evidence based allocation of limited resources, targetting the most harmful event types, may minimize mortality, morbidity, and economic loss.

References

  1. National Climatic Data Center Storm Events FAQ: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf

  2. National Weather Service Storm Data Documentation: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf

__