The National Climatic Data Center (NCDC) is provided with regular updates on data related to adverse weather. This “storm data” is received from the National Weather Service (NWS), who has 60 days to submit after the end of each data month (1). The database was started in April 1950, continued to November 2011, and comprises 902,297 observations on 37 variables. Among the latter, are event type, health outcomes including fatality and injury and property damage. There are 48 types of events documented in the database (2).
This analysis aimed to identify which event types were most harmful to population health as well as had the greatest economic consequences. To achieve this goal, the event types were analysed to identify which were associated with the most fatality, injury and property damage.
The results demonstrate that tornadoes were the most harmful to population health and accounted for the greatest amount of property damage. These results can therefore help to inform allocations of limited resources, to minimize mortality, morbidity, and economic loss.
The .csv.bz2 file was downloaded from the website using the download.file() function. It was then read directly into R with read.csv(). The process was performed with header=T to include the header information, sep = “,” and converting factors into character variables for subsequent plotting. As this was a fairly large file, this stage was cached to increase speed of the analysis.
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile="StormData.csv")
data <- read.csv("StormData.csv", header = TRUE, sep = ",", stringsAsFactors=F)
Across the United States, the top ten event types causing the most harm to population health are shown in figures 1 and 2. The most fatal event was tornadoes, with 5633. Other highly lethal events included excessive heat, flash floods, heat and lightning (Fig. 1)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
by_event <- group_by(data, EVTYPE)
fatal <- arrange(summarize(by_event, Fatalities=sum(FATALITIES)), desc(Fatalities))
head(fatal,20)
## # A tibble: 20 x 2
## EVTYPE Fatalities
## <chr> <dbl>
## 1 TORNADO 5633
## 2 EXCESSIVE HEAT 1903
## 3 FLASH FLOOD 978
## 4 HEAT 937
## 5 LIGHTNING 816
## 6 TSTM WIND 504
## 7 FLOOD 470
## 8 RIP CURRENT 368
## 9 HIGH WIND 248
## 10 AVALANCHE 224
## 11 WINTER STORM 206
## 12 RIP CURRENTS 204
## 13 HEAT WAVE 172
## 14 EXTREME COLD 160
## 15 THUNDERSTORM WIND 133
## 16 HEAVY SNOW 127
## 17 EXTREME COLD/WIND CHILL 125
## 18 STRONG WIND 103
## 19 BLIZZARD 101
## 20 HIGH SURF 101
fatal10 <- fatal[1:10,]
par(mar=c(6,6,4,1))
barplot(fatal10$Fatalities, names.arg=fatal10$EVTYPE, oma=c(4,2,2,1), col="blue", main="Fig. 1: Fatalities by Event", ylab="Fatality", ylim=c(0, 6000), xlab="Event", las=2, cex.axis=0.8, cex.names=0.6, mgp=c(5,1,0))
The event that led to the most injuries was also tornadoes, with more than 91,000 cases. Other fairly lethal events included TSTM wind, flood, excessive heat, and lightning (Fig. 2)
injury <- arrange(summarize(by_event, Injuries=sum(INJURIES)), desc(Injuries))
head(injury, 20)
## # A tibble: 20 x 2
## EVTYPE Injuries
## <chr> <dbl>
## 1 TORNADO 91346
## 2 TSTM WIND 6957
## 3 FLOOD 6789
## 4 EXCESSIVE HEAT 6525
## 5 LIGHTNING 5230
## 6 HEAT 2100
## 7 ICE STORM 1975
## 8 FLASH FLOOD 1777
## 9 THUNDERSTORM WIND 1488
## 10 HAIL 1361
## 11 WINTER STORM 1321
## 12 HURRICANE/TYPHOON 1275
## 13 HIGH WIND 1137
## 14 HEAVY SNOW 1021
## 15 WILDFIRE 911
## 16 THUNDERSTORM WINDS 908
## 17 BLIZZARD 805
## 18 FOG 734
## 19 WILD/FOREST FIRE 545
## 20 DUST STORM 440
injury14 <- injury[1:14,]
par(mar=c(6,6,4,1))
options(scipen=999)
barplot(injury14$Injuries, names.arg=injury14$EVTYPE, oma=c(4,2,2,1), col="blue", main="Fig. 2: Injuries by Event", ylab="Injuries", ylim=c(0, 100000), xlab="Event", las=2, cex.axis=0.8, cex.names=0.6, mgp=c(5,1,0))
Across the United States, the top ten types of events with the greatest economic consequence is shown in figure 3. The most costly event was tornadoes, accounting for $3,212,258 worth of property damage Other costly events included flash floods, TSTM wind, flood, and thundersstorm wind (Fig. 3).
damage <- arrange(summarize(by_event, Damage=sum(PROPDMG)), desc(Damage))
head(damage, 20)
## # A tibble: 20 x 2
## EVTYPE Damage
## <chr> <dbl>
## 1 TORNADO 3212258.
## 2 FLASH FLOOD 1420125.
## 3 TSTM WIND 1335966.
## 4 FLOOD 899938.
## 5 THUNDERSTORM WIND 876844.
## 6 HAIL 688693.
## 7 LIGHTNING 603352.
## 8 THUNDERSTORM WINDS 446293.
## 9 HIGH WIND 324732.
## 10 WINTER STORM 132721.
## 11 HEAVY SNOW 122252.
## 12 WILDFIRE 84459.
## 13 ICE STORM 66001.
## 14 STRONG WIND 62994.
## 15 HIGH WINDS 55625
## 16 HEAVY RAIN 50842.
## 17 TROPICAL STORM 48424.
## 18 WILD/FOREST FIRE 39345.
## 19 FLASH FLOODING 28497.
## 20 URBAN/SML STREAM FLD 26052.
damage10 <- damage[1:10,]
par(mar=c(6,6,4,1))
options(scipen=999)
barplot(damage10$Damage, names.arg=damage10$EVTYPE, oma=c(4,2,2,1), col="blue", main="Fig. 3: Property Damage by Event", ylab="Property Damage", ylim=c(0, 3500000), xlab="Event", las=2, cex.axis=0.8, cex.names=0.6, mgp=c(5,1,0))
This analysis shows that tornadoes were the most harmful event in terms of population health and property damage. Evidence based allocation of limited resources, targetting the most harmful event types, may minimize mortality, morbidity, and economic loss.
National Climatic Data Center Storm Events FAQ: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf
National Weather Service Storm Data Documentation: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf
__