Weather events can have severe consequences on the population and economy of a country. This analysis aims to identify such events and their impact. The data comes from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database over 1950 to 2011.
The analysis aims to investigate which different types of severe weather events are most harmful on the populations health in respect of general injuries and fatalities. Further the economic consequences will be analyzed by exploring the financial damage done to both general property and agriculture (i.e. crops).
The analysis concludes with Tornado being the most destructive with respect to population health by a big margin. Floods, Typhoons and Tornado do the most property damage in the US while Droughts, Floods & Icestorms lead to the most crop damage.
Download the storm dataset which comes from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The documentation for the same can be found at :
suppressMessages(library(data.table))
suppressMessages(library(dplyr))
suppressMessages(library(ggplot2))
suppressMessages(library(gridExtra))
if(!file.exists("./repdata%2Fdata%2FStormData.csv.bz2")){
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(url, "./repdata%2Fdata%2FStormData.csv.bz2")
}
Read the data downloaded :
storm <- read.csv("./repdata%2Fdata%2FStormData.csv.bz2", stringsAsFactors = FALSE)
Selecting only the health and economic consequences variables from the dataset
setDT(storm)
stormdam <- copy(storm[, .(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)])
str(stormdam)
## Classes 'data.table' and 'data.frame': 902297 obs. of 7 variables:
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## - attr(*, ".internal.selfref")=<externalptr>
As we can see the property damage and crop damage have seperate index variables. For the purpose of this analysis, the index has to be converted and multiplied to obtain the final damage value.
stormdam[, PROPDMGEXP := recode(PROPDMGEXP, "?"=0, "+"=0,"B"=9, "m"=6, "M"=6, "K"=3, "H"=2, "h"=2, "0"=0, "1"=1, "2"=2, "3"=3, "4"=4, "5"=5, "6"=6, "7"=7, "8"=8, .default = 0) ]
stormdam[, PROPDMG := PROPDMG * 10^(PROPDMGEXP)]
stormdam[, CROPDMGEXP := recode(CROPDMGEXP, "?"=0, "B"=9, "m"=6, "M"=6, "K"=3, "k"=3, "0"=0, "2"=2, .default = 0) ]
stormdam[, CROPDMG := CROPDMG * 10^(CROPDMGEXP)]
Now Summing up the damages by event types and ordering them by magnitude.
crop <- stormdam[, .(cropdmg=sum(CROPDMG)), by = EVTYPE][order(-cropdmg)]
prop <- stormdam[, .(propdmg=sum(PROPDMG)), by = EVTYPE][order(-propdmg)]
The fatalities and injuries are classified by event type and ordered by magnitude.
fatal <- stormdam[, .(Fatalities=sum(FATALITIES)), by = EVTYPE][order(-Fatalities)]
injur <- stormdam[, .(Injuries=sum(INJURIES)), by = EVTYPE][order(-Injuries)]
The top 5 events leading to maximum Injuries and Fatalities are :
head(fatal,5)
## EVTYPE Fatalities
## 1: TORNADO 5633
## 2: EXCESSIVE HEAT 1903
## 3: FLASH FLOOD 978
## 4: HEAT 937
## 5: LIGHTNING 816
head(injur,5)
## EVTYPE Injuries
## 1: TORNADO 91346
## 2: TSTM WIND 6957
## 3: FLOOD 6789
## 4: EXCESSIVE HEAT 6525
## 5: LIGHTNING 5230
The below barplot explains the effect of top weather events resulting in Fatalities and Injuries respectively.
g1 <- ggplot(head(fatal,10), aes(x=reorder(EVTYPE, Fatalities), y=Fatalities)) + geom_bar(stat = "identity",fill="blue") + coord_flip() + labs(x="Event Type") + ggtitle("Health Impact of Top Weather Events in US")
g2 <- ggplot(head(injur,10), aes(x=reorder(EVTYPE, Injuries), y=Injuries)) + geom_bar(stat = "identity",fill="violetred") + coord_flip() + labs(x="Event Type")
grid.arrange(g1, g2, nrow = 2)
The top 5 events leading to maximum property and crop damage are :
head(prop,5)
## EVTYPE propdmg
## 1: FLOOD 144657709807
## 2: HURRICANE/TYPHOON 69305840000
## 3: TORNADO 56947380677
## 4: STORM SURGE 43323536000
## 5: FLASH FLOOD 16822673979
head(crop,5)
## EVTYPE cropdmg
## 1: DROUGHT 13972566000
## 2: FLOOD 5661968450
## 3: RIVER FLOOD 5029459000
## 4: ICE STORM 5022113500
## 5: HAIL 3025954473
The below barplot explains the effect of top weather events resulting in Property and Crop damage respectively.
g3 <- ggplot(head(prop,10), aes(x=reorder(EVTYPE, propdmg), y=propdmg/(1e9))) + geom_bar(stat = "identity",fill="brown4") + coord_flip() + labs(x="Event Type", y="Property Damage (Billion Dollars)") + ggtitle("Economic Impact of Top Weather Events in US")
g4 <- ggplot(head(crop,10), aes(x=reorder(EVTYPE, cropdmg), y=cropdmg/(1e9))) + geom_bar(stat = "identity",fill="darkorchid") + coord_flip() + labs(x="Event Type", y="Crop Damage (Billion Dollars)")
grid.arrange(g3, g4, nrow = 2)
As is evident from the above plots, Tornado is the single most destructive weather event with respect to the population health. It causes the most Fatalities and Injuries. Excessive Heat and Flash Floods follow it as events causing the most Fatalities.
Floods cause the most property damage, followed by Hurricanes and Tornadoes. Droughts are responsible for the major crop damage followed by Floods & Icestorms