1. Synopsis

Weather events can have severe consequences on the population and economy of a country. This analysis aims to identify such events and their impact. The data comes from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database over 1950 to 2011.

The analysis aims to investigate which different types of severe weather events are most harmful on the populations health in respect of general injuries and fatalities. Further the economic consequences will be analyzed by exploring the financial damage done to both general property and agriculture (i.e. crops).

The analysis concludes with Tornado being the most destructive with respect to population health by a big margin. Floods, Typhoons and Tornado do the most property damage in the US while Droughts, Floods & Icestorms lead to the most crop damage.

2. Data Processing

Download the storm dataset which comes from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The documentation for the same can be found at :

suppressMessages(library(data.table))
suppressMessages(library(dplyr))
suppressMessages(library(ggplot2))
suppressMessages(library(gridExtra))
                 
if(!file.exists("./repdata%2Fdata%2FStormData.csv.bz2")){
    url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
    download.file(url, "./repdata%2Fdata%2FStormData.csv.bz2")
}

Read the data downloaded :

storm <- read.csv("./repdata%2Fdata%2FStormData.csv.bz2", stringsAsFactors = FALSE)

Selecting only the health and economic consequences variables from the dataset

setDT(storm)
stormdam <- copy(storm[, .(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)])
str(stormdam)
## Classes 'data.table' and 'data.frame':   902297 obs. of  7 variables:
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  - attr(*, ".internal.selfref")=<externalptr>

2.1 Economic Impact

As we can see the property damage and crop damage have seperate index variables. For the purpose of this analysis, the index has to be converted and multiplied to obtain the final damage value.

stormdam[, PROPDMGEXP := recode(PROPDMGEXP, "?"=0, "+"=0,"B"=9, "m"=6, "M"=6, "K"=3, "H"=2, "h"=2, "0"=0, "1"=1, "2"=2, "3"=3, "4"=4, "5"=5, "6"=6, "7"=7, "8"=8, .default = 0) ]

stormdam[, PROPDMG := PROPDMG * 10^(PROPDMGEXP)]

stormdam[, CROPDMGEXP := recode(CROPDMGEXP, "?"=0, "B"=9, "m"=6, "M"=6, "K"=3, "k"=3, "0"=0, "2"=2, .default = 0) ]

stormdam[, CROPDMG := CROPDMG * 10^(CROPDMGEXP)]

Now Summing up the damages by event types and ordering them by magnitude.

crop <- stormdam[, .(cropdmg=sum(CROPDMG)), by = EVTYPE][order(-cropdmg)]

prop <- stormdam[, .(propdmg=sum(PROPDMG)), by = EVTYPE][order(-propdmg)]

2.2 Population Impact

The fatalities and injuries are classified by event type and ordered by magnitude.

fatal <- stormdam[, .(Fatalities=sum(FATALITIES)), by = EVTYPE][order(-Fatalities)]

injur <- stormdam[, .(Injuries=sum(INJURIES)), by = EVTYPE][order(-Injuries)]

3. Results

3.1 Fatalities & Injuries

The top 5 events leading to maximum Injuries and Fatalities are :

head(fatal,5)
##            EVTYPE Fatalities
## 1:        TORNADO       5633
## 2: EXCESSIVE HEAT       1903
## 3:    FLASH FLOOD        978
## 4:           HEAT        937
## 5:      LIGHTNING        816
head(injur,5)
##            EVTYPE Injuries
## 1:        TORNADO    91346
## 2:      TSTM WIND     6957
## 3:          FLOOD     6789
## 4: EXCESSIVE HEAT     6525
## 5:      LIGHTNING     5230

The below barplot explains the effect of top weather events resulting in Fatalities and Injuries respectively.

g1 <- ggplot(head(fatal,10), aes(x=reorder(EVTYPE, Fatalities), y=Fatalities)) + geom_bar(stat = "identity",fill="blue") + coord_flip() + labs(x="Event Type") + ggtitle("Health Impact of Top Weather Events in US")

g2 <- ggplot(head(injur,10), aes(x=reorder(EVTYPE, Injuries), y=Injuries)) + geom_bar(stat = "identity",fill="violetred") + coord_flip() + labs(x="Event Type")

grid.arrange(g1, g2, nrow = 2)

3.2 Economic Impact

The top 5 events leading to maximum property and crop damage are :

head(prop,5)
##               EVTYPE      propdmg
## 1:             FLOOD 144657709807
## 2: HURRICANE/TYPHOON  69305840000
## 3:           TORNADO  56947380677
## 4:       STORM SURGE  43323536000
## 5:       FLASH FLOOD  16822673979
head(crop,5)
##         EVTYPE     cropdmg
## 1:     DROUGHT 13972566000
## 2:       FLOOD  5661968450
## 3: RIVER FLOOD  5029459000
## 4:   ICE STORM  5022113500
## 5:        HAIL  3025954473

The below barplot explains the effect of top weather events resulting in Property and Crop damage respectively.

g3 <- ggplot(head(prop,10), aes(x=reorder(EVTYPE, propdmg), y=propdmg/(1e9))) + geom_bar(stat = "identity",fill="brown4") + coord_flip() + labs(x="Event Type", y="Property Damage (Billion Dollars)") + ggtitle("Economic Impact of Top Weather Events in US")

g4 <- ggplot(head(crop,10), aes(x=reorder(EVTYPE, cropdmg), y=cropdmg/(1e9))) + geom_bar(stat = "identity",fill="darkorchid") + coord_flip() + labs(x="Event Type", y="Crop Damage (Billion Dollars)")

grid.arrange(g3, g4, nrow = 2)

Conclusion

As is evident from the above plots, Tornado is the single most destructive weather event with respect to the population health. It causes the most Fatalities and Injuries. Excessive Heat and Flash Floods follow it as events causing the most Fatalities.

Floods cause the most property damage, followed by Hurricanes and Tornadoes. Droughts are responsible for the major crop damage followed by Floods & Icestorms