Synopsis:

This is a report that contains the analyis of the data on storms and other significant weather phenomena having sufficient intensity to cause loss of life, injuries, property damage and/or other economic losses. The data for the analysis was downloaded from “Storm data”, an official publication of the National Oceanic and Atmospheric Administration. The data covers the accumulated information on the impact of events of natural disasters over the duration 1950 to November 2011.

Some of the data elements, namely the fields representing the expenses for property damage(PROPDMGEXP) and those for crop damage (CROPDMGEXP) had to be recoded to get numerical value for computation. The values, H, K, M, B and their smaller case equivalents had to be recoded to represent, 10^2, 10^3, 10^6 and 10^9, before computing the total damage. Following this, the net damage expenses were rolled up by the event type and the total damages were calculated. Similarly the number of fatalities and injuries were also rolled up by the event. The code used for anlaysis are given below along with the plots and final result.

Data Processing

The code that reads the data in to R and recodes the required fields are shown below

#cache= TRUE option is not used as the machine I ran it had a high RAM
storm <- read.csv("repdata-data-StormData.csv")

#Use dplyr for efficiency
library(dplyr)
strm_df <- tbl_df(storm)
rm(storm)
   
#The PROPDMGEXP and CROPDMGEXP are recoded and the total damage expenses calculated
strm_df_mutate <- strm_df %>% mutate(PROPDMGTOT = ifelse(PROPDMGEXP %in% c('-', '?', '+', ' '), 0, ifelse(PROPDMGEXP %in% c('H', 'h'), PROPDMG*100, ifelse(PROPDMGEXP %in% c('K', 'k'), PROPDMG*1000, ifelse(PROPDMGEXP %in% c('M', 'm'), PROPDMG*1000000, ifelse( PROPDMGEXP %in% c('B', 'b'), PROPDMG*1000000000, as.numeric(PROPDMGEXP)*PROPDMG)))))) %>% mutate(CROPDMGTOT = ifelse(CROPDMGEXP %in% c('-', '?', '+', ' '), 0, ifelse(CROPDMGEXP %in% c('H', 'h'), CROPDMG*100, ifelse(CROPDMGEXP %in% c('K', 'k'), CROPDMG*1000, ifelse(CROPDMGEXP %in% c('M', 'm'), CROPDMG*1000000, ifelse( CROPDMGEXP %in% c('B', 'b'), CROPDMG*1000000000, as.numeric(CROPDMGEXP)*CROPDMG))))))

#Aggregate the data by event
strm_by_evnt <- group_by(strm_df_mutate, EVTYPE)
hzrds_by_evnt <- summarise(strm_by_evnt, sum_ftl = sum(FATALITIES), sum_inj = sum(INJURIES))
hzrds_by_evnt <- arrange(hzrds_by_evnt, desc(sum_ftl), desc(sum_inj))

#Get the top 5 hazardous events based on fatalities
top_hlth_hzrd_evnts <- hzrds_by_evnt[1:5,]

The code below produces the chart with fatality counts of the top 5 fatality event from 1950 to Nov 2011

barplot(top_hlth_hzrd_evnts$sum_ftl, names.arg = top_hlth_hzrd_evnts$EVTYP,  main="Plot 1: Top Fatality Events", xlab = "Events", ylab = "Total Fatalities", col = "brown", cex.names = 0.75)

The chart below has the injury counts of the top 5 fatality event from 1950 to Nov 2011

barplot(top_hlth_hzrd_evnts$sum_inj, names.arg = top_hlth_hzrd_evnts$EVTYP,  main="Plot 2: Top Injury Events", xlab = "Events", ylab = "Total Injuries", col = "orange", cex.names = 0.75)

The code below produces the chart showing the total Expenses incurred in millions for the top economically consequential events from 1940 to Nov 2011

exp_by_evnt <- summarise(strm_by_evnt, sum_exp = sum(PROPDMGTOT) + sum(CROPDMGTOT))
exp_by_evnt <- arrange(exp_by_evnt, desc(sum_exp))
top_exp_evnts <- exp_by_evnt[1:5,]
barplot(top_exp_evnts$sum_exp/1000000, names.arg = top_exp_evnts$EVTYP, main="Plot 3: Top Economic Consequence Events", xlab = "Events", ylab = "Total Cost ($ millions)",col = "green", cex.names = 0.7)

Results

Tornado is clearly the natural event that causes the most harm to population health, both in terms of fatalities and injuries. Excessive Heat comes at a far second, although the fatalities and injuries are less than half of those caused by Tornado. Plots 1 and 2 clearly illustrates this.

In terms of economic consequences, Flood is the biggest culprit followed by Hurricane/Typhoon and then by Tornado. This is after taking in to conisderation both property damages and crop damages. Plot 3 shows the total cost of the top 5 natural events ranked by the damage expenses incurred for the mentioned duration.