Synopsis

In this report we aim to analyze the storm event database from U.S. National Oceanic and Atmospheric Administration(NOAA) to determine the impact of storms on humans and economy betweeb 1950 and 2011. The analysis has determined that tornadoes are the most dangerous event in terms of fatalaties and injuries to humans followed by excessive heat and thunderstorm winds. In terms of economic impact flash floods had the most significant impact followed by thunderstornm winds and tornadoes.

Data Processing

Data Loading: Load the data obtained from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database
## Load the required libraries
library(plyr)
library(ggplot2)
## Load the data
stormData = read.csv("./repdata-data-StormData.csv.bz2")
After loaading the data, display the first few rows. There are 902927 rows with 37 columns
dim(stormData)
## [1] 902297     37
head(stormData[,1:5])
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY
## 1       1  4/18/1950 0:00:00     0130       CST     97
## 2       1  4/18/1950 0:00:00     0145       CST      3
## 3       1  2/20/1951 0:00:00     1600       CST     57
## 4       1   6/8/1951 0:00:00     0900       CST     89
## 5       1 11/15/1951 0:00:00     1500       CST     43
## 6       1 11/15/1951 0:00:00     2000       CST     77
Analysis #1: Determine events across the United States, that are most harmful with respect to population health using the fataltaies and injuries number
## Summarize fatalaties and injuries for all events
harmfulEvent <- ddply(stormData, .(EVTYPE), summarize,
                    impact = sum(FATALITIES) +  sum(INJURIES))
## Get the top 10 events and display
harmfulEvent <- head(harmfulEvent[order(harmfulEvent$impact, decreasing = T), ], 10)
print(harmfulEvent)  
##                EVTYPE impact
## 834           TORNADO  96979
## 130    EXCESSIVE HEAT   8428
## 856         TSTM WIND   7461
## 170             FLOOD   7259
## 464         LIGHTNING   6046
## 275              HEAT   3037
## 153       FLASH FLOOD   2755
## 427         ICE STORM   2064
## 760 THUNDERSTORM WIND   1621
## 972      WINTER STORM   1527
harmfulEvent$tmp <- reorder(harmfulEvent$EVTYPE, harmfulEvent$impact)
Analysis #2: Determine events across the United States, that had the most significant econmic impact using the property and crop damage numbers
## Property and crop damage needs to be calculated using PROPDMG/CORPDMG and PROPDMGEXP/CROPDMGEXP vaue 

## Function to calculate the actual exponent value 
exp_transform <- function(exp) {
  ## h -> hundred, k -> thousand, m -> million, b -> billion
  if (exp %in% c('h', 'H'))
    return(2)
  else if (exp %in% c('k', 'K'))
    return(3)
  else if (exp %in% c('m', 'M'))
    return(6)
  else if (exp %in% c('b', 'B'))
    return(9)
  else if (!is.na(as.numeric(exp))) ## if a digit
    return(as.numeric(exp))
  else  
    return(0) ## default return 0
}

## Use function to to calculate the actual damage
prop_dmg_exp <- sapply(stormData$PROPDMGEXP, FUN=exp_transform)
stormData$prop_dmg <- stormData$PROPDMG * (10 ** prop_dmg_exp)
crop_dmg_exp <- sapply(stormData$CROPDMGEXP, FUN=exp_transform)
stormData$crop_dmg <- stormData$CROPDMG * (10 ** crop_dmg_exp)

## Calculate the actual damage for top 10 events and display the values using $(Billion)
economicEvent <- ddply(stormData, .(EVTYPE), summarize,
                       impact = sum(prop_dmg)  + sum(crop_dmg))  
economicEvent <- head(economicEvent[order(economicEvent$impact, decreasing = T), ], 10)  
economicEvent$impact <- economicEvent$impact/1000000000  
print(economicEvent)  
##                 EVTYPE      impact
## 153        FLASH FLOOD 68203.78828
## 786 THUNDERSTORM WINDS 20865.50750
## 834            TORNADO  1079.36622
## 244               HAIL   318.78181
## 464          LIGHTNING   172.95540
## 170              FLOOD   150.31968
## 411  HURRICANE/TYPHOON    71.91371
## 185           FLOODING    59.21711
## 670        STORM SURGE    43.32354
## 310         HEAVY SNOW    18.06724
economicEvent$tmp <- reorder(economicEvent$EVTYPE, economicEvent$impact)  

Results

Most Harmful Events: Plot the graph of top 10 events that are most harmful with respect to population

ggplot(data = harmfulEvent, aes(x = harmfulEvent$tmp, y = impact)) +
  geom_bar(fill = "blue", stat = "identity") +
  ggtitle("Top 10 Most Harmful Events") +
  xlab("Event type") + ylab("Number of Injuries and Fatalities") +
  theme(axis.text.x = element_text(angle=45, hjust=1))  

Conclusion: Torandos had the most significant impact on population between 1950 and 2011 in terms of fatalities and injuries followed by Excessive Heat and Thunderstorm Winds.

Events with most Economic Consequence: Plot the graph of top 10 events that had the most significant economic impact

ggplot(data = economicEvent, aes(x = economicEvent$tmp, y = impact)) +
  geom_bar(fill = "green", stat = "identity") +
  ggtitle("Top 10  Events with Most Economic Impact(Property and Crop damage)") +
  xlab("Event type") + ylab("$ (Billion) Damage") +
  theme(axis.text.x = element_text(angle=45, hjust=1))  

Conclusion: Flash Floods had the most impact to the economy between 1950 and 2011 followed by Thunderstorm Winds and Torando