Synopsis
In this report we aim to analyze the storm event database from U.S. National Oceanic and Atmospheric Administration(NOAA) to determine the impact of storms on humans and economy betweeb 1950 and 2011. The analysis has determined that tornadoes are the most dangerous event in terms of fatalaties and injuries to humans followed by excessive heat and thunderstorm winds. In terms of economic impact flash floods had the most significant impact followed by thunderstornm winds and tornadoes.
Data Processing
Data Loading: Load the data obtained from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database
## Load the required libraries
library(plyr)
library(ggplot2)
## Load the data
stormData = read.csv("./repdata-data-StormData.csv.bz2")
After loaading the data, display the first few rows. There are 902927 rows with 37 columns
dim(stormData)
## [1] 902297 37
head(stormData[,1:5])
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY
## 1 1 4/18/1950 0:00:00 0130 CST 97
## 2 1 4/18/1950 0:00:00 0145 CST 3
## 3 1 2/20/1951 0:00:00 1600 CST 57
## 4 1 6/8/1951 0:00:00 0900 CST 89
## 5 1 11/15/1951 0:00:00 1500 CST 43
## 6 1 11/15/1951 0:00:00 2000 CST 77
Analysis #1: Determine events across the United States, that are most harmful with respect to population health using the fataltaies and injuries number
## Summarize fatalaties and injuries for all events
harmfulEvent <- ddply(stormData, .(EVTYPE), summarize,
impact = sum(FATALITIES) + sum(INJURIES))
## Get the top 10 events and display
harmfulEvent <- head(harmfulEvent[order(harmfulEvent$impact, decreasing = T), ], 10)
print(harmfulEvent)
## EVTYPE impact
## 834 TORNADO 96979
## 130 EXCESSIVE HEAT 8428
## 856 TSTM WIND 7461
## 170 FLOOD 7259
## 464 LIGHTNING 6046
## 275 HEAT 3037
## 153 FLASH FLOOD 2755
## 427 ICE STORM 2064
## 760 THUNDERSTORM WIND 1621
## 972 WINTER STORM 1527
harmfulEvent$tmp <- reorder(harmfulEvent$EVTYPE, harmfulEvent$impact)
Analysis #2: Determine events across the United States, that had the most significant econmic impact using the property and crop damage numbers
## Property and crop damage needs to be calculated using PROPDMG/CORPDMG and PROPDMGEXP/CROPDMGEXP vaue
## Function to calculate the actual exponent value
exp_transform <- function(exp) {
## h -> hundred, k -> thousand, m -> million, b -> billion
if (exp %in% c('h', 'H'))
return(2)
else if (exp %in% c('k', 'K'))
return(3)
else if (exp %in% c('m', 'M'))
return(6)
else if (exp %in% c('b', 'B'))
return(9)
else if (!is.na(as.numeric(exp))) ## if a digit
return(as.numeric(exp))
else
return(0) ## default return 0
}
## Use function to to calculate the actual damage
prop_dmg_exp <- sapply(stormData$PROPDMGEXP, FUN=exp_transform)
stormData$prop_dmg <- stormData$PROPDMG * (10 ** prop_dmg_exp)
crop_dmg_exp <- sapply(stormData$CROPDMGEXP, FUN=exp_transform)
stormData$crop_dmg <- stormData$CROPDMG * (10 ** crop_dmg_exp)
## Calculate the actual damage for top 10 events and display the values using $(Billion)
economicEvent <- ddply(stormData, .(EVTYPE), summarize,
impact = sum(prop_dmg) + sum(crop_dmg))
economicEvent <- head(economicEvent[order(economicEvent$impact, decreasing = T), ], 10)
economicEvent$impact <- economicEvent$impact/1000000000
print(economicEvent)
## EVTYPE impact
## 153 FLASH FLOOD 68203.78828
## 786 THUNDERSTORM WINDS 20865.50750
## 834 TORNADO 1079.36622
## 244 HAIL 318.78181
## 464 LIGHTNING 172.95540
## 170 FLOOD 150.31968
## 411 HURRICANE/TYPHOON 71.91371
## 185 FLOODING 59.21711
## 670 STORM SURGE 43.32354
## 310 HEAVY SNOW 18.06724
economicEvent$tmp <- reorder(economicEvent$EVTYPE, economicEvent$impact)
Results
Most Harmful Events: Plot the graph of top 10 events that are most harmful with respect to population
ggplot(data = harmfulEvent, aes(x = harmfulEvent$tmp, y = impact)) +
geom_bar(fill = "blue", stat = "identity") +
ggtitle("Top 10 Most Harmful Events") +
xlab("Event type") + ylab("Number of Injuries and Fatalities") +
theme(axis.text.x = element_text(angle=45, hjust=1))

Conclusion: Torandos had the most significant impact on population between 1950 and 2011 in terms of fatalities and injuries followed by Excessive Heat and Thunderstorm Winds.
Events with most Economic Consequence: Plot the graph of top 10 events that had the most significant economic impact
ggplot(data = economicEvent, aes(x = economicEvent$tmp, y = impact)) +
geom_bar(fill = "green", stat = "identity") +
ggtitle("Top 10 Events with Most Economic Impact(Property and Crop damage)") +
xlab("Event type") + ylab("$ (Billion) Damage") +
theme(axis.text.x = element_text(angle=45, hjust=1))

Conclusion: Flash Floods had the most impact to the economy between 1950 and 2011 followed by Thunderstorm Winds and Torando