This document outlines the analysis performed on NOAA Storm Database from the year 1950 through November 2011. The goal of this analysis is to evaluate
Which storm event type has the greatest impact on population health
Which storm event type has the greatest economic impact
The record of fatalities and injuries are used as indicators for the impact of an event on population health. Coversely, the economic impact of a storm even was measured by property damage and crop damage indicators.
Storm data is downloaded using the following r code from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 as a .bz2 file. The file is then read into r using the read.csv command and stored in a variable NOAA.
if (!file.exists("NOAAStorm.bz2")) {
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
destfile = "NOAAStorm.bz2", method = "libcurl")
}
NOAA <- read.csv("NOAAStorm.bz2")
In order to acquire a unique set of storm event types, all event types were changed to uppercase characters, hence, eliminating potential lowercase repeats
NOAA$EVTYPE <- factor(toupper(NOAA$EVTYPE))
For data processing, libraries plyr and dplyr were loaded.
library(plyr)
library(dplyr)
Fatalities and injuries for each storm event type were summed into variables fatal and injure. Greatest harm to population health was measured as a sum of total fatalities and injuries caused by a storm even type.
fatal <- tapply(NOAA$FATALITIES, NOAA$EVTYPE, sum)
injure <- tapply(NOAA$INJURIES, NOAA$EVTYPE, sum)
harm <- fatal + injure
worstevent <- names(which.max(harm))
maxfatal <- fatal[which.max(harm)]
maxinjure <- injure[which.max(harm)]
According to this storm database, the most harmful event type was TORNADO with a total of 5633 fatalities and 9.134610^{4} injuries over the period of 1950 through November 2011.
It is also of interest to consider storm event types that caused over 100 fatalities and/or injuries for the duration recorded on this database. These data are stored in a data frame called mostharm. The plot below shows storm event types that have caused the most number of fatalities and injuries excluding TORNADO.
AllHarm <- data.frame(cbind(fatal,injure,harm))
AllHarm <- cbind(rownames(AllHarm), AllHarm)
AllHarm <- arrange(AllHarm, harm)
mostharm <- subset(AllHarm, harm > 100 & harm < (maxinjure + maxfatal))
par(mar = c(10,6,4,1))
barplot(mostharm$harm, names.arg = mostharm$`rownames(AllHarm)`, las=2, cex.names = 0.67, ylab = "Fatalities and Injuries", main = "Event Types Causing >100 Fatalitites/Injuries excluding Worst Event Type")
The storm data is subsetted into data frame, economic, summarizing only monetary damage as measured by property damage (PROPDMG) and crop damage (CROPDMG). The units for the cost of damage as thousands of dollars (K), millions of dollars (M) or billions of dollars (B) is referenced in variables PROPDMGEXP and CROPDMGEXP. These variables are summed into column totaldamage to compute total monetary damage caused by an incident.
economic <- subset(NOAA, (PROPDMG>0 | CROPDMG>0), select = c(EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP))
economic <- mutate(economic, PROPDMGEXP = gsub("[hH]", "2", economic$PROPDMGEXP))
economic <- mutate(economic, PROPDMGEXP = gsub("[kK]", "3", economic$PROPDMGEXP))
economic <- mutate(economic, PROPDMGEXP = gsub("[mM]", "6", economic$PROPDMGEXP))
economic <- mutate(economic, PROPDMGEXP = gsub("[bB]", "9", economic$PROPDMGEXP))
economic <- mutate(economic, CROPDMGEXP = gsub("[hH]", "2", economic$CROPDMGEXP))
economic <- mutate(economic, CROPDMGEXP = gsub("[kK]", "3", economic$CROPDMGEXP))
economic <- mutate(economic, CROPDMGEXP = gsub("[mM]", "6", economic$CROPDMGEXP))
economic <- mutate(economic, CROPDMGEXP = gsub("[bB]", "9", economic$CROPDMGEXP))
economic$PROPDMGEXP <- as.numeric(economic$PROPDMGEXP)
## Warning: NAs introduced by coercion
economic$CROPDMGEXP <- as.numeric(economic$CROPDMGEXP)
## Warning: NAs introduced by coercion
economic[is.na(economic)] <- 0 #It is assumed that NA values render exponent 0 for multiplier of 1.
economic <- mutate(economic, totaldamage = PROPDMG * (10^PROPDMGEXP) + CROPDMG * (10^CROPDMGEXP))
The monetary damage for each storm event type were summed into data frame damage.
damage <- tapply(economic$totaldamage, economic$EVTYPE, sum)
damage <- data.frame(damage)
damage <- cbind(rownames(damage),damage)
damage <- na.omit(damage)
damage <- arrange(damage, damage)
maxdamage <- damage$`rownames(damage)`[dim(damage)[1]]
costmaxdamage <- damage$damage[dim(damage)[1]]
According to the NOAA storm database, the greatest economic damage was induced by FLOOD with a total cost of $1.503196810^{11} over the period of 1950 through November 2011.
It is also of interest to consider storm event types that caused over billions of dollars of damage for the duration recorded on this database. These data are stored in a data frame called mostdamage. The plot below shows storm event types that have caused the greatest economic damage excluding FLOOD.
mostdamage <- subset(damage, damage > 1000000000 & damage < costmaxdamage)
mostdamage <- mutate(mostdamage, damage = damage/1000000000)
par(mar = c(10,6,4,4))
barplot(mostdamage$damage, names.arg = mostdamage$`rownames(damage)`, las=2, cex.names = 0.65, ylab = "Cost ($ Billion)", main = "Event Types Causing Greatest Economic Damage Excluding Worst Event Type")