Synopsis

In this document we analyze the health and economic damage consequences of a large set of data from the NOAA. We aim to find the type of storm that has caused the greatest damage in the form of injuries and fatalities in the data, and then the type that has caused the greatest economic (property and crop damage) harm.

Data Processing

The first step is to pull the data using ‘read.csv’ but additionally we need to process the two exponent columns: CROPDMGEXP and PROPDMGEXP. The following code interprets the values in these two columns using a local function called “GetExponent” and then adjusts the corresponding costs, CROPDMG and PROPDMG, by taking 10 to the power of the exponent column.

stormData <- read.csv("repdata_data_StormData.csv.bz2")

GetExponent <- function(x){
  calc <- x
  if (tolower(x) == "k"){
    calc <- 3}
  else if (tolower(x) == "m"){
    calc <- 6}
  else if (tolower(x) == "b"){
    calc <- 9}
  else if (tolower(x) == "h"){
    calc <- 2}
  else if (x == "" | x == "+" | x == "?" | x == "-"){
    calc <- 0}
  
  return(as.numeric(calc))
}

stormData <- mutate(stormData, CROPDMGEXP = sapply(CROPDMGEXP, GetExponent))
stormData <- mutate(stormData, PROPDMGEXP = sapply(PROPDMGEXP, GetExponent))
stormData$CROPDMG <- stormData$CROPDMG * 10 ^ (stormData$CROPDMGEXP)
stormData$PROPDMG <- stormData$PROPDMG * 10 ^ (stormData$PROPDMGEXP)
stormData$TotalDamage <- stormData$CROPDMG + stormData$PROPDMG

Results

First we’ll look at the population health effects, which are categorized into injuries and fatalities. Aggregating the injuries and fatalities separately for each type of event yields a result that lends to easy interpretation: tornadoes cause by far the greater harm in both columns.

stormInjuries <- subset(aggregate(INJURIES ~ EVTYPE, data = stormData, sum), INJURIES > 0)
stormFatalities <- subset(aggregate(FATALITIES ~ EVTYPE, data = stormData, sum), FATALITIES > 0)

stormInjuries[stormInjuries$INJURIES == max(stormInjuries$INJURIES),]
##      EVTYPE INJURIES
## 834 TORNADO    91346
stormFatalities[stormFatalities$FATALITIES == max(stormFatalities$FATALITIES),]
##      EVTYPE FATALITIES
## 834 TORNADO       5633

Hence we conclude that tornadoes cause the greatest harm to population health of all event types.

The following code and corresponding plots show how tornadoes stand out compared to the other event types that are in the top 5% for either injuries or fatalities.

stormDataPersonalHarm <- merge(stormInjuries, stormFatalities, by="EVTYPE")
stormDataPersonalHarm.Significant <- subset(stormDataPersonalHarm,
                            INJURIES > quantile(stormDataPersonalHarm$INJURIES, 0.95)
                            | FATALITIES> quantile(stormDataPersonalHarm$FATALITIES, 0.95))
ggplot(data=stormDataPersonalHarm.Significant, aes(x=EVTYPE, y=INJURIES)) + geom_point() + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+ggtitle("Total Injuries from Worst 5% of Event Types")

ggplot(data=stormDataPersonalHarm.Significant, aes(x=EVTYPE, y=FATALITIES)) + geom_point() + theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+ggtitle("Total Fatalities from Worst 5% of Event Types")

For economic consequences we want to combine the crop and property in the ‘TotalDamage’ column from data processing above. This gives the total dollar amount of damage for each event, and we again aggregate by event type.

stormDataTotalDamage <- aggregate(TotalDamage ~ EVTYPE, data = stormData, sum)
stormDataTotalDamage[stormDataTotalDamage$TotalDamage == max(stormDataTotalDamage$TotalDamage),]
##     EVTYPE  TotalDamage
## 170  FLOOD 150319678257

Flood damage has resulted in the greatest amount of economic damage. As the following plot shows, even when looking at only the top 1% worst event types in terms of economic damage, floods are far and away the most costly disaster events.

stormDataTotalDamage.Significant <- subset(stormDataTotalDamage, TotalDamage > quantile(stormDataTotalDamage$TotalDamage, 0.99))
ggplot(stormDataTotalDamage.Significant, aes(x=EVTYPE, y=TotalDamage))+geom_point()+ theme(axis.text.x = element_text(angle = 30, vjust = 1, hjust=1))+ggtitle("Total Damage in Worst 1% of Event Types")