Reproducable Data Peer Assignment 2 - Storm Data analysis for Human Health and Property Damage

This report evaluates the effects of storms both in terms of human health and economic impact. To do this I analyzed the US National Oceanic and Atmospheric Administration's (NOAA) storm database. The tornado was the most devastating storm, as gauged by both human casualty and injury. I did not combine casualty and injury values because they are fundamentally two different types of loss. Property damage results came from crop and property damage values. Because these were both expressed as dollar amounts, I combined them. Floods, hurricanes/typhoons, and tornados had the greatest economic impact, in terms of crop and property damage.

Data Processing

The storm data base is distributed as a BZ2 file. R is capable of reading the CSV file embedded in the BZ2 archive. The property and crop damage fields both make use of a second filed that describes the exponent. K for kilo, M for million, and B for billion. I created a new field, named prop_norm that holds the sum of these two values.

library(knitr)
raw = read.csv("repdata-data-StormData.csv.bz2")

# Evaluate the crop/property exponent.  The crop and property values can be
# combined because they are both dollar amounts.
raw$prop_norm <- apply(raw[, c("PROPDMGEXP", "PROPDMG", "CROPDMGEXP", "CROPDMG")], 
    1, function(x) {
        result <- 0
        # Property damage
        if (x[1] == "K") {
            result <- as.numeric(x[2]) * 1000
        } else if (x[1] == "M") {
            result <- as.numeric(x[2]) * 1e+06
        } else if (x[1] == "B") {
            result <- as.numeric(x[2]) * 1e+09
        }
        # Crop damage
        if (x[3] == "K") {
            result <- result + as.numeric(x[4]) * 1000
        } else if (x[3] == "M") {
            result <- result + as.numeric(x[4]) * 1e+06
        } else if (x[3] == "B") {
            result <- result + as.numeric(x[4]) * 1e+09
        }

        result
    })

I would like to see what storms caused the greatest number of human injuries. To do this I aggregated sums by the “event types”. I took only the events that had the top five number of injuries.

inj_sums <- aggregate(raw$INJURIES, by = list(Category = raw$EVTYPE), FUN = sum)
names(inj_sums) <- c("evtype", "count")
inj_sums <- inj_sums[order(-inj_sums$count), ]
inj_sums <- inj_sums[1:5, ]

I would like to see what storms caused the greatest number of human fatalities. To do this I aggregated sums by the “event types”. I took only the events that had the top five number of fatalities.

fatal_sums <- aggregate(raw$FATALITIES, by = list(Category = raw$EVTYPE), FUN = sum)
names(fatal_sums) <- c("evtype", "count")
fatal_sums <- fatal_sums[order(-fatal_sums$count), ]
fatal_sums <- fatal_sums[1:5, ]

I would like to see what storms caused the largest amount of property and crop damage. To do this I aggregated sums by the “event types”. I took only the events that had the top five amount of property damage.

propdmg_sums <- aggregate(raw$prop_norm, by = list(Category = raw$EVTYPE), FUN = sum)
names(propdmg_sums) <- c("evtype", "count")
propdmg_sums <- propdmg_sums[order(-propdmg_sums$count), ]
propdmg_sums <- propdmg_sums[1:5, ]

Results

The results of this study are summarized in the following two sections. First, we will look at Human Health.

Human Health

Weather can have a devistating impact on human health. To consider human health I evaluated both injuries and fatalities. The torndao was the greatest contributor to both fatalities and injuries. Heat was the second largest source of fatalities. TSTM WIND was the second leading cause of injuries.

The following table shows the top five events for injuries.

kable(inj_sums, format = "markdown")
## |id   |evtype          |  count|
## |:----|:---------------|------:|
## |834  |TORNADO         |  91346|
## |856  |TSTM WIND       |   6957|
## |170  |FLOOD           |   6789|
## |130  |EXCESSIVE HEAT  |   6525|
## |464  |LIGHTNING       |   5230|

The following table shows the top five events for fatalities.

kable(fatal_sums, format = "markdown")
## |id   |evtype          |  count|
## |:----|:---------------|------:|
## |834  |TORNADO         |   5633|
## |130  |EXCESSIVE HEAT  |   1903|
## |153  |FLASH FLOOD     |    978|
## |275  |HEAT            |    937|
## |464  |LIGHTNING       |    816|

The following chart compares both fatalities and injuries.

par(mfrow = c(2, 1))
barplot(fatal_sums$count, names.arg = fatal_sums$evtype, xlab = "Event", ylab = "Fatalities")
barplot(inj_sums$count, names.arg = inj_sums$evtype, xlab = "Event", ylab = "Injuries")

plot of chunk unnamed-chunk-7

Economic Consequences

Weather can have devistating effects on property and crops. To consider this damage I evaluated both crop and property damage. Floods, hurricanes/typhoons, and tornados had the greatest economic impact, in terms of crop and property damage.

The following table shows the top five events for economic impact.

kable(propdmg_sums, format = "markdown")
## |id   |evtype             |      count|
## |:----|:------------------|----------:|
## |170  |FLOOD              |  1.503e+11|
## |411  |HURRICANE/TYPHOON  |  7.191e+10|
## |834  |TORNADO            |  5.734e+10|
## |670  |STORM SURGE        |  4.332e+10|
## |244  |HAIL               |  1.875e+10|

The following chart compares both property and crop damage.

par(mfrow = c(2, 1))
barplot(propdmg_sums$count, names.arg = propdmg_sums$evtype, xlab = "Event", 
    ylab = "Property and Crop Damage")

plot of chunk unnamed-chunk-9