Synopsis

Given the NOAA Storm Data, demonstrate:

  1. Across the United States, which types of events are most harmful with respect to population health?
  2. Across the United States, which types of events have the greatest economic consequences?

Knowing the answers to these questions will aid in preparations. Tornadoes present the most injuries and fatalities. Tornadoes mixed with Thunderstorm Wind and Hail cause the most economic damage.

Data Processing

Let’s load the data

if(!file.exists("repdata-data-StormData.csv.bz2"))
{
  download.file(url="https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
                destfile = "repdata-data-StormData.csv.bz2")
}
StormData <- read.csv("repdata-data-StormData.csv.bz2")

Question 1

Across the United States, which types of events are most harmful with respect to population health?

Let’s combine Fatalities and Injuries into a single count, Casualties. This allows for easy sorting by maximum value.

options(scipen = 9)
CasualtyData <- StormData[,c("EVTYPE","FATALITIES","INJURIES")]
CasualtyData$CASUALTIES <- (CasualtyData$FATALITIES + CasualtyData$INJURIES)

byCasualties <- aggregate(list(Casualties = CasualtyData$CASUALTIES,
                               Injuries = CasualtyData$INJURIES,
                               Fatalities = CasualtyData$FATALITIES),
                          list(EventType = CasualtyData$EVTYPE),
                          FUN = sum)
byCasualties <- byCasualties[order(byCasualties$Casualties,decreasing = TRUE),]

Question 2

Across the United States, which types of events have the greatest economic consequences?

The provided data had many incorrectly entered exponent values. This will convert them to numeric powers of ten. The odd values “+”, “-”, and “?” are converted to 0. Essentially, this completely discounts their values. But, there is such a small number of events that this is nearly unnoticed.

After calculating the dollar cost of Property and Crop damage, combine them into a single field for sorting of maximum damage.

economicData <- StormData[,c("EVTYPE","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]
economicData$PROPDMGEXP <- gsub(pattern = "+", replacement = "0", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "-", replacement = "0", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "?", replacement = "0", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "m", replacement = "M", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "k", replacement = "K", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "h", replacement = "H", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "M", replacement = "6", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "K", replacement = "3", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "H", replacement = "2", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PROPDMGEXP <- gsub(pattern = "B", replacement = "9", economicData$PROPDMGEXP, fixed = TRUE)
economicData$PropertyDamage <- (economicData$PROPDMG * (10^as.numeric(economicData$PROPDMGEXP)))

economicData$CROPDMGEXP <- gsub(pattern = "?", replacement = "0", economicData$CROPDMGEXP, fixed = TRUE)
economicData$CROPDMGEXP <- gsub(pattern = "k", replacement = "K", economicData$CROPDMGEXP, fixed = TRUE)
economicData$CROPDMGEXP <- gsub(pattern = "m", replacement = "M", economicData$CROPDMGEXP, fixed = TRUE)
economicData$CROPDMGEXP <- gsub(pattern = "B", replacement = "9", economicData$CROPDMGEXP, fixed = TRUE)
economicData$CROPDMGEXP <- gsub(pattern = "K", replacement = "3", economicData$CROPDMGEXP, fixed = TRUE)
economicData$CROPDMGEXP <- gsub(pattern = "M", replacement = "6", economicData$CROPDMGEXP, fixed = TRUE)
economicData$CropDamage <- (economicData$CROPDMG * (10^as.numeric(economicData$CROPDMGEXP)))

economicData$TotalDamage <- (economicData$PropertyDamage + economicData$PropertyDamage)
byDamage <- aggregate(list(TotalDamage = economicData$TotalDamage,
                           PropertyDamage = economicData$PropertyDamage,
                           CropDamage = economicData$CropDamage),
                      list(EventType = economicData$EVTYPE),
                      FUN = sum)
byDamage <- byDamage[order(byDamage$TotalDamage,decreasing = TRUE),]

Results

Question 1

Across the United States, which types of events are most harmful with respect to population health?

with(byCasualties[1:3,],barplot(height = Casualties,
                                names.arg = EventType,
                                main = "Highest total injuries and fatalities", 
                                cex.names = .7,
                                xlab = "Event Type", 
                                ylab = "Total"))

byCasualties[1:5,]
##          EventType Casualties Injuries Fatalities
## 834        TORNADO      96979    91346       5633
## 130 EXCESSIVE HEAT       8428     6525       1903
## 856      TSTM WIND       7461     6957        504
## 170          FLOOD       7259     6789        470
## 464      LIGHTNING       6046     5230        816

We can see that Tornadoes represent the highest human damage by a significant amount at 96979 incidents. Excessive heat and Thunderstorm Wind combined still represent a fraction of the damage of Tornadoes, at 8428 and 7461 each.

Question 2

Across the United States, which types of events have the greatest economic consequences?

with(byDamage[1:3,],barplot(height = TotalDamage,
                           names.arg = EventType,
                           main = "Highest economic cost",
                           cex.names = .7,
                           xlab = "Event Type",
                           ylab = "Total cost"))

byDamage[1:5,]
##                      EventType TotalDamage PropertyDamage CropDamage
## 842 TORNADOES, TSTM WIND, HAIL  3200000000     1600000000    2500000
## 954                 WILD FIRES  1248200000      624100000         NA
## 271                  HAILSTORM   482000000      241000000         NA
## 392            HIGH WINDS/COLD   221000000      110500000    7000000
## 591             River Flooding   212310000      106155000         NA

As for economic cost, a combination of Tornadoes, Thunderstorm Wind, and Hail makes the largest source of damage at $3200000000. Wild Fires and Hailstorms place second and third at $1248200000 and $482000000, respectively. Again, the primary damage source is more than twice the second and third sources combined.