Severe weather events take both human and economic tolls. This website analyzes each cummulatively based on data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database between 1950 and 2011.

Data Processing

The data were loaded from a csv file of the events available from NOAA’s website and stored in a data frame called “raw”. A copy of the data was immediately made so that it could be modified without affecting the original data.


     raw <- read.csv("repdata-data-StormData.csv.bz2", stringsAsFactors = FALSE)

     noaaData <- raw


The data was then subsetted so that injuries, fatalities, crop damage and property damage could be summed separately and organized by event type. The code below sums up the injury and fatality columns of the NOAA data by event type, and then creates a data frame to hold this information. This creates a data frame with cummulative injury and death information organized by each over all the years records were kept.


      injuries <- lapply(split(noaaData$INJURIES,noaaData$EVTYPE),"sum") ## sums up injuries by event type
     fatalities <- lapply(split(noaaData$FATALITIES,noaaData$EVTYPE),"sum") ## sums up fatalities by event type
     
     ## Create an injuries and fatalities data frame from the sums data
     populationDf <- data.frame(1:length(unique(noaaData$EVTYPE)),1:length(unique(noaaData$EVTYPE)),1:length(unique(noaaData$EVTYPE)),1:length(unique(noaaData$EVTYPE)))
     names(populationDf) <- c("EVTYPE","INJURIES", "FATALITIES", "COMBINED")
     populationDf$EVTYPE <- names(injuries) ## Used to be (sums)
     populationDf$INJURIES <- as.numeric(injuries)
     populationDf$FATALITIES <- as.numeric(fatalities)
     populationDf$COMBINED <- populationDf$INJURIES + populationDf$FATALITIES


Processing the crop and property damage columns was trickier because each type of data was given as a multiple of either an “h” or “H” (100’s of dollars of damage), a “k” or “K” (1000’s of dollars of damage), an “m” or “M” (millions of dollars of damage), or a “b” or “B” (billions of dollars of damage). The code below first creates a data frame for converting letters to their monetary values, and then multiplies the crop and property damages by the appropriate value from the multiples data frame to form crop and property damage columns. Finally, similarly to above, the crop and property damage columns were summed by event type over all the years that NOAA records were kept.


     ## Create data frame to convert symbols to actual costs
     multipliers <- data.frame(100,1000,1000000,1000000000)
     names(multipliers) <- c("hH", "kK", "mM", "bB")
     
     ## Convert symbols (hH, kK, mM, bB) in CROPDMGEXP to actual numbers (100, 1000, 1e6, 1e9)
     data <- noaaData$CROPDMGEXP
     noaaData$cropDamageCosts <- as.numeric(lapply(data, function(x){
                 if(any(x == substring(names(multipliers),1,1) | x == substring(names(multipliers),2,2))){
                      value <- multipliers[[match(TRUE,(x == substring(names(multipliers),1,1) | x == substring(names(multipliers),2,2)))]]
                      }else{value <- 0} 
                 return(value)}))
     
     noaaData$cropDamageCosts <- noaaData$CROPDMG * noaaData$cropDamageCosts
     
     ## Convert symbols (hH, kK, mM, bB) in PROPDMGEXP to actual numbers (100, 1000, 1e6, 1e9)
     data <- noaaData$PROPDMGEXP
     noaaData$propertyDamageCosts <- as.numeric(lapply(data, function(x){
          if(any(x == substring(names(multipliers),1,1) | x == substring(names(multipliers),2,2))){
               value <- multipliers[[match(TRUE,(x == substring(names(multipliers),1,1) | x == substring(names(multipliers),2,2)))]]
          }else{value <- 0} 
          return(value)}))
     
     noaaData$propertyDamageCosts <- noaaData$PROPDMG * noaaData$propertyDamageCosts
     
     cropDamage <- lapply(split(noaaData$cropDamageCosts,noaaData$EVTYPE),"sum") ## sums up crop damage by event type
     propertyDamage <- lapply(split(noaaData$propertyDamageCosts,noaaData$EVTYPE),"sum") ## sums up property damage by event type
   
     ## Create an crop and property damage data frame from the sums data
     econDf <- data.frame(1:length(unique(noaaData$EVTYPE)),1:length(unique(noaaData$EVTYPE)),1:length(unique(noaaData$EVTYPE)),1:length(unique(noaaData$EVTYPE)))
     names(econDf) <- c("EVTYPE","cropDamage", "propertyDamage", "COMBINED")
     econDf$EVTYPE <- names(cropDamage)  ## Used to be (sums)
     econDf$cropDamage <- as.numeric(cropDamage)/1000000000 ## Get money in billions for graphing purposes
     econDf$propertyDamage <- as.numeric(propertyDamage)/1000000000
     econDf$COMBINED <- econDf$cropDamage + econDf$propertyDamage

Results

Below is the code used to generate several summary plots. This first code chunk creates bar plots of the top 10 most harmful events to humans by way of injuries, deaths and a combined total of injuries and deaths. In order to do this, rankings were performed on the data and only the top 10 of each displayed in the bar plots so that the plots weren’t overwhelmed by all 985 event types.


     par(mfrow=c(1,3), cex.main = 1, oma = c(3, 2, 3, 2), mar= c(8,4,4,2)+.01)
     
     populationDf$cropRanks <- rank(populationDf$INJURIES,ties.method = "first") ## Create ranks column for events so that only most destructive event types are graphed
     sub <- populationDf[populationDf$cropRanks > max(populationDf$cropRanks)-10,] ## Subset top destructive data sorted by event type
     sub <- sub[order(sub$cropRanks),] ## Rearrange subset in increasing order for plotting
     
     ## Create barplot of injuries by event types
     barPlot <- barplot(sub$INJURIES, axes = FALSE, axisnames = FALSE, main="Most Injurious Weather Events") 
     xLabels <- sub$EVTYPE
     text(barPlot, par('usr')[3], labels = xLabels, srt = 45, adj = c(1,1), xpd = TRUE, cex=.7) ## Creates rotated labels for x-axis
     axis(2) 
     
     populationDf$fatalityRanks <- rank(populationDf$FATALITIES,ties.method = "first") ## Create ranks column for events so that only most destructive event types are graphed
     sub <- populationDf[populationDf$fatalityRanks > max(populationDf$fatalityRanks)-10,] ## Subset top destructive data sorted by event type
     sub <- sub[order(sub$fatalityRanks),] ## Rearrange subset in increasing order for plotting
     
     ## Create barplot of deaths by event types
     barPlot <- barplot(sub$FATALITIES, axes = FALSE, axisnames = FALSE, main="Most Deadly Weather Events") 
     xLabels <- sub$EVTYPE
     text(barPlot, par('usr')[3], labels = xLabels, srt = 45, adj = c(1,1), xpd = TRUE, cex=.7) ## Creates rotated labels for x-axis
     axis(2)
     
     populationDf$combinedRanks <- rank(populationDf$COMBINED,ties.method = "first") ## Create ranks column for events so that only most destructive event types are graphed
     sub <- populationDf[populationDf$combinedRanks > max(populationDf$combinedRanks)-10,] ## Subset top destructive data sorted by event type
     sub <- sub[order(sub$combinedRanks),] ## Rearrange subset in increasing order for plotting
     
     ## Create barplot of TOTAL injuries & deaths by event types
     barPlot <- barplot(sub$COMBINED, axes = FALSE, axisnames = FALSE, main="Most Harmful Events (Combined)") 
     xLabels <- sub$EVTYPE
     text(barPlot, par('usr')[3], labels = xLabels, srt = 45, adj = c(1,1), xpd = TRUE, cex=.7) ## Creates rotated labels for x-axis
     axis(2)
     
     ## Add whole plot titles
     mtext("Events Most Harmful to Population Health", side=3, outer = TRUE, font=2)
     mtext("Number of People Injured or Killed", side=2, outer = TRUE, padj = 1, cex=.8, adj = .65, font=2)
     mtext(quote(bold("FIGURE 1: \n") ~ "The above graphs show the injuries and deaths caused by different natural events.\nFrom the data above, tornadoes are by far the most harmful events to human health."), side = 1, outer = TRUE, cex=.8, adj=.5)


Finally the economic losses due to crop damage, property damage and the combined losses were ranked and the top 10 events graphed in barplots.


      par(mfrow=c(1,3), cex.main = 1, oma = c(3, 2, 3, 2), mar= c(8,4,4,2)+.01)
     
     econDf$cropRanks <- rank(econDf$cropDamage,ties.method = "first") ## Create ranks column for events so that only most destructive event types are graphed
     sub <- econDf[econDf$cropRanks > max(econDf$cropRanks)-10,] ## Subset top destructive data sorted by event type
     sub <- sub[order(sub$cropRanks),] ## Rearrange subset in increasing order for plotting
    
     ## Create barplot of cropDamage by event types
     barPlot <- barplot(sub$cropDamage, axes = FALSE, axisnames = FALSE, main="Crop Losses by Event Type") 
     xLabels <- sub$EVTYPE
     text(barPlot, par("usr")[3], labels = xLabels, srt = 45, adj = c(1,1), xpd = TRUE, cex=.7) ## Creates rotated labels for x-axis
     axis(2) 
     
     econDf$propertyRanks <- rank(econDf$propertyDamage,ties.method = "first") ## Create ranks column for events so that only most destructive event types are graphed
     sub <- econDf[econDf$propertyRanks > max(econDf$propertyRanks)-10,] ## Subset top destructive data sorted by event type
     sub <- sub[order(sub$propertyRanks),] ## Rearrange subset in increasing order for plotting
     
     ## Create barplot of propertys by event types
     barPlot <- barplot(sub$propertyDamage, axes = FALSE, axisnames = FALSE, main="Property Losses by Event Type") 
     xLabels <- sub$EVTYPE
     text(barPlot, par("usr")[3], labels = xLabels, srt = 45, adj = c(1,1), xpd = TRUE, cex=.7) ## Creates rotated labels for x-axis
     axis(2)
     
     econDf$combinedRanks <- rank(econDf$COMBINED,ties.method = "first") ## Create ranks column for events so that only most destructive event types are graphed
     sub <- econDf[econDf$combinedRanks > max(econDf$combinedRanks)-10,] ## Subset top destructive data sorted by event type
     sub <- sub[order(sub$combinedRanks),] ## Rearrange subset in increasing order for plotting
     
     ## Create barplot of TOTAL cropDamage & propertys by event types
     barPlot <- barplot(sub$COMBINED, axes = FALSE, axisnames = FALSE, main="Crop and Property Combined Losses") 
     xLabels <- sub$EVTYPE
     text(barPlot, par("usr")[3], labels = xLabels, srt = 45, adj = c(1,1), xpd = TRUE, cex=.7) ## Creates rotated labels for x-axis
     axis(2)
     
     ## Add whole plot titles
     mtext("Events Most Harmful to Economic Health", side=3, outer = TRUE, font=2)
     mtext("Amount of Losses in Billions USD", side=2, outer = TRUE, padj = 1, cex=.8, adj = .65, font=2)
     mtext(quote(bold("FIGURE 2: \n") ~ "The above graphs show the crop damage losses and property damage losses caused by different natural events.\nFrom the data above, floods and water-related events are by far the most harmful events to economic health."), side = 1, outer = TRUE, cex=.8, adj=.5)

Summary

The results clearly show that tornadoes are by far the most dangerous weather event to human health. Economically speaking, floods historically have created the biggest losses in the US, with hurricanes/typhoons contributing significantly as well. Interestingly, property damage accounts for a lot more loss to the economy than crop damage, though arguably, crop loss negatively affects population health as well.