Synopsis

Severe weather events have significant costs to human health and the economy in the United States. But which are the most severe?

Analysis of U.S. National Oceanic and Atmospheric Administartion’s (NOAA) storm data from 1996 to 2011 found that tornandos have the biggest impact human health (recorded fatalities and injuries).

The same period data shows that hurricanes have the biggest economic cost (recorded property and crop damage combined).

Data Processing

The U.S. National Oceanic and Atmospheric Administartion’s (NOAA) storm data set for events from 1950 to November 2011 was download and loaded into R. NOAA has been collecting all storm event type data since January 1996, so the data from January 1996 to November 2011 were included for analysis (on event beginning date).

Total economic damage was calculated as total property damage and total crop damage combined using the expense code to identify amount.

Some data cleaning was performed. Some substitutions of event type (e.g. “tstm wind” -> “thunderstorm wind”, “thunderstorm wind” -> “thunderstorm wind”) were done and an 2005 Flood in Napa valley, which was coded as having 115 billion dollars in damage, was removed from further analysis (upon review in the event remarks, this extreme value was obviously incorrect).

For more information about the original data source, go the Storm Events Database website.

library(ggplot2)
library(dplyr)
library(reshape2)

if(!file.exists("StormData.csv.bz2")) download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2","StormData.csv.bz2")
storms <- read.csv("StormData.csv.bz2")
##Calculate totals and filter to include only data after 1996 when all event types were captured

storms <- select(storms,c(BGN_DATE,EVTYPE,FATALITIES,INJURIES,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP)) 
storms <- mutate(storms,event.year=format(strptime(BGN_DATE, "%m/%d/%Y"),"%Y"),event.type=tolower(EVTYPE))
storms <- filter(storms,event.year >= 1996)

##Calculate economic damage totals and remove entry with Napa Valley 2005 flood with over 100 Billion dollars damage coded (upon review of remarks in source file, this is obviously a mistake)

storms <- mutate(storms,crop.damage.dollars=ifelse(CROPDMGEXP=="B",CROPDMG*1000000000,ifelse(CROPDMGEXP=="M",CROPDMG*1000000,ifelse(CROPDMGEXP=="K",CROPDMG*1000,CROPDMG))),
                  property.damage.dollars=ifelse(PROPDMGEXP=="B",PROPDMG*1000000000,ifelse(PROPDMGEXP=="M",PROPDMG*1000000,ifelse(PROPDMGEXP=="K",PROPDMG*1000,PROPDMG))),
                  total.damage.dollars=property.damage.dollars+crop.damage.dollars)
storms <- filter(storms,total.damage.dollars<100000000000)

##Summarize data by event type, then use substitution to better standardize event type element (need to apply grepl after summary due for performance considerations)

events.summary <- storms %>% group_by(event.type) %>% summarise(total.casualties=sum(FATALITIES) + sum(INJURIES),total.fatalities=sum(FATALITIES),total.injuries=sum(INJURIES),total.damage.dollars.million=round(sum(total.damage.dollars)/1000000,2)) %>% filter(total.casualties>0 & total.damage.dollars.million>0) 

events.summary <- mutate(events.summary,event.type.clean = ifelse(grepl("typhoon|hurricane",event.type),"hurricane",ifelse(grepl("storm surge",event.type),"storm surge/tide",ifelse(grepl("fire",event.type),"wildfire",ifelse(grepl("marine tstm|marine thunderstorm",event.type),"marine thunderstorm wind",ifelse(grepl("tstm|thunderstorm",event.type),"thunderstorm wind",event.type))))))

events.summary <- events.summary %>% group_by(event.type.clean) %>% summarise(total.casualties=sum(total.casualties),total.fatalities=sum(total.fatalities),total.injuries=sum(total.injuries),total.damage.dollars.million=sum(total.damage.dollars.million))

Results

options(scipen = 100)

top.events.by.casualties <- events.summary %>% select(-total.damage.dollars.million) %>% arrange(desc(total.casualties))

top.events.by.casualties <- melt(top.events.by.casualties[1:10,],id=c("event.type.clean","total.casualties"))

g <- ggplot(top.events.by.casualties, aes(x=reorder(event.type.clean,total.casualties),y=value,fill=variable))
                                          

g + geom_bar(stat = "identity") + coord_flip() + ylab("Total Casualties (fatalities + injuries)") + xlab("Storm Event Type")+ labs(fill = "Casualty") + scale_fill_discrete(labels = c("Fatalities", "Injuries"))


Figure 1: Top Ten U.S. Storm Event Types by Human Casualties: 1996 to 2011

The storm event type with the greatest impact to human health was tornado, with 22178.

top.events.by.damage <- arrange(events.summary,desc(total.damage.dollars.million))[1:10,]

g <- ggplot(top.events.by.damage,aes(x=reorder(event.type.clean,total.damage.dollars.million),y=total.damage.dollars.million))

g + geom_bar(stat = "identity") + coord_flip() + ylab("Total Damage ($ Million)") + xlab("Storm Event Type")


Figure 2: Top Ten U.S. Storm Event Types by Economic Damage to Property & Crops: 1996 to 2011

The storm event type with the greatest impact to the economy was hurricane, with 87068.99 damage in millions of U.S. dollars to both propety and crops.