The Health and Economic Impacts of Extreme Weather Events in the United States (1950-2011)

Synopsis

Extreme weather events can cause serious consequences on the local economy as well as the health of residents. This project uses the NOAA Storm Database which is assembled from data provided by the National Weather Service (NWS) covering a timeframe between 1950-2011. The dataset was analyzed as a whole without partitioning for time.
In analyses of economic damages, flooding caused the largest economic impacts by far. In analyses of public health costs, tornadoes were determined to cause the highest numbers of human casualties. Enhanced efforts to control/prevent flooding and better predict the occurrence and trajectories of tornadoes may be warranted based on the results of these preliminary analyses.

Data Processing

File Import

Sets working directory and unzips the .bz2 file with bzfile() and uses read.table() to generate an R dataframe

setwd("~/coursera/lectures/5-ReproducibleResearch/Assignment2") 
storm <- read.table(bzfile("repdata-data-StormData.csv.bz2"), header=TRUE, sep=",", na.strings = "")

We are only going to keep the relevant columns for our data analysis. These columns are: “EVTYPE”, “FATALITIES”, “INJURIES”, “PROPDMG”, “PROPDMGEXP”, “CROPDMG”, “CROPDMGEXP”.

storm <- storm[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]
Economic Damages Calculation

The monetary damages in this dataset are provided as a number multiplied by an exponent (i.e. PROPDMG^PROPDMGEXP). The format that the exponent is recorded in is quite variable, for example:

levels(storm$PROPDMGEXP)
##  [1] "-" "?" "+" "0" "1" "2" "3" "4" "5" "6" "7" "8" "B" "h" "H" "K" "m"
## [18] "M"

I will convert all exponents to integer values to allow for calculation of economic damages. Ambiguous values such as “-”, “?”, “+” have been set to 1. I’ve also decided to set values of “0” to “1” because n^0 = 1 while n^1 = n and I believe this may be an input error. These exponents will then be used to calculate the property damages.

levels(storm$PROPDMGEXP) <- c("1", "1", "1", "1", "1", "2", "3", "4", "5", "6", "7", "8", "9", "2", "2", "3", "6", "6")
levels(storm$CROPDMGEXP) <- c("1", "1", "2", "9", "3", "3", "6", "6")
storm$totPropDmg <- storm$PROPDMG * 10^as.integer(storm$PROPDMGEXP)
storm$totCropDmg <- storm$CROPDMG * 10^as.integer(storm$CROPDMGEXP)

Total economic damage will be taken as the sum of property and crop damages. For the sake of simplicity, economic damage is not adjusted for inflation. Health damage will be taken as the total number of casualties which is the sum total of injuries and fatalities.

storm$economicDmg <- storm$totPropDmg + storm$totCropDmg
storm$healthDmg <- storm$FATALITIES + storm$INJURIES
Event Types Cleaning

There are 985 levels in the “EVTYPE” column. A quick examination suggests that some data cleaning may be useful to consolidate similar entries and correct for data entry mistakes. The following regex reduces the number of factors to 670.

levels(storm$EVTYPE) <- gsub("[^a-zA-Z]", " ", toupper(levels(storm$EVTYPE))) #replace non-alphabetical characters with spaces
levels(storm$EVTYPE) <- gsub("TSTM", "THUNDERSTORM", levels(storm$EVTYPE)) 
levels(storm$EVTYPE) <- gsub("FLOODING", "FLOOD", levels(storm$EVTYPE)) 
levels(storm$EVTYPE) <- gsub("WINDS", "WIND", levels(storm$EVTYPE))
levels(storm$EVTYPE) <- gsub("HURRICANE [A-Z]+", "HURRICANE TYPHOON", levels(storm$EVTYPE))
levels(storm$EVTYPE) <- gsub("CURRENTS", "CURRENT", levels(storm$EVTYPE))
levels(storm$EVTYPE) <- gsub("^\\s+", "", levels(storm$EVTYPE))
levels(storm$EVTYPE) <- gsub("\\s+$", "", levels(storm$EVTYPE))
levels(storm$EVTYPE) <- gsub("\\s+/g", " ", levels(storm$EVTYPE)) #replace multiple spaces with a single space

Results

Top 20 Most Damaging Extreme Weather Events

To obtain the most damaging weather events, we will first sum the health and economic damage across event types using the aggregate() function.

#economic damages
ecoDmg <- aggregate(storm$economicDmg, by = list(storm$EVTYPE), sum, na.rm=TRUE)
colnames(ecoDmg) <- c("EventType", "Dollars")
#health damages
humDmg <- aggregate(storm$healthDmg, by = list(storm$EVTYPE), sum, na.rm=TRUE)
colnames(humDmg) <- c("EventType", "Casualties")

To obtain the top 20 most damaging weather events, we’ll order the data by total damages and then subset the first 20 rows

topEco <- ecoDmg[order(-ecoDmg$Dollars), ][1:20, ]
topEco$EventType <- factor(topEco$EventType, levels = topEco$EventType) #convert to factor
topHum <- humDmg[order(-humDmg$Casualties), ][1:20, ]
topHum$EventType <- factor(topHum$EventType, levels = topHum$EventType) #convert to factor
Load Libraries
library(scales)
library(ggplot2)
Economic Costs by Extreme Weather Event
require(ggplot2)
ggplot(topEco, aes(EventType, Dollars)) +
    geom_bar(stat="identity", aes(fill = Dollars)) +
    theme(axis.text.x = element_text(angle=45, hjust=1)) +
    labs(x="Extreme Weather Event", y="Economic Cost (U.S. Dollars)", title="Total Economic Costs of Top 20 Extreme Weather Events (1950-2011)") +
    scale_y_continuous(labels = dollar)

plot of chunk ecoPlot

By far the largest economic losses are caused by floods with total damages exceeding 100 billion U.S. dollars. Other extreme weather events causing large economic damages are Hurricanes/Typhoons, Tornadoes, and Hail.

Human Casualties by Extreme Weather Event
require(ggplot2)
ggplot(topHum, aes(EventType, Casualties)) +
    geom_bar(stat="identity", aes(fill = Casualties)) +
    theme(axis.text.x = element_text(angle=45, hjust=1)) +
    labs(x="Extreme Weather Event", y="Human Casualties (Injuries + Fatalities)", title="Total Human Casualties of Top 20 Extreme Weather Events (1950-2011)") +
    scale_y_continuous(labels = comma)

plot of chunk humPlot

Extreme weather events also have a large impact on public health. In terms of total human casualties (sum total of injuries and fatalities), Tornadoes cause by far the largest toll. Other extreme weather events causing large numbers of human casualties are Thunderstorm Winds, Excessive Heat, Lightning, and Flash Floods.

Discussion

Economic Costs of Flooding

The high economic costs associated with floods in this study may be partially due to the fact that economic damages from floods are recorded differently from other weather events. The National Weather Service mandates that damage estimates for floods are always recorded even if no reasonably accurate estimates, such as those from an insurance company, are available. As a result, the economic costs of non-flood events are likely underestimated relative to flood-related damages.

Future Directions

In future analyses, the economic costs should be adjusted for inflation. Also, it may be beneficial to assess injuries and fatalities seperately when assesssing impacts on public health. It would be extremely interesting to analyze this dataset as a time-series to assess whether event types and their associated health and economic costs have changed over the past ~60 years. It would also be interesting to analyze these data using the geographic locations provided.