Economic and health consecuences of severe weather events in the USA

Synopsis

This study tries to find which types of severe weather events have been more harmful with respect to population health and which have had the greatest economic consequences. All the study is fully reproducible and all the code is provided for that purpose within this document. The code has been written as short and clear as possible to facilitate understanding.

Please, excuse any language faults in the document (English is not my mother tongue).

Data Processing

Downloading the data

Data is automatically downloaded if not present already in the system. In order to accelerate the process and minimize memory usage, only the required columns from the CSV file are loaded into a dataframe.

data_source = 'https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2'
if (!file.exists('data.csv.bz2'))
    download.file(data_source, 'data.csv.bz2', method = 'curl')
data_columns = c(rep('NULL', 7), NA, rep('NULL', 14), rep(NA, 6), rep('NULL', 9))
df <- read.csv('data.csv.bz2', colClasses = data_columns)
str(df)
## 'data.frame':    902297 obs. of  7 variables:
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...

Processing for population health

The sum of all the injuries and fatalities is calculated for each type of event.

health_df <- aggregate(cbind(INJURIES, FATALITIES) ~ EVTYPE, df, 'sum')

Then, for plotting the results, only the worst events are selected.

N_WORST <- 6
worst_fatalities <- head(health_df[order(-health_df$FATALITIES), c('EVTYPE', 'FATALITIES')], N_WORST)
worst_injuries <- head(health_df[order(-health_df$INJURIES), c('EVTYPE', 'INJURIES')], N_WORST)

Processing for economic consequences

Both property and crop damage are formed out of a rounded number (PROPDMG and CROPDMG) and a factor (PROPDMGEXP, CROPDMGEXP). In order to know the real costs, each number is to be multiplied by the corresponding factor, if this is the case.

A function has been created to permorm this task over the dataframe.

exp_multiply <- function(row, num, factor) {
    if (row[factor] %in% c('h', 'H'))
        return(as.numeric(row[num]) * 100)
    if (row[factor] %in% c('k', 'K'))
        return(as.numeric(row[num]) * 1000)
    if (row[factor] %in% c('m', 'M'))
        return(as.numeric(row[num]) * 1000000)
    if (row[factor] %in% c('b', 'B'))
        return(as.numeric(row[num]) * 1000000000)
    return(as.numeric(row[num]))
    
}
df['PROPDMG'] <- apply(df, 1, function(x) exp_multiply(x, 'PROPDMG', 'PROPDMGEXP'))
df['CROPDMG'] <- apply(df, 1, function(x) exp_multiply(x, 'CROPDMG', 'CROPDMGEXP'))

Following the same procedure as before, the sum of the economic damage is calculated for each type of event. In this case, property and crop costs are not separated.

economic_df <- aggregate(PROPDMG + CROPDMG ~ EVTYPE, df, 'sum')
colnames(economic_df) <- c('EVTYPE', 'ALLDMG')

Then, for plotting the results, only the worst events are selected.

worst_dmg <- head(economic_df[order(-economic_df$ALLDMG), ], N_WORST)

Results

The color brewer package is used in order to generate color palettes for the graphics.

library(RColorBrewer)
palette <- rev(brewer.pal(N_WORST, "YlOrRd"))

The graphics bellow show the weather events which have resulted in the highest injuries and fatalities between 1950 and 2011. Notice the numbers are in thousands of people.

par(mfrow = c(1, 2), mar = c(10, 4.1, 4.1, 2.1))
# Injuries
barplot(worst_injuries$INJURIES / 1000, col = palette,
        main = 'Injuries', ylab = 'Count (thousands)',
        names.arg = worst_injuries$EVTYPE, las = 2, cex.names = 0.8)
# Fatalities
barplot(worst_fatalities$FATALITIES / 1000, col = palette,
        main = 'Fatalities', ylab = 'Count (thousands)',
        names.arg = worst_fatalities$EVTYPE, las = 2, cex.names = 0.8)

plot of chunk unnamed-chunk-8

Tornados cause both the highest number of injuries and fatalities. While the number of injuries caused by tornados is much higher than those caused by any other factor, surprisingly, the second factor causing the highest number of fatalities is the excessive heat, causing more than 35% of the number of fatalities caused by tornados.

The graphic bellow show the weather events which have resulted in the worst economic damage between 1950 and 2011. Notice the costs are in billions of dollars.

par(mar = c(10, 4.1, 4.1, 2.1))
# Economic damage
barplot(worst_dmg$ALLDMG / 1e9, col = palette,
        main = 'Total economic damage', ylab = 'Costs (billions of dollars)',
        names.arg = worst_dmg$EVTYPE, las = 2, cex.names = 0.8)

plot of chunk unnamed-chunk-9

Although tornados are the worst events for human health, they are just in third position when talking about economic costs. The highest economic costs in this case are caused by floods, followed by hurricanes/typhoons.