The 10 most harmful severe weather events

Synopsis

In this work, we study the public health and economic consequences of severe weather events across USA. In order to do this we have analyzed a dataset provided by the NOAA that includes figures for the number of victims and damage costs caused by different severe weather event types.

In order to determine which event types have the worst impact on public health and economy we have computed two top ten lists: one for the top ten event types with the worse economic consequences (higher damage costs) and another for the top ten event types with the worse health consequences (higher number of victims).

Tornados are the severe weather events with the worst consequences for public health and the 3rd worst consequences for economy.

Floods and hurricane are the events with the worst economic consequences.

Four event types are included in both top ten lists and therefore cause great damage to both public health and economy:

Data Processing

We read the data from the NOAA's Storm Data csv file into one data frame


storm_data <- read.csv("repdata-data-StormData.csv")

Once the data is loaded into memory we will prepare two subsets for further analysis of the health and economic repercusions of severe weather events.

Public health damage

The dataset includes information about the number of injuried people and the number of fatalities that each event caused.

We will explore the respercusions of severe weather events on public health by summing the total injuries and the total fatalities caused by each type of event.


if (!require(plyr)) {
    install.packages("plyr")
}
## Loading required package: plyr

require(plyr)

health_data <- storm_data[, names(storm_data) %in% c("EVTYPE", "FATALITIES", 
    "INJURIES")]

health_dmg <- ddply(health_data, .(EVTYPE), summarize, fatalities = sum(FATALITIES), 
    injuries = sum(INJURIES))

As there are a huge number of event types (985), we will focus in the 10 most harmful event types. In order to do this, we will sum fatalities and injuries (total damage) for each event type. Then we will sort the event types by decreasing total damage and keep the top-most 10. We will also sum the injuried people and fatalities caused by all the other event types and save those figures in an “others” event type.


# Compute the total number of victims
health_dmg <- ddply(health_dmg, .(EVTYPE), transform, total_dmg = sum(fatalities) + 
    sum(injuries))

# Sort by the total number of victims
health_dmg <- health_dmg[order(health_dmg$total_dmg, decreasing = TRUE), ]

# Cumpute the sum aggregate for types not in th top ten and name it 'others'
others <- data.frame(EVTYPE = "others", t(colSums(health_dmg[-11:0, -1])))

# Select the top ten and add the others row

health_dmg <- rbind(health_dmg[1:10, ], others)

Economic repercussions

The dataset includes the costs in $ of the damage caused by each event on crops and in property.

We will explore the economic consequences of severe weather events by summing the total cost of property damage and the total cost of crops damage caused by each type of event.

Costs are represented by two columns for each damage type. One column is the cost figure and the other – its name finishing by EXP – is an exponent. The cost should be computed according to the formula figure*10exponent

The figure columns for property and crops are PROPDMG and CROPSDMG respectively.

Exponents are represented by a character variable and contains numbers for the actual exponent, but also letters like “B”, “M”, “K”, “H” meaning billions, millions, thousands and hundreds.


economic_data <- storm_data[, names(storm_data) %in% c("EVTYPE", "PROPDMG", 
    "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]


# This functions translates the character representation of exponents into
# its numeric equivalent.
translate_exponentials <- function(x) {
    x <- tolower(x)
    x[x == "b"] <- 9
    x[x == "h"] <- 2
    x[x == "k"] <- 3
    x[x == "m"] <- 6

    x <- suppressWarnings(as.numeric(x))
    x[is.na(x)] <- 0
    result = 10^x
    result

}


# Compute the damages in dolars
economic_data <- transform(economic_data, PROPDMG = PROPDMG * translate_exponentials(as.character(economic_data$PROPDMGEXP)), 
    CROPDMG = CROPDMG * translate_exponentials(as.character(economic_data$CROPDMGEXP)))

# Summarize by event type

economic_dmg <- ddply(economic_data, .(EVTYPE), summarize, property = sum(PROPDMG), 
    crops = sum(CROPDMG))

As in the case of public health consequences, we will keep only the top ten worst events. We proceed exactly as before.



economic_dmg <- ddply(economic_dmg, .(EVTYPE), transform, total_dmg = sum(property) + 
    sum(crops))

economic_dmg <- economic_dmg[order(economic_dmg$total_dmg, decreasing = TRUE), 
    ]

others <- data.frame(EVTYPE = "others", t(colSums(economic_dmg[-11:0, -1])))

economic_dmg <- rbind(economic_dmg[1:10, ], others)

Results

Public health

if (!require(ggplot2)) {
    install.packages("ggplot2")
}
## Loading required package: ggplot2

require(ggplot2)

if (!require(reshape2)) {
    install.packages("reshape2")
}
## Loading required package: reshape2

require(reshape2)

if (!require(scales)) {
    install.packages("scales")
}
## Loading required package: scales

require(scales)

# Reorganize the data to plot a bar graph using ggplot

# All damages must be in the same column and a factor indicating the damage
# type: fatality or injury is needed.

bar_graph_data <- melt(health_dmg, id.vars = c("EVTYPE"), measure.vars = c("fatalities", 
    "injuries"), value.name = "dmg", variable.name = "dmg_type")

# Recompute the total damage for each event type
bar_graph_data <- ddply(bar_graph_data, .(EVTYPE), transform, total_dmg = sum(dmg))

# Event names look better in lower case
bar_graph_data <- transform(bar_graph_data, EVTYPE = tolower(EVTYPE))

# We have lost the order. Reorder the data in decreasing order of total
# damage
bar_graph_data <- bar_graph_data[order(bar_graph_data$total_dmg, decreasing = TRUE), 
    ]

# And put the 'others' category in the last place
bar_graph_data <- rbind(bar_graph_data[bar_graph_data$EVTYPE != "others", ], 
    bar_graph_data[bar_graph_data$EVTYPE == "others", ])

# Plot it
ggplot(bar_graph_data, aes(x = reorder(bar_graph_data$EVTYPE, seq(nrow(bar_graph_data), 
    1)), y = dmg, fill = dmg_type)) + geom_bar(stat = "identity", position = "dodge") + 
    xlab("Event Type") + ylab("# victims") + scale_y_log10(breaks = c(10, 100, 
    1000, 1e+05), labels = comma) + labs(fill = "") + coord_flip()

plot of chunk unnamed-chunk-5


Figure 1. Top 10 severe weather events that produce more victims (injuried people plus fatalities). # victims are shown as the 10 base logarithm to make comparison easier.

Figure 1 shows that the most harmul events to public health are tornados. They alone cause between 1 and 2 orders of magnitude more victims than the other top ten events.

Economic consequences



# Reorganize the data to plot a bar graph using ggplot

# All damages must be in the same column and a factor indicating the damage
# type: fatality or injury is needed.

bar_graph_data <- melt(economic_dmg, id.vars = c("EVTYPE"), measure.vars = c("property", 
    "crops"), value.name = "dmg", variable.name = "dmg_type")

# Recompute the total damage for each event type
bar_graph_data <- ddply(bar_graph_data, .(EVTYPE), transform, total_dmg = sum(dmg))

# Event names look better in lower case
bar_graph_data <- transform(bar_graph_data, EVTYPE = tolower(EVTYPE))

# We have lost the order. Reorder the data in decreasing order of total
# damage
bar_graph_data <- bar_graph_data[order(bar_graph_data$total_dmg, decreasing = TRUE), 
    ]

# And put the 'others' category in the last place
bar_graph_data <- rbind(bar_graph_data[bar_graph_data$EVTYPE != "others", ], 
    bar_graph_data[bar_graph_data$EVTYPE == "others", ])

ggplot(bar_graph_data, aes(x = reorder(bar_graph_data$EVTYPE, seq(nrow(bar_graph_data), 
    1)), y = dmg, fill = dmg_type)) + geom_bar(stat = "identity", position = "dodge") + 
    xlab("Event Type") + ylab("Damage costs ($)") + scale_y_log10(breaks = c(10^3, 
    10^6, 10^9, 10^12), labels = comma) + labs(fill = "") + coord_flip()

plot of chunk unnamed-chunk-6


Figure 2. Top 10 severe weather events that produce more economic losses (on property plus crops). Cost is shown as the 10 base logarithm to make comparisons easier.

Figure 2 shows that both floods and hurricanes have the greater impact. It's interesting to notice that tornados -that were the events with the greatest impact on publi health -, are the third cause of economic losses.

Event types with great impact in both public health and economy

The following are the severe event types that cause both the highest economic losses and the highest number of victims:


tolower(intersect(health_dmg$EVTYPE[1:10], economic_dmg$EVTYPE[1:10]))
[1] "tornado"     "flood"       "flash flood" "ice storm"