Public health and economic problems caused by storms and other severe weather events

Synopsis

This analysis uses the data made available by the U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database to show the public health and economic implications of storms and other weather events.

The NOOA database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The events in the database start in the year 1950 and end in November 2011.

Data processing

Download the U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database

require(R.utils)
require(ggplot2)

if (!file.exists("data/StormData.csv")) {
    download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", 
        "data/StormData.csv.bz2", "curl")
    bunzip2("data/StormData.csv.bz2")
}

Load the data into R

storm.data <- read.csv("data/StormData.csv")

Filter to columns used in the analysis

data <- storm.data[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", 
    "CROPDMG", "CROPDMGEXP")]

Inspect the data set

str(data)
## 'data.frame':    902297 obs. of  7 variables:
##  $ EVTYPE    : Factor w/ 985 levels "?","ABNORMALLY DRY",..: 830 830 830 830 830 830 830 830 830 830 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
tail(data, n = 3)
##            EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
## 902295  HIGH WIND          0        0       0          K       0
## 902296   BLIZZARD          0        0       0          K       0
## 902297 HEAVY SNOW          0        0       0          K       0
##        CROPDMGEXP
## 902295          K
## 902296          K
## 902297          K

Tidy up the data set and calculate total economic damage

# Convert exponents to numeric, so that they can be used to calculate the
# total damage
exponents.to.numeric <- function(data) {
    data = toupper(as.character(data))
    data[data == ""] <- 0
    data[(data == "+") | (data == "-") | (data == "?")] <- 1
    data[data == "H"] <- 2
    data[data == "K"] <- 3
    data[data == "M"] <- 6
    data[data == "B"] <- 9
    as.numeric(data)
}
data$PROPDMGEXP <- exponents.to.numeric(data$PROPDMGEXP)
data$CROPDMGEXP <- exponents.to.numeric(data$CROPDMGEXP)
data$ECONOMICDAMAGE <- data$PROPDMG * 10^data$PROPDMGEXP + data$CROPDMG * 10^data$CROPDMGEXP
tail(data, n = 3)
##            EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
## 902295  HIGH WIND          0        0       0          3       0
## 902296   BLIZZARD          0        0       0          3       0
## 902297 HEAVY SNOW          0        0       0          3       0
##        CROPDMGEXP ECONOMICDAMAGE
## 902295          3              0
## 902296          3              0
## 902297          3              0

Results

Using the weather events data set, we investigate which events were the most harmful both economically and with respect to population.

Which types of events are most harmful with respect to population health?

In order to find out what are the most harmful event types with respect to the population health we check the fatality and injury rates.

# Base function for all plots
weather.event.plot <- function(data, var, ylabel = paste("Number of", var), 
    title = paste(var, "per event type")) {
    values <- head(sort(tapply(data[[toupper(var)]], data$EVTYPE, sum), decreasing = T))
    variable <<- data.frame(names(values), values, row.names = NULL)
    head(variable)

    ggplot(data = variable, aes(x = variable$names, y = variable$values)) + 
        geom_bar(stat = "identity") + theme(axis.text.x = element_text(angle = 90)) + 
        xlab("Event Type") + ylab(ylabel) + ggtitle(title)
}

Fatalities

weather.event.plot(data, "Fatalities")

plot of chunk unnamed-chunk-7

Injuries

weather.event.plot(data, "Injuries")

plot of chunk unnamed-chunk-8

Clearly, the tornado is the most harmful event for the population health, considering both fatality and injuries. Weather events such as excessive heat and flood also play an important role.

Which types of events have the greatest economic consequences?

weather.event.plot(data, "economicdamage", ylabel = "Economic damage (USD)", 
    title = "Economic damage per event type")

plot of chunk unnamed-chunk-9

Flood is by far the event that has the greatest economic impact, tornado and typhon also have a significant participation.