Health and Economic Impact of Weather Events in the US

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Synopsis

By doing analysis on the storm event database, it is found that tornadoes are the most dangerous weather event to the population health. The second most dangerous event type is the excessive heat. The economic impact of weather events was also analyzed. Flash floods and thunderstorm winds caused billions of dollars in property damages between 1950 and 2011. The largest crop damage caused by drought, followed by flood and hails.

Data Processing

The following analysis is going to be performed on Storm Events Database, provided by National Climatic Data Center. The data is from a comma-separated-value file available here. There is also some documentation of the data available here.

Reading the data into a data frame.

storm_data <- read.csv(bzfile("data/repdata_data_StormData.csv.bz2"))

Cleaning of data is necessary. Event types don’t have a specific format.

# number of unique event types
length(unique(storm_data$EVTYPE))
## [1] 985
# translate all letters to lowercase
event_types <- tolower(storm_data$EVTYPE)
# replace all punct. characters with a space
event_types <- gsub("[[:blank:][:punct:]+]", " ", event_types)
length(unique(event_types))
## [1] 874
# update the data frame
storm_data$EVTYPE <- event_types

Section - I: Effect over Population Health

We first aggregate the number of casualities by the event types to find those event types that are most harmful to population health.

library(plyr)
casualties <- ddply(storm_data, .(EVTYPE), summarize,
                    fatalities = sum(FATALITIES),
                    injuries = sum(INJURIES))

# Find events that caused most death and injury
fatal_events <- head(casualties[order(casualties$fatalities, decreasing = T), ], 10)
injury_events <- head(casualties[order(casualties$injuries, decreasing = T), ], 10)

Top 10 events that caused largest number of deaths are

fatal_events[, c("EVTYPE", "fatalities")]
##             EVTYPE fatalities
## 741        tornado       5633
## 116 excessive heat       1903
## 138    flash flood        978
## 240           heat        937
## 410      lightning        816
## 762      tstm wind        504
## 154          flood        470
## 515    rip current        368
## 314      high wind        248
## 19       avalanche        224

Top 10 events that caused most number of injuries are

injury_events[, c("EVTYPE", "injuries")]
##                EVTYPE injuries
## 741           tornado    91346
## 762         tstm wind     6957
## 154             flood     6789
## 116    excessive heat     6525
## 410         lightning     5230
## 240              heat     2100
## 382         ice storm     1975
## 138       flash flood     1777
## 671 thunderstorm wind     1488
## 209              hail     1361

Section - II: Effect over Economy

Property damage and crop damage estimates are used to analyze the impact of weather events on the economy.

exp <- function(eval) {
    # h -> hundred, k -> thousand, m -> million, b -> billion
    if (eval %in% c('h', 'H'))
        return(2)
    else if (eval %in% c('k', 'K'))
        return(3)
    else if (eval %in% c('m', 'M'))
        return(6)
    else if (eval %in% c('b', 'B'))
        return(9)
    else if (!is.na(as.numeric(eval))) 
        return(as.numeric(eval))
    else if (eval %in% c('', '-', '?', '+'))
        return(0)
    else {
        stop("Invalid value")
    }
}
prop_dmg_exp <- sapply(storm_data$PROPDMGEXP, FUN=exp)
storm_data$prop_dmg <- storm_data$PROPDMG * (10 ** prop_dmg_exp)
crop_dmg_exp <- sapply(storm_data$CROPDMGEXP, FUN=exp)
storm_data$crop_dmg <- storm_data$CROPDMG * (10 ** crop_dmg_exp)
# Compute the economic loss by event type
library(plyr)
econ_loss <- ddply(storm_data, .(EVTYPE), summarize,
                   prop_dmg = sum(prop_dmg),
                   crop_dmg = sum(crop_dmg))

# filter out events that caused no economic loss
econ_loss <- econ_loss[(econ_loss$prop_dmg > 0 | econ_loss$crop_dmg > 0), ]
prop_dmg_events <- head(econ_loss[order(econ_loss$prop_dmg, decreasing = T), ], 10)
crop_dmg_events <- head(econ_loss[order(econ_loss$crop_dmg, decreasing = T), ], 10)

Top 10 events that caused most property damage (in dollars) are as follows

prop_dmg_events[, c("EVTYPE", "prop_dmg")]
##                 EVTYPE     prop_dmg
## 138        flash flood 6.820237e+13
## 697 thunderstorm winds 2.086532e+13
## 741            tornado 1.078951e+12
## 209               hail 3.157558e+11
## 410          lightning 1.729433e+11
## 154              flood 1.446577e+11
## 366  hurricane typhoon 6.930584e+10
## 166           flooding 5.920826e+10
## 585        storm surge 4.332354e+10
## 270         heavy snow 1.793259e+10

Similarly, the events that caused biggest crop damage are

crop_dmg_events[, c("EVTYPE", "crop_dmg")]
##                EVTYPE    crop_dmg
## 84            drought 13972566000
## 154             flood  5661968450
## 519       river flood  5029459000
## 382         ice storm  5022113500
## 209              hail  3025974480
## 357         hurricane  2741910000
## 366 hurricane typhoon  2607872800
## 138       flash flood  1421317100
## 125      extreme cold  1312973000
## 185      frost freeze  1094186000

Results

Health impact of weather events

The following plot shows top dangerous weather event types.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.1.3
library(gridExtra)
## Loading required package: grid
# Set the levels in order
p1 <- ggplot(data=fatal_events,
             aes(x=reorder(EVTYPE, fatalities), y=fatalities, fill=fatalities)) +
    geom_bar(stat="identity") +
    coord_flip() +
    ylab("Total number of fatalities") +
    xlab("Event type") +
    theme(legend.position="none")

p2 <- ggplot(data=injury_events,
             aes(x=reorder(EVTYPE, injuries), y=injuries, fill=injuries)) +
    geom_bar(stat="identity") +
    coord_flip() + 
    ylab("Total number of injuries") +
    xlab("Event type") +
    theme(legend.position="none")

grid.arrange(p1, p2, main="Top deadly weather events in the US (1950-2011)")

Economic impact of weather events

The following plot shows the most severe weather event types with respect to economic cost that they have costed since 1950s.

library(ggplot2)
library(gridExtra)
# Set the levels in order
p1 <- ggplot(data=prop_dmg_events,
             aes(x=reorder(EVTYPE, prop_dmg), y=log10(prop_dmg), fill=prop_dmg )) +
    geom_bar(stat="identity") +
    coord_flip() +
    xlab("Event type") +
    ylab("Property damage in dollars (log-scale)") +
    theme(legend.position="none")

p2 <- ggplot(data=crop_dmg_events,
             aes(x=reorder(EVTYPE, crop_dmg), y=crop_dmg, fill=crop_dmg)) +
    geom_bar(stat="identity") +
    coord_flip() + 
    xlab("Event type") +
    ylab("Crop damage in dollars") + 
    theme(legend.position="none")

grid.arrange(p1, p2, main="Weather costs to the US economy (1950-2011)")