Fatalities, injuries, and economic consequences costos to the US economy from 1950 to 2011

INTRODUCTION

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Synopsis

The analysis on the storm event database revealed that tornadoes are the most dangerous weather event type regarding the injuries and fatalities in the US. In terms of weather events that caused greatest economic consequences, the shows that tornado caused the most property damage in this 61 years of analysis; regarding the crop damages, hail is the top 1.

Data Informatino

The data for this report come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. You can download the file from the course web site:

Storm Data There is also some documentation of the database available. Here you will find how some of the variables are constructed/defined.

National Weather Service Storm Data Documentation. National Climatic Data Center Storm Events FAQ.

The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

DATA PROCESSING

The data is a comma-separated-value file compressed via bzip2 algorithm to reduce its size. Hence, the first step is to read the data into a data frame and display a summary of the information .

storm <- read.csv(bzfile("/Users/isabelmendez/Documents/R/ExploratoryDataAnalysis/project2/repdata-data-StormData.csv.bz2"), na.strings = "NA")
#summary(storm)

Preprocessing data

For the Weather Event Types it was formatted the words to refer the same type of event.

# number of unique event types
length(unique(storm$EVTYPE))
## [1] 985
# translate all letters to lowercase
event_types <- tolower(storm$EVTYPE)
# replace all punct. characters with a space
event_types <- gsub("[[:blank:][:punct:]+]", " ", event_types)
length(unique(event_types))
## [1] 874
# update the data frame
storm$EVTYPE <- event_types

Across the United States, which types of events (as indicated in the EVTYPE EVTYPE variable) are most harmful with respect to population health?

To find the event types that are most harmful to population health, the number of casualties are aggregated by the event type.

library(plyr)
casualties <- ddply(storm, .(EVTYPE), summarize,
                    fatalities = sum(FATALITIES),
                    injuries = sum(INJURIES))

# Find events that caused most death and injury
fatal_events <- head(casualties[order(casualties$fatalities, decreasing = T), ], 10)
injury_events <- head(casualties[order(casualties$injuries, decreasing = T), ], 10)

Then, it is analyzed the top 10 events that caused largest number of deaths, being the tornado the main cause of death.

fatal_events[, c("EVTYPE", "fatalities")]
##             EVTYPE fatalities
## 741        tornado       5633
## 116 excessive heat       1903
## 138    flash flood        978
## 240           heat        937
## 410      lightning        816
## 762      tstm wind        504
## 154          flood        470
## 515    rip current        368
## 314      high wind        248
## 19       avalanche        224

Later, the top 10 events that caused most number of injuries are analyzed, being again the tornado the most harmful.

injury_events[, c("EVTYPE", "injuries")]
##                EVTYPE injuries
## 741           tornado    91346
## 762         tstm wind     6957
## 154             flood     6789
## 116    excessive heat     6525
## 410         lightning     5230
## 240              heat     2100
## 382         ice storm     1975
## 138       flash flood     1777
## 671 thunderstorm wind     1488
## 209              hail     1361

Across the United States, which types of events have the greatest economic consequences?

library(plyr)
econ_loss <- ddply(storm, .(EVTYPE), summarize,
                   prop_dmg = sum(PROPDMG),
                   crop_dmg = sum(CROPDMG))

# filter out events that caused no economic loss
econ_loss <- econ_loss[(econ_loss$prop_dmg > 0 | econ_loss$crop_dmg > 0), ]
prop_dmg_events <- head(econ_loss[order(econ_loss$prop_dmg, decreasing = T), ], 10)
crop_dmg_events <- head(econ_loss[order(econ_loss$crop_dmg, decreasing = T), ], 10)

The data analysis shows that the 10 events that caused most property damage, in dollars, are the following. Tornado is the top 1.

library(plyr)
prop_dmg_events[, c("EVTYPE", "prop_dmg")]
##                 EVTYPE  prop_dmg
## 741            tornado 3212258.2
## 138        flash flood 1420124.6
## 762          tstm wind 1335995.6
## 154              flood  899938.5
## 671  thunderstorm wind  876844.2
## 209               hail  688693.4
## 410          lightning  603351.8
## 697 thunderstorm winds  446293.2
## 314          high wind  324731.6
## 866       winter storm  132720.6

In the case of the crop damange, from the top 10, hail is the first cause. An important data to analyze is that in this case, tornado is the fifth cause.

crop_dmg_events[, c("EVTYPE", "crop_dmg")]
##                 EVTYPE  crop_dmg
## 209               hail 579596.28
## 138        flash flood 179200.46
## 154              flood 168037.88
## 762          tstm wind 109202.60
## 741            tornado 100018.52
## 671  thunderstorm wind  66791.45
## 84             drought  33898.62
## 697 thunderstorm winds  18684.93
## 314          high wind  17283.21
## 250         heavy rain  11122.80

RESULTS

Most harmful weather events regarding to population health

The following plot shows top dangerous weather event types. This graph reflects that the tornado was the main cause of death during those 61 years; there were more than 5,000 deaths and more than 10,000 injuries during that period in US. The other two top most dangerous weather event types were the excessive heat and flash floods.

library(ggplot2)
library(gridExtra)
# Set the levels in order
p1 <- ggplot(data=fatal_events,
             aes(x=reorder(EVTYPE, fatalities), y=fatalities, fill=fatalities)) +
    geom_bar(stat="identity") +
    coord_flip() + xlab("Weather Event Type") + ylab("Total Number of Fatalities") + theme(legend.position="none")

p2 <- ggplot(data=injury_events,
             aes(x=reorder(EVTYPE, injuries), y=injuries, fill=injuries)) +
    geom_bar(stat="identity") +
    coord_flip() + xlab("Weather Event Type") + ylab("Total Number of Injuries") + theme(legend.position="none")

grid.arrange(p1, p2, top="TOP FATALITIES AND INJURIES IN THE US FROM 1950 TO 2011")

Most harmful weather events that caused greatest economic consequences.

The following plot shows the top most harmful weather events that caused greatest economic consequences. This graph shows that tornado caused the most property damage. For the crop damaage, the first event was the hail, in this case, tornadoes are the fifth weather event that caused greatest economic consequences.

library(ggplot2)
#library(gridExtra)
# Set the levels in order
p1 <- ggplot(data=prop_dmg_events,
             aes(x=reorder(EVTYPE, prop_dmg), y=log10(prop_dmg), fill=prop_dmg )) +
    geom_bar(stat="identity") +
    coord_flip() +
    xlab("Weather Event Type") +
    ylab("Property Damage in Dollars (log-scale)") +
    theme(legend.position="none")

p2 <- ggplot(data=crop_dmg_events,
             aes(x=reorder(EVTYPE, crop_dmg), y=crop_dmg, fill=crop_dmg)) +
    geom_bar(stat="identity") +
    coord_flip() + 
    xlab("Weather Event Type") +
    ylab("Crop Damage in Dollars") + 
    theme(legend.position="none")
grid.arrange(p1, p2, top="ECONOMIC CONSEQUENCES COSTS TO THE US ECONOMY ROM 1950 TO 2011")