Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Synopsis

The analysis shows that tornadoes are the most dangerous event, followed by excessive heat. Flash floods and thunderstorm winds provoqued billions of dollars in damages between 1950 and 2011. The cause of the largest crop damage was drought, followed by flood and hails.

Data Processing

The database used can be dowloaded here, and the description can be found here.

library(ggplot2)
library(gridExtra)
library(plyr)
data <- read.csv(bzfile("repdata_data_StormData.csv.bz2"))

Before start, a normalization in the variable EVTYPE (event type) is needed, in order to have the same category without taking into account the way the event was logged (uppercase/lowercase)

eventTypes <- tolower(data$EVTYPE)
eventTypes <- gsub("[[:blank:][:punct:]+]", " ", eventTypes)
data$EVTYPE <- eventTypes
length(unique(data$EVTYPE))
## [1] 874

Across the United States, most harmfull events with respect to Population Health

Results

To find the most harmfull events, we are going to sum the casualties group by the event type.

casualtiesByEvent <- ddply(data, .(EVTYPE), summarize, fatalities = sum(FATALITIES), injuries = sum(INJURIES))
fatalEvents <- head(casualtiesByEvent[order(casualtiesByEvent$fatalities, decreasing = T), ], 10)
injuryEvents <- head(casualtiesByEvent[order(casualtiesByEvent$injuries, decreasing = T), ], 10)

Top 10 events that caused largest number of deaths:

fatalEvents[, c("EVTYPE", "fatalities")]
##             EVTYPE fatalities
## 741        tornado       5633
## 116 excessive heat       1903
## 138    flash flood        978
## 240           heat        937
## 410      lightning        816
## 762      tstm wind        504
## 154          flood        470
## 515    rip current        368
## 314      high wind        248
## 19       avalanche        224

Top 10 events that caused most number of injuries:

injuryEvents[, c("EVTYPE", "injuries")]
##                EVTYPE injuries
## 741           tornado    91346
## 762         tstm wind     6957
## 154             flood     6789
## 116    excessive heat     6525
## 410         lightning     5230
## 240              heat     2100
## 382         ice storm     1975
## 138       flash flood     1777
## 671 thunderstorm wind     1488
## 209              hail     1361
p1 <- ggplot(data=fatalEvents,
             aes(x=reorder(EVTYPE, fatalities), y=fatalities, fill=fatalities)) +
    geom_bar(stat="identity") +
    coord_flip() +
    ylab("Total number of fatalities") +
    xlab("Event")

p2 <- ggplot(data=injuryEvents,
             aes(x=reorder(EVTYPE, injuries), y=injuries, fill=injuries)) +
    geom_bar(stat="identity") +
    coord_flip() + 
    ylab("Total number of injuries") +
    xlab("Event")

grid.arrange(p1, p2, top="Top deadly weather events")

Economic Effects of Weather Events

Results

Property Damage

propertyDamageData <- aggregate(PROPDMG ~ EVTYPE, data = data, FUN=sum)
propertyDamageData <- arrange(propertyDamageData, desc(propertyDamageData[, 2]))
top10PropertyDamageData <- propertyDamageData[1:10,]


ggplot(top10PropertyDamageData, aes(x = reorder(EVTYPE, -PROPDMG), y = PROPDMG)) +
  geom_bar(stat = "identity") +
  xlab("Weather Event Type") +
  ylab("Property Damage") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ggtitle('Top 10 Property Damage')

Crop Damage

cropDamageData <- aggregate(CROPDMG ~ EVTYPE, data = data, FUN=sum)
cropDamageData <- arrange(cropDamageData, desc(cropDamageData[, 2]))
top10CropDamageData <- cropDamageData[1:10,]
ggplot(top10CropDamageData, aes(x = reorder(EVTYPE, -CROPDMG), y = CROPDMG)) +
  geom_bar(stat = "identity") +
  xlab("Weather Event Type") +
  ylab("Property Damage") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ggtitle('Top 10 Crop Damage')