Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This study address the following questions:

  1. Across the United States, which types of events are most harmful with respect to population health?
  2. Across the United States, which types of events have the greatest economic consequences?

In order to properly answer these questions the public storm database from NOAA (National Oceanic and Atmospheric Administration’s) containing characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage was used to conduct this analysis.

Data Processing

The data used here was downloaded from the Coursera Reproducible Research Course. It is a comma-separated-value file compressed via the bzip2 algorithm to reduce its size

Storm Data https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2

Documentation from National Weather Service https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf

stormData <- read.csv(file = "repdata-data-StormData.csv")

Converting dates variables

stormData$BGN_DATE <- as.Date(stormData$BGN_DATE, "%m/%d/%Y")
stormData$END_DATE <- as.Date(stormData$END_DATE, "%m/%d/%Y")

Results

1. Across the United States, which types of events are most harmful with respect to population health?

There are 2 (two) variables that can indicate how harmful each event is (were) with respect to population health:

stormDataPop <- stormData %>%
  group_by(EVTYPE) %>%
  summarise(TOT_FATALITIES = sum(FATALITIES, na.rm = T),
            TOT_INJURIES   = sum(INJURIES, na.rm = T))

stormDataPop <- stormDataPop[order(stormDataPop$TOT_FATALITIES, decreasing = T),]
head(stormDataPop[,c(1,2)])
## Source: local data frame [6 x 2]
## 
##           EVTYPE TOT_FATALITIES
##           (fctr)          (dbl)
## 1        TORNADO           5633
## 2 EXCESSIVE HEAT           1903
## 3    FLASH FLOOD            978
## 4           HEAT            937
## 5      LIGHTNING            816
## 6      TSTM WIND            504

Tornado is the event that causes - by far - the larger number of fatalities across the US, followed by Excessive Heat and Flash Flood.

stormDataPop <- stormDataPop[order(stormDataPop$TOT_INJURIES, decreasing = T),]
head(stormDataPop[,c(1,3)])
## Source: local data frame [6 x 2]
## 
##           EVTYPE TOT_INJURIES
##           (fctr)        (dbl)
## 1        TORNADO        91346
## 2      TSTM WIND         6957
## 3          FLOOD         6789
## 4 EXCESSIVE HEAT         6525
## 5      LIGHTNING         5230
## 6           HEAT         2100

Just like the fatalities, the event that has the larger number of injuries is the Tornado, this time, followed by Tunderstorm wind and Flood.

plotFatalities <- ggplot(data = stormData[stormData$EVTYPE=="TORNADO",], aes(x = BGN_DATE, y = FATALITIES)) +
  geom_point() +
  ggtitle("Tornado fatalities through time") +
  xlab("Year") +
  ylab("Fatalities")

plotInjuries <- ggplot(data = stormData[stormData$EVTYPE=="TORNADO",], aes(x = BGN_DATE, y = INJURIES)) +
  geom_point() +
  ggtitle("Tornado injuries through time") +
  xlab("Year") +
  ylab("Injuries")

grid.arrange(plotFatalities, plotInjuries, ncol=2)

2. Across the United States, which types of events have the greatest economic consequences?

There are 2 (two) variables that can indicate the greatest economic consequences

stormDataDmg <- stormData %>%
  group_by(EVTYPE) %>%
  summarise(TOT_PROPDMG = sum(PROPDMG, na.rm = T),
            TOT_CROPDMG = sum(CROPDMG, na.rm = T))

stormDataDmg <- stormDataDmg[order(stormDataDmg$TOT_PROPDMG, decreasing = T),]
head(stormDataDmg[,c(1,2)])
## Source: local data frame [6 x 2]
## 
##              EVTYPE TOT_PROPDMG
##              (fctr)       (dbl)
## 1           TORNADO   3212258.2
## 2       FLASH FLOOD   1420124.6
## 3         TSTM WIND   1335965.6
## 4             FLOOD    899938.5
## 5 THUNDERSTORM WIND    876844.2
## 6              HAIL    688693.4

Tornada is the event that brings more property economic consequences across US, followed by Flash Flood and Tunderstorm Wind.

stormDataDmg <- stormDataDmg[order(stormDataDmg$TOT_CROPDMG, decreasing = T),]
head(stormDataDmg[,c(1,2)])
## Source: local data frame [6 x 2]
## 
##              EVTYPE TOT_PROPDMG
##              (fctr)       (dbl)
## 1              HAIL    688693.4
## 2       FLASH FLOOD   1420124.6
## 3             FLOOD    899938.5
## 4         TSTM WIND   1335965.6
## 5           TORNADO   3212258.2
## 6 THUNDERSTORM WIND    876844.2

Talking about crop economic consequences, the vilain is Hail, followed by Flash Flood and Flood.

plotProperty <- ggplot(data = stormData[stormData$EVTYPE=="TORNADO" & stormData$BGN_DATE >= "1990-01-01", ], aes(x = BGN_DATE, y = PROPDMG)) +
  geom_point() +
  ggtitle("Tornado damage ($) through time") +
  xlab("Year") +
  ylab("Damage ($)")

plotCrop <- ggplot(data = stormData[stormData$EVTYPE=="HAIL" & stormData$BGN_DATE >= "1990-01-01", ], aes(x = BGN_DATE, y = CROPDMG)) +
  geom_point() +
  ggtitle("Hail damage ($) through time") +
  xlab("Year") +
  ylab("Damage ($)")

grid.arrange(plotProperty, plotCrop, ncol=2)

Conclusion

Through this analysis we could learn that Tornados are the main vilain when we talk about economic consequences as well as population harmfulness. Hail represents a significant amount of damage to the economics and should be treated carefully too. Other analysis can be realized from here, using the results and reproducing this one.