Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This study address the following questions:
In order to properly answer these questions the public storm database from NOAA (National Oceanic and Atmospheric Administration’s) containing characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage was used to conduct this analysis.
The data used here was downloaded from the Coursera Reproducible Research Course. It is a comma-separated-value file compressed via the bzip2 algorithm to reduce its size
Storm Data https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2
Documentation from National Weather Service https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf
stormData <- read.csv(file = "repdata-data-StormData.csv")
Converting dates variables
stormData$BGN_DATE <- as.Date(stormData$BGN_DATE, "%m/%d/%Y")
stormData$END_DATE <- as.Date(stormData$END_DATE, "%m/%d/%Y")
1. Across the United States, which types of events are most harmful with respect to population health?
There are 2 (two) variables that can indicate how harmful each event is (were) with respect to population health:
stormDataPop <- stormData %>%
group_by(EVTYPE) %>%
summarise(TOT_FATALITIES = sum(FATALITIES, na.rm = T),
TOT_INJURIES = sum(INJURIES, na.rm = T))
stormDataPop <- stormDataPop[order(stormDataPop$TOT_FATALITIES, decreasing = T),]
head(stormDataPop[,c(1,2)])
## Source: local data frame [6 x 2]
##
## EVTYPE TOT_FATALITIES
## (fctr) (dbl)
## 1 TORNADO 5633
## 2 EXCESSIVE HEAT 1903
## 3 FLASH FLOOD 978
## 4 HEAT 937
## 5 LIGHTNING 816
## 6 TSTM WIND 504
Tornado is the event that causes - by far - the larger number of fatalities across the US, followed by Excessive Heat and Flash Flood.
stormDataPop <- stormDataPop[order(stormDataPop$TOT_INJURIES, decreasing = T),]
head(stormDataPop[,c(1,3)])
## Source: local data frame [6 x 2]
##
## EVTYPE TOT_INJURIES
## (fctr) (dbl)
## 1 TORNADO 91346
## 2 TSTM WIND 6957
## 3 FLOOD 6789
## 4 EXCESSIVE HEAT 6525
## 5 LIGHTNING 5230
## 6 HEAT 2100
Just like the fatalities, the event that has the larger number of injuries is the Tornado, this time, followed by Tunderstorm wind and Flood.
plotFatalities <- ggplot(data = stormData[stormData$EVTYPE=="TORNADO",], aes(x = BGN_DATE, y = FATALITIES)) +
geom_point() +
ggtitle("Tornado fatalities through time") +
xlab("Year") +
ylab("Fatalities")
plotInjuries <- ggplot(data = stormData[stormData$EVTYPE=="TORNADO",], aes(x = BGN_DATE, y = INJURIES)) +
geom_point() +
ggtitle("Tornado injuries through time") +
xlab("Year") +
ylab("Injuries")
grid.arrange(plotFatalities, plotInjuries, ncol=2)
2. Across the United States, which types of events have the greatest economic consequences?
There are 2 (two) variables that can indicate the greatest economic consequences
stormDataDmg <- stormData %>%
group_by(EVTYPE) %>%
summarise(TOT_PROPDMG = sum(PROPDMG, na.rm = T),
TOT_CROPDMG = sum(CROPDMG, na.rm = T))
stormDataDmg <- stormDataDmg[order(stormDataDmg$TOT_PROPDMG, decreasing = T),]
head(stormDataDmg[,c(1,2)])
## Source: local data frame [6 x 2]
##
## EVTYPE TOT_PROPDMG
## (fctr) (dbl)
## 1 TORNADO 3212258.2
## 2 FLASH FLOOD 1420124.6
## 3 TSTM WIND 1335965.6
## 4 FLOOD 899938.5
## 5 THUNDERSTORM WIND 876844.2
## 6 HAIL 688693.4
Tornada is the event that brings more property economic consequences across US, followed by Flash Flood and Tunderstorm Wind.
stormDataDmg <- stormDataDmg[order(stormDataDmg$TOT_CROPDMG, decreasing = T),]
head(stormDataDmg[,c(1,2)])
## Source: local data frame [6 x 2]
##
## EVTYPE TOT_PROPDMG
## (fctr) (dbl)
## 1 HAIL 688693.4
## 2 FLASH FLOOD 1420124.6
## 3 FLOOD 899938.5
## 4 TSTM WIND 1335965.6
## 5 TORNADO 3212258.2
## 6 THUNDERSTORM WIND 876844.2
Talking about crop economic consequences, the vilain is Hail, followed by Flash Flood and Flood.
plotProperty <- ggplot(data = stormData[stormData$EVTYPE=="TORNADO" & stormData$BGN_DATE >= "1990-01-01", ], aes(x = BGN_DATE, y = PROPDMG)) +
geom_point() +
ggtitle("Tornado damage ($) through time") +
xlab("Year") +
ylab("Damage ($)")
plotCrop <- ggplot(data = stormData[stormData$EVTYPE=="HAIL" & stormData$BGN_DATE >= "1990-01-01", ], aes(x = BGN_DATE, y = CROPDMG)) +
geom_point() +
ggtitle("Hail damage ($) through time") +
xlab("Year") +
ylab("Damage ($)")
grid.arrange(plotProperty, plotCrop, ncol=2)
Through this analysis we could learn that Tornados are the main vilain when we talk about economic consequences as well as population harmfulness. Hail represents a significant amount of damage to the economics and should be treated carefully too. Other analysis can be realized from here, using the results and reproducing this one.