This report looks at the top event producers of average deaths and injuries as top event producers of average storm and crop damage.
The data-set for storms is large, but by concentrating on the top five of the average metrics for damage you begin to see patterns. There is often overlap among the metrics.
During the analysis it became apparent that the top fatal events were all events that happen regularly including Tornadoes, Cold and Snow, and Extreme Heat. The top injury producing events were Heat Wave, a specific tropical storm, and Wild Fires. Municipalities should focus their efforts on the most common events since tropical storms are less common.
The top property damage events all involve water damage and include Coastal Erosion, Heavy Rain and Flooding. The top crop damage events were drought events and included Dust Storms, Winds and Forest Fires. Municipalities that are looking to reduce damage costs should note that all these events are predicted to increase with climate change and should start preparing now.
Storms and other weather events happen all the time and mostly produce small amounts of property damage and often few deaths or injuries. However, several times each year there are severe events that can cause catastrophic damage and destruction. Preventing these results is a key concern of many municipalities.
Download data into working directory.
StormDataFile <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
StormDataDoc <- "https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf"
StormDataFAQ <- "https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf"
download.file(StormDataFile, "FStormData.csv.bz2")
trying URL 'https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2'
Content type 'application/bzip2' length 49177144 bytes (46.9 MB)
downloaded 46.9 MB
AccessDate <- Sys.Date()
Load data into R.
FstormData <- read.csv("FStormData.csv.bz2", stringsAsFactors = FALSE)
Load appropriate libraries.
#library(lubridate)
library(dplyr)
library(knitr)
library(stringr)
library(ggplot2)
library(gridExtra)
The following r code creates data tables to summarize the top fatality and injury producing events.
HumDam <- FstormData %>%
select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP) %>%
group_by(EVTYPE) %>% summarise(avgF = mean(FATALITIES), avgI = mean(INJURIES))
HumFatal <- HumDam %>% top_n(5, avgF) %>%
mutate(label = "TopFatalities") %>%
arrange(desc(avgF))
HumInj <- HumDam %>% top_n(5, avgI) %>%
mutate(label = "TopInjuries") %>%
arrange(desc(avgI))
HumTotal = rbind(HumFatal, HumInj)
p2 <- qplot(EVTYPE, avgI, data = HumTotal, color=label, ylab = "Average Number Injuries") + ggtitle("Hightest Average Injuries") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
p1 <- qplot(EVTYPE, avgF, data = HumTotal, color=label, ylab = "Average Number Fatalities") + ggtitle("Highest Average Fatalities")+ theme(axis.text.x = element_text(angle = 90, hjust = 1))
grid.arrange(p1,p2, nrow=1)
As can be seen in the tables, there is considerable cross-over between events that produce the largest number of fatalities and those that produce the largest number of injuries. Thus, it is worthwhile to concentrate on both these metrics.
The following r code creates data tables to summarize the top property and crop damage producing events.
PropDam <- FstormData %>%
select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP) %>%
group_by(EVTYPE) %>% summarise(avgP = mean(PROPDMG), avgC = mean(CROPDMG))
PD1 <- PropDam %>% top_n(5, avgP) %>%
mutate(label = "TopProperty") %>%
arrange(desc(avgP))
PD2 <- PropDam %>% top_n(5, avgC) %>%
mutate(label = "TopCrop") %>%
arrange(desc(avgC))
PropTotal = rbind(PD1, PD2)
p3 <- qplot(EVTYPE, avgP, data = PropTotal, color=label, ylab = "Average Cost in Dollars") + ggtitle("Highest Average Property Damage")+ theme(axis.text.x = element_text(angle = 90, hjust = 1))
p4 <- qplot(EVTYPE, avgC, data = PropTotal, color=label, ylab = "Average Cost in Dollars") + ggtitle("Highest Average Crop Damage")+ theme(axis.text.x = element_text(angle = 90, hjust = 1))
grid.arrange(p3,p4, nrow=1)
Unlike the previous analysis, there is not much relationship between high property damage events and high crop damage events. Additionally, there are a large number of events who tie for the fifth most expensive in property damage which is unusual. Thus, the municipality must consider the location of their land in deciding whether they wish to try to mitigate against property or crop damage as it does not appear that both can be done simultaneously.