Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.
This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
library("data.table")
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(url, "StormData.csv.bz2")
library(R.utils)
## Loading required package: R.oo
## Loading required package: R.methodsS3
## R.methodsS3 v1.7.1 (2016-02-15) successfully loaded. See ?R.methodsS3 for help.
## R.oo v1.22.0 (2018-04-21) successfully loaded. See ?R.oo for help.
##
## Attaching package: 'R.oo'
## The following objects are masked from 'package:methods':
##
## getClasses, getMethods
## The following objects are masked from 'package:base':
##
## attach, detach, gc, load, save
## R.utils v2.9.0 successfully loaded. See ?R.utils for help.
##
## Attaching package: 'R.utils'
## The following object is masked from 'package:utils':
##
## timestamp
## The following objects are masked from 'package:base':
##
## cat, commandArgs, getOption, inherits, isOpen, parse, warnings
bunzip2("StormData.csv.bz2", "StormData.csv")
storm_data1 <- read.csv("StormData.csv")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
##
## between, first, last
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
storm_data2 <- select(storm_data1, EVTYPE, FATALITIES, INJURIES)
storm_data3 <- select(storm_data1, EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
This subset of data looks at the top five weather events resulting in injuries and fatalities. A direct fatality or injury is defined as a fatality or injury directly attributable to the hydro-meteorological event itself, or impact by airborne/falling/moving debris, i.e., missiles generated by wind, water, ice, lightning, tornado, etc. Indirect injuries may be entered into a field within the Storm Data software, but they will not be tallied in official Storm Data statistics.
library(dplyr)
fatalities <- storm_data2 %>% group_by(EVTYPE) %>% mutate(total_fatalities = INJURIES + FATALITIES)
harmful <- fatalities %>% group_by(EVTYPE) %>% summarise(total_harm = sum(total_fatalities))
arrange(harmful, desc(total_harm))
## # A tibble: 985 x 2
## EVTYPE total_harm
## <fct> <dbl>
## 1 TORNADO 96979
## 2 EXCESSIVE HEAT 8428
## 3 TSTM WIND 7461
## 4 FLOOD 7259
## 5 LIGHTNING 6046
## 6 HEAT 3037
## 7 FLASH FLOOD 2755
## 8 ICE STORM 2064
## 9 THUNDERSTORM WIND 1621
## 10 WINTER STORM 1527
## # ... with 975 more rows
topfive <- head(arrange(harmful,desc(total_harm)), 5)
head(topfive)
## # A tibble: 5 x 2
## EVTYPE total_harm
## <fct> <dbl>
## 1 TORNADO 96979
## 2 EXCESSIVE HEAT 8428
## 3 TSTM WIND 7461
## 4 FLOOD 7259
## 5 LIGHTNING 6046
### Figure 1. Bar graph of total injuries and fatalities by weather event
par(mar=c(4,4,4,4), bg="gray100", family="HersheySans", lwd=2)
figure1 <- barplot(height=topfive$total_harm, xlab="Weather Event", names.arg=c("Tornado", "Excessive Heat","TSTM Wind", "Flood", "Lightning"), ylab="Injuries and Fatalities (count)", ylim=c(0,100000), main="Weather Events Causing the Most Injuries and Fatalities", col=c("red4", "red3", "red2", "red1","red"), axisnames=TRUE)
text(figure1, 8500, topfive$total_harm,cex=1, pos=3)
This subset of data looks at the the top five weather events causing monetary damages to property and crops. Typically, damage refers to damage inflicted to private property (structures, objects, vegetation) as well as public infrastructure and facilities. Crop damage refers to damage from freezing temperature, hail, wind, and other weather events that result in damaged crops. Property and Crop damage is considered as a broad estimate.
library(dplyr)
character <- sort(unique(as.character(storm_data3$PROPDMGEXP)))
exp <- c(0,0,0,1,10,10,10,10,10,10,10,10,10,10^9,10^2,10^2,10^3,10^6,10^6)
multiplier <- data.frame(character, exp)
storm_data3$property<- multiplier$exp[match(storm_data3$PROPDMGEXP, multiplier$character)]
storm_data3$crop <- multiplier$exp[match(storm_data3$CROPDMGEXP, multiplier$character)]
storm_data3 <- storm_data3 %>% mutate(PROPDMG = PROPDMG*property) %>% mutate(CROPDMG = CROPDMG*crop) %>% mutate(total_damage = PROPDMG+CROPDMG)
damage <- storm_data3 %>% group_by(EVTYPE) %>% summarize(damage_by_event = sum(total_damage))
millions <- select(damage, damage_by_event, EVTYPE) %>% mutate(damage_millions = damage_by_event/100000000)
topfive_costs <- head(arrange(millions,desc(damage_millions)), 5)
head(topfive_costs)
## # A tibble: 5 x 3
## damage_by_event EVTYPE damage_millions
## <dbl> <fct> <dbl>
## 1 150319678250 FLOOD 1503.
## 2 71913712800 HURRICANE/TYPHOON 719.
## 3 57352117607 TORNADO 574.
## 4 43323541000 STORM SURGE 433.
## 5 17562132111 FLASH FLOOD 176.
### Figure 2. Top five events causing monetary damages (US$) to property and crops
par(mar=c(4,4,4,4), bg="gray100", family="HersheySans", lwd=2)
figure2 <- barplot(height=topfive_costs$damage_millions, xlab="Weather Event", names.arg=c("Flood", "Hurricane/Typhoon","Tornado", "Storm Surge", "Flash Flood"), axisnames=TRUE, ylab="Economic Costs (US$)", ylim=c(0,2000), main="Economic Losses (in Millions of US$) to Property and Crops", col=c("green4", "green3", "green2", "green1","green"))
text(figure1, 0, round(topfive_costs$damage_millions,1),cex=1, pos=3)
The top five weather events causing the most economic costs due to damages to property and crops was flooding, hurricanes/typhoons, tornado, storm surge, and flash flooding.
Overall, the weather events causing the most impact on population health via injuries and fatalities were Tornados (96,969) and excessive heat (9,428). When looking at economic consequences, flooding (US dollars 150,319,678,250) and hurricanes/typhoons (US dollars 71,913,712,800) caused the most damage to property and crops.