Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
The purpose of this report is to explore severe weather events in the NOAA Storm Database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The data for this report come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size which can be downloaded here:
There is also some documentation of the database available. Here you will find information on variable constructed and variable definition.
The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
Population health is operationalized as counts of fatalities (FATALITIES) and injuries (INJURIES) by county, year, event combinations. Economic consequenses are operationalized by the total amount of property damage (PROPDMG) summarized by county, year, and event combinations. Additionally, since it was noted that the most recent years are considered complete only events occuring in 2008 and beyond are considered in this analysis. The following lines of code read in the raw data and summarize injuries, fatalaties, and events.
storm<-read.csv(bzfile("C:/Users/Dell - User/Downloads/repdata-data-StormData.csv.bz2"), header = TRUE)
storm$cofips<-paste(formatC(storm$STATE__,width=2,flag="0"),formatC(storm$COUNTY,width=3,flag="0"),sep="")
Originally, I was thinking of creating a map using the leaflet library, but decided against it since i could only map one event at a time.
storm$year<-as.numeric(format(as.Date(storm$BGN_DATE,format="%m/%d/%Y %H:%M:%S"),"%Y"))
myvars<-c("FATALITIES","INJURIES","PROPDMG","EVTYPE")
storm.sub<-subset(storm,year>=2008,select=myvars)
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.2.5
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
by_all<-group_by(storm.sub,EVTYPE)
storm.agg<-summarize_each(by_all,funs(sum))
The goal of this section was to summarize fatalities and injuries to estimate the most harmful weather events with respect to population health. The idea being that events that produce many fatalities and injuries are more harmful to population health than weather events with fewer injuries and fatalities. Similarly, events causing more property damage are considered to have the greater economic consequences than those that do not cause a lot of property damage.
In order to answer this question the total number of fatalities and injuries recorded between 2008 and 2011 were summarized in bar plots. For ease of display, only the top seven events for each measure are shown.
par(mfrow=c(2,1))
sort1<-storm.agg[order(-storm.agg$FATALITIES),]
barplot(sort1$FATALITIES[1:7], main="Top Seven Most Fatal Weather Events in the U.S., 2008-2011",xlab="Weather Event",ylab="Counts of
Fatalities",names.arg=c("TORNADO","FLASH FLOOD","RIP CURRENT","HEAT","FLOOD","LIGHTNING","THUNDERSTORM WIND"),col="#9ebcda",cex.names=0.8)
sort2<-storm.agg[order(-storm.agg$INJURIES),]
barplot(sort2$INJURIES[1:7], main="Top Seven Weather Events with the most Injuries in the U.S., 2008-2011",xlab="Weather Event",ylab="Counts of
Injuries",names.arg=c("TORNADO","THUNDERSTORM WIND","LIGHTNING","HEAT","EXCESSIVE HEAT","FLASH FLOOD", "WILDFIRE"),col="#9ebcda", cex.names=0.8)
Tornados, Flash Floods, Heat, and Lightning seem to be among the are most harmful with respect to population health. Each event appears as both the most fatal and source of injury. Between 2008 and 2001, Tornados accounted for 782 fatalities and 8,949 injuries. Almost as many fatalities as the other top seven events and 2.5 times as many injuries as the other top seven injury events combined.
In order to answer this question the total amount of property damage recorded between 2008 and 2011 was summarized in bar plots. For ease of display, only the top five events for each measure are shown.
sort3<-storm.agg[order(-storm.agg$PROPDMG),]
par(mfrow=c(1,1))
options(scipen=999)
barplot(sort3$PROPDMG[1:5], main="Top Five Weather Events with the most Property Damage in the U.S., 2008-2011",xlab="$",ylab="Weather Event",names.arg=c("TSTORM WIND","TORNADO","FLASH FLOOD","FLOOD","HAIL"),horiz=TRUE,col="#9ebcda",cex.names=0.8)
Wind from Thunderstorms causes the most property damage with around $746K in damage between 2008 and 2011. Hail was the fifth most costly weather event causing around $95K in damage between 2008 and 2011 (These figures have to be in billions right?). Tornados, Flash Flood, and Flood events appear on as the top five costly as well as the top seven most fatal or injury causing storms making them some of the most damaging events to the U.S.