The Effects of Severe Weather Events on Public Health and the Economy in the United States

Synopsis

This report examines the impact of severe weather events on public health and the economy in the United States (US). Storm data from 1996 to 2010 provided by the National Oceanic and Atmospheric Administration (NOAA) was analyzed to identify the weather events with the most significant impact. The criteria used were: total number of injuries, total number of fatalities, total value of property damage and total value of crop damage. This report finds that Tornadoes, Excessive Heat and Floods caused the most casualties, while Floods, Hurricanes and Storm Surges created the greatest economic consequences.

Required Packages

library(R.utils)
library(dplyr)
library(ggplot2)
library(grid)
library(gridExtra)

Data Processing

First, we read the NOAA storm data into R. If the CSV file and BZIP2 file do not already exist, the BZIP2 file is downloaded and unzipped.

if(!file.exists("repdata-data-StormData.csv") & 
       !file.exists("repdata-data-StormData.csv.bz2")){
    download.file("http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile="repdata-data-StormData.csv.bz2")
    bunzip2("repdata-data-StormData.csv.bz2", overwrite = TRUE)
}

data <- read.csv("repdata-data-StormData.csv", stringsAsFactors = FALSE)

Second, to facilitate data analysis, the variable names were converted to lower case.

names(data) <- tolower(names(data))

Third, data entries before 1996 were removed from the dataset. This is because in Jan 1996, the NOAA began using the Paradox relational database for storing data. This was substantially more comprehensive than before; from Jan 1950 to Dec 1995, data was stored in unformatted text files.

data$bgn_date <- sub(" .*", "", data$bgn_date)
data$bgn_date <- as.Date(data$bgn_date, "%m/%d/%Y")
data <- subset(data, data$bgn_date >= as.Date("1996-01-01"))

Value of Damage

The variables propdmgexp and cropdmgexp signify the magnitude of the damage. For example, 1.55B represents $1,550,000,000 worth of damage. These variables take multiple values, some of which have limited interpretation.

table(data$propdmgexp)
## 
##             0      B      K      M 
## 276185      1     32 369938   7374
table(data$cropdmgexp)
## 
##             B      K      M 
## 373069      4 278686   1771

Thus, only the values "", "K", "M" and "B" were retained. These were converted to 1, 1000, 1000000 and 1000000000 respectively.

data$propdmgexp <- gsub("^$", 1, data$propdmgexp)
data$propdmgexp <- gsub("K", 1000, data$propdmgexp)
data$propdmgexp <- gsub("M", 1000000, data$propdmgexp)
data$propdmgexp <- gsub("B", 1000000000, data$propdmgexp)
data$propdmgexp <- as.numeric(data$propdmgexp)

data$cropdmgexp <- gsub("^$", 1, data$cropdmgexp)
data$cropdmgexp <- gsub("K", 1000, data$cropdmgexp)
data$cropdmgexp <- gsub("M", 1000000, data$cropdmgexp)
data$cropdmgexp <- gsub("B", 1000000000, data$cropdmgexp)
data$cropdmgexp <- as.numeric(data$cropdmgexp)

table(data$propdmgexp)
## 
##      0      1   1000  1e+06  1e+09 
##      1 276185 369938   7374     32
table(data$cropdmgexp)
## 
##      1   1000  1e+06  1e+09 
## 373069 278686   1771      4

Next, the absolute values of damage were calculated by multiplying the three-significant-figure value (the original propdmg and cropdmg) by the respective multiplication factors (propdmgexp and cropdmgexp)

data$propdmg <- data$propdmg * data$propdmgexp
data$cropdmg <- data$cropdmg * data$cropdmgexp

Storm Events Data

The values in the event type data (evtype) had inconsistent names. Thus, first, the values were converted to lower case:

data$evtype <- tolower(data$evtype)

Second, monthly summaries were removed from the data:

data <- data[!grepl("summary", data$evtype), ]

Next, various entries were standardized, and several categories were combined based on similarity. One standardization to note was the conversion of the string “tstm” to “thunderstorm”. This was significant because there were 81403 instances of “thunderstorm wind” and 128664 instances of “tstm wind”. After standardizing and combining several categories, the “thunderstorm” category comprised 211204 observations.

Standardizations were performed starting from the longest string to the shortest string. This was to ensure that intermediate strings were not replaced pre-maturely.

data$evtype <- gsub("\\bbitter wind chill temperatures\\b", "bitter wind chill", data$evtype)
data$evtype <- gsub("\\bextreme windchill temperatures\\b", "extreme windchill", data$evtype)
data$evtype <- gsub("\\bcold wind chill temperatures\\b", "cold", data$evtype)
data$evtype <- gsub("\\blight snow/freezing precip\\b", "light snow", data$evtype)
data$evtype <- gsub("\\bcoastal  flooding/erosion\\b", "coastal flooding/erosion", data$evtype)
data$evtype <- gsub("\\bunseasonably warm and dry\\b", "dry", data$evtype)
data$evtype <- gsub("\\bgusty thunderstorm winds\\b", "gusty winds", data$evtype)
data$evtype <- gsub("\\bmarine thunderstorm wind\\b", "marine tstm wind", data$evtype)
data$evtype <- gsub("\\bextreme cold/wind chill\\b", "cold", data$evtype)
data$evtype <- gsub("\\bgusty thunderstorm wind\\b", "gusty winds", data$evtype)
data$evtype <- gsub("\\bthunderstorm wind \\(g40\\)\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind and lightning\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bunseasonably cool & wet\\b", "unseasonably cold", data$evtype)
data$evtype <- gsub("\\bunseasonably warm & wet\\b", "heat", data$evtype)
data$evtype <- gsub("\\bexcessive heat/drought\\b", "heat", data$evtype)
data$evtype <- gsub("\\bnon-severe wind damage\\b", "no severe weather", data$evtype)
data$evtype <- gsub("\\bunseasonably warm year\\b", "heat", data$evtype)
data$evtype <- gsub("\\b   high surf advisory\\b", "high surf advisory", data$evtype)
data$evtype <- gsub("\\bcstl flooding/erosion\\b", "coastal flooding/erosion", data$evtype)
data$evtype <- gsub("\\bunseasonably warm/wet\\b", "heat", data$evtype)
data$evtype <- gsub("\\bunusual/record warmth\\b", "heat", data$evtype)
data$evtype <- gsub("\\burban/small strm fldg\\b", "urban/sml stream fld", data$evtype)
data$evtype <- gsub("\\burban/sml stream fldg\\b", "urban/sml stream fld", data$evtype)
data$evtype <- gsub("\\burban/street flooding\\b", "urban/sml stream fld", data$evtype)
data$evtype <- gsub("\\bheavy rain/high surf\\b", "heavy rain", data$evtype)
data$evtype <- gsub("\\bheavy surf/high surf\\b", "heavy surf", data$evtype)
data$evtype <- gsub("\\bhigh surf advisories\\b", "high surf", data$evtype)
data$evtype <- gsub("\\bice jam flood \\(minor\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\blate season snowfall\\b", "late snow", data$evtype)
data$evtype <- gsub("\\blate-season snowfall\\b", "late snow", data$evtype)
data$evtype <- gsub("\\bmild and dry pattern\\b", "dry", data$evtype)
data$evtype <- gsub("\\bsevere thunderstorms\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bagricultural freeze\\b", "frost", data$evtype)
data$evtype <- gsub("\\bfreezing rain/sleet\\b", "freezing rain", data$evtype)
data$evtype <- gsub("\\bgusty wind/hvy rain\\b", "gusty winds", data$evtype)
data$evtype <- gsub("\\bheavy precipitation\\b", "heavy rain", data$evtype)
data$evtype <- gsub("\\bheavy rain and wind\\b", "heavy rain", data$evtype)
data$evtype <- gsub("\\bheavy surf and wind\\b", "heavy surf", data$evtype)
data$evtype <- gsub("\\blight snow/flurries\\b", "light snow", data$evtype)
data$evtype <- gsub("\\brecord temperatures\\b", "record temperature", data$evtype)
data$evtype <- gsub("\\bsevere thunderstorm\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bsleet/freezing rain\\b", "sleet", data$evtype)
data$evtype <- gsub("\\bunseasonal low temp\\b", "cold", data$evtype)
data$evtype <- gsub("\\bunusually late snow\\b", "unseasonably cold", data$evtype)
data$evtype <- gsub("\\berosion/cstl flood\\b", "coastal flooding/erosion", data$evtype)
data$evtype <- gsub("\\bextreme wind chill\\b", "extreme windchill", data$evtype)
data$evtype <- gsub("\\bheavy rain effects\\b", "heavy rain", data$evtype)
data$evtype <- gsub("\\bheavy snow squalls\\b", "heavy snow", data$evtype)
data$evtype <- gsub("\\bhigh surf advisory\\b", "high surf", data$evtype)
data$evtype <- gsub("\\bmarine strong wind\\b", "marine tstm wind", data$evtype)
data$evtype <- gsub("\\bmudslide/landslide\\b", "mudslide", data$evtype)
data$evtype <- gsub("\\brecord warm temps.\\b", "heat", data$evtype)
data$evtype <- gsub("\\brecord winter snow\\b", "record snowfall", data$evtype)
data$evtype <- gsub("\\bsnow/freezing rain\\b", "snow", data$evtype)
data$evtype <- gsub("\\btemperature record\\b", "record temperature", data$evtype)
data$evtype <- gsub("\\bthundersnow shower\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bvolcanic ash plume\\b", "volcanic ash", data$evtype)
data$evtype <- gsub("\\bwinter weather mix\\b", "winter weather", data$evtype)
data$evtype <- gsub("\\bwinter weather/mix\\b", "winter weather", data$evtype)
data$evtype <- gsub("\\bcold temperatures\\b", "cold", data$evtype)
data$evtype <- gsub("\\bflash flood/flood\\b", "flash flood", data$evtype)
data$evtype <- gsub("\\bflood/flash flood\\b", "flash flood", data$evtype)
data$evtype <- gsub("\\bflood/flash/flood\\b", "flash flood", data$evtype)
data$evtype <- gsub("\\bflood/strong wind\\b", "flood", data$evtype)
data$evtype <- gsub("\\bheavy snow shower\\b", "heavy snow", data$evtype)
data$evtype <- gsub("\\bhurricane edouard\\b", "hurricane", data$evtype)
data$evtype <- gsub("\\bhurricane/typhoon\\b", "hurricane", data$evtype)
data$evtype <- gsub("\\bicestorm/blizzard\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\bsnow accumulation\\b", "snow", data$evtype)
data$evtype <- gsub("\\bsnow/blowing snow\\b", "snow", data$evtype)
data$evtype <- gsub("\\bthunderstorm wind\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bunseasonable cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\bunseasonably cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\bunseasonably cool\\b", "unseasonably cold", data$evtype)
data$evtype <- gsub("\\bunseasonably warm\\b", "heat", data$evtype)
data$evtype <- gsub("\\bvolcanic eruption\\b", "volcanic ash", data$evtype)
data$evtype <- gsub("\\b tstm wind \\(g45\\)\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bblizzard summary\\b", "blizzard", data$evtype)
data$evtype <- gsub("\\bcoastal flooding\\b", "coastal flood", data$evtype)
data$evtype <- gsub("\\bcold temperature\\b", "cold", data$evtype)
data$evtype <- gsub("\\blake effect snow\\b", "lake-effect snow", data$evtype)
data$evtype <- gsub("\\blate season snow\\b", "late snow", data$evtype)
data$evtype <- gsub("\\brecord dry month\\b", "dry", data$evtype)
data$evtype <- gsub("\\bstorm surge/tide\\b", "storm surge", data$evtype)
data$evtype <- gsub("\\bstrong wind gust\\b", "strong wind", data$evtype)
data$evtype <- gsub("\\btstm wind  \\(g45\\)\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bunseasonably dry\\b", "dry", data$evtype)
data$evtype <- gsub("\\bunseasonably hot\\b", "heat", data$evtype)
data$evtype <- gsub("\\bvolcanic ashfall\\b", "volcanic ash", data$evtype)
data$evtype <- gsub("\\bwild/forest fire\\b", "wildfire", data$evtype)
data$evtype <- gsub("\\babnormal warmth\\b", "heat", data$evtype)
data$evtype <- gsub("\\bcold/wind chill\\b", "cold", data$evtype)
data$evtype <- gsub("\\bdamaging freeze\\b", "frost", data$evtype)
data$evtype <- gsub("\\bexcessively dry\\b", "dry", data$evtype)
data$evtype <- gsub("\\bgusty wind/hail\\b", "gusty winds", data$evtype)
data$evtype <- gsub("\\bgusty wind/rain\\b", "gusty winds", data$evtype)
data$evtype <- gsub("\\bheavy rain/wind\\b", "heavy rain", data$evtype)
data$evtype <- gsub("\\bhigh wind \\(g40\\)\\b", "high wind", data$evtype)
data$evtype <- gsub("\\bnon severe hail\\b", "no severe weather", data$evtype)
data$evtype <- gsub("\\brecord may snow\\b", "record snowfall", data$evtype)
data$evtype <- gsub("\\brecord snowfall\\b", "record snow", data$evtype)
data$evtype <- gsub("\\btstm heavy rain\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind \\(g35\\)\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind \\(g40\\)\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind \\(g45\\)\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bunseasonal rain\\b", "unseasonably wet", data$evtype)
data$evtype <- gsub("\\b coastal flood\\b", "coastal flood", data$evtype)
data$evtype <- gsub("\\babnormally dry\\b", "dry", data$evtype)
data$evtype <- gsub("\\bblow-out tides\\b", "blow-out tide", data$evtype)
data$evtype <- gsub("\\bcold and frost\\b", "frost", data$evtype)
data$evtype <- gsub("\\bdry conditions\\b", "dry", data$evtype)
data$evtype <- gsub("\\bdry microburst\\b", "dry", data$evtype)
data$evtype <- gsub("\\bexcessive cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\bexcessive heat\\b", "heat", data$evtype)
data$evtype <- gsub("\\bexcessive rain\\b", "excessive rainfall", data$evtype)
data$evtype <- gsub("\\bflash flooding\\b", "flash flood", data$evtype)
data$evtype <- gsub("\\bheavy rainfall\\b", "heavy rain", data$evtype)
data$evtype <- gsub("\\blight snowfall\\b", "light snow", data$evtype)
data$evtype <- gsub("\\bprolong warmth\\b", "heat", data$evtype)
data$evtype <- gsub("\\brecord dryness\\b", "dry", data$evtype)
data$evtype <- gsub("\\briver flooding\\b", "river flood", data$evtype)
data$evtype <- gsub("\\bsnow and sleet\\b", "snow", data$evtype)
data$evtype <- gsub("\\btornado debris\\b", "tornado", data$evtype)
data$evtype <- gsub("\\btstm wind \\(41\\)\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind/hail\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bunusual warmth\\b", "heat", data$evtype)
data$evtype <- gsub("\\bunusually cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\bunusually warm\\b", "heat", data$evtype)
data$evtype <- gsub("\\burban flooding\\b", "urban/sml stream fld", data$evtype)
data$evtype <- gsub("\\bcold and snow\\b", "cold", data$evtype)
data$evtype <- gsub("\\bextended cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\bfunnel clouds\\b", "funnel cloud", data$evtype)
data$evtype <- gsub("\\bmoderate snow\\b", "moderate snowfall", data$evtype)
data$evtype <- gsub("\\bnon tstm wind\\b", "no severe weather", data$evtype)
data$evtype <- gsub("\\bnon-tstm wind\\b", "no severe weather", data$evtype)
data$evtype <- gsub("\\brecord warmth\\b", "heat", data$evtype)
data$evtype <- gsub("\\bsnow advisory\\b", "snow", data$evtype)
data$evtype <- gsub("\\bthunderstorms\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind g45\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bwet micoburst\\b", "wet microburst", data$evtype)
data$evtype <- gsub("\\bwind advisory\\b", "wind", data$evtype)
data$evtype <- gsub("\\bwind and wave\\b", "wind", data$evtype)
data$evtype <- gsub("\\b flash flood\\b", "flash flood", data$evtype)
data$evtype <- gsub("\\bblowing snow\\b", "snow", data$evtype)
data$evtype <- gsub("\\bcoastalflood\\b", "coastal flood", data$evtype)
data$evtype <- gsub("\\bcoastalstorm\\b", "coastal storm", data$evtype)
data$evtype <- gsub("\\bcold weather\\b", "cold", data$evtype)
data$evtype <- gsub("\\bextreme cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\bfrost/freeze\\b", "frost", data$evtype)
data$evtype <- gsub("\\bhigh  swells\\b", "high surf", data$evtype)
data$evtype <- gsub("\\bmixed precip\\b", "mixed precipitation", data$evtype)
data$evtype <- gsub("\\bprolong cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\brain \\(heavy\\)\\b", "rain", data$evtype)
data$evtype <- gsub("\\brecord  cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\brip currents\\b", "rip current", data$evtype)
data$evtype <- gsub("\\bsnow and ice\\b", "snow", data$evtype)
data$evtype <- gsub("\\bsnow drought\\b", "snow", data$evtype)
data$evtype <- gsub("\\bsnow showers\\b", "snow", data$evtype)
data$evtype <- gsub("\\bsnow squalls\\b", "snow", data$evtype)
data$evtype <- gsub("\\bstrong winds\\b", "strong wind", data$evtype)
data$evtype <- gsub("\\bthunderstorm\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind 40\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\btstm wind 45\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bwarm weather\\b", "heat", data$evtype)
data$evtype <- gsub("\\bwinter storm\\b", "winter weather", data$evtype)
data$evtype <- gsub("\\b waterspout\\b", "waterspout", data$evtype)
data$evtype <- gsub("\\bdry weather\\b", "dry", data$evtype)
data$evtype <- gsub("\\bearly frost\\b", "frost", data$evtype)
data$evtype <- gsub("\\bfirst frost\\b", "frost", data$evtype)
data$evtype <- gsub("\\bhard freeze\\b", "frost", data$evtype)
data$evtype <- gsub("\\bhigh swells\\b", "high surf", data$evtype)
data$evtype <- gsub("\\bhot and dry\\b", "dry", data$evtype)
data$evtype <- gsub("\\bhot weather\\b", "heat", data$evtype)
data$evtype <- gsub("\\bice on road\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\bice pellets\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\blate freeze\\b", "frost", data$evtype)
data$evtype <- gsub("\\brain damage\\b", "rain", data$evtype)
data$evtype <- gsub("\\brecord cold\\b", "cold", data$evtype)
data$evtype <- gsub("\\brecord cool\\b", "record cold", data$evtype)
data$evtype <- gsub("\\brecord heat\\b", "heat", data$evtype)
data$evtype <- gsub("\\brecord high\\b", "record heat", data$evtype)
data$evtype <- gsub("\\brecord warm\\b", "heat", data$evtype)
data$evtype <- gsub("\\bsleet storm\\b", "sleet", data$evtype)
data$evtype <- gsub("\\bsnow squall\\b", "snow", data$evtype)
data$evtype <- gsub("\\burban flood\\b", "urban/sml stream fld", data$evtype)
data$evtype <- gsub("\\bwaterspouts\\b", "waterspout", data$evtype)
data$evtype <- gsub("\\bwind damage\\b", "wind", data$evtype)
data$evtype <- gsub("\\bwintery mix\\b", "winter weather", data$evtype)
data$evtype <- gsub("\\b lightning\\b", "lightning", data$evtype)
data$evtype <- gsub("\\b tstm wind\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bdust devel\\b", "dust devil", data$evtype)
data$evtype <- gsub("\\bgusty wind\\b", "gusty winds", data$evtype)
data$evtype <- gsub("\\bhail\\(0.75\\)\\b", "hail", data$evtype)
data$evtype <- gsub("\\bheavy seas\\b", "high surf", data$evtype)
data$evtype <- gsub("\\bhigh water\\b", "high surf", data$evtype)
data$evtype <- gsub("\\bhigh winds\\b", "high wind", data$evtype)
data$evtype <- gsub("\\blandslides\\b", "landslide", data$evtype)
data$evtype <- gsub("\\brough surf\\b", "rough seas", data$evtype)
data$evtype <- gsub("\\bsnow/sleet\\b", "snow", data$evtype)
data$evtype <- gsub("\\btstm winds\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bwind chill\\b", "wind", data$evtype)
data$evtype <- gsub("\\bwind gusts\\b", "wind", data$evtype)
data$evtype <- gsub("\\bwinter mix\\b", "winter weather", data$evtype)
data$evtype <- gsub("\\bwintry mix\\b", "winter weather", data$evtype)
data$evtype <- gsub("\\bdry spell\\b", "dry", data$evtype)
data$evtype <- gsub("\\bhail/wind\\b", "hail", data$evtype)
data$evtype <- gsub("\\bheat wave\\b", "heat", data$evtype)
data$evtype <- gsub("\\bheatburst\\b", "heat", data$evtype)
data$evtype <- gsub("\\bhigh seas\\b", "high surf", data$evtype)
data$evtype <- gsub("\\bhot spell\\b", "heat", data$evtype)
data$evtype <- gsub("\\bice roads\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\blandslump\\b", "landslide", data$evtype)
data$evtype <- gsub("\\bmud slide\\b", "mudslide", data$evtype)
data$evtype <- gsub("\\bmudslides\\b", "mudslide", data$evtype)
data$evtype <- gsub("\\brain/snow\\b", "rain", data$evtype)
data$evtype <- gsub("\\btstm wind\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bvery warm\\b", "heat", data$evtype)
data$evtype <- gsub("\\bice/snow\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\bsnow/ice\\b", "snow", data$evtype)
data$evtype <- gsub("\\btstm wnd\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bvery dry\\b", "dry", data$evtype)
data$evtype <- gsub("\\bdryness\\b", "dry", data$evtype)
data$evtype <- gsub("\\bice fog\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\bice jam\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\bfreeze\\b", "frost", data$evtype)
data$evtype <- gsub("\\b wind\\b", "wind", data$evtype)
data$evtype <- gsub("\\bwinds\\b", "wind", data$evtype)
data$evtype <- gsub("\\btstm\\b", "thunderstorm", data$evtype)
data$evtype <- gsub("\\bice\\b", "ice storm", data$evtype)
data$evtype <- gsub("\\bwnd\\b", "wind", data$evtype)
data$evtype <- gsub("heat", "excessive heat", data$evtype)
data$evtype <- gsub("cold", "excessive cold", data$evtype)
data$evtype <- gsub("dry", "excessive dryness", data$evtype)

After standardization, there were 147 unique event types.

length(unique(data$evtype))
## [1] 147

Data Conversion

The data was then converted to a data frame tbl form for processing using the dplyr package.

tbl_data <- tbl_df(data)

Subsequently, the following relevant variables were selected:

tbl_data <- select(tbl_data, evtype, state, fatalities, injuries, propdmg, cropdmg)

Results

Effect on Public Health

For the purposes of this report, the effect of weather events on public health will be measured by fatalities and injuries. The data was first grouped by event type, and then summarized by injuries and fatalities separately. Next, the event types were ranked by total injuries and total fatalities separately.

ph_data <- group_by(tbl_data, evtype)
inj_data <- summarize(ph_data, injuries = sum(injuries, na.rm = TRUE))
ftl_data <- summarize(ph_data, fatalities = sum(fatalities, na.rm = TRUE))
inj_data <- arrange(inj_data, desc(injuries))
ftl_data <- arrange(ftl_data, desc(fatalities))

The panel diagram below summarizes the 10 most severe weather events in terms of total injuries and total fatalities.

inj_plot <- ggplot(inj_data[1:10, ], aes(x = reorder(evtype, -injuries), y = injuries)) +
    geom_bar(stat = "identity", fill = "dodgerblue4") +
    theme(panel.background = element_rect(fill = "grey94")) +
    ggtitle("Total Injuries") +
    theme(plot.title = element_text(size = 16)) +
    theme(axis.title.y = element_text(vjust = 1, size = 12)) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, colour = "grey30")) +
    theme(axis.text.y = element_text(colour = "grey30")) +
    scale_x_discrete(labels = c("Tornado",
                                "Excessive Heat",
                                "Flood",
                                "Thunderstorm",
                                "Lightning",
                                "Winter Weather",
                                "Flash Flood",
                                "Wildfire",
                                "Hurricane",
                                "Highwind")) +
    ylab("Number of Injuries") +
    xlab("")

ftl_plot <- ggplot(ftl_data[1:10, ], aes(x = reorder(evtype, -fatalities), y = fatalities)) +
    geom_bar(stat = "identity", fill = "indianred3") +
    theme(panel.background = element_rect(fill = "grey94")) +
    ggtitle("Total Fatalities") +
    theme(plot.title = element_text(size = 16)) +
    theme(axis.title.y = element_text(vjust = 1, size = 12)) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, colour = "grey30")) +
    theme(axis.text.y = element_text(colour = "grey30")) +
    scale_x_discrete(labels = c("Excessive Heat",
                                "Tornado",
                                "Flash Flood",
                                "Lightning",
                                "Rip Current",
                                "Flood",
                                "Thunderstorm",
                                "Excessive Cold",
                                "Winter Weather",
                                "Highwind")) +
    ylab("Number of Fatalities") +
    xlab("")
grid.arrange(inj_plot, ftl_plot,
             ncol = 2,
             main = textGrob(
                 "\nImpact of Weather Events on Public Health in the\nUnited States, 1996 - 2011",
                 gp = gpar(fontface = "bold", fontsize = 20)),
             sub = textGrob("Event Type\n", gp = gpar(fontsize = 12))
             )

plot of chunk injftlplot

The diagram shows that Tornadoes, Floods, Excessive Heat, Thunderstorms, Lightning, Winter Weather, Flash Floods and Highwinds were common top causes of injuries and fatalities. In general, Tornadoes and Floods had a substantially larger impact on public health than any other weather event.

Effect on the Economy

For the purposes of this report, the effect of weather on the economy will be assessed based on property and crop damage. This time, the data was first grouped by event type, and then summarized by property damage and crop damage separately. Next, the event types were ranked by the sum of the value of property damage and that of crop damage separately.

e_data <- group_by(tbl_data, evtype)
prop_data <- summarize(e_data, propdmg = sum(propdmg, na.rm = TRUE))
crop_data <- summarize(e_data, cropdmg = sum(cropdmg, na.rm = TRUE))
prop_data <- arrange(prop_data, desc(propdmg))
crop_data <- arrange(crop_data, desc(cropdmg))

The nominal values of property damage and crop damage are summarized in the panel diagram below.

prop_plot <- ggplot(prop_data[1:10, ], aes(x = reorder(evtype, -propdmg), y = propdmg/1000000)) +
    geom_bar(stat = "identity", fill = "burlywood3") +
    theme(panel.background = element_rect(fill = "grey94")) +
    ggtitle("Value of Property Damage") +
    theme(plot.title = element_text(size = 16)) +
    theme(axis.title.y = element_text(vjust = 1, size = 12)) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, colour = "grey30")) +
    theme(axis.text.y = element_text(colour = "grey30")) +
    scale_x_discrete(labels = c("Flood",
                                "Hurricane",
                                "Storm Surge",
                                "Tornado",
                                "Flash Flood",
                                "Hail",
                                "Thunderstorm",
                                "Wildfire",
                                "Tropical Storm",
                                "Highwind")) +
    ylab("Value of Damage (in Million US Dollars)") +
    xlab("")

crop_plot <- ggplot(crop_data[1:10, ], aes(x = reorder(evtype, -cropdmg), y = cropdmg/1000000)) +
    geom_bar(stat = "identity", fill = "seagreen") +
    theme(panel.background = element_rect(fill = "grey94")) +
    ggtitle("Value of Crop Damage") +
    theme(plot.title = element_text(size = 16)) +
    theme(axis.title.y = element_text(vjust = 1, size = 12)) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, colour = "grey30")) +
    theme(axis.text.y = element_text(colour = "grey30")) +
    scale_x_discrete(labels = c("Drought",
                                "Hurricane",
                                "Flood",
                                "Hail",
                                "Frost",
                                "Excessive Cold",
                                "Flash Flood",
                                "Thunderstorm",
                                "Heavy Rain",
                                "Tropical Storm")) +
    ylab("Value of Damage (in Million US Dollars)") +
    xlab("")
grid.arrange(prop_plot, crop_plot,
             ncol = 2,
             main = textGrob(
                 "\nEconomic Impact of Weather Events in the\nUnited States, 1996 - 2011",
                 gp = gpar(fontface = "bold", fontsize = 20)),
             sub = textGrob("Event Type\n", gp = gpar(fontsize = 12))
             )

plot of chunk dmgplot

In general, weather events had a significantly larger impact on property than on crops, as shown by the larger scales. Although cropland requires larger land areas than property, this observation can be explained by the fact that property is arguably more valuable than crop on an area-for-area basis of comparison. Droughts clearly had a greater impact on crop damage than property damage because of the nature of agriculture. Meanwhile, floods and hurricanes both had a substantially greater impact on the economy than factors other than droughts.

Aggregate Effects

To identify the types of weather events that caused the greatest public health and economic consequences, the fatalities and injuries data, and the property damage and crop damage data were aggregated separately. Thus, the aggregate measures of the impact of weather events were: Total Casualties and Total Value of Damage.

health_data <- group_by(tbl_data, evtype)
health_data <- summarize(health_data, injuries = sum(injuries, na.rm = TRUE), fatalities = sum(fatalities, na.rm = TRUE), health = injuries + fatalities)
health_data <- arrange(health_data, desc(health))

econ_data <- group_by(tbl_data, evtype)
econ_data <- summarize(econ_data, propdmg = sum(propdmg, na.rm = TRUE), cropdmg = sum(cropdmg, na.rm = TRUE), econ = propdmg + cropdmg)
econ_data <- arrange(econ_data, desc(econ))

The panel diagram below shows the 10 weather events that had the greatest separate impact on public health and the economy.

health_plot <- ggplot(health_data[1:10, ], aes(x = reorder(evtype, -health), y = health)) +
    geom_bar(stat = "identity", fill = "coral3") +
    theme(panel.background = element_rect(fill = "grey94")) +
    ggtitle("Total Casualties") +
    theme(plot.title = element_text(size = 16)) +
    theme(axis.title.y = element_text(vjust = 1, size = 12)) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, colour = "grey30")) +
    theme(axis.text.y = element_text(colour = "grey30")) +
    scale_x_discrete(labels = c("Tornado",
                                "Excessive Heat",
                                "Flood",
                                "Thunderstorm",
                                "Lightning",
                                "Flash Flood",
                                "Winter Weather",
                                "Wildfire",
                                "Hurricane",
                                "Highwind")) +
    ylab("Number of Casualties") +
    xlab("")

econ_plot <- ggplot(econ_data[1:10, ], aes(x = reorder(evtype, -econ), y = econ/1000000)) +
    geom_bar(stat = "identity", fill = "orchid4") +
    theme(panel.background = element_rect(fill = "grey94")) +
    ggtitle("Total Value of Damage") +
    theme(plot.title = element_text(size = 16)) +
    theme(axis.title.y = element_text(vjust = 1, size = 12)) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, colour = "grey30")) +
    theme(axis.text.y = element_text(colour = "grey30")) +
    scale_x_discrete(labels = c("Flood",
                                "Hurricane",
                                "Storm Surge",
                                "Tornado",
                                "Hail",
                                "Flash Flood",
                                "Drought",
                                "Thunderstorm",
                                "Tropical Storm",
                                "Wildfire")) +
    ylab("Value of Damage (in Million US Dollars)") +
    xlab("")
grid.arrange(health_plot, econ_plot,
             ncol = 2,
             main = textGrob(
                 "\nImpact of Weather Events on Public Health and the \nEconomy in the United States, 1996 - 2011",
                 gp = gpar(fontface = "bold", fontsize = 20)),
             sub = textGrob("Event Type\n", gp = gpar(fontsize = 12))
             )

plot of chunk sumplot

The data shows that Tornadoes, Floods, Thunderstorms, Flash Floods, Wildfires, and Hurricanes were common among the two sets of weather events that most significantly affected public health and the economy.

Conclusion

In conclusion, this study used data from the NOAA to identify weather events that were the most harmful with respect to public health and the economy in the United States from 1996 to 2010. The 10 most harmful weather events were identified based on the total number of injuries, the total number of fatalities, the total value of property damage, the total value of crop damage, the total number of casualties and the total value of damage. The weather events that had significant consequences on both public health and the economy were Tornadoes, Floods, Thunderstorms, Flash Floods, Wildfires, and Hurricanes.