Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This report explores the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property and crop damage.
This report will focus on those three areas: fatalities, injuries, and economic damage (crop and property damage). For each of these topics, this report shows which weather event types were the most fatal, most injuring, and most costly for the U.S. as a whole. Additionally, this report reviews the most fatal, most injuring, and most costly event type by state. Although the events in the database start in the year 1950, prior to 1996, the only event types that were available were “Tornado, Thunderstorm Wind and Hail” (1955-1996) or “Tornado” (1950-1954). From 1996 forward, 48 event types are recorded as defined in NWS Directive 10-1605, and so this report will focus on this period of time.
This initial code chunk reads in the data and provides information about from where the data were downloaded.
setwd("~/Documents/Coursera/5_ReproducibleResearch/PeerAssessment2")
#This section downloads the data and writes the data to a csv. Because the datafile is quite large,
#this section can be commented out after the first download and then the reviewer could start with
#uploading the csv that was previously created.
temp <- tempfile()
link <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(link, temp, method = "curl")
data <- read.csv(bzfile(temp))
unlink(temp)
write.csv(data, file = "StormData.csv", row.names = FALSE)
data <- read.csv("StormData.csv", stringsAsFactors = FALSE)
This code chunk first subsets the data to only keep the columns used in this analysis. Next, the character variables are all translated to uppercase, and the column with the Beginning date is converted to a date-time variable. Finally, the data is subset based on date to include only observations since January 1996. A total of 248790 observations will be dropped for a remaining total of 653507.
#Libraries used:
library(stringdist)
library(maps)
library(mapdata)
library(scales)
library(knitr)
library(ggplot2)
library(RColorBrewer)
#Subset DF to columns of intrest
ColsInt <- c("BGN_DATE", "STATE", "EVTYPE", "MAG", "FATALITIES", "INJURIES", "PROPDMG",
"PROPDMGEXP", "CROPDMG", "CROPDMGEXP", "LATITUDE", "LONGITUDE")
SubData <- data[,ColsInt] #902297 observations of 12 variables
#Ensure character variables are uppercase
SubData$STATE <- toupper(SubData$STATE)
SubData$EVTYPE <- toupper(SubData$EVTYPE)
SubData$PROPDMGEXP <- toupper(SubData$PROPDMGEXP)
SubData$CROPDMGEXP <- toupper(SubData$CROPDMGEXP)
#Convert BGN_DATE variable to date-time variable
SubData$BGN_DATE <- gsub(" 0:00:00", "", SubData$BGN_DATE)
SubData$BGN_DATE <- strptime(SubData$BGN_DATE, "%m/%d/%Y")
#Subset data by date
MoreComplete <- as.POSIXlt("1996-01-01 00:00:00")
SubData <- SubData[(SubData$BGN_DATE > MoreComplete),]
The following code chunk converts the PROPDMG and CROPDMG back to dollar values and then adds them to create a Total Economic Damage column (TtlEcnDmg). The resulting column is then converted to millions of dollars and rounded.
##Processing for Property and Crop Damage variables
K <- 1000
M <- 1000000
B <- 1000000000
SubData[(SubData$PROPDMGEXP == "K"),]$PROPDMG <- SubData[(SubData$PROPDMGEXP == "K"),]$PROPDMG*K
SubData[(SubData$PROPDMGEXP == "M"),]$PROPDMG <- SubData[(SubData$PROPDMGEXP == "M"),]$PROPDMG*M
SubData[(SubData$PROPDMGEXP == "B"),]$PROPDMG <- SubData[(SubData$PROPDMGEXP == "B"),]$PROPDMG*B
SubData[(SubData$CROPDMGEXP == "K"),]$CROPDMG <- SubData[(SubData$CROPDMGEXP == "K"),]$CROPDMG*K
SubData[(SubData$CROPDMGEXP == "M"),]$CROPDMG <- SubData[(SubData$CROPDMGEXP == "M"),]$CROPDMG*M
SubData[(SubData$CROPDMGEXP == "B"),]$CROPDMG <- SubData[(SubData$CROPDMGEXP == "B"),]$CROPDMG*B
SubData$CROPDMG <- as.numeric(SubData$PROPDMG)
SubData$CROPDMG <- as.numeric(SubData$CROPDMG)
SubData$TtlEcnDmg <- SubData$PROPDMG + SubData$CROPDMG
#convert to Billions
SubData$TtlEcnDmg <- SubData$TtlEcnDmg/B
SubData$TtlEcnDmg <- round(SubData$TtlEcnDmg, digits = 2)
This section of the document processes the EVTYPE column by using a csv file that contains the 48 event types as defined in NWS Directive 10-1605 (found at http://www.ncdc.noaa.gov/stormevents/pd01016005curr.pdf). The review of unique event types below revealed that there were 438 unique event types. There are five code chunks in this section. The first code chunk uploads the csv file and creates a data frame of unique event types and frequency in the data.
#Create a vector of Event Types as they are classified in the National Weather Service Documentation
EventTypes <- read.csv("EventTypesNWS.csv", header = FALSE)
EventTypes <- toupper(EventTypes$V1)
#Review unique Event Types
UnqFreq <- as.data.frame(table(SubData$EVTYPE))
UnqFreq <- UnqFreq[order(UnqFreq$Var1),]
UnqFreq <- UnqFreq[order(UnqFreq$Freq, decreasing = TRUE),]
This code chunk includes processing that is based on the initial visual inspection of the unique event types and frequencies.
#Replace "\" with "/"
SubData$EVTYPE <- gsub("\\\\", "/", SubData$EVTYPE)
#Remove Parens and periods
SubData$EVTYPE <- gsub("\\(", "", SubData$EVTYPE)
SubData$EVTYPE <- gsub("\\)", "", SubData$EVTYPE)
SubData$EVTYPE <- gsub("\\.", "", SubData$EVTYPE)
#Remove MPH
SubData$EVTYPE <- gsub("MPH", "", SubData$EVTYPE)
#Remove all numbers
SubData$EVTYPE <- gsub("[0-9]", "", SubData$EVTYPE)
#TSTM to Thunderstorm
SubData$EVTYPE <- gsub("TSTM", "THUNDERSTORM", SubData$EVTYPE)
#Rain and Snow - Rain to "Heavy Rain", Snow to "Heavy Snow"
SubData$EVTYPE[SubData$EVTYPE=="RAIN"]<-"HEAVY RAIN"
SubData$EVTYPE[SubData$EVTYPE=="SNOW"]<-"HEAVY SNOW"
#Anycase of Microburst to Thunderstorm Winds
microburst <- function(evtypes) {
tgtList <- unique((grep("MICROBURST", evtypes, value = TRUE)))
for (i in 1:length(tgtList)) {
evtypes[(evtypes == tgtList[i])] <- "THUNDERSTORM WINDS"
}
evtypes
}
SubData$EVTYPE <- microburst(SubData$EVTYPE)
#Hurricane or Typhoon to "Hurricane/Typhoon" do nothing
hurricane <- function(evtypes) {
tgtList <- unique(grep("HURRICANE|TYPHOON", evtypes, value = TRUE))
for (i in 1:length(tgtList)) {
evtypes[(evtypes == tgtList[i])] <- "HURRICANE/TYPHOON"
}
evtypes
}
SubData$EVTYPE <- hurricane(SubData$EVTYPE)
The goal of this section is to identify EVTYPE observations that are very close to one of the EventTypes. The unique event types from the data set are looped through compared to the event types listed in the NWS Documentation using the function stringdist from the stringdist package. The default method for stringdist is Optimal string alignment. The Optimal String Alignment distance (osa) counts the number of deletions, insertions and substitutions necessary to turn b into a, and also allows transposition of adjacent characters. Each substring may be edited only once. (For example, a character cannot be transposed twice to move it forward in the string).
UniqueEvt <- unique(SubData$EVTYPE)
UnqEVtype <- c()
TxtDist <- c()
NewEvt <- c()
for (i in 1:length(UniqueEvt)) {
Distances <- stringdist(UniqueEvt[i], EventTypes)
minDist <- min(Distances)
minEventType <- EventTypes[which.min(Distances)]
UnqEVtype <- c(UnqEVtype, UniqueEvt[i])
TxtDist <- c(TxtDist, minDist)
NewEvt <- c(NewEvt, minEventType)
}
UnqEvts <- data.frame(EVTYPE = UnqEVtype, TXTDIST = TxtDist, NEWEVT = NewEvt,
stringsAsFactors = FALSE)
OffByUnder3 <- UnqEvts[(UnqEvts$TXTDIST < 3 & UnqEvts$TXTDIST > 0),]
OffByUnder3 #these look good
## EVTYPE TXTDIST NEWEVT
## 19 RIP CURRENTS 1 RIP CURRENT
## 33 THUNDERSTORM WINDS 1 THUNDERSTORM WIND
## 49 COASTALFLOOD 1 COASTAL FLOOD
## 80 STRONG WINDS 1 STRONG WIND
## 95 COASTAL FLOOD 1 COASTAL FLOOD
## 160 THUNDERSTORM WIND G 2 THUNDERSTORM WIND
## 170 THUNDERSTORM WIND 1 THUNDERSTORM WIND
## 175 THUNDERSTORM WND 1 THUNDERSTORM WIND
## 177 THUNDERSTORM WIND 1 THUNDERSTORM WIND
## 189 LAKE EFFECT SNOW 1 LAKE-EFFECT SNOW
## 200 FUNNEL CLOUDS 1 FUNNEL CLOUD
## 201 WATERSPOUTS 1 WATERSPOUT
## 217 HIGH WINDS 1 HIGH WIND
## 219 LIGHTNING 1 LIGHTNING
## 231 HIGH WIND G 2 HIGH WIND
## 314 FLASH FLOOD 1 FLASH FLOOD
## 327 WATERSPOUT 1 WATERSPOUT
## 338 DUST DEVEL 1 DUST DEVIL
DiffEvts3up <- UnqEvts[(UnqEvts$TXTDIST > 2),]
#Count the number of Event types in the SubData that have a stringdist of 3 or more.
sum(SubData$EVTYPE %in% DiffEvts3up$EVTYPE) #12584
## [1] 12584
#Checking the frequencies of the remaining unique values that have a stringdist greater than 2
UnqFreq <- as.data.frame(table(SubData$EVTYPE))
UnqFreq3plus <- UnqFreq[(UnqFreq$Var1 %in% DiffEvts3up$EVTYPE),]
UnqFreq3plus <- UnqFreq3plus[order(UnqFreq3plus$Freq, decreasing = TRUE),]
head(UnqFreq3plus, n=10)
## Var1 Freq
## 341 URBAN/SML STREAM FLD 3391
## 362 WILD/FOREST FIRE 1443
## 375 WINTER WEATHER/MIX 1104
## 309 THUNDERSTORM WIND/HAIL 1026
## 68 EXTREME COLD 617
## 158 LANDSLIDE 588
## 85 FOG 532
## 363 WIND 326
## 267 STORM SURGE 253
## 126 HEAVY SURF/HIGH SURF 228
Based on review of frequencies for unique event types with a stringdist of three or more, if the top ten are manually corrected for, this will account for 9508 of the 12584 observations (76%). The code chunk below replaces the EVTYPE variables where appropriate and dropped rows of data for which the appropriate event type is unclear.
#Top Ten based on the freqencies: URBAN/SML STREAM FLD (3391), WILD/FOREST FIRE (1443), WINTER
#WEATHER/MIX (1104), THUNDERSTORM WIND/HAIL (1026), EXTREME COLD (617), LANDSLIDE (588), FOG (532),
#WIND (326), STORM SURGE (253), HEAVY SURF/HIGH SURF (228) -- Manually edit these where appropriate
#"Heavy rain situations, resulting in urban and/or small stream flooding, should be classified as a
#Heavy Rain event.." according to the NWS documentation
SubData[(SubData$EVTYPE == "URBAN/SML STREAM FLD"),]$EVTYPE <- "HEAVY RAIN"
#"Any significant forest fire, grassland fire, rangeland fire, or wildland-urban interface fire..."
#according to the NWS documentation this should be "WILDFIRE"
SubData[(SubData$EVTYPE == "WILD/FOREST FIRE"),]$EVTYPE <- "WILDFIRE"
#"A Winter Weather event could result from one or more winter precipitation types (snow, or blowing/
#drifting snow, or freezing rain/drizzle), on a widespread or localized basis ..."
#according to the NWS documentation this should be "WINTER WEATHER"
SubData[(SubData$EVTYPE == "WINTER WEATHER/MIX"),]$EVTYPE <- "WINTER WEATHER"
#THUNDERSTORM WIND/HAIL -- It is not clear whether this should be "THUNDERSTORM WIND" or "HAIL", so
#These will be dropped.
SubData <- SubData[!(SubData$EVTYPE == "THUNDERSTORM WIND/HAIL"),] #1026 rows
#EXTREME COLD to "EXTREME COLD/WIND CHILL"
SubData[(SubData$EVTYPE == "EXTREME COLD"),]$EVTYPE <- "EXTREME COLD/WIND CHILL"
#LANDSLIDE should be renamed to "DEBRIS FLOW"
SubData[(SubData$EVTYPE == "LANDSLIDE"),]$EVTYPE <- "DEBRIS FLOW"
SubData$EVTYPE <- gsub("LANDSLIDE", "DEBRIS FLOW", SubData$EVTYPE)
#FOG to DENSE FOG
SubData[(SubData$EVTYPE == "FOG"),]$EVTYPE <- "DENSE FOG"
#WIND to STRONG WIND or HIGH WIND based on MAG value from dataset - those with MAG values
#greater than 50 assigned to "HIGH WIND", those with values less than 50 assigned to STRONG WIND
SubData[(SubData$EVTYPE == "WIND" & SubData$MAG > 50),]$EVTYPE <- "HIGH WIND"
SubData[(SubData$EVTYPE == "WIND" & SubData$MAG <= 50),]$EVTYPE <- "STRONG WIND"
#STORM SURGE
SubData[(SubData$EVTYPE == "STORM SURGE"),]$EVTYPE <- "STORM SURGE/TIDE"
#HEAVY SURF/HIGH SURF
SubData[(SubData$EVTYPE == "HEAVY SURF/HIGH SURF"),]$EVTYPE <- "HIGH SURF"
In the final code chunk in the Event Types section, the string distances are recalculated. Visual inspection confirmed that all of of the unique EVTYPES with a stringdist of less than three were matched to the correct event from the NWS Event Type vector. The EVTYPES in the data were then updated. The remaining rows of data that had a stringdist of three or more were dropped from the data.
#Re-calculate the string distances
UniqueEvt <- unique(SubData$EVTYPE)
UnqEVtype <- c()
TxtDist <- c()
NewEvt <- c()
for (i in 1:length(UniqueEvt)) {
Distances <- stringdist(UniqueEvt[i], EventTypes)
minDist <- min(Distances)
minEventType <- EventTypes[which.min(Distances)]
UnqEVtype <- c(UnqEVtype, UniqueEvt[i])
TxtDist <- c(TxtDist, minDist)
NewEvt <- c(NewEvt, minEventType)
}
UnqEvts <- data.frame(EVTYPE = UnqEVtype, TXTDIST = TxtDist, NEWEVT = NewEvt,
stringsAsFactors = FALSE)
DiffEvts3up <- UnqEvts[(UnqEvts$TXTDIST > 2),]
#off by less than three -- Check again after modifications to confirm that all of these are matched
#to the correct event from the NWS Event Type vector. Update EVTYPES in Subdata
OffByUnder3 <- UnqEvts[(UnqEvts$TXTDIST < 3 & UnqEvts$TXTDIST > 0),]
OffByUnder3
## EVTYPE TXTDIST NEWEVT
## 17 RIP CURRENTS 1 RIP CURRENT
## 31 THUNDERSTORM WINDS 1 THUNDERSTORM WIND
## 46 COASTALFLOOD 1 COASTAL FLOOD
## 77 STRONG WINDS 1 STRONG WIND
## 92 COASTAL FLOOD 1 COASTAL FLOOD
## 157 THUNDERSTORM WIND G 2 THUNDERSTORM WIND
## 167 THUNDERSTORM WIND 1 THUNDERSTORM WIND
## 172 THUNDERSTORM WND 1 THUNDERSTORM WIND
## 173 THUNDERSTORM WIND 1 THUNDERSTORM WIND
## 185 LAKE EFFECT SNOW 1 LAKE-EFFECT SNOW
## 196 FUNNEL CLOUDS 1 FUNNEL CLOUD
## 197 WATERSPOUTS 1 WATERSPOUT
## 210 DEBRIS FLOWS 1 DEBRIS FLOW
## 213 HIGH WINDS 1 HIGH WIND
## 215 LIGHTNING 1 LIGHTNING
## 227 HIGH WIND G 2 HIGH WIND
## 310 FLASH FLOOD 1 FLASH FLOOD
## 323 WATERSPOUT 1 WATERSPOUT
## 334 DUST DEVEL 1 DUST DEVIL
for (i in 1:nrow(OffByUnder3)) {
SubData[(SubData$EVTYPE == OffByUnder3$EVTYPE[i]),]$EVTYPE <- OffByUnder3$NEWEVT[i]
}
#Count the number of Event types in the SubData that have a stringdist of 3 or more.
sum(SubData$EVTYPE %in% DiffEvts3up$EVTYPE) #3074 rows will be dropped
## [1] 3074
#3074 out of the 652481 remaining observations in the dataset is less than 0.5%. This small
#percentage of observations will be dropped from the dataset as cleaning them will be excessively time consuming
SubData <- SubData[(!(SubData$EVTYPE %in% DiffEvts3up$EVTYPE)),]
UnqFreq <- as.data.frame(table(SubData$EVTYPE)) #48 Unique Event Types remain
The following code chunk sums the total fatalities by event type and the total injuries by event type. The top ten of each are shown in the table below. Excessive Heat and Tornadoes are the most fatal events by quite a bit in the U.S. as a whole. Also as shown in the table, Tornadoes cause more than three times the number of injuries that the next event type (Flood).
Fatalities <- aggregate(FATALITIES~EVTYPE, data = SubData, sum)
Fatalities <- Fatalities[order(Fatalities$FATALITIES, decreasing = TRUE),]
Injuries <- aggregate(INJURIES~EVTYPE, data = SubData, sum)
Injuries <- Injuries[order(Injuries$INJURIES, decreasing = TRUE),]
TopTen <- cbind(Fatalities[c(1:10),], Injuries[c(1:10),])
kable(head(TopTen, n=10), format = "html", row.names = FALSE,
col.names = c("Event Type", "Total Fatalities", "Event Type", "Total Injuries"),
align = c("l", "c", "l", "c"),
caption = c("Top Ten Event Types that are Most Harmful to Human Health in the US"))
Event Type | Total Fatalities | Event Type | Total Injuries |
---|---|---|---|
EXCESSIVE HEAT | 1797 | TORNADO | 20667 |
TORNADO | 1511 | FLOOD | 6758 |
FLASH FLOOD | 887 | EXCESSIVE HEAT | 6391 |
LIGHTNING | 650 | THUNDERSTORM WIND | 5058 |
RIP CURRENT | 542 | LIGHTNING | 4140 |
FLOOD | 414 | FLASH FLOOD | 1674 |
THUNDERSTORM WIND | 376 | WILDFIRE | 1456 |
EXTREME COLD/WIND CHILL | 240 | HURRICANE/TYPHOON | 1328 |
HEAT | 237 | WINTER STORM | 1292 |
HIGH WIND | 235 | HEAT | 1222 |
Total Fatalities by Event Type for U.S. States and Territories
The map (Figure 1) below shows the most fatal event type for each of the contiguous 48 states, and the table provides the information for the states and territories that are not on the map but for which there is data. The map shows some patterns that we would expect to see - Tornadoes are most fatal in the areas that see large numbers of tornadoes. One interesting note is that Wildfires are not a most fatal event for any state, nor are Hurricane/Typhoons. It seems likely that this is due to effective evacuation efforts and some ability to provide warning.
StatesCensus <- read.table("state.txt", header = TRUE, sep = "|")
names(StatesCensus) <- c("STATE_CODE", "STATE", "STATE_NAME", "STATENS")
FatalState <- aggregate(FATALITIES~STATE + EVTYPE, data = SubData, sum)
FatalState.agg <- aggregate(FATALITIES~STATE, data=FatalState, max)
FatalState.max <- merge(FatalState.agg, FatalState)
#Merge the Maximum dataset to the US Census Bureau state data based on the State abbrevations
FatalState.max <- merge(FatalState.max, StatesCensus, by = "STATE")
ForMapFtSt.max <- FatalState.max
ForMapFtSt.max$STATE_NAME <- tolower(ForMapFtSt.max$STATE_NAME)
if (require("maps")) {
states <- map_data("state")
names(states) <- c("long", "lat", "group", "order","STATE_NAME", "subregion")
choro <- merge(ForMapFtSt.max, states, sort = FALSE, by = "STATE_NAME")
choro <- choro[order(choro$order), ]
colors <- rainbow(length(unique(choro$EVTYPE)))
qplot(long, lat, data = choro, group = group, fill = EVTYPE, xlab = "", ylab = "",
geom = "polygon", main = "Figure 1: Most Fatal Event Type for U.S. States")
}
NonLower48F <- FatalState.max[!(ForMapFtSt.max$STATE_NAME %in% states$STATE_NAME),]
NonLower48F <- NonLower48F[,c(5, 3, 2)]
kable(head(NonLower48F, n=nrow(NonLower48F)), format = "html", row.names = FALSE,
col.names = c("State or Territory", "Event Type", "Total Fatalities"),
align = c("l", "l", "c"),
caption = c("Total Fatalities by Event Type for Regions not included in Map"))
State or Territory | Event Type | Total Fatalities |
---|---|---|
Alaska | AVALANCHE | 33 |
American Samoa | TSUNAMI | 32 |
Guam | RIP CURRENT | 38 |
Hawaii | HIGH SURF | 17 |
Puerto Rico | FLASH FLOOD | 34 |
U.S. Virgin Islands | HIGH SURF | 3 |
Total Injuries by Event Type for U.S. States and Territories
This section maps the event type that had the greatest number of injuries for each of the contiguous 48 states, and the table provides the information for the states and territories that are not on the map but for which there is data. Similarly to the previous map, we see some patterns that we expect to see, however it is interesting to note that some of the unexpected results (Heat in Michigan) are likely due to lack of preparation for unexpected weather events.
InjState <- aggregate(INJURIES~STATE + EVTYPE, data = SubData, sum)
InjState.agg <- aggregate(INJURIES~STATE, data=InjState, max)
InjState.max <- merge(InjState.agg, InjState)
#Merge the Maximum dataset to the US Census Bureau state data based on the State abbrevations
InjState.max <- merge(InjState.max, StatesCensus, by = "STATE")
ForMapIjSt.max <- InjState.max
ForMapIjSt.max$STATE_NAME <- tolower(ForMapIjSt.max$STATE_NAME)
if (require("maps")) {
states <- map_data("state")
names(states) <- c("long", "lat", "group", "order","STATE_NAME", "subregion")
choro <- merge(ForMapIjSt.max, states, sort = FALSE, by = "STATE_NAME")
choro <- choro[order(choro$order), ]
colors <- rainbow(length(unique(choro$EVTYPE)))
qplot(long, lat, data = choro, group = group, fill = EVTYPE, xlab = "", ylab = "",
geom = "polygon", main = "Figure 2: Weather Event Type with Most Injuries for U.S. States")
}
NonLower48I <- InjState.max[!(ForMapIjSt.max$STATE_NAME %in% states$STATE_NAME),]
NonLower48I <- NonLower48I[,c(5, 3, 2)]
kable(head(NonLower48I, n=nrow(NonLower48I)), format = "html", row.names = FALSE,
col.names = c("State or Territory", "Event Type", "Total Injuries"),
align = c("l", "l", "c"),
caption = c("Total Injuries by Event Type for Regions not included in Map"))
State or Territory | Event Type | Total Injuries |
---|---|---|
Alaska | ICE STORM | 34 |
American Samoa | TSUNAMI | 129 |
Guam | HURRICANE/TYPHOON | 339 |
Hawaii | HIGH SURF | 28 |
Puerto Rico | HEAVY RAIN | 10 |
U.S. Virgin Islands | RIP CURRENT | 1 |
U.S. Virgin Islands | LIGHTNING | 1 |
This code chunk aggregates total economic damage by event type. The top twenty most costly weather event types are listed in the table below. Given the propensity that of Americans for building near water sources, it is not surprising that the top three would all occur especially near water sources. It seems that limiting building properties and businesses on or very near water sources could help reduce the economic costs of Flood events, Hurricane/Typhoon events, and Storm Surges/Tide events.
TecnDMGst <- aggregate(TtlEcnDmg~EVTYPE, data = SubData, sum)
TecnDMGst <- TecnDMGst[order(TecnDMGst$TtlEcnDmg, decreasing = TRUE),]
TopTwenty <- TecnDMGst[1:20,]
kable(head(TopTwenty, n=20), format = "html", row.names = FALSE,
col.names = c("Event Type", "Approximate Total Economic Cost"),
align = c("l", "c"),
caption = c("Top Twenty Event Types by Approximate Total
Economic Cost (in Billions) Across the U.S."))
Event Type | Approximate Total Economic Cost |
---|---|
FLOOD | 284.98 |
HURRICANE/TYPHOON | 163.34 |
STORM SURGE/TIDE | 95.61 |
TORNADO | 45.30 |
HAIL | 27.15 |
FLASH FLOOD | 26.46 |
TROPICAL STORM | 15.10 |
WILDFIRE | 15.04 |
THUNDERSTORM WIND | 10.36 |
HIGH WIND | 9.73 |
ICE STORM | 7.11 |
WINTER STORM | 2.73 |
DROUGHT | 2.08 |
HEAVY RAIN | 1.06 |
HEAVY SNOW | 1.03 |
BLIZZARD | 0.97 |
DEBRIS FLOW | 0.58 |
COASTAL FLOOD | 0.46 |
TSUNAMI | 0.29 |
LIGHTNING | 0.26 |
Most Damaging Event Type and Total Cost for U.S. States and Territories This code chunk calculates the most damaging event type per state. The table below shows each state, the most economically damaging event type for that state, and the total cost of that event type in billions for the state.
CostState <- aggregate(TtlEcnDmg~STATE + EVTYPE, data = SubData, sum)
CostState.agg <- aggregate(TtlEcnDmg~STATE, data=CostState, max)
CostState.max <- merge(CostState.agg, CostState)
CostState.max <- merge(CostState.max, StatesCensus, by = "STATE")
ForTable.max <- CostState.max[,c(5, 3, 2)]
kable(ForTable.max, format = "html", row.names = FALSE,
col.names = c("State or Territory", "Event Type", "Total Cost"),
align = c("l", "l", "c"),
caption = c("Most Damaging Event Type and Total Cost in Billions by State"))
State or Territory | Event Type | Total Cost |
---|---|---|
Alaska | FLOOD | 0.24 |
Alabama | TORNADO | 9.78 |
Arkansas | TORNADO | 2.94 |
American Samoa | TSUNAMI | 0.16 |
Arizona | HAIL | 5.66 |
California | FLOOD | 233.41 |
Colorado | HAIL | 2.70 |
Connecticut | TROPICAL STORM | 0.12 |
District of Columbia | TROPICAL STORM | 0.25 |
Delaware | COASTAL FLOOD | 0.08 |
Florida | HURRICANE/TYPHOON | 56.88 |
Georgia | TORNADO | 1.69 |
Guam | HURRICANE/TYPHOON | 1.72 |
Hawaii | FLASH FLOOD | 0.30 |
Iowa | FLOOD | 2.44 |
Idaho | FLOOD | 0.21 |
Illinois | FLASH FLOOD | 1.49 |
Indiana | FLOOD | 1.57 |
Kansas | TORNADO | 1.32 |
Kentucky | HAIL | 1.20 |
Louisiana | STORM SURGE/TIDE | 63.65 |
Massachusetts | TORNADO | 0.92 |
Maryland | TROPICAL STORM | 1.06 |
Maine | ICE STORM | 0.64 |
Michigan | TORNADO | 0.60 |
Minnesota | FLOOD | 2.47 |
Missouri | TORNADO | 7.05 |
Mississippi | HURRICANE/TYPHOON | 28.35 |
Montana | HAIL | 0.18 |
North Carolina | HURRICANE/TYPHOON | 11.01 |
North Dakota | FLOOD | 7.72 |
Nebraska | HAIL | 1.60 |
New Hampshire | ICE STORM | 0.12 |
New Jersey | FLOOD | 4.18 |
New Mexico | WILDFIRE | 3.07 |
Nevada | FLOOD | 1.34 |
New York | FLASH FLOOD | 3.29 |
Ohio | FLASH FLOOD | 2.21 |
Oklahoma | TORNADO | 3.35 |
Oregon | FLOOD | 1.42 |
Pennsylvania | FLASH FLOOD | 2.67 |
Puerto Rico | HURRICANE/TYPHOON | 3.65 |
Rhode Island | FLOOD | 0.18 |
South Carolina | ICE STORM | 0.30 |
South Dakota | BLIZZARD | 0.12 |
Tennessee | FLOOD | 8.44 |
Texas | TROPICAL STORM | 10.95 |
Utah | FLOOD | 0.65 |
Virginia | HURRICANE/TYPHOON | 1.27 |
U.S. Virgin Islands | HURRICANE/TYPHOON | 0.05 |
Vermont | FLOOD | 2.12 |
Washington | FLOOD | 0.39 |
Wisconsin | HAIL | 1.82 |
West Virginia | FLASH FLOOD | 0.78 |
Wyoming | HAIL | 0.20 |
Most Economically Damaging Event Type for U.S. States and Territories This section maps the event type that had the greatest economic cost for each of the contiguous 48 states. The table above includes the information for the states and territories that are not on the map but for which there is data. Similar to Figure 2 above, there are some patterns here that one would expect to see, such as Tornadoes in OK, MO, AR, and KS. However, again, given that Coastal Flood, Flash Flood, Flood, Hurricane/Typhoon, Storm Surge/Tide, and Tropical Storm are the most economically damaging for 35 out of the 55 states and territories, it seems that planning for Flood control and Hurricane/Typhoon/Tsunami preparedness is important.
ForMapCostSt.max <- CostState.max
ForMapCostSt.max$STATE_NAME <- tolower(ForMapCostSt.max$STATE_NAME)
if (require("maps")) {
states <- map_data("state")
names(states) <- c("long", "lat", "group", "order","STATE_NAME", "subregion")
choro <- merge(ForMapCostSt.max, states, sort = FALSE, by = "STATE_NAME")
choro <- choro[order(choro$order), ]
colors <- rainbow(length(unique(choro$EVTYPE)))
mapping <- qplot(long, lat, data = choro, group = group, fill = EVTYPE, geom = "polygon",
xlab = "", ylab = "",
main = "Figure 3: Weather Event Type with Greatest
Economic Cost for U.S. States")
mapping + scale_fill_brewer(palette="Set3")
}
uniqueCostEVTYPES <- unique(CostState.max$EVTYPE)
uniqueCostEVTYPES
## [1] "FLOOD" "TORNADO" "TSUNAMI"
## [4] "HAIL" "TROPICAL STORM" "COASTAL FLOOD"
## [7] "HURRICANE/TYPHOON" "FLASH FLOOD" "STORM SURGE/TIDE"
## [10] "ICE STORM" "WILDFIRE" "BLIZZARD"
Watery <- c("COASTAL FLOOD", "FLASH FLOOD", "FLOOD", "HURRICANE/TYPHOON", "STORM SURGE/TIDE",
"TROPICAL STORM", "TSUNAMI")
WaterRelated <- ForMapCostSt.max[(ForMapCostSt.max$EVTYPE %in% Watery), c(1:3)]
nrow(WaterRelated) #35 States/Territories
## [1] 35