Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This report explores the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property and crop damage.

This report will focus on those three areas: fatalities, injuries, and economic damage (crop and property damage). For each of these topics, this report shows which weather event types were the most fatal, most injuring, and most costly for the U.S. as a whole. Additionally, this report reviews the most fatal, most injuring, and most costly event type by state. Although the events in the database start in the year 1950, prior to 1996, the only event types that were available were “Tornado, Thunderstorm Wind and Hail” (1955-1996) or “Tornado” (1950-1954). From 1996 forward, 48 event types are recorded as defined in NWS Directive 10-1605, and so this report will focus on this period of time.

Data Processing

This initial code chunk reads in the data and provides information about from where the data were downloaded.

setwd("~/Documents/Coursera/5_ReproducibleResearch/PeerAssessment2")

#This section downloads the data and writes the data to a csv. Because the datafile is quite large, 
#this section can be commented out after the first download and then the reviewer could start with 
#uploading the csv that was previously created.
temp <- tempfile()
link <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(link, temp, method = "curl")
data <- read.csv(bzfile(temp))
unlink(temp)

write.csv(data, file = "StormData.csv", row.names = FALSE) 
data <- read.csv("StormData.csv", stringsAsFactors = FALSE)

This code chunk first subsets the data to only keep the columns used in this analysis. Next, the character variables are all translated to uppercase, and the column with the Beginning date is converted to a date-time variable. Finally, the data is subset based on date to include only observations since January 1996. A total of 248790 observations will be dropped for a remaining total of 653507.

#Libraries used:
library(stringdist)
library(maps)
library(mapdata)
library(scales)
library(knitr)
library(ggplot2)
library(RColorBrewer)

#Subset DF to columns of intrest
ColsInt <- c("BGN_DATE", "STATE", "EVTYPE", "MAG", "FATALITIES", "INJURIES", "PROPDMG", 
             "PROPDMGEXP", "CROPDMG", "CROPDMGEXP", "LATITUDE", "LONGITUDE")

SubData <- data[,ColsInt] #902297 observations of 12 variables

#Ensure character variables are uppercase
SubData$STATE <- toupper(SubData$STATE)
SubData$EVTYPE <- toupper(SubData$EVTYPE)
SubData$PROPDMGEXP <- toupper(SubData$PROPDMGEXP)
SubData$CROPDMGEXP <- toupper(SubData$CROPDMGEXP)

#Convert BGN_DATE variable to date-time variable
SubData$BGN_DATE <- gsub(" 0:00:00", "", SubData$BGN_DATE)
SubData$BGN_DATE <- strptime(SubData$BGN_DATE, "%m/%d/%Y")

#Subset data by date
MoreComplete <- as.POSIXlt("1996-01-01 00:00:00")
SubData <- SubData[(SubData$BGN_DATE > MoreComplete),]

Data Processing: Property and Crop Damage

The following code chunk converts the PROPDMG and CROPDMG back to dollar values and then adds them to create a Total Economic Damage column (TtlEcnDmg). The resulting column is then converted to millions of dollars and rounded.

##Processing for Property and Crop Damage variables

K <- 1000
M <- 1000000
B <- 1000000000

SubData[(SubData$PROPDMGEXP == "K"),]$PROPDMG <- SubData[(SubData$PROPDMGEXP == "K"),]$PROPDMG*K
SubData[(SubData$PROPDMGEXP == "M"),]$PROPDMG <- SubData[(SubData$PROPDMGEXP == "M"),]$PROPDMG*M
SubData[(SubData$PROPDMGEXP == "B"),]$PROPDMG <- SubData[(SubData$PROPDMGEXP == "B"),]$PROPDMG*B
SubData[(SubData$CROPDMGEXP == "K"),]$CROPDMG <- SubData[(SubData$CROPDMGEXP == "K"),]$CROPDMG*K
SubData[(SubData$CROPDMGEXP == "M"),]$CROPDMG <- SubData[(SubData$CROPDMGEXP == "M"),]$CROPDMG*M
SubData[(SubData$CROPDMGEXP == "B"),]$CROPDMG <- SubData[(SubData$CROPDMGEXP == "B"),]$CROPDMG*B

SubData$CROPDMG <- as.numeric(SubData$PROPDMG)
SubData$CROPDMG <- as.numeric(SubData$CROPDMG)

SubData$TtlEcnDmg <- SubData$PROPDMG + SubData$CROPDMG

#convert to Billions
SubData$TtlEcnDmg <- SubData$TtlEcnDmg/B
SubData$TtlEcnDmg <- round(SubData$TtlEcnDmg, digits = 2)

Data Processing: Event Types

This section of the document processes the EVTYPE column by using a csv file that contains the 48 event types as defined in NWS Directive 10-1605 (found at http://www.ncdc.noaa.gov/stormevents/pd01016005curr.pdf). The review of unique event types below revealed that there were 438 unique event types. There are five code chunks in this section. The first code chunk uploads the csv file and creates a data frame of unique event types and frequency in the data.

#Create a vector of Event Types as they are classified in the National Weather Service Documentation
EventTypes <- read.csv("EventTypesNWS.csv", header = FALSE)
EventTypes <- toupper(EventTypes$V1)

#Review unique Event Types
UnqFreq <- as.data.frame(table(SubData$EVTYPE))
UnqFreq <- UnqFreq[order(UnqFreq$Var1),]
UnqFreq <- UnqFreq[order(UnqFreq$Freq, decreasing = TRUE),]

This code chunk includes processing that is based on the initial visual inspection of the unique event types and frequencies.

#Replace "\" with "/"
SubData$EVTYPE <- gsub("\\\\", "/", SubData$EVTYPE)

#Remove Parens and periods
SubData$EVTYPE <- gsub("\\(", "", SubData$EVTYPE)
SubData$EVTYPE <- gsub("\\)", "", SubData$EVTYPE)
SubData$EVTYPE <- gsub("\\.", "", SubData$EVTYPE)

#Remove MPH
SubData$EVTYPE <- gsub("MPH", "", SubData$EVTYPE)

#Remove all numbers
SubData$EVTYPE <- gsub("[0-9]", "", SubData$EVTYPE)

#TSTM to Thunderstorm 
SubData$EVTYPE <- gsub("TSTM", "THUNDERSTORM", SubData$EVTYPE)

#Rain and Snow - Rain to "Heavy Rain", Snow to "Heavy Snow"
SubData$EVTYPE[SubData$EVTYPE=="RAIN"]<-"HEAVY RAIN"
SubData$EVTYPE[SubData$EVTYPE=="SNOW"]<-"HEAVY SNOW"

#Anycase of Microburst to Thunderstorm Winds
microburst <- function(evtypes) {
    tgtList <- unique((grep("MICROBURST", evtypes, value = TRUE)))
    for (i in 1:length(tgtList)) {
        evtypes[(evtypes == tgtList[i])] <- "THUNDERSTORM WINDS"
    }
    evtypes
}
SubData$EVTYPE <- microburst(SubData$EVTYPE)

#Hurricane or Typhoon to "Hurricane/Typhoon" do nothing
hurricane <- function(evtypes) {
    tgtList <- unique(grep("HURRICANE|TYPHOON", evtypes, value = TRUE))
    for (i in 1:length(tgtList)) {
        evtypes[(evtypes == tgtList[i])] <- "HURRICANE/TYPHOON"
    }
    evtypes
}
SubData$EVTYPE <- hurricane(SubData$EVTYPE)

The goal of this section is to identify EVTYPE observations that are very close to one of the EventTypes. The unique event types from the data set are looped through compared to the event types listed in the NWS Documentation using the function stringdist from the stringdist package. The default method for stringdist is Optimal string alignment. The Optimal String Alignment distance (osa) counts the number of deletions, insertions and substitutions necessary to turn b into a, and also allows transposition of adjacent characters. Each substring may be edited only once. (For example, a character cannot be transposed twice to move it forward in the string).

UniqueEvt <- unique(SubData$EVTYPE)
UnqEVtype <- c()
TxtDist <- c()
NewEvt <- c() 

for (i in 1:length(UniqueEvt)) {
    Distances <- stringdist(UniqueEvt[i], EventTypes)
    minDist <- min(Distances)
    minEventType <- EventTypes[which.min(Distances)]
    UnqEVtype <- c(UnqEVtype, UniqueEvt[i])
    TxtDist <- c(TxtDist, minDist)
    NewEvt <- c(NewEvt, minEventType)
}

UnqEvts <- data.frame(EVTYPE = UnqEVtype, TXTDIST = TxtDist, NEWEVT = NewEvt, 
                      stringsAsFactors = FALSE)
OffByUnder3 <- UnqEvts[(UnqEvts$TXTDIST < 3 & UnqEvts$TXTDIST > 0),] 
OffByUnder3 #these look good
##                  EVTYPE TXTDIST            NEWEVT
## 19         RIP CURRENTS       1       RIP CURRENT
## 33   THUNDERSTORM WINDS       1 THUNDERSTORM WIND
## 49         COASTALFLOOD       1     COASTAL FLOOD
## 80         STRONG WINDS       1       STRONG WIND
## 95        COASTAL FLOOD       1     COASTAL FLOOD
## 160 THUNDERSTORM WIND G       2 THUNDERSTORM WIND
## 170  THUNDERSTORM WIND        1 THUNDERSTORM WIND
## 175    THUNDERSTORM WND       1 THUNDERSTORM WIND
## 177   THUNDERSTORM WIND       1 THUNDERSTORM WIND
## 189    LAKE EFFECT SNOW       1  LAKE-EFFECT SNOW
## 200       FUNNEL CLOUDS       1      FUNNEL CLOUD
## 201         WATERSPOUTS       1        WATERSPOUT
## 217          HIGH WINDS       1         HIGH WIND
## 219           LIGHTNING       1         LIGHTNING
## 231         HIGH WIND G       2         HIGH WIND
## 314         FLASH FLOOD       1       FLASH FLOOD
## 327          WATERSPOUT       1        WATERSPOUT
## 338          DUST DEVEL       1        DUST DEVIL
DiffEvts3up <- UnqEvts[(UnqEvts$TXTDIST > 2),]

#Count the number of Event types in the SubData that have a stringdist of 3 or more.
sum(SubData$EVTYPE %in% DiffEvts3up$EVTYPE) #12584
## [1] 12584
#Checking the frequencies of the remaining unique values that have a stringdist greater than 2
UnqFreq <- as.data.frame(table(SubData$EVTYPE))
UnqFreq3plus <- UnqFreq[(UnqFreq$Var1 %in% DiffEvts3up$EVTYPE),]
UnqFreq3plus <- UnqFreq3plus[order(UnqFreq3plus$Freq, decreasing = TRUE),]
head(UnqFreq3plus, n=10)
##                       Var1 Freq
## 341   URBAN/SML STREAM FLD 3391
## 362       WILD/FOREST FIRE 1443
## 375     WINTER WEATHER/MIX 1104
## 309 THUNDERSTORM WIND/HAIL 1026
## 68            EXTREME COLD  617
## 158              LANDSLIDE  588
## 85                     FOG  532
## 363                   WIND  326
## 267            STORM SURGE  253
## 126   HEAVY SURF/HIGH SURF  228

Based on review of frequencies for unique event types with a stringdist of three or more, if the top ten are manually corrected for, this will account for 9508 of the 12584 observations (76%). The code chunk below replaces the EVTYPE variables where appropriate and dropped rows of data for which the appropriate event type is unclear.

#Top Ten based on the freqencies: URBAN/SML STREAM FLD (3391), WILD/FOREST FIRE (1443), WINTER 
#WEATHER/MIX (1104), THUNDERSTORM WIND/HAIL (1026), EXTREME COLD (617), LANDSLIDE (588), FOG (532), 
#WIND (326), STORM SURGE (253), HEAVY SURF/HIGH SURF (228) -- Manually edit these where appropriate

#"Heavy rain situations, resulting in urban and/or small stream flooding, should be classified as a 
#Heavy Rain event.." according to the NWS documentation
SubData[(SubData$EVTYPE == "URBAN/SML STREAM FLD"),]$EVTYPE <- "HEAVY RAIN"

#"Any significant forest fire, grassland fire, rangeland fire, or wildland-urban interface fire..." 
#according to the NWS documentation this should be "WILDFIRE"
SubData[(SubData$EVTYPE == "WILD/FOREST FIRE"),]$EVTYPE <- "WILDFIRE"

#"A Winter Weather event could result from one or more winter precipitation types (snow, or blowing/
#drifting snow, or freezing rain/drizzle), on a widespread or localized basis ..." 
#according to the NWS documentation this should be "WINTER WEATHER"
SubData[(SubData$EVTYPE == "WINTER WEATHER/MIX"),]$EVTYPE <- "WINTER WEATHER"

#THUNDERSTORM WIND/HAIL -- It is not clear whether this should be "THUNDERSTORM WIND" or "HAIL", so
#These will be dropped.
SubData <- SubData[!(SubData$EVTYPE == "THUNDERSTORM WIND/HAIL"),] #1026 rows

#EXTREME COLD to "EXTREME COLD/WIND CHILL"
SubData[(SubData$EVTYPE == "EXTREME COLD"),]$EVTYPE <- "EXTREME COLD/WIND CHILL"

#LANDSLIDE should be renamed to "DEBRIS FLOW"
SubData[(SubData$EVTYPE == "LANDSLIDE"),]$EVTYPE <- "DEBRIS FLOW"
SubData$EVTYPE <- gsub("LANDSLIDE", "DEBRIS FLOW", SubData$EVTYPE)

#FOG to DENSE FOG
SubData[(SubData$EVTYPE == "FOG"),]$EVTYPE <- "DENSE FOG"

#WIND to STRONG WIND or HIGH WIND based on MAG value from dataset - those with MAG values
#greater than 50 assigned to "HIGH WIND", those with values less than 50 assigned to STRONG WIND
SubData[(SubData$EVTYPE == "WIND" & SubData$MAG > 50),]$EVTYPE <- "HIGH WIND"
SubData[(SubData$EVTYPE == "WIND" & SubData$MAG <= 50),]$EVTYPE <- "STRONG WIND"

#STORM SURGE 
SubData[(SubData$EVTYPE == "STORM SURGE"),]$EVTYPE <- "STORM SURGE/TIDE"

#HEAVY SURF/HIGH SURF
SubData[(SubData$EVTYPE == "HEAVY SURF/HIGH SURF"),]$EVTYPE <- "HIGH SURF"

In the final code chunk in the Event Types section, the string distances are recalculated. Visual inspection confirmed that all of of the unique EVTYPES with a stringdist of less than three were matched to the correct event from the NWS Event Type vector. The EVTYPES in the data were then updated. The remaining rows of data that had a stringdist of three or more were dropped from the data.

#Re-calculate the string distances
UniqueEvt <- unique(SubData$EVTYPE)
UnqEVtype <- c()
TxtDist <- c()
NewEvt <- c() 

for (i in 1:length(UniqueEvt)) {
    Distances <- stringdist(UniqueEvt[i], EventTypes)
    minDist <- min(Distances)
    minEventType <- EventTypes[which.min(Distances)]
    UnqEVtype <- c(UnqEVtype, UniqueEvt[i])
    TxtDist <- c(TxtDist, minDist)
    NewEvt <- c(NewEvt, minEventType)
}

UnqEvts <- data.frame(EVTYPE = UnqEVtype, TXTDIST = TxtDist, NEWEVT = NewEvt, 
                      stringsAsFactors = FALSE)
DiffEvts3up <- UnqEvts[(UnqEvts$TXTDIST > 2),]

#off by less than three -- Check again after modifications to confirm that all of these are matched 
#to the correct event from the NWS Event Type vector.  Update EVTYPES in Subdata
OffByUnder3 <- UnqEvts[(UnqEvts$TXTDIST < 3 & UnqEvts$TXTDIST > 0),]
OffByUnder3
##                  EVTYPE TXTDIST            NEWEVT
## 17         RIP CURRENTS       1       RIP CURRENT
## 31   THUNDERSTORM WINDS       1 THUNDERSTORM WIND
## 46         COASTALFLOOD       1     COASTAL FLOOD
## 77         STRONG WINDS       1       STRONG WIND
## 92        COASTAL FLOOD       1     COASTAL FLOOD
## 157 THUNDERSTORM WIND G       2 THUNDERSTORM WIND
## 167  THUNDERSTORM WIND        1 THUNDERSTORM WIND
## 172    THUNDERSTORM WND       1 THUNDERSTORM WIND
## 173   THUNDERSTORM WIND       1 THUNDERSTORM WIND
## 185    LAKE EFFECT SNOW       1  LAKE-EFFECT SNOW
## 196       FUNNEL CLOUDS       1      FUNNEL CLOUD
## 197         WATERSPOUTS       1        WATERSPOUT
## 210        DEBRIS FLOWS       1       DEBRIS FLOW
## 213          HIGH WINDS       1         HIGH WIND
## 215           LIGHTNING       1         LIGHTNING
## 227         HIGH WIND G       2         HIGH WIND
## 310         FLASH FLOOD       1       FLASH FLOOD
## 323          WATERSPOUT       1        WATERSPOUT
## 334          DUST DEVEL       1        DUST DEVIL
for (i in 1:nrow(OffByUnder3)) {
    SubData[(SubData$EVTYPE == OffByUnder3$EVTYPE[i]),]$EVTYPE <- OffByUnder3$NEWEVT[i]
}

#Count the number of Event types in the SubData that have a stringdist of 3 or more.
sum(SubData$EVTYPE %in% DiffEvts3up$EVTYPE) #3074 rows will be dropped
## [1] 3074
#3074 out of the 652481 remaining observations in the dataset is less than 0.5%.  This small 
#percentage of observations will be dropped from the dataset as cleaning them will be excessively time consuming
SubData <- SubData[(!(SubData$EVTYPE %in% DiffEvts3up$EVTYPE)),]

UnqFreq <- as.data.frame(table(SubData$EVTYPE)) #48 Unique Event Types remain


Results

Results: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

The following code chunk sums the total fatalities by event type and the total injuries by event type. The top ten of each are shown in the table below. Excessive Heat and Tornadoes are the most fatal events by quite a bit in the U.S. as a whole. Also as shown in the table, Tornadoes cause more than three times the number of injuries that the next event type (Flood).

Fatalities <- aggregate(FATALITIES~EVTYPE, data = SubData, sum)
Fatalities <- Fatalities[order(Fatalities$FATALITIES, decreasing = TRUE),]
Injuries <- aggregate(INJURIES~EVTYPE, data = SubData, sum)
Injuries <- Injuries[order(Injuries$INJURIES, decreasing = TRUE),]
TopTen <- cbind(Fatalities[c(1:10),], Injuries[c(1:10),])

kable(head(TopTen, n=10), format = "html", row.names = FALSE, 
      col.names = c("Event Type", "Total Fatalities", "Event Type", "Total Injuries"), 
      align = c("l", "c", "l", "c"), 
      caption = c("Top Ten Event Types that are Most Harmful to Human Health in the US"))
Top Ten Event Types that are Most Harmful to Human Health in the US
Event Type Total Fatalities Event Type Total Injuries
EXCESSIVE HEAT 1797 TORNADO 20667
TORNADO 1511 FLOOD 6758
FLASH FLOOD 887 EXCESSIVE HEAT 6391
LIGHTNING 650 THUNDERSTORM WIND 5058
RIP CURRENT 542 LIGHTNING 4140
FLOOD 414 FLASH FLOOD 1674
THUNDERSTORM WIND 376 WILDFIRE 1456
EXTREME COLD/WIND CHILL 240 HURRICANE/TYPHOON 1328
HEAT 237 WINTER STORM 1292
HIGH WIND 235 HEAT 1222


Total Fatalities by Event Type for U.S. States and Territories

The map (Figure 1) below shows the most fatal event type for each of the contiguous 48 states, and the table provides the information for the states and territories that are not on the map but for which there is data. The map shows some patterns that we would expect to see - Tornadoes are most fatal in the areas that see large numbers of tornadoes. One interesting note is that Wildfires are not a most fatal event for any state, nor are Hurricane/Typhoons. It seems likely that this is due to effective evacuation efforts and some ability to provide warning.

StatesCensus <- read.table("state.txt", header = TRUE, sep = "|")
names(StatesCensus) <- c("STATE_CODE", "STATE", "STATE_NAME", "STATENS")

FatalState <- aggregate(FATALITIES~STATE + EVTYPE, data = SubData, sum)
FatalState.agg <- aggregate(FATALITIES~STATE, data=FatalState, max)
FatalState.max <- merge(FatalState.agg, FatalState)

#Merge the Maximum dataset to the US Census Bureau state data based on the State abbrevations 
FatalState.max <- merge(FatalState.max, StatesCensus, by = "STATE")
ForMapFtSt.max <- FatalState.max
ForMapFtSt.max$STATE_NAME <- tolower(ForMapFtSt.max$STATE_NAME)

if (require("maps")) {
    states <- map_data("state")
    names(states) <- c("long", "lat", "group", "order","STATE_NAME", "subregion")
    choro <- merge(ForMapFtSt.max, states, sort = FALSE, by = "STATE_NAME")
    choro <- choro[order(choro$order), ]
    colors <- rainbow(length(unique(choro$EVTYPE)))
    qplot(long, lat, data = choro, group = group, fill = EVTYPE, xlab = "", ylab = "", 
          geom = "polygon", main = "Figure 1: Most Fatal Event Type for U.S. States")
}

NonLower48F <- FatalState.max[!(ForMapFtSt.max$STATE_NAME %in% states$STATE_NAME),]
NonLower48F <- NonLower48F[,c(5, 3, 2)]
kable(head(NonLower48F, n=nrow(NonLower48F)), format = "html", row.names = FALSE, 
      col.names = c("State or Territory", "Event Type", "Total Fatalities"), 
      align = c("l", "l", "c"), 
      caption = c("Total Fatalities by Event Type for Regions not included in Map"))
Total Fatalities by Event Type for Regions not included in Map
State or Territory Event Type Total Fatalities
Alaska AVALANCHE 33
American Samoa TSUNAMI 32
Guam RIP CURRENT 38
Hawaii HIGH SURF 17
Puerto Rico FLASH FLOOD 34
U.S. Virgin Islands HIGH SURF 3


Total Injuries by Event Type for U.S. States and Territories

This section maps the event type that had the greatest number of injuries for each of the contiguous 48 states, and the table provides the information for the states and territories that are not on the map but for which there is data. Similarly to the previous map, we see some patterns that we expect to see, however it is interesting to note that some of the unexpected results (Heat in Michigan) are likely due to lack of preparation for unexpected weather events.

InjState <- aggregate(INJURIES~STATE + EVTYPE, data = SubData, sum)
InjState.agg <- aggregate(INJURIES~STATE, data=InjState, max)
InjState.max <- merge(InjState.agg, InjState)

#Merge the Maximum dataset to the US Census Bureau state data based on the State abbrevations 
InjState.max <- merge(InjState.max, StatesCensus, by = "STATE")
ForMapIjSt.max <- InjState.max
ForMapIjSt.max$STATE_NAME <- tolower(ForMapIjSt.max$STATE_NAME)

if (require("maps")) {
    states <- map_data("state")
    names(states) <- c("long", "lat", "group", "order","STATE_NAME", "subregion")
    choro <- merge(ForMapIjSt.max, states, sort = FALSE, by = "STATE_NAME")
    choro <- choro[order(choro$order), ]
    colors <- rainbow(length(unique(choro$EVTYPE)))
    qplot(long, lat, data = choro, group = group, fill = EVTYPE, xlab = "", ylab = "", 
          geom = "polygon", main = "Figure 2: Weather Event Type with Most Injuries for U.S. States")
} 

NonLower48I <- InjState.max[!(ForMapIjSt.max$STATE_NAME %in% states$STATE_NAME),]
NonLower48I <- NonLower48I[,c(5, 3, 2)]
kable(head(NonLower48I, n=nrow(NonLower48I)), format = "html", row.names = FALSE, 
      col.names = c("State or Territory", "Event Type", "Total Injuries"), 
      align = c("l", "l", "c"), 
      caption = c("Total Injuries by Event Type for Regions not included in Map"))
Total Injuries by Event Type for Regions not included in Map
State or Territory Event Type Total Injuries
Alaska ICE STORM 34
American Samoa TSUNAMI 129
Guam HURRICANE/TYPHOON 339
Hawaii HIGH SURF 28
Puerto Rico HEAVY RAIN 10
U.S. Virgin Islands RIP CURRENT 1
U.S. Virgin Islands LIGHTNING 1


Results: Across the United States, which types of events have the greatest economic consequences?

This code chunk aggregates total economic damage by event type. The top twenty most costly weather event types are listed in the table below. Given the propensity that of Americans for building near water sources, it is not surprising that the top three would all occur especially near water sources. It seems that limiting building properties and businesses on or very near water sources could help reduce the economic costs of Flood events, Hurricane/Typhoon events, and Storm Surges/Tide events.

TecnDMGst <- aggregate(TtlEcnDmg~EVTYPE, data = SubData, sum)
TecnDMGst <- TecnDMGst[order(TecnDMGst$TtlEcnDmg, decreasing = TRUE),]

TopTwenty <- TecnDMGst[1:20,]

kable(head(TopTwenty, n=20), format = "html", row.names = FALSE, 
      col.names = c("Event Type", "Approximate Total Economic Cost"), 
      align = c("l", "c"), 
      caption = c("Top Twenty Event Types by Approximate Total 
                  Economic Cost (in Billions) Across the U.S."))
Top Twenty Event Types by Approximate Total Economic Cost (in Billions) Across the U.S.
Event Type Approximate Total Economic Cost
FLOOD 284.98
HURRICANE/TYPHOON 163.34
STORM SURGE/TIDE 95.61
TORNADO 45.30
HAIL 27.15
FLASH FLOOD 26.46
TROPICAL STORM 15.10
WILDFIRE 15.04
THUNDERSTORM WIND 10.36
HIGH WIND 9.73
ICE STORM 7.11
WINTER STORM 2.73
DROUGHT 2.08
HEAVY RAIN 1.06
HEAVY SNOW 1.03
BLIZZARD 0.97
DEBRIS FLOW 0.58
COASTAL FLOOD 0.46
TSUNAMI 0.29
LIGHTNING 0.26


Most Damaging Event Type and Total Cost for U.S. States and Territories This code chunk calculates the most damaging event type per state. The table below shows each state, the most economically damaging event type for that state, and the total cost of that event type in billions for the state.

CostState <- aggregate(TtlEcnDmg~STATE + EVTYPE, data = SubData, sum)
CostState.agg <- aggregate(TtlEcnDmg~STATE, data=CostState, max)
CostState.max <- merge(CostState.agg, CostState)
CostState.max <- merge(CostState.max, StatesCensus, by = "STATE")

ForTable.max <- CostState.max[,c(5, 3, 2)]

kable(ForTable.max, format = "html", row.names = FALSE,
      col.names = c("State or Territory", "Event Type", "Total Cost"), 
      align = c("l", "l", "c"), 
      caption = c("Most Damaging Event Type and Total Cost in Billions by State"))
Most Damaging Event Type and Total Cost in Billions by State
State or Territory Event Type Total Cost
Alaska FLOOD 0.24
Alabama TORNADO 9.78
Arkansas TORNADO 2.94
American Samoa TSUNAMI 0.16
Arizona HAIL 5.66
California FLOOD 233.41
Colorado HAIL 2.70
Connecticut TROPICAL STORM 0.12
District of Columbia TROPICAL STORM 0.25
Delaware COASTAL FLOOD 0.08
Florida HURRICANE/TYPHOON 56.88
Georgia TORNADO 1.69
Guam HURRICANE/TYPHOON 1.72
Hawaii FLASH FLOOD 0.30
Iowa FLOOD 2.44
Idaho FLOOD 0.21
Illinois FLASH FLOOD 1.49
Indiana FLOOD 1.57
Kansas TORNADO 1.32
Kentucky HAIL 1.20
Louisiana STORM SURGE/TIDE 63.65
Massachusetts TORNADO 0.92
Maryland TROPICAL STORM 1.06
Maine ICE STORM 0.64
Michigan TORNADO 0.60
Minnesota FLOOD 2.47
Missouri TORNADO 7.05
Mississippi HURRICANE/TYPHOON 28.35
Montana HAIL 0.18
North Carolina HURRICANE/TYPHOON 11.01
North Dakota FLOOD 7.72
Nebraska HAIL 1.60
New Hampshire ICE STORM 0.12
New Jersey FLOOD 4.18
New Mexico WILDFIRE 3.07
Nevada FLOOD 1.34
New York FLASH FLOOD 3.29
Ohio FLASH FLOOD 2.21
Oklahoma TORNADO 3.35
Oregon FLOOD 1.42
Pennsylvania FLASH FLOOD 2.67
Puerto Rico HURRICANE/TYPHOON 3.65
Rhode Island FLOOD 0.18
South Carolina ICE STORM 0.30
South Dakota BLIZZARD 0.12
Tennessee FLOOD 8.44
Texas TROPICAL STORM 10.95
Utah FLOOD 0.65
Virginia HURRICANE/TYPHOON 1.27
U.S. Virgin Islands HURRICANE/TYPHOON 0.05
Vermont FLOOD 2.12
Washington FLOOD 0.39
Wisconsin HAIL 1.82
West Virginia FLASH FLOOD 0.78
Wyoming HAIL 0.20


Most Economically Damaging Event Type for U.S. States and Territories This section maps the event type that had the greatest economic cost for each of the contiguous 48 states. The table above includes the information for the states and territories that are not on the map but for which there is data. Similar to Figure 2 above, there are some patterns here that one would expect to see, such as Tornadoes in OK, MO, AR, and KS. However, again, given that Coastal Flood, Flash Flood, Flood, Hurricane/Typhoon, Storm Surge/Tide, and Tropical Storm are the most economically damaging for 35 out of the 55 states and territories, it seems that planning for Flood control and Hurricane/Typhoon/Tsunami preparedness is important.

ForMapCostSt.max <- CostState.max
ForMapCostSt.max$STATE_NAME <- tolower(ForMapCostSt.max$STATE_NAME)
if (require("maps")) {
    states <- map_data("state")
    names(states) <- c("long", "lat", "group", "order","STATE_NAME", "subregion")
    choro <- merge(ForMapCostSt.max, states, sort = FALSE, by = "STATE_NAME")
    choro <- choro[order(choro$order), ]
    colors <- rainbow(length(unique(choro$EVTYPE)))
    mapping <- qplot(long, lat, data = choro, group = group, fill = EVTYPE, geom = "polygon", 
                     xlab = "", ylab = "", 
                     main = "Figure 3: Weather Event Type with Greatest 
                                Economic Cost for U.S. States")
    mapping + scale_fill_brewer(palette="Set3")
} 

uniqueCostEVTYPES <- unique(CostState.max$EVTYPE)
uniqueCostEVTYPES
##  [1] "FLOOD"             "TORNADO"           "TSUNAMI"          
##  [4] "HAIL"              "TROPICAL STORM"    "COASTAL FLOOD"    
##  [7] "HURRICANE/TYPHOON" "FLASH FLOOD"       "STORM SURGE/TIDE" 
## [10] "ICE STORM"         "WILDFIRE"          "BLIZZARD"
Watery <- c("COASTAL FLOOD", "FLASH FLOOD", "FLOOD", "HURRICANE/TYPHOON", "STORM SURGE/TIDE",
            "TROPICAL STORM", "TSUNAMI")

WaterRelated <- ForMapCostSt.max[(ForMapCostSt.max$EVTYPE %in% Watery), c(1:3)]
nrow(WaterRelated) #35 States/Territories
## [1] 35