Economic and Health Impact of Storm Events in the United States

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This report explores the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage for the period from 1950 through to 2011.

Underlying storm events are aggregated and analyzed to be able to identify those events which have the most significant

with a view to providing appropriate data to assist a policy maker arriving at decisions around appropriate resource allocations for prevention and mitigation activity.

Data Processing

The storm data is a copy of the NOAA storm database made available from the Reproducible Research course website. The data is provided in a csv and bz2 compressed format. The original data is provided by the NOAA National Centers for Environmental Information and the original raw data is available in the Storm Events Database.

We obtain the file and load it into a data table. Given the time taken to download and parse the file we persist download and parsed file after processing for the first time.

storm.data <- (function() {
        parsed.file = './raw-data/storm.data.rds'
        
        if (!file.exists(parsed.file)) {
                if (!file.exists('raw-data')) {
                    dir.create(file.path(getwd(), 'raw-data'))
            }
        
            url <- 'https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2'
            file <- './raw-data/repdata-data-StormData.csv.bz2'
            
            if (!file.exists(file)) {
                download(url, file)
                        storm.data <- data.table(read.csv(bzfile("./raw-data/repdata-data-StormData.csv.bz2")))
                        setkey(storm.data, 'REFNUM')
                        saveRDS(storm.data, parsed.file)
                }
        } else {
                storm.data <- readRDS(parsed.file)        
        }
        storm.data
})()

Data on storm events is provided from 1950 through to November 2011. During this period there have been 902297 observations of weather events captured which collectively represent `r length(unique(storm.data$EVTYPE)) different event types. There are however a number of inconsistencies with the data and prior to performing analysis additional processing was performed to clean the data.

Prior to 1996 the data is limited in scope adn only captures Tornado Data (1950 - 1995 from the Storm Prediction Center) and Thunderstorm Wind and Hail Data (1959 - 1995 Storm Prediection Center). It was not until 1996 that the capture of storm data was standardized against the current Storm Data Event definitions so data prior to 1996 was discarded to avoid skewing towards a narrow series of events.

storm.data <- storm.data[as.Date(BGN_DATE, "%m/%d/%Y") >= "1996-01-01"]

As we are looking at the economic and health impact of events we can discard any events that do not meet these criteria. We filter data where there is no health impact (0 fatalities or injuries) or economic damage (0 property or crop damage).

storm.data <- storm.data[storm.data$FATALITIES > 0 | storm.data$INJURIES > 0 | storm.data$PROPDMG > 0 | storm.data$FATALITIES > 0 , ]

There is a lack of standardization for event categorization. Equivalent events are described using various cases, punctuation and descriptions so we have looked to clean these by:

  • use lowercase for all event descriptors;
  • replace any punctuation with a space;
  • strip any leading spaces;
  • remove any digits;
  • remove trailing spaces and replace multiple spaces with a single space;
  • remove summary events;
  • remove unspecified event types ie. other and none;
  • remove any spaces and terminal characters where there are no more than 2 characters
storm.data$EVTYPE <- tolower(storm.data$EVTYPE)
storm.data$EVTYPE <- gsub("[[:blank:][:punct:]+]", " ", storm.data$EVTYPE)
storm.data$EVTYPE <- gsub("^\\s+", "", storm.data$EVTYPE)
storm.data$EVTYPE <- gsub("[0-9]+", "", storm.data$EVTYPE)
storm.data$EVTYPE <- gsub("\\s+$", "", storm.data$EVTYPE)
storm.data$EVTYPE <- gsub("\\s+", " ", storm.data$EVTYPE)

storm.data <- storm.data[!EVTYPE %like% "summary"]
storm.data <- storm.data[!EVTYPE %like% "none" | !EVTYPE %like% "other"]
storm.data <- storm.data[!grep("\\s.{1,2}$", EVTYPE)]

Subsequent to this event normalization there are

length(unique(storm.data$EVTYPE))
## [1] 163

distinct event types covering

nrow(storm.data)
## [1] 196024

observations. Prior to further analysis we need to normalize these to the 48 standardized event types provided in p.6 2.1.1 Table 1. Storm Data Event Table of NOAA’s NWS Documentation. We have done this by producing a lookup table which provides appropriate mappings for the remaining event types to the standardized values. This was achieved using a technical described by Daniel Falster using lookup tables and implemented in addNewData.R.

The souce for Daniel Falster’s implementation follows:

##' Modifies 'data' by adding new values supplied in newDataFileName
##'
##' newDataFileName is expected to have columns 
##' c(lookupVariable,lookupValue,newVariable,newValue,source)
##' 
##' Within the column 'newVariable', replace values that
##' match 'lookupValue' within column 'lookupVariable' with the value
##' newValue'.  If 'lookupVariable' is NA, then replace *all* elements
##' of 'newVariable' with the value 'newValue'.
##'
##' Note that lookupVariable can be the same as newVariable.
##'
##' @param newDataFileName name of lookup table
##' @param data existing data.frame
##' @param allowedVars vector of permissible variable names for newVariable
##' @return modified data.frame
addNewData <- function(newDataFileName, data, allowedVars){

  import <- readNewData(newDataFileName, allowedVars)

  if( !is.null(import)){    
    for(i in seq_len(nrow(import))){  #Make replacements
      col.to <- import$newVariable[i] 
      col.from <- import$lookupVariable[i]
      if(is.na(col.from)){ # apply to whole column
        data[col.to] <- import$newValue[i]
      } else { # apply to subset
        rows <- data[[col.from]] == import$lookupValue[i]
        data[rows,col.to] <- import$newValue[i]
      }
    }   
  }      
  data
}

##' Utility function to read/process newDataFileName for addNewData
##' 
##' @param newDataFileName name of lookup table
##' @param allowedVars vector of permissible variable names for newVariable
##' @return data.frame with columns c(lookupVariable,lookupValue,newVariable,newValue,source)
readNewData <- function(newDataFileName, allowedVars){
  
  if( file.exists(newDataFileName)){
    import <- read.csv(newDataFileName, header=TRUE, stringsAsFactors=FALSE,
                       strip.white=TRUE)
    if( nrow(import)> 0 ){
      
      #Check columns names for import are right
      expectedColumns<- c("lookupVariable","lookupValue","newVariable","newValue")
      nameIsOK <-  expectedColumns %in% names(import)
      if(any(!nameIsOK))
        stop("Incorrect name in lookup table for ",
             newDataFileName, "--> ", paste(expectedColumns[!nameIsOK],
                                            collapse=", "))
      
      #Check values of newVariable are in list of allowed variables
      import$lookupVariable[import$lookupVariable == ""] <- NA
      nameIsOK <- import$newVariable %in% allowedVars
      if(any(!nameIsOK))
        stop("Incorrect name(s) in newVariable column of ",
             newDataFileName, "--> ", paste(import$newVariable[!nameIsOK],
                                      collapse=", "))
    } else {
      import <- NULL
    }
  } else {
    import <- NULL
  }
  import
}

The data mappings used to normalize the remaining events were:

event.map <- read.csv(file="analysis-data/eventmap.csv",head=TRUE,sep=",")
event.map[, c(2, 4)]
##                  lookupValue                 newValue
## 1                          x                    Other
## 2     astronomical high tide         Storm Surge/Tide
## 3      astronomical low tide    Astronomical Low Tide
## 4                  avalanche                Avalanche
## 5              beach erosion            Coastal Flood
## 6                  black ice             Frost/Freeze
## 7                   blizzard                 Blizzard
## 8               blowing dust               Dust Storm
## 9               blowing snow                Ice Storm
## 10                brush fire                 Wildfire
## 11           coastal erosion            Coastal Flood
## 12             coastal flood            Coastal Flood
## 13          coastal flooding            Coastal Flood
## 14  coastal flooding erosion            Coastal Flood
## 15             coastal storm            Coastal Flood
## 16              coastalstorm            Coastal Flood
## 17                      cold          Cold/Wind Chill
## 18             cold and snow          Cold/Wind Chill
## 19          cold temperature          Cold/Wind Chill
## 20              cold weather          Cold/Wind Chill
## 21           cold wind chill          Cold/Wind Chill
## 22                 dam break                    Flood
## 23           damaging freeze             Frost/Freeze
## 24                 dense fog                Dense Fog
## 25               dense smoke              Dense Smoke
## 26                 downburst               Heavy Rain
## 27                   drought                  Drought
## 28                  drowning                    Other
## 29            dry microburst        Thunderstorm Wind
## 30                dust devil               Dust Devil
## 31                dust storm               Dust Storm
## 32        erosion cstl flood            Coastal Flood
## 33            excessive heat           Excessive Heat
## 34            excessive snow               Heavy Snow
## 35             extended cold  Extreme Cold/Wind Chill
## 36              extreme cold  Extreme Cold/Wind Chill
## 37   extreme cold wind chill  Extreme Cold/Wind Chill
## 38         extreme windchill  Extreme Cold/Wind Chill
## 39          falling snow ice                Ice Storm
## 40               flash flood              Flash Flood
## 41         flash flood flood              Flash Flood
## 42                     flood                    Flood
## 43         flood flash flood              Flash Flood
## 44                       fog                Dense Fog
## 45                    freeze             Frost/Freeze
## 46          freezing drizzle             Frost/Freeze
## 47              freezing fog             Freezing Fog
## 48             freezing rain                    Sleet
## 49            freezing spray                    Sleet
## 50                     frost             Frost/Freeze
## 51              frost freeze             Frost/Freeze
## 52              funnel cloud             Funnel Cloud
## 53                     glaze             Frost/Freeze
## 54             gradient wind               High Wind 
## 55                gusty wind               High Wind 
## 56           gusty wind hail               High Wind 
## 57       gusty wind hvy rain               High Wind 
## 58           gusty wind rain               High Wind 
## 59               gusty winds               High Wind 
## 60                      hail                     Hail
## 61            hazardous surf                High Surf
## 62                      heat                     Heat
## 63                 heat wave                     Heat
## 64                heavy rain               Heavy Rain
## 65      heavy rain high surf               Heavy Rain
## 66                heavy seas                High Surf
## 67                heavy snow               Heavy Snow
## 68         heavy snow shower               Heavy Snow
## 69                heavy surf                High Surf
## 70       heavy surf and wind                High Surf
## 71      heavy surf high surf                High Surf
## 72                 high seas                High Surf
## 73                 high surf                High Surf
## 74        high surf advisory                High Surf
## 75               high swells                High Surf
## 76                high water         Storm Surge/Tide
## 77                 high wind               High Wind 
## 78                high winds               High Wind 
## 79                 hurricane      Hurricane (Typhoon)
## 80         hurricane edouard      Hurricane (Typhoon)
## 81         hurricane typhoon      Hurricane (Typhoon)
## 82     hyperthermia exposure  Extreme Cold/Wind Chill
## 83      hypothermia exposure  Extreme Cold/Wind Chill
## 84       ice jam flood minor                    Flood
## 85               ice on road             Frost/Freeze
## 86                 ice roads             Frost/Freeze
## 87                 ice storm             Frost/Freeze
## 88                 icy roads             Frost/Freeze
## 89          lake effect snow         Lake-Effect Snow
## 90           lakeshore flood          Lakeshore Flood
## 91                 landslide              Debris Flow
## 92                landslides              Debris Flow
## 93                 landslump              Debris Flow
## 94                 landspout              Debris Flow
## 95          late season snow               Heavy Snow
## 96       light freezing rain                    Sleet
## 97                light snow          Winter Weather 
## 98            light snowfall          Winter Weather 
## 99                 lightning                Lightning
## 100          marine accident                    Other
## 101              marine hail              Marine Hail
## 102         marine high wind         Marine High Wind
## 103       marine strong wind       Marine Strong Wind
## 104         marine tstm wind Marine Thunderstorm Wind
## 105 marine thunderstorm wind Marine Thunderstorm Wind
## 106               microburst        Thunderstorm Wind
## 107             mixed precip               Heavy Rain
## 108      mixed precipitation               Heavy Rain
## 109                mud slide              Debris Flow
## 110                 mudslide              Debris Flow
## 111                mudslides              Debris Flow
## 112   non severe wind damage               High Wind 
## 113            non tstm wind               High Wind 
## 114    non thunderstorm wind               High Wind 
## 115                    other                    Other
## 116                     rain               Heavy Rain
## 117                rain snow               Heavy Rain
## 118              record heat           Excessive Heat
## 119              rip current              Rip Current
## 120             rip currents              Rip Current
## 121              river flood                    Flood
## 122           river flooding                    Flood
## 123               rock slide              Debris Flow
## 124               rogue wave                High Surf
## 125               rough seas                High Surf
## 126               rough surf                High Surf
## 127                   seiche                   Seiche
## 128               small hail                     Hail
## 129                     snow               Heavy Snow
## 130             snow and ice                Ice Storm
## 131              snow squall                Ice Storm
## 132             snow squalls                Ice Storm
## 133              storm surge         Storm Surge/Tide
## 134         storm surge tide         Storm Surge/Tide
## 135              strong wind              Strong Wind
## 136             strong winds              Strong Wind
## 137             thunderstorm        Thunderstorm Wind
## 138        thunderstorm wind        Thunderstorm Wind
## 139           tidal flooding         Storm Surge/Tide
## 140                  tornado                  Tornado
## 141      torrential rainfall               Heavy Rain
## 142      tropical depression      Tropical Depression
## 143           tropical storm           Tropical Storm
## 144                tstm wind        Thunderstorm Wind
## 145  tstm wind and lightning        Thunderstorm Wind
## 146           tstm wind hail        Thunderstorm Wind
## 147                  tsunami                  Tsunami
## 148                  typhoon      Hurricane (Typhoon)
## 149        unseasonably warm                     Heat
## 150     urban sml stream fld                    Flood
## 151             volcanic ash             Volcanic Ash
## 152             warm weather                     Heat
## 153               waterspout               Waterspout
## 154           wet microburst        Thunderstorm Wind
## 155                whirlwind              Strong Wind
## 156         wild forest fire                 Wildfire
## 157                 wildfire                 Wildfire
## 158                     wind               High Wind 
## 159            wind and wave               High Wind 
## 160              wind damage               High Wind 
## 161                    winds               High Wind 
## 162             winter storm             Winter Storm
## 163           winter weather          Winter Weather 
## 164       winter weather mix          Winter Weather 
## 165               wintry mix          Winter Weather

Applying these mappings to the filtered storm data

allowedVars<-c("EVTYPE")
storm.data <- addNewData("analysis-data/eventmap.csv", storm.data, allowedVars)

results in

length(unique(storm.data$EVTYPE))
## [1] 49

distinct event types which represent the 48 standard types and an additional classification of Other to account for event types that it was not possible to map.

Results

Storm Events Impact on Population Health

Storm events impacting population health can be considered to be those events resulting in injuries or fatalities. During the sample period there were a total of

nrow(storm.data[FATALITIES > 0 | INJURIES > 0])
## [1] 12760

events which had an impact on population health. Isolating those events to a discrete data frame

health.impact <- storm.data[FATALITIES > 0 | INJURIES > 0]

and determing the total impact for the health events given some events produce both injuries and fatalities

health.impact$INJURIES[is.na(health.impact$INJURIES)] <- 0
health.impact$CROPDMG1[is.na(health.impact$FATALITIES)] <- 0
health.impact <- health.impact[, IMPACT := .SD[, INJURIES + FATALITIES]]

we are then able to plot the number of events. Producing the plots filtered to the events of each type having the greatest impact

injuries <- ggplot(health.impact[, head(.SD, 20), by=INJURIES],
                   aes(x = EVTYPE, y = INJURIES)) + 
                   geom_bar(stat = "identity",  
                   aes(fill = INJURIES),
                   position = "dodge") + 
                   theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
                   xlab("Event") + 
                   ylab("Injuries")

fatalities <- ggplot(health.impact[, head(.SD, 20), by=FATALITIES],
                     aes(x = EVTYPE, y = FATALITIES)) + 
                     geom_bar(stat = "identity", 
                     aes(fill = FATALITIES), position = "dodge") + theme(axis.text.x = element_text(angle =45, hjust = 1)) + 
                     xlab("Event") + 
                     ylab("Fatalities")

combined <- ggplot(health.impact[, head(.SD, 20), by=IMPACT], 
                   aes(x = EVTYPE, y = IMPACT)) + 
                   geom_bar(stat = "identity", 
                   aes(fill = IMPACT),
                   position = "dodge") + 
                   theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
                   xlab("Event") + 
                   ylab("Health Impact")


grid.arrange(injuries, fatalities, combined, ncol=1, main = "Storm Events Impacting Population Health in the United States (1996-2011)")

The histograms demonstrate that it is a Tornado which has the greatest impact on population health, both from the perspective of injuries and fatalaties.

Storm Events Economic Impact

Economic damage caused by storm events is classified as 2 distinct groups - property damage (PROPDMG) and crop damage (CROPDMG). The economic impact of the damage can be determined by the dollar amount in association with an exponent (PROPDMGEXP or CROPDMGEXP). The expected values for the exponents is numeric or a textual representation eg. m or M to represent 10^6. To facilitate the subsequent analysis this was normalised to a numeric representation.

map.exponent.as.numeric <- function(exponent, ...) {
    exponent <- as.factor(exponent)
    levels(exponent) <- list(...)
    exponent
}

`PROPDMGEXP’ has

unique(storm.data$PROPDMGEXP)
## [1] K   M B
## Levels:  - ? + 0 1 2 3 4 5 6 7 8 B h H K m M

values to be mapped in the exponent.

storm.data$PROPDMGEXP <- map.exponent.as.numeric(storm.data$PROPDMGEXP, 
                                                 "1"=c("1"), "2"=c("2"), "3"=c("3"), "4"=c("4"), "5"=c("5"), "6"=c("6"), "7"=c("7"), "8"=c("8"), "9"=c("9"), 
                                                 "100"=c('h', 'H'), "1000"=c('k','K'), "1000000"=c('m','M'), "1000000000"=c('b','B'))

`CROPDMGEXP’ has

unique(storm.data$CROPDMGEXP)
## [1] K   M B
## Levels:  ? 0 2 B k K m M

values to be mapped in the exponent.

storm.data$CROPDMGEXP <- map.exponent.as.numeric(storm.data$CROPDMGEXP, 
                                                 "1"=c("1"), "2"=c("2"), "3"=c("3"), "4"=c("4"), "5"=c("5"), "6"=c("6"), "7"=c("7"), "8"=c("8"), "9"=c("9"), 
                                                 "100"=c('h', 'H'), "1000"=c('k','K'), "1000000"=c('m','M'), "1000000000"=c('b','B'))

Having mapped the values to the damage exponents we are able to determine values for the economic damage associated with impacting events by multiplying the damage amounts (PROPDMG and CROPDMG) by the respective exponents (PROPDMGEXP and CROPDMGEXP).

Extracting that subset of the data which covers events with an economic impact

economic.impact <- storm.data[PROPDMG > 0 | CROPDMG > 0]

we have a total of

nrow(economic.impact)
## [1] 189232

events which have caused an economic impact. We determine the notional $ value for each of these economic events by multiplying the damage by the mapped exponents and, given some events cause both crop and property damage, sum these to arrive at the total economic impact.

economic.impact <- economic.impact[, PROPDMG1:=0]
economic.impact <- economic.impact[, CROPDMG1:=0]
economic.impact <- economic.impact[, TOTALDMG:=0]
economic.impact <- economic.impact[, PROPDMG1 :=  .SD[, PROPDMG * as.numeric(levels(PROPDMGEXP))[PROPDMGEXP]]]
economic.impact <- economic.impact[, CROPDMG1 :=  .SD[, CROPDMG * as.numeric(levels(CROPDMGEXP))[CROPDMGEXP]]]
economic.impact$PROPDMG1[is.na(economic.impact$PROPDMG1)] <- 0
economic.impact$CROPDMG1[is.na(economic.impact$CROPDMG1)] <- 0
economic.impact <- economic.impact[, TOTALDMG := .SD[, CROPDMG1 + PROPDMG1]]

we are then able to plot the notional impact (USD) of the events

cropdmg <- ggplot(economic.impact[, head(.SD, 10), by=CROPDMG1],
                   aes(x = EVTYPE, y = CROPDMG1)) + 
                   geom_bar(stat = "identity",  
                   aes(fill = CROPDMG1),
                   position = "dodge") + 
                   theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
                   xlab("Event") + 
                   ylab("Crop Damage (USD)")

propdmg <- ggplot(economic.impact[, head(.SD, 10), by=PROPDMG1],
                     aes(x = EVTYPE, y = PROPDMG)) + 
                     geom_bar(stat = "identity", 
                     aes(fill = PROPDMG1), position = "dodge") + theme(axis.text.x = element_text(angle =45, hjust = 1)) + 
                     xlab("Event") + 
                     ylab("Property Damage (USD)")

combined <- ggplot(economic.impact[, head(.SD, 10), by=TOTALDMG], 
                   aes(x = EVTYPE, y = TOTALDMG)) + 
                   geom_bar(stat = "identity", 
                   aes(fill = TOTALDMG),
                   position = "dodge") + 
                   theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
                   xlab("Event") + 
                   ylab("Total Damage (USD)")

grid.arrange(cropdmg, propdmg, combined, ncol=1, main = "Storm Events Economic in the United States (1996-2011)")

The histograms demonstrate that the events causing the most crop damage are Hurricane (Typhoon) and Debris Flow causes the most property damage. Howeverm on aggregate, it is a Flood which causes the most damage overall.

Conclusion

The Tornado is the weather event which has had the most impact on human health, both in terms of injuries and fatalaties, from 1996 through to 2011 although a Flood is the weather event which has caused the most economic damage over the same period.