Health and Economic Impact of Weather Events in the US

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Synopsis

The analysis on the storm event database revealed that tornadoes are the most dangerous weather event to the population health. The second most dangerous event type is the excessive heat. The economic impact of weather events was also analyzed. Flash floods and thunderstorm winds caused billions of dollars in property damages between 1950 and 2011. The largest crop damage caused by drought, followed by flood and hails.

Data Processing

The analysis was performed on Storm Events Database, provided by National Climatic Data Center. The data is from a comma-separated-value file available here. There is also some documentation of the data available here.

The first step is to read the data into a data frame.

dat <- read.csv('data/repdata_data_StormData.csv.bz2')

Before the analysis, the data need some preprocessing. Event types don’t have a specific format. For instance, there are events with types Frost/Freeze, FROST/FREEZE and FROST\\FREEZE which obviously refer to the same type of event.

# number of unique event types
length(unique(dat$EVTYPE))
## [1] 985
event_types <- dat$EVTYPE
event_types <- gsub("[[:blank:][:punct:]+]", " ", event_types)
#Update the data frame
dat$EVTYPE <- event_types

No further data preprocessing was performed although the event type field can be processed further to merge event types such as tstm wind and thunderstorm wind. After the cleaning, as expected, the number of unique event types reduce significantly. For further analysis, the cleaned event types are used.

Dangerous Events with respect to Population Health

To find the event types that are most harmful to population health, the number of casualties are aggregated by the event type.

library(dplyr)
## 
## Adjuntando el paquete: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
eventype_fatalities <- as.data.frame(dat %>% group_by(EVTYPE) %>% summarise(fatalities = sum(FATALITIES), injuries = sum(INJURIES)))
#We yield the top 10 most fatal event types
head(eventype_fatalities[order(eventype_fatalities$fatalities, decreasing = TRUE), c('EVTYPE', "fatalities")], 10)
##             EVTYPE fatalities
## 810        TORNADO       5633
## 124 EXCESSIVE HEAT       1903
## 151    FLASH FLOOD        978
## 268           HEAT        937
## 446      LIGHTNING        816
## 830      TSTM WIND        504
## 167          FLOOD        470
## 564    RIP CURRENT        368
## 337      HIGH WIND        248
## 19       AVALANCHE        224
#We yield the top 10 event types which caused more injuries
head(eventype_fatalities[order(eventype_fatalities$injuries, decreasing = TRUE), c("EVTYPE","injuries")], 10)
##                EVTYPE injuries
## 810           TORNADO    91346
## 830         TSTM WIND     6957
## 167             FLOOD     6789
## 124    EXCESSIVE HEAT     6525
## 446         LIGHTNING     5230
## 268              HEAT     2100
## 417         ICE STORM     1975
## 151       FLASH FLOOD     1777
## 740 THUNDERSTORM WIND     1488
## 238              HAIL     1361

Economic Effects of Weather Events

To analyze the impact of weather events on the economy, available property damage and crop damage reportings/estimates were used.

In the raw data, the property damage is represented with two fields, a number PROPDMG in dollars and the exponent PROPDMGEXP. Similarly, the crop damage is represented using two fields, CROPDMG and CROPDMGEXP. The first step in the analysis is to calculate the property and crop damage for each event.

#Let's first create a function which will return the exponent digit given a letter indicating the exponent
exp_function <- function(x){
  if(x %in% c('k', 'K')){
    return(3)
  }
  else if(x %in% c('M', 'm')){
    return(6)
  }
  else if(x %in% c('B', 'b')){
    return(9)
  }
  else if(x %in% c('h','H')){
    return(2)
  }
  else if(!is.na(as.numeric(x))){
    return(as.numeric(x))
  }
  else if(x %in% c('', '-', '?', '+')){
    return(0)
  }
  else{
    stop('Invalid entry')
  }
}
#Now let's add this changes to our original data file
prop_dmg_exp <- sapply(dat$PROPDMGEXP, FUN = exp_function)
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
dat$prop_dmg <- dat$PROPDMG * (10**prop_dmg_exp)
crop_dmg_exp <- sapply(dat$CROPDMGEXP, FUN = exp_function)
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
dat$crop_dmg <- dat$CROPDMG * (10**crop_dmg_exp)


#Next, let's calculate the total economic loss per event type
econ_loss <- as.data.frame(dat %>% group_by(EVTYPE) %>% summarise(crop_loss = sum(crop_dmg), prop_loss = sum(prop_dmg)))
#We yield the top 10 event types that caused the highest crop losses
head(econ_loss[order(econ_loss$crop_loss, decreasing = TRUE),c('EVTYPE','crop_loss')], 10)
##                EVTYPE   crop_loss
## 91            DROUGHT 13972566000
## 167             FLOOD  5661968450
## 568       RIVER FLOOD  5029459000
## 417         ICE STORM  5022113500
## 238              HAIL  3025954473
## 379         HURRICANE  2741910000
## 387 HURRICANE TYPHOON  2607872800
## 151       FLASH FLOOD  1421317100
## 132      EXTREME COLD  1292973000
## 196      FROST FREEZE  1094086000
#We now yield the top 10 event types which caused the highest property losses
head(econ_loss[order(econ_loss$prop_loss, decreasing = TRUE),c('EVTYPE','prop_loss')], 10)
##                EVTYPE    prop_loss
## 167             FLOOD 144657709807
## 387 HURRICANE TYPHOON  69305840000
## 810           TORNADO  56947380677
## 643       STORM SURGE  43323536000
## 151       FLASH FLOOD  16822673979
## 238              HAIL  15735267513
## 379         HURRICANE  11868319010
## 823    TROPICAL STORM   7703890550
## 941      WINTER STORM   6688497251
## 337         HIGH WIND   5270046295

Results

Health impact of weather events

The following plot shows top dangerous weather event types. Here, we will only consider the op 10 most fatal disasters type

library(ggplot2)
library(gridExtra)
## 
## Adjuntando el paquete: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
eventype_fatalities_filter <- eventype_fatalities[order(eventype_fatalities$fatalities, decreasing = TRUE),]
eventype_injuries_filter <- eventype_fatalities[order(eventype_fatalities$injuries, decreasing = TRUE),]

p1 <- ggplot(data = head(eventype_fatalities_filter,10), aes(x = EVTYPE, y = fatalities)) + geom_bar(stat = 'identity') + labs(x = 'Event type', y = 'Number of fatalities', title = 'Fatalities as a function of the Event type') + coord_flip()
p2 <- ggplot(data = head(eventype_injuries_filter,10), aes(x = EVTYPE, y = injuries)) + geom_bar(stat = 'identity') + labs(x = 'Event type', y = 'Number of injuries', title = 'Injuries as a function of the Event type') + coord_flip()
grid.arrange(p1, p2, top="Top deadly weather events in the US (1950-2011)")

Economic impact of weather events

The following plot shows the most severe weather event types with respect to economic cost that they have costed since 1950s.

econ_loss_prop_filter <- head(econ_loss[order(econ_loss$prop_loss, decreasing = TRUE),c('EVTYPE','prop_loss')], 10)
econ_loss_crop_filter <- head(econ_loss[order(econ_loss$crop_loss, decreasing = TRUE),c('EVTYPE','crop_loss')], 10)

p1 <- ggplot(data = econ_loss_prop_filter, aes(x = EVTYPE, y = prop_loss)) + geom_bar(stat = 'identity') + labs(x = 'Event type', y = 'Porperty losses', title = 'Property losses as a function of the Event type') + coord_flip()
p2 <- ggplot(data = head(econ_loss_crop_filter,10), aes(x = EVTYPE, y = crop_loss)) + geom_bar(stat = 'identity') + labs(x = 'Event type', y = 'Crop losses', title = 'Crop losses as a function of the Event type') + coord_flip()
grid.arrange(p1, p2, top="Weather costs to the US economy (1950-2011)")

Property damages are given in logarithmic scale due to large range of values. The data shows that flash floods and thunderstorm winds cost the largest property damages among weather-related natural diseasters. Note that, due to untidy nature of the available data, type flood and flash flood are separate values and should be merged for more accurate data-driven conclusions.

The most severe weather event in terms of crop damage is the drought. In the last half century, the drought has caused more than 10 billion dollars damage. Other severe crop-damage-causing event types are floods and hails.