Risk Assesment of Severe Weather Events based on NOAA historic data (1950-2011)

iair kleiman

1. Synopsis

The objective of this study is to identify the most lethal Weather Events and also those with the greatest economic impact. To do this, I’ve used NOAA historic data from (1950-2011). Some important data procesing and correction was needed. The raw database had mispelling and different name for the same weather events. Also the economic impact had different units (houndred, thousand, millions, billion).

The most dangerous events (fatalities) were: tornadoes, hot weather, floods, storms and lightnings. The most expensive events were: floods, hurricanes, storm, tornadoes and hail.

2. Data Procesing

library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
library(ggplot2)
library(pander)  # A better looking alternative to XTABLE
## Warning: package 'pander' was built under R version 3.1.3
library(gridExtra) #package the enables multiple ggplots
## Warning: package 'gridExtra' was built under R version 3.1.3
## Loading required package: grid
storm_data <-  read.csv(bzfile("repdata-data-StormData.csv.bz2") ) #data file import
storm_tbl <- tbl_df(storm_data) # #data.frame to data.table conversion

To avoid even more name fixing, I converted EVTYPE, PROPDMGEXP and CROPDMGEXP all to lower cases.

storm_tbl <- storm_tbl %>% mutate(EVTYPE = tolower(EVTYPE),  PROPDMGEXP =tolower(PROPDMGEXP),
                                CROPDMGEXP = tolower(CROPDMGEXP)) 

The Damages amounts are not in the same units. Some ammount are in houndred dollar, thousands, millions and even billions. Let’s bring everything to Millions of Dollars

storm_tbl <- storm_tbl %>% mutate(Property_Damage = ifelse(PROPDMGEXP=="h",PROPDMG * 100/1000000,
        ifelse(PROPDMGEXP=="k", PROPDMG * 1/1000, ifelse(PROPDMGEXP=="m",PROPDMG * 1,
        ifelse(PROPDMGEXP=="b", PROPDMG * 1000, 0)))))

storm_tbl <- storm_tbl %>% mutate(Crop_Damage = ifelse(CROPDMGEXP=="h",CROPDMG * 100/1000000,
        ifelse(CROPDMGEXP=="k", CROPDMG * 1/1000, ifelse(CROPDMGEXP=="m",CROPDMG * 1,
        ifelse(CROPDMGEXP=="b", CROPDMG * 1000, 0)))))

Before the EVTYPE name cleaning, I wanted to know how many different EVTYPE names there were

sorted <- unique(storm_tbl$EVTYPE) 
sorted <- sort(sorted)
head(sorted, 150)
##   [1] "   high surf advisory"          " coastal flood"                
##   [3] " flash flood"                   " lightning"                    
##   [5] " tstm wind"                     " tstm wind (g45)"              
##   [7] " waterspout"                    " wind"                         
##   [9] "?"                              "abnormal warmth"               
##  [11] "abnormally dry"                 "abnormally wet"                
##  [13] "accumulated snowfall"           "agricultural freeze"           
##  [15] "apache county"                  "astronomical high tide"        
##  [17] "astronomical low tide"          "avalance"                      
##  [19] "avalanche"                      "beach erosin"                  
##  [21] "beach erosion"                  "beach erosion/coastal flood"   
##  [23] "beach flood"                    "below normal precipitation"    
##  [25] "bitter wind chill"              "bitter wind chill temperatures"
##  [27] "black ice"                      "blizzard"                      
##  [29] "blizzard and extreme wind chil" "blizzard and heavy snow"       
##  [31] "blizzard summary"               "blizzard weather"              
##  [33] "blizzard/freezing rain"         "blizzard/heavy snow"           
##  [35] "blizzard/high wind"             "blizzard/winter storm"         
##  [37] "blow-out tide"                  "blow-out tides"                
##  [39] "blowing dust"                   "blowing snow"                  
##  [41] "blowing snow- extreme wind chi" "blowing snow & extreme wind ch"
##  [43] "blowing snow/extreme wind chil" "breakup flooding"              
##  [45] "brush fire"                     "brush fires"                   
##  [47] "coastal  flooding/erosion"      "coastal erosion"               
##  [49] "coastal flood"                  "coastal flooding"              
##  [51] "coastal flooding/erosion"       "coastal storm"                 
##  [53] "coastal surge"                  "coastal/tidal flood"           
##  [55] "coastalflood"                   "coastalstorm"                  
##  [57] "cold"                           "cold air funnel"               
##  [59] "cold air funnels"               "cold air tornado"              
##  [61] "cold and frost"                 "cold and snow"                 
##  [63] "cold and wet conditions"        "cold temperature"              
##  [65] "cold temperatures"              "cold wave"                     
##  [67] "cold weather"                   "cold wind chill temperatures"  
##  [69] "cold/wind chill"                "cold/winds"                    
##  [71] "cool and wet"                   "cool spell"                    
##  [73] "cstl flooding/erosion"          "dam break"                     
##  [75] "dam failure"                    "damaging freeze"               
##  [77] "deep hail"                      "dense fog"                     
##  [79] "dense smoke"                    "downburst"                     
##  [81] "downburst winds"                "driest month"                  
##  [83] "drifting snow"                  "drought"                       
##  [85] "drought/excessive heat"         "drowning"                      
##  [87] "dry"                            "dry conditions"                
##  [89] "dry hot weather"                "dry microburst"                
##  [91] "dry microburst 50"              "dry microburst 53"             
##  [93] "dry microburst 58"              "dry microburst 61"             
##  [95] "dry microburst 84"              "dry microburst winds"          
##  [97] "dry mircoburst winds"           "dry pattern"                   
##  [99] "dry spell"                      "dry weather"                   
## [101] "dryness"                        "dust devel"                    
## [103] "dust devil"                     "dust devil waterspout"         
## [105] "dust storm"                     "dust storm/high winds"         
## [107] "duststorm"                      "early freeze"                  
## [109] "early frost"                    "early rain"                    
## [111] "early snow"                     "early snowfall"                
## [113] "erosion/cstl flood"             "excessive"                     
## [115] "excessive cold"                 "excessive heat"                
## [117] "excessive heat/drought"         "excessive precipitation"       
## [119] "excessive rain"                 "excessive rainfall"            
## [121] "excessive snow"                 "excessive wetness"             
## [123] "excessively dry"                "extended cold"                 
## [125] "extreme cold"                   "extreme cold/wind chill"       
## [127] "extreme heat"                   "extreme wind chill"            
## [129] "extreme wind chill/blowing sno" "extreme wind chills"           
## [131] "extreme windchill"              "extreme windchill temperatures"
## [133] "extreme/record cold"            "extremely wet"                 
## [135] "falling snow/ice"               "first frost"                   
## [137] "first snow"                     "flash flood"                   
## [139] "flash flood - heavy rain"       "flash flood from ice jams"     
## [141] "flash flood landslides"         "flash flood winds"             
## [143] "flash flood/"                   "flash flood/ flood"            
## [145] "flash flood/ street"            "flash flood/flood"             
## [147] "flash flood/heavy rain"         "flash flood/landslide"         
## [149] "flash flooding"                 "flash flooding/flood"
summary(sorted)
##    Length     Class      Mode 
##       898 character character

There are many mistakes in the EVTYPE naming, for example how thunderstorm is writen, wind is writen as wind, winds, wnd. If the word record or excesive was added. There are also spelling errors and even spaces befores a word. I will make some fixes, but there is a lot of work to be done.

First thing to fix is to filter rows that include the text “summary”

storm_tbl <- storm_tbl %>% filter(!grepl("summary", EVTYPE))

Now let the heavy name fixing begin! I will try to cluster the Event Type Categories to be able later to add similar events and their consequences

storm_tbl <- storm_tbl %>%
        mutate(EVTYPE = ifelse(grepl("tstm|thund|thundeerstorm|tunderstorm|
                                     thundertsorm", EVTYPE),"thunderstorm", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("tornado|torndao", EVTYPE),"tornado", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("hail", EVTYPE), "hail", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("lightning|lighting|lightning|ligntning", EVTYPE),
                   "lightning", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("blizzard", EVTYPE),"blizzard", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("rain", EVTYPE), "rain", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("cold", EVTYPE),"cold weather", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("mud", EVTYPE),"mudslide", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("hurricane", EVTYPE),"hurricane", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("hot", EVTYPE),"hot weather", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("wind|wnd", EVTYPE),"winds", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("flood|fld", EVTYPE), "flood", EVTYPE)) %>%  
        mutate(EVTYPE = ifelse(grepl("snow", EVTYPE),"snow", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("winter storm", EVTYPE), 
                               "winter storm", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("tropical ", EVTYPE),
                   "tropical storm", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("storm", EVTYPE),"storm", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("volcanic", EVTYPE),"Volcanic Activity", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("wild", EVTYPE),"wildfire", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("heat", EVTYPE),"hot weather", EVTYPE)) %>%
        mutate(EVTYPE = ifelse(grepl("current", EVTYPE),"rip current", EVTYPE))   
sorted <- unique(storm_tbl$EVTYPE) 
sorted <- sort(sorted)
sorted
##   [1] "   high surf advisory"         " waterspout"                  
##   [3] "?"                             "abnormal warmth"              
##   [5] "abnormally dry"                "abnormally wet"               
##   [7] "agricultural freeze"           "apache county"                
##   [9] "astronomical high tide"        "astronomical low tide"        
##  [11] "avalance"                      "avalanche"                    
##  [13] "beach erosin"                  "beach erosion"                
##  [15] "below normal precipitation"    "black ice"                    
##  [17] "blizzard"                      "blow-out tide"                
##  [19] "blow-out tides"                "blowing dust"                 
##  [21] "brush fire"                    "brush fires"                  
##  [23] "coastal erosion"               "coastal surge"                
##  [25] "cold weather"                  "cool and wet"                 
##  [27] "cool spell"                    "dam break"                    
##  [29] "dam failure"                   "damaging freeze"              
##  [31] "dense fog"                     "dense smoke"                  
##  [33] "downburst"                     "driest month"                 
##  [35] "drought"                       "drowning"                     
##  [37] "dry"                           "dry conditions"               
##  [39] "dry microburst"                "dry microburst 50"            
##  [41] "dry microburst 53"             "dry microburst 58"            
##  [43] "dry microburst 61"             "dry microburst 84"            
##  [45] "dry pattern"                   "dry spell"                    
##  [47] "dry weather"                   "dryness"                      
##  [49] "dust devel"                    "dust devil"                   
##  [51] "dust devil waterspout"         "early freeze"                 
##  [53] "early frost"                   "excessive"                    
##  [55] "excessive precipitation"       "excessive wetness"            
##  [57] "excessively dry"               "extremely wet"                
##  [59] "first frost"                   "flash floooding"              
##  [61] "flood"                         "fog"                          
##  [63] "forest fires"                  "freeze"                       
##  [65] "freezing drizzle"              "freezing drizzle and freezing"
##  [67] "freezing fog"                  "freezing spray"               
##  [69] "frost"                         "frost/freeze"                 
##  [71] "frost\\freeze"                 "funnel"                       
##  [73] "funnel cloud"                  "funnel cloud."                
##  [75] "funnel clouds"                 "funnels"                      
##  [77] "glaze"                         "glaze ice"                    
##  [79] "grass fires"                   "gustnado"                     
##  [81] "gustnado and"                  "hail"                         
##  [83] "hard freeze"                   "hazardous surf"               
##  [85] "heavy mix"                     "heavy precipatation"          
##  [87] "heavy precipitation"           "heavy seas"                   
##  [89] "heavy shower"                  "heavy showers"                
##  [91] "heavy surf"                    "heavy surf/high surf"         
##  [93] "heavy swells"                  "high"                         
##  [95] "high  swells"                  "high seas"                    
##  [97] "high surf"                     "high surf advisories"         
##  [99] "high surf advisory"            "high swells"                  
## [101] "high temperature record"       "high tides"                   
## [103] "high water"                    "high waves"                   
## [105] "hot weather"                   "hurricane"                    
## [107] "hyperthermia/exposure"         "hypothermia"                  
## [109] "hypothermia/exposure"          "ice"                          
## [111] "ice floes"                     "ice fog"                      
## [113] "ice jam"                       "ice on road"                  
## [115] "ice pellets"                   "ice roads"                    
## [117] "icy roads"                     "landslide"                    
## [119] "landslides"                    "landslump"                    
## [121] "landspout"                     "large wall cloud"             
## [123] "late freeze"                   "lightning"                    
## [125] "low temperature"               "low temperature record"       
## [127] "marine accident"               "marine mishap"                
## [129] "microburst"                    "mild and dry pattern"         
## [131] "mild pattern"                  "mild/dry pattern"             
## [133] "mixed precip"                  "mixed precipitation"          
## [135] "monthly precipitation"         "monthly temperature"          
## [137] "mudslide"                      "no severe weather"            
## [139] "none"                          "normal precipitation"         
## [141] "northern lights"               "other"                        
## [143] "patchy dense fog"              "patchy ice"                   
## [145] "prolong warmth"                "rain"                         
## [147] "rapidly rising water"          "record cool"                  
## [149] "record dry month"              "record dryness"               
## [151] "record high"                   "record high temperature"      
## [153] "record high temperatures"      "record low"                   
## [155] "record precipitation"          "record temperature"           
## [157] "record temperatures"           "record warm"                  
## [159] "record warm temps."            "record warmth"                
## [161] "red flag criteria"             "red flag fire wx"             
## [163] "remnants of floyd"             "rip current"                  
## [165] "rock slide"                    "rogue wave"                   
## [167] "rotating wall cloud"           "rough seas"                   
## [169] "rough surf"                    "saharan dust"                 
## [171] "seiche"                        "severe turbulence"            
## [173] "sleet"                         "small stream"                 
## [175] "small stream and"              "smoke"                        
## [177] "snow"                          "southeast"                    
## [179] "storm"                         "temperature record"           
## [181] "tornado"                       "tsunami"                      
## [183] "typhoon"                       "unseasonably cool"            
## [185] "unseasonably cool & wet"       "unseasonably dry"             
## [187] "unseasonably warm"             "unseasonably warm & wet"      
## [189] "unseasonably warm and dry"     "unseasonably warm year"       
## [191] "unseasonably warm/wet"         "unseasonably wet"             
## [193] "unseasonal low temp"           "unusual warmth"               
## [195] "unusual/record warmth"         "unusually warm"               
## [197] "urban and small"               "urban and small stream"       
## [199] "urban small"                   "urban/small"                  
## [201] "urban/small stream"            "very dry"                     
## [203] "very warm"                     "vog"                          
## [205] "Volcanic Activity"             "wall cloud"                   
## [207] "wall cloud/funnel cloud"       "warm dry conditions"          
## [209] "warm weather"                  "water spout"                  
## [211] "waterspout"                    "waterspout-"                  
## [213] "waterspout funnel cloud"       "waterspout/"                  
## [215] "waterspouts"                   "wayterspout"                  
## [217] "wet micoburst"                 "wet microburst"               
## [219] "wet month"                     "wet weather"                  
## [221] "wet year"                      "wildfire"                     
## [223] "winds"                         "winter mix"                   
## [225] "winter weather"                "winter weather mix"           
## [227] "winter weather/mix"            "wintery mix"                  
## [229] "wintry mix"
summary(sorted)
##    Length     Class      Mode 
##       229 character character

After cleaning a little the Event Types name, it was posible to cut from 898 Event Types down to 229.

At this point I will select the most important variables for this study. I will keep the event initial date, the State, the Event Type, the number of fatalities, the number of injuries, Property Damage (million dollars) and crop damage (million dollars). I’m also grouping the data by Event Type.

# Data Grouping by Type of Event
storm_tbl$BGN_DATE <- mdy_hms(storm_tbl$BGN_DATE)

storm_flt <- storm_tbl %>% select(BGN_DATE, STATE, EVTYPE, FATALITIES, 
                                 INJURIES, Property_Damage, Crop_Damage)

storm_grp <- group_by(storm_flt, EVTYPE)

Now i will make a sum of the Fatalities of every type of event, also make the sum of the injuries of the events, and add a column with the sum of both fatalities plus injueries.

I sorted this data table, by number of fatalities

# Event Sorting based on Casualties and Injuries
storm_health <- storm_grp %>% 
        summarize(FATALITIES = sum(FATALITIES, na.rm=T), 
        INJURIES= sum(INJURIES, na.rm=T)) %>% arrange(desc(FATALITIES), 
        desc(INJURIES))
storm_health <- mutate(storm_health, Total_Incidents = FATALITIES+INJURIES )

I’m repeating the proccess but now for the economic impact

# Event sorting based on Economic Impact
storm_dmg <- storm_grp %>%
        summarize(Property_Damage = sum(Property_Damage, na.rm=T), 
        Crop_Damage= sum(Crop_Damage, na.rm=T)) %>% 
        mutate(Total_Damage= Property_Damage+Crop_Damage) %>%
        arrange(desc(Total_Damage), desc(Property_Damage))

Now that I have data sorted, I’m selecting the 15 event type with the biggest impact

storm_health_15 <- head(storm_health, 15)
storm_dmg_15 <- head(storm_dmg, 15)

3. Results

Across the United States, which types of events are most harmful with respect to population health?

Table 1. Top 15 Harmful Event Types ordered by Fatalities and Injuries

pander(head(storm_health, 15))
EVTYPE FATALITIES INJURIES Total_Incidents
tornado 5636 91407 97043
hot weather 3138 9224 12362
flood 1552 8683 10235
storm 1177 13741 14918
lightning 817 5231 6048
rip current 577 529 1106
winds 473 1954 2427
cold weather 451 320 771
avalanche 224 170 394
snow 143 1119 1262
hurricane 135 1328 1463
rain 114 305 419
high surf 104 156 260
blizzard 101 806 907
wildfire 90 1606 1696

Across the United States, which types of events have the greatest economic consequences?

Table 2. Top 15 Event Types resulting in economic consequences ordered by Total Damage in US Dollars

pander(head(storm_dmg, 15))
EVTYPE Property_Damage Crop_Damage Total_Damage
flood 167566 12275 179841
hurricane 84756 5515 90271
storm 78898 7023 85920
tornado 56993 415 57408
hail 15975 3047 19021
drought 1046 13973 15019
wildfire 8492 402.8 8894
winds 6146 772.8 6918
rain 3265 919.3 4185
cold weather 246.6 1417 1663
snow 1010 134.7 1145
frost/freeze 10.48 1094 1105
lightning 938.7 12.09 950.8
hot weather 20.33 904.5 924.8
blizzard 664.9 112.1 777
health_plot <- ggplot(storm_health_15, aes(x= reorder(EVTYPE, -FATALITIES), FATALITIES)) +
        geom_bar(fill="blue",  stat="identity") + theme(axis.text.x=element_text(angle = 45, hjust = 1)) +
        ggtitle("Top 15 Harmful Event Types") + ylab("Fatalities") +         xlab("Event Type")
dmg_plot <- ggplot(storm_dmg_15, aes(x= reorder(EVTYPE, -Total_Damage), 
                        Total_Damage)) + geom_bar(fill="forestgreen",  
                        stat="identity") + theme(axis.text.x=element_text(angle = 45, hjust = 1)) + 
        ggtitle("Top 15 Expensive Event Types") + ylab("Total Economic Damage (Million Dollars)") +
        xlab("Event Type")

Figure 1. Fatalities and Economic Impact of the most important Event Types

grid.arrange(health_plot, dmg_plot, ncol=2,nrow=1)