THE IMPACT OF WEATHER EVENTS IN US

Introduction

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Data processing

Downloading required packages

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.0.3
library("gridExtra")
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:gridExtra':
## 
##     combine
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(knitr)
library(lubridate)
## Warning: package 'lubridate' was built under R version 4.0.3
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

Setting workign directory, Loading and reading

The source data file is downloaded from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2. Comprehensive documentation for the dataset is available: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pd

dataset_url <- “https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2” download.file(dataset_url, “StormData.csv.bz2”) storm <- read.csv(“StormData.csv.bz2”)

setwd("C:/Users/Inspiron 5537pro/Desktop/Project/Reproducible_Research_P")
storm <- read.csv("repdata_data_StormData.csv")

Subsetting a new dataset

There are 902.297 observations with 37 variables in the raw file. Only a subset is required for the analysis as: 1. Relevant for the analysis are the starting date (BGN_DATE), event type (EVTYPE), counter for the health impact (FATALITIES and INJURIES), monetary impact on crop and property (PROPDMG and CROPDMG) as well as their corresponding units/exponents (PROPDMGEXP and CROPDMGEXP). 2. According to the NOAA ([https://www.ncdc.noaa.gov/stormevents/details.jsp]) the full set of wheather events is available since 1996 only. Between 1950 and 1995 only a subset (Tornado, Thunderstorm, Wind and Hail) of these events is available in the storm database. In order to have a comparable basis for the analysis, the dataset is limited to the observations posted between 1996 and 2011.

# select the required fields only
stormsub <- select(storm, BGN_DATE, EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP, FATALITIES, INJURIES)
# Format the BGN_DATE variable as a date
stormsub$BGN_DATE <- as.Date(stormsub$BGN_DATE, "%m/%d/%Y")
stormsub$YEAR <- year(stormsub$BGN_DATE)
# Only use events since 1996
stormsub2 <- filter(stormsub, YEAR >= 1996)
# start looking at what the data look like
dim(stormsub2)
## [1] 653530      9
summary(stormsub2)
##     BGN_DATE             EVTYPE             PROPDMG         PROPDMGEXP       
##  Min.   :1996-01-01   Length:653530      Min.   :   0.00   Length:653530     
##  1st Qu.:2000-11-21   Class :character   1st Qu.:   0.00   Class :character  
##  Median :2005-05-14   Mode  :character   Median :   0.00   Mode  :character  
##  Mean   :2004-10-25                      Mean   :  11.69                     
##  3rd Qu.:2008-08-22                      3rd Qu.:   1.00                     
##  Max.   :2011-11-30                      Max.   :5000.00                     
##     CROPDMG         CROPDMGEXP          FATALITIES           INJURIES       
##  Min.   :  0.000   Length:653530      Min.   :  0.00000   Min.   :0.00e+00  
##  1st Qu.:  0.000   Class :character   1st Qu.:  0.00000   1st Qu.:0.00e+00  
##  Median :  0.000   Mode  :character   Median :  0.00000   Median :0.00e+00  
##  Mean   :  1.839                      Mean   :  0.01336   Mean   :8.87e-02  
##  3rd Qu.:  0.000                      3rd Qu.:  0.00000   3rd Qu.:0.00e+00  
##  Max.   :990.000                      Max.   :158.00000   Max.   :1.15e+03  
##       YEAR     
##  Min.   :1996  
##  1st Qu.:2000  
##  Median :2005  
##  Mean   :2004  
##  3rd Qu.:2008  
##  Max.   :2011

This subset file contains 653,530 observations with 9 variables. # Cleaning data

We first want to delete rows where observations of FATALITIES, INJURIES and DAMAGES are positive or equal to zero.

stormsub2 <- filter(stormsub2, PROPDMG>=0 & CROPDMG >=0 & FATALITIES >=0 & INJURIES>=0)
length(unique(stormsub2$EVTYPE))
## [1] 516
unique(stormsub2$EVTYPE)
##   [1] "WINTER STORM"                   "TORNADO"                       
##   [3] "TSTM WIND"                      "HAIL"                          
##   [5] "HIGH WIND"                      "HEAVY RAIN"                    
##   [7] "FLASH FLOOD"                    "FREEZING RAIN"                 
##   [9] "EXTREME COLD"                   "EXCESSIVE HEAT"                
##  [11] "LIGHTNING"                      "FUNNEL CLOUD"                  
##  [13] "EXTREME WINDCHILL"              "BLIZZARD"                      
##  [15] "URBAN/SML STREAM FLD"           "FLOOD"                         
##  [17] "TSTM WIND/HAIL"                 "WATERSPOUT"                    
##  [19] "RIP CURRENTS"                   "HEAVY SNOW"                    
##  [21] "Other"                          "Record dry month"              
##  [23] "Temperature record"             "WILD/FOREST FIRE"              
##  [25] "Minor Flooding"                 "ICE STORM"                     
##  [27] "STORM SURGE"                    "Ice jam flood (minor"          
##  [29] "High Wind"                      "DUST STORM"                    
##  [31] "STRONG WIND"                    "DUST DEVIL"                    
##  [33] "Tstm Wind"                      "DROUGHT"                       
##  [35] "DRY MICROBURST"                 "FOG"                           
##  [37] "ROUGH SURF"                     "Wind"                          
##  [39] "THUNDERSTORMS"                  "Heavy Surf"                    
##  [41] "HEAVY SURF"                     "Dust Devil"                    
##  [43] "Wind Damage"                    "Marine Accident"               
##  [45] "Snow"                           "AVALANCHE"                     
##  [47] "Freeze"                         "TROPICAL STORM"                
##  [49] "Snow Squalls"                   "Coastal Flooding"              
##  [51] "Heavy Rain"                     "Strong Wind"                   
##  [53] "WINDS"                          "WIND"                          
##  [55] "COASTAL FLOOD"                  "COASTAL STORM"                 
##  [57] "COASTALFLOOD"                   "Erosion/Cstl Flood"            
##  [59] "Heavy Rain and Wind"            "Light Snow/Flurries"           
##  [61] "Wet Month"                      "Wet Year"                      
##  [63] "Tidal Flooding"                 "River Flooding"                
##  [65] "SNOW"                           "DAMAGING FREEZE"               
##  [67] "Damaging Freeze"                "HURRICANE"                     
##  [69] "Beach Erosion"                  "Hot and Dry"                   
##  [71] "Flood/Flash Flood"              "Icy Roads"                     
##  [73] "High Surf"                      "Heavy Rain/High Surf"          
##  [75] "HIGH SURF"                      "Thunderstorm Wind"             
##  [77] "Rain Damage"                    "ICE JAM"                       
##  [79] "Unseasonable Cold"              "Early Frost"                   
##  [81] "Wintry Mix"                     "blowing snow"                  
##  [83] "STREET FLOODING"                "Record Cold"                   
##  [85] "Extreme Cold"                   "Ice Fog"                       
##  [87] "Excessive Cold"                 "Torrential Rainfall"           
##  [89] "Freezing Rain"                  "Landslump"                     
##  [91] "Late-season Snowfall"           "Hurricane Edouard"             
##  [93] "Coastal Storm"                  "Flood"                         
##  [95] "HEAVY RAIN/WIND"                "TIDAL FLOODING"                
##  [97] "Winter Weather"                 "Snow squalls"                  
##  [99] "Strong Winds"                   "Strong winds"                  
## [101] "RECORD WARM TEMPS."             "Ice/Snow"                      
## [103] "Mudslide"                       "Glaze"                         
## [105] "Extended Cold"                  "Snow Accumulation"             
## [107] "Freezing Fog"                   "Drifting Snow"                 
## [109] "Whirlwind"                      "Heavy snow shower"             
## [111] "Heavy rain"                     "COASTAL FLOODING"              
## [113] "LATE SNOW"                      "Record May Snow"               
## [115] "Record Winter Snow"             "Heavy Precipitation"           
## [117] " COASTAL FLOOD"                 "Record temperature"            
## [119] "Light snow"                     "Late Season Snowfall"          
## [121] "Gusty Wind"                     "small hail"                    
## [123] "Light Snow"                     "MIXED PRECIP"                  
## [125] "Black Ice"                      "Mudslides"                     
## [127] "Gradient wind"                  "Snow and Ice"                  
## [129] "COLD"                           "Freezing Spray"                
## [131] "DOWNBURST"                      "Summary Jan 17"                
## [133] "Summary of March 14"            "Summary of March 23"           
## [135] "Summary of March 24"            "Summary of April 3rd"          
## [137] "Summary of April 12"            "Summary of April 13"           
## [139] "Summary of April 21"            "Summary August 11"             
## [141] "Summary of April 27"            "Summary of May 9-10"           
## [143] "Summary of May 10"              "Summary of May 13"             
## [145] "Summary of May 14"              "Summary of May 22 am"          
## [147] "Summary of May 22 pm"           "Heatburst"                     
## [149] "Summary of May 26 am"           "Summary of May 26 pm"          
## [151] "Metro Storm, May 26"            "Summary of May 31 am"          
## [153] "Summary of May 31 pm"           "Summary of June 3"             
## [155] "Summary of June 4"              "Summary June 5-6"              
## [157] "Summary June 6"                 "Summary of June 11"            
## [159] "Summary of June 12"             "Summary of June 13"            
## [161] "Summary of June 15"             "Summary of June 16"            
## [163] "Summary June 18-19"             "Summary of June 23"            
## [165] "Summary of June 24"             "Summary of June 30"            
## [167] "Summary of July 2"              "Summary of July 3"             
## [169] "Summary of July 11"             "Summary of July 22"            
## [171] "Summary July 23-24"             "Summary of July 26"            
## [173] "Summary of July 29"             "Summary of August 1"           
## [175] "Summary August 2-3"             "Summary August 7"              
## [177] "Summary August 9"               "Summary August 10"             
## [179] "Summary August 17"              "Summary August 21"             
## [181] "Summary August 28"              "Summary September 4"           
## [183] "Summary September 20"           "Summary September 23"          
## [185] "Summary Sept. 25-26"            "Summary: Oct. 20-21"           
## [187] "Summary: October 31"            "Summary: Nov. 6-7"             
## [189] "Summary: Nov. 16"               "Microburst"                    
## [191] "wet micoburst"                  "HAIL/WIND"                     
## [193] "Hail(0.75)"                     "Funnel Cloud"                  
## [195] "Urban Flooding"                 "No Severe Weather"             
## [197] "Urban flood"                    "Urban Flood"                   
## [199] "Cold"                           "WINTER WEATHER"                
## [201] "Summary of May 22"              "Summary of June 6"             
## [203] "Summary August 4"               "Summary of June 10"            
## [205] "Summary of June 18"             "Summary September 3"           
## [207] "Summary: Sept. 18"              "Coastal Flood"                 
## [209] "coastal flooding"               "Small Hail"                    
## [211] "Record Temperatures"            "Light Snowfall"                
## [213] "Freezing Drizzle"               "Gusty wind/rain"               
## [215] "GUSTY WIND/HVY RAIN"            "Blowing Snow"                  
## [217] "Early snowfall"                 "Monthly Snowfall"              
## [219] "Record Heat"                    "Seasonal Snowfall"             
## [221] "Monthly Rainfall"               "Cold Temperature"              
## [223] "Sml Stream Fld"                 "Heat Wave"                     
## [225] "MUDSLIDE/LANDSLIDE"             "Saharan Dust"                  
## [227] "Volcanic Ash"                   "Volcanic Ash Plume"            
## [229] "Thundersnow shower"             "NONE"                          
## [231] "COLD AND SNOW"                  "DAM BREAK"                     
## [233] "RAIN"                           "RAIN/SNOW"                     
## [235] "OTHER"                          "FREEZE"                        
## [237] "TSTM WIND (G45)"                "RECORD WARMTH"                 
## [239] "STRONG WINDS"                   "FREEZING DRIZZLE"              
## [241] "UNSEASONABLY WARM"              "SLEET/FREEZING RAIN"           
## [243] "BLACK ICE"                      "WINTRY MIX"                    
## [245] "BLOW-OUT TIDES"                 "UNSEASONABLY COLD"             
## [247] "UNSEASONABLY COOL"              "TSTM HEAVY RAIN"               
## [249] "UNSEASONABLY DRY"               "Gusty Winds"                   
## [251] "GUSTY WIND"                     "TSTM WIND 40"                  
## [253] "TSTM WIND 45"                   "HARD FREEZE"                   
## [255] "TSTM WIND (41)"                 "HEAT"                          
## [257] "RIVER FLOOD"                    "TSTM WIND (G40)"               
## [259] "RIP CURRENT"                    "TSTM WND"                      
## [261] "DENSE FOG"                      "Wintry mix"                    
## [263] " TSTM WIND"                     "MUD SLIDE"                     
## [265] "MUDSLIDES"                      "MUDSLIDE"                      
## [267] "Frost"                          "Frost/Freeze"                  
## [269] "SNOW AND ICE"                   "WIND DAMAGE"                   
## [271] "RAIN (HEAVY)"                   "Record Warmth"                 
## [273] "Prolong Cold"                   "Cold and Frost"                
## [275] "RECORD COLD"                    "PROLONG COLD"                  
## [277] "AGRICULTURAL FREEZE"            "URBAN/SML STREAM FLDG"         
## [279] "SNOW SQUALL"                    "HEAVY SNOW SQUALLS"            
## [281] "SNOW/ICE"                       "GUSTY WINDS"                   
## [283] "SMALL HAIL"                     "SNOW SQUALLS"                  
## [285] "LAKE EFFECT SNOW"               "STRONG WIND GUST"              
## [287] "LATE FREEZE"                    "RECORD TEMPERATURES"           
## [289] "ICY ROADS"                      "RECORD SNOWFALL"               
## [291] "BLOW-OUT TIDE"                  "THUNDERSTORM"                  
## [293] "Hypothermia/Exposure"           "HYPOTHERMIA/EXPOSURE"          
## [295] "Lake Effect Snow"               "Mixed Precipitation"           
## [297] "Record High"                    "COASTALSTORM"                  
## [299] "LIGHT SNOW"                     "Snow and sleet"                
## [301] "Freezing rain"                  "Gusty winds"                   
## [303] "FUNNEL CLOUDS"                  "WATERSPOUTS"                   
## [305] "Blizzard Summary"               "FROST"                         
## [307] "ICE"                            "SUMMARY OF MARCH 24-25"        
## [309] "SUMMARY OF MARCH 27"            "SUMMARY OF MARCH 29"           
## [311] "GRADIENT WIND"                  "Icestorm/Blizzard"             
## [313] "Flood/Strong Wind"              "TSTM WIND AND LIGHTNING"       
## [315] "gradient wind"                  "SEVERE THUNDERSTORMS"          
## [317] "EXCESSIVE RAIN"                 "Freezing drizzle"              
## [319] "Mountain Snows"                 "URBAN/SMALL STRM FLDG"         
## [321] "WET MICROBURST"                 "Heavy surf and wind"           
## [323] "Mild and Dry Pattern"           "COLD AND FROST"                
## [325] "RECORD HEAT"                    "TYPHOON"                       
## [327] "LANDSLIDES"                     "HIGH SWELLS"                   
## [329] "HIGH  SWELLS"                   "VOLCANIC ASH"                  
## [331] "HIGH WINDS"                     "DRY SPELL"                     
## [333] " LIGHTNING"                     "BEACH EROSION"                 
## [335] "UNSEASONAL RAIN"                "EARLY RAIN"                    
## [337] "PROLONGED RAIN"                 "WINTERY MIX"                   
## [339] "COASTAL FLOODING/EROSION"       "UNSEASONABLY WET"              
## [341] "HOT SPELL"                      "HEAT WAVE"                     
## [343] "UNSEASONABLY HOT"               "UNSEASONABLY WARM AND DRY"     
## [345] " TSTM WIND (G45)"               "TSTM WIND  (G45)"              
## [347] "HIGH WIND (G40)"                "TSTM WIND (G35)"               
## [349] "DRY WEATHER"                    "TSTM WINDS"                    
## [351] "FREEZING RAIN/SLEET"            "ABNORMAL WARMTH"               
## [353] "UNUSUAL WARMTH"                 "GLAZE"                         
## [355] "WAKE LOW WIND"                  "MONTHLY RAINFALL"              
## [357] "COLD TEMPERATURES"              "COLD WIND CHILL TEMPERATURES"  
## [359] "MODERATE SNOW"                  "MODERATE SNOWFALL"             
## [361] "URBAN/STREET FLOODING"          "COASTAL EROSION"               
## [363] "UNUSUAL/RECORD WARMTH"          "BITTER WIND CHILL"             
## [365] "BITTER WIND CHILL TEMPERATURES" "SEICHE"                        
## [367] "TSTM"                           "COASTAL  FLOODING/EROSION"     
## [369] "SNOW DROUGHT"                   "UNSEASONABLY WARM YEAR"        
## [371] "HYPERTHERMIA/EXPOSURE"          "SNOW/SLEET"                    
## [373] "ROCK SLIDE"                     "ICE PELLETS"                   
## [375] "URBAN FLOOD"                    "PATCHY DENSE FOG"              
## [377] "RECORD COOL"                    "RECORD WARM"                   
## [379] "HOT WEATHER"                    "RIVER FLOODING"                
## [381] "RECORD TEMPERATURE"             "SAHARAN DUST"                  
## [383] "TROPICAL DEPRESSION"            "VOLCANIC ERUPTION"             
## [385] "COOL SPELL"                     "WIND ADVISORY"                 
## [387] "GUSTY WIND/HAIL"                "RED FLAG FIRE WX"              
## [389] "FIRST FROST"                    "EXCESSIVELY DRY"               
## [391] "HEAVY SEAS"                     "FLASH FLOOD/FLOOD"             
## [393] "SNOW AND SLEET"                 "LIGHT SNOW/FREEZING PRECIP"    
## [395] "VOG"                            "EXCESSIVE RAINFALL"            
## [397] "FLASH FLOODING"                 "MONTHLY PRECIPITATION"         
## [399] "MONTHLY TEMPERATURE"            "RECORD DRYNESS"                
## [401] "EXTREME WINDCHILL TEMPERATURES" "MIXED PRECIPITATION"           
## [403] "EXTREME WIND CHILL"             "DRY CONDITIONS"                
## [405] "HEAVY RAINFALL"                 "REMNANTS OF FLOYD"             
## [407] "EARLY SNOWFALL"                 "FREEZING FOG"                  
## [409] "LANDSPOUT"                      "DRIEST MONTH"                  
## [411] "RECORD  COLD"                   "LATE SEASON HAIL"              
## [413] "EXCESSIVE SNOW"                 "WINTER MIX"                    
## [415] "DRYNESS"                        "FLOOD/FLASH/FLOOD"             
## [417] "WIND AND WAVE"                  "SEVERE THUNDERSTORM"           
## [419] "LIGHT FREEZING RAIN"            " WIND"                         
## [421] "MONTHLY SNOWFALL"               "DRY"                           
## [423] "RECORD RAINFALL"                "RECORD PRECIPITATION"          
## [425] "ICE ROADS"                      "HIGH SEAS"                     
## [427] "SLEET"                          "ROUGH SEAS"                    
## [429] "UNSEASONABLY WARM/WET"          "UNSEASONABLY COOL & WET"       
## [431] "UNUSUALLY WARM"                 "TSTM WIND G45"                 
## [433] "NON SEVERE HAIL"                "RECORD SNOW"                   
## [435] "SNOW/FREEZING RAIN"             "SNOW/BLOWING SNOW"             
## [437] "NON-SEVERE WIND DAMAGE"         "UNUSUALLY COLD"                
## [439] "WARM WEATHER"                   "LANDSLUMP"                     
## [441] "THUNDERSTORM WIND (G40)"        "LANDSLIDE"                     
## [443] "WALL CLOUD"                     "HIGH WATER"                    
## [445] "UNSEASONABLY WARM & WET"        " FLASH FLOOD"                  
## [447] "LOCALLY HEAVY RAIN"             "WIND GUSTS"                    
## [449] "UNSEASONAL LOW TEMP"            "HIGH SURF ADVISORY"            
## [451] "LATE SEASON SNOW"               "GUSTY LAKE WIND"               
## [453] "ABNORMALLY DRY"                 "WINTER WEATHER MIX"            
## [455] "RED FLAG CRITERIA"              "WND"                           
## [457] "CSTL FLOODING/EROSION"          "SMOKE"                         
## [459] " WATERSPOUT"                    "SNOW ADVISORY"                 
## [461] "EXTREMELY WET"                  "UNUSUALLY LATE SNOW"           
## [463] "VERY DRY"                       "RECORD LOW RAINFALL"           
## [465] "ROGUE WAVE"                     "SNOWMELT FLOODING"             
## [467] "PROLONG WARMTH"                 "ACCUMULATED SNOWFALL"          
## [469] "FALLING SNOW/ICE"               "DUST DEVEL"                    
## [471] "NON-TSTM WIND"                  "NON TSTM WIND"                 
## [473] "BRUSH FIRE"                     "GUSTY THUNDERSTORM WINDS"      
## [475] "PATCHY ICE"                     "SNOW SHOWERS"                  
## [477] "HEAVY RAIN EFFECTS"             "BLOWING DUST"                  
## [479] "EXCESSIVE HEAT/DROUGHT"         "NORTHERN LIGHTS"               
## [481] "MARINE TSTM WIND"               "   HIGH SURF ADVISORY"         
## [483] "WIND CHILL"                     "HAZARDOUS SURF"                
## [485] "WILDFIRE"                       "FROST/FREEZE"                  
## [487] "WINTER WEATHER/MIX"             "ASTRONOMICAL HIGH TIDE"        
## [489] "COLD WEATHER"                   "WHIRLWIND"                     
## [491] "VERY WARM"                      "ABNORMALLY WET"                
## [493] "TORNADO DEBRIS"                 "EXTREME COLD/WIND CHILL"       
## [495] "ICE ON ROAD"                    "FIRST SNOW"                    
## [497] "ICE/SNOW"                       "DROWNING"                      
## [499] "GUSTY THUNDERSTORM WIND"        "MARINE HAIL"                   
## [501] "HIGH SURF ADVISORIES"           "HURRICANE/TYPHOON"             
## [503] "HEAVY SURF/HIGH SURF"           "SLEET STORM"                   
## [505] "STORM SURGE/TIDE"               "COLD/WIND CHILL"               
## [507] "LAKE-EFFECT SNOW"               "MARINE HIGH WIND"              
## [509] "THUNDERSTORM WIND"              "TSUNAMI"                       
## [511] "DENSE SMOKE"                    "LAKESHORE FLOOD"               
## [513] "MARINE THUNDERSTORM WIND"       "MARINE STRONG WIND"            
## [515] "ASTRONOMICAL LOW TIDE"          "VOLCANIC ASHFALL"

Some difference are caused by upper and lower cases as well as leading or trailing whitespace(s), shlashes or hyphens between two words, etc. We scanned through the most frequent event types as well as most obvious approximations (like Coastal flood, Coastal flood / erosion, coastal flooding…) or abbreviations (TSTM for Thunderstorm). We then regroup several key words under more global categories. We might regroup further, for example with ICE / SNOW / WINTER under a broad WINTER category, but we prefer to stay with a few more ie more specific types.

stormsub3 <- mutate(stormsub2, EVTYPE = toupper(trimws(EVTYPE, which = "both", whitespace = "[ \t\r\n]")))
stormsub3$EVTYPE <- gsub(".*WINTER.*", "WINTER", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(COLD|COOL|HYPOTHERM).*", "COLD", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(HAIL|SLEET).*", "HAIL", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(ICE|ICY|FRO*ST|FREEZ).*", "ICE", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(SNOW|AVALANCH|BLIZZARD|WINTE*R).*", "SNOW", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(FIRE|SMOKE).*", "FIRE", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(SLIDE|MUD|LANDSL).*", "LANDSLIDE", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(FLO*D|FLOYD|DAM).*", "FLOOD", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(HEAT|HOT|TEMPERATUR|WARM|HYPERTHERM|RECORD HIGH).*", "HEAT", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(DRY|DRIEST|DROUGHT|SEICHE).*", "DROUGHT", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(STO*RM|TORNADO|DEPRESSION|CYCLON|TYPHOON|HURRICAN|BURST).*", "TORNADO", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(WI*ND).*", "WIND", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(RAIN|PRECIP|WET|THUNDERST|TSTM).*", "RAIN", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(SEA|WAVE|TSUNAMI|SWELL|SURF|TIDE|CURRENT|BEACH|COAST|DROWN|MARIN).*", "SEA", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(FOG|VOG|CLOUD).*", "FOG", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*VOLCA.*", "VOLCANO", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*DUST.*", "DUST", stormsub3$EVTYPE)
stormsub3$EVTYPE <- gsub(".*(SUMMARY|MONTHLY|NONE).*", "OTHER", stormsub3$EVTYPE)
length(unique(stormsub3$EVTYPE))
## [1] 26
unique(stormsub3$EVTYPE)
##  [1] "SNOW"              "TORNADO"           "WIND"             
##  [4] "HAIL"              "RAIN"              "FLOOD"            
##  [7] "ICE"               "COLD"              "HEAT"             
## [10] "LIGHTNING"         "FOG"               "WATERSPOUT"       
## [13] "SEA"               "OTHER"             "DROUGHT"          
## [16] "FIRE"              "DUST"              "LANDSLIDE"        
## [19] "GLAZE"             "NO SEVERE WEATHER" "VOLCANO"          
## [22] "WATERSPOUTS"       "LANDSPOUT"         "HIGH WATER"       
## [25] "RED FLAG CRITERIA" "NORTHERN LIGHTS"

We have reduced the number of event types to 26.

We now want to check damage amounts. There are two columns, one containing a figure and the second containing the unit. Let’s check the unit:

unique(stormsub3$CROPDMGEXP)
## [1] "K" ""  "M" "B"
unique(stormsub3$PROPDMGEXP)
## [1] "K" ""  "M" "B" "0"

We have null values and some “0”, which does not really matter as a quick exploration of the data reveals they go together nil amounts in the damage columns.

The meaning of units/components is the following: * K or k: thousand dollars (10^3) * M or m: million dollars (10^6) * B or b: billion dollars (10^9)

We then replace these letters (upper or lower case) by the 10^x item then create a new column to calculate the total amount of damage:

stormsub3 <- mutate(stormsub3,
               PROPDAMAGE = ifelse(PROPDMGEXP == "B", PROPDMG * 10^9, ifelse(
                 PROPDMGEXP == "M", PROPDMG * 10^6, PROPDMG * 10^3
               ) 
               ),
               
               CROPDAMAGE = ifelse(CROPDMGEXP == "B", CROPDMG * 10^9, ifelse(
                 CROPDMGEXP == "M", CROPDMG * 10^6, CROPDMG * 10^3
               )
               ),
               TOTDAMAGE = PROPDAMAGE + CROPDAMAGE
)
head(stormsub3)
##     BGN_DATE  EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP FATALITIES INJURIES
## 1 1996-01-06    SNOW     380          K      38          K          0        0
## 2 1996-01-11 TORNADO     100          K       0                     0        0
## 3 1996-01-11    WIND       3          K       0                     0        0
## 4 1996-01-11    WIND       5          K       0                     0        0
## 5 1996-01-11    WIND       2          K       0                     0        0
## 6 1996-01-18    HAIL       0                  0                     0        0
##   YEAR PROPDAMAGE CROPDAMAGE TOTDAMAGE
## 1 1996     380000      38000    418000
## 2 1996     100000          0    100000
## 3 1996       3000          0      3000
## 4 1996       5000          0      5000
## 5 1996       2000          0      2000
## 6 1996          0          0         0

We now have a clean databse.

To figure out which events cause 1/ the most fatalities and injuries and 2/ the largest economic impact, we need to extract two new datasets from our clean database. This means aggregating health impact and economic impact per event type, then sorting the date in a descending order.

Impact on the population health in terms of fatalities and injuries:
fatalities <- aggregate(FATALITIES ~ EVTYPE, stormsub3, sum)
fatalities10 <- fatalities[order(-fatalities$FATALITIES), ][1:10, ]
injuries<-aggregate(INJURIES ~ EVTYPE, stormsub3, sum)
injuries10 <- injuries[order(-injuries$INJURIES), ][1:10, ]
Economic impact in terms of damage on properties and crops:
ecocost <- aggregate(TOTDAMAGE ~ EVTYPE, stormsub3, sum)
ecocost <- transform(ecocost, TOTDAMAGE=TOTDAMAGE/10^9)
ecocost <- transform(ecocost, TOTDAMAGE=round(TOTDAMAGE,0))
ecocost10 <- ecocost[order(-ecocost$TOTDAMAGE), ][1:10, ]

Results

Hail, wind and tornadoes and hail were the most frequent severe weather events in the US between 1996 and 2011.

Number of occurrence between 1996 and 2011

sort(table(stormsub3$EVTYPE), decreasing = TRUE)[1:10]
## 
##      HAIL      WIND   TORNADO     FLOOD      SNOW LIGHTNING      RAIN       FOG 
##    209335    159429    112271     79591     37985     13204     11697      7798 
##      FIRE       ICE 
##      4199      3700
  1. Impact on population health in terms of fatalities and injuries Let’s visualize results as tables and plots:
fatalities10
##       EVTYPE FATALITIES
## 9       HEAT       2037
## 22   TORNADO       1863
## 5      FLOOD       1337
## 20       SEA        732
## 21      SNOW        664
## 26      WIND        655
## 14 LIGHTNING        651
## 1       COLD        379
## 11       ICE         96
## 18      RAIN         96
injuries10
##       EVTYPE INJURIES
## 22   TORNADO    24180
## 5      FLOOD     8527
## 9       HEAT     7702
## 26      WIND     5148
## 14 LIGHTNING     4141
## 21      SNOW     3147
## 4       FIRE     1458
## 20       SEA      888
## 6        FOG      856
## 8       HAIL      818

Show the Result in the form of plot

G1<-ggplot(data=fatalities10, aes(x=reorder(EVTYPE, FATALITIES),y =FATALITIES))+ coord_flip() +geom_bar(fill="violet",stat="identity")+labs(title = "Top 10 Fatality causing Events in US",x = "Weather Event", y ="Number of Fatalities")
G2 = ggplot(data=injuries10,aes(x=reorder(EVTYPE, INJURIES),y =INJURIES))+coord_flip()+geom_bar(fill = "green",stat = "identity")+labs(title = "Top 10 Injury causing Events in US",x = "Weather Event", y=" Number of Injuries")

# Draw two plots generated above dividing space in two columns
grid.arrange(G1, G2, nrow = 2)

Fatal Weather Event

Tornadoes, and floods & heat to a lesser extent, were the severe weather events causing the most fatalities and injuries in the US between 1996 and 2011.

  1. Economic impact in terms of property and crop damages

Let’s vizualize results as table and plot:

G3<- ggplot(ecocost10, aes(x = reorder(EVTYPE, -TOTDAMAGE), y = TOTDAMAGE, fill="green")) + coord_flip()+ geom_bar(stat = "identity", show.legend = F) + theme(axis.text.x = element_text(angle = 30, hjust = 1)) + labs(x = "Weather Events", y = "Economic impact (USDbn)", title = "Top 10 weather events causing economic impact")
print(G3)

CONCLUSION

  1. TORNADO has the highest damage to people’s health.
  2. FLOOD has the highest property damage cost.
  3. ROUGHT has the highest crop damage cost.