Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. The events in the database start in the year 1950 and end in November 2011.

The aim of this project is to analyze this database to understand events having the maximum impact with regard to public health and economic consequences. In the following sections the data is downloaded and and cleaned and the relevant analysis is carried out. Events with maximum impact with respect to public health are determined by the number of fatalities and injuries, while events with maximum impact with regard to economic consequences are detemined with regard to crop damage and property damage. It is found that tornados cause maximum damage to public health and floods cause maximum economic damage.

Packages used

library(dplyr)
library(ggplot2)

Data processing

The data is available at this link and the documentation is available at this link. The following code downloads and saves the data in the local machine.

fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(url=fileUrl,destfile="~/DataScientistsToolbox/05ReproducibleResearch/Week4/Assignment_Final/repdata_data_StormData.csv.bz2")
stormdata <- read.csv("repdata_data_StormData.csv.bz2")
dim(stormdata)
## [1] 902297     37
length(levels(stormdata$EVTYPE))
## [1] 985

The above shows that there are 902297 observations of 37 variables. Among the variables of concern are EVTYPE, which refers to the event-type (storm, rain, hail etc.), CROPDMG and CROPDMGEXP which describe damage to crops, PROPDMG and PROPDMGEXP which describe damage to property, and FATALITIES and INJURIES. 985 number of unique entries in the EVTYPE variable. This high number is because of the high variability in the number of ways in which a single event-type has been entered. For example, “Wnterstorm”, “Winter Storm”, “winterstorm”, “winterstorms” etc. all refer to the single event-type “winterstorms”.

This extreme variability is the main challenge in processing the data. To complicate matters, some of the entries in the EVTYPE variable cannot be unequivocally assigned to a single weather event. For example, it is not clear if “heavysnowhighwinds&flood” specifically refers to heavy snow, high winds, or floods.

A few things can be straightaway done to reduce this variability. The first thing is to replace all upper case letters with lower case letters. The next is to remove all blank spaces and characters such as “(”, “)”, “-” and “/”, as well as all numbers.

stormdata$EVTYPE <- as.factor(tolower(stormdata$EVTYPE))
stormdata$EVTYPE <- as.factor(gsub(" ","",stormdata$EVTYPE))
stormdata$EVTYPE <- as.factor(gsub("/","",stormdata$EVTYPE))
stormdata$EVTYPE <- as.factor(gsub("-","",stormdata$EVTYPE))
stormdata$EVTYPE <- as.factor(gsub("[0-9]","",stormdata$EVTYPE))
stormdata$EVTYPE <- as.factor(gsub("\\(|\\)","",stormdata$EVTYPE))

The general strategy to proceed from here is as follows. Page 6 of the documentation lists the following 50 main categories of event-type:

  1. Astronomical Low Tide
  2. Avalanche
  3. Blizzard
  4. Coastal Flood
  5. Cold/Wind Chill
  6. Debris Flow
  7. Dense Fog
  8. Dense Smoke
  9. Drought
  10. Dust Devil
  11. Dust Storm
  12. Excessive Heat
  13. Extreme Cold/Wind Chill
  14. Flash Flood
  15. Flood
  16. Frost/Freeze
  17. Funnel Cloud
  18. Freezing Fog
  19. Hail
  20. Heat
  21. Heavy Rain
  22. Heavy Snow
  23. High Surf
  24. High Wind
  25. Designator
  26. Event Name
  27. Hurricane (Typhoon)
  28. Ice Storm
  29. Lake-Effect Snow
  30. Lakeshore Flood
  31. Lightning
  32. Marine Hail
  33. Marine High Wind
  34. Marine Strong Wind
  35. Marine Thunderstorm Wind
  36. Rip Current
  37. Seiche
  38. Sleet
  39. Storm Surge/Tide
  40. Strong Wind
  41. Thunderstorm Wind
  42. Tornado
  43. Tropical Depression
  44. Tropical Storm
  45. Tsunami
  46. Volcanic Ash
  47. Waterspout
  48. Wildfire
  49. Winter Storm
  50. Winter Weather

Our general strategy to proceed will be the following. As far as possible, we shall try to assign each entry in the EVTYPE variable to one of the categories from the above list. For example if an entry reads “wintryweather” or “winterweather”, then it is assigned to the “winterweather” (the 50th element in the above list). If an entry belongs to multiple categories, then they shall be grouped into a category describing these multiple categories. For example, “Heavyrain/winterstorm”, “Winterstorm Severerain” will be grouped into the category “heavrainwinterstorm”. How to extract the consequences on the basis of this strategy will be described later.

Astronomical Low Tide

This event has been properly entered.

Avalanche

stormdata$EVTYPE <- as.character(stormdata$EVTYPE)
avalanche_cases <- c("avalance","avalanche")
stormdata$EVTYPE[stormdata$EVTYPE %in% avalanche_cases] <- "avalanche"

Blizzard

bliz_cases <- c("blizzard","blizzardfreezingrain","blizzardsummary","blizzardweather","groundblizzard")
stormdata$EVTYPE[stormdata$EVTYPE %in% bliz_cases] <- "blizzard"

Coastal Flood

coastalflood_cases <-
c("beacherosioncoastalflood","coastalflood","coastalflooding","coastalfloodingerosion","coastaltidalflood","cstlfloodingerosion","beachflood")
stormdata$EVTYPE[stormdata$EVTYPE %in% coastalflood_cases] <- "coastalflood"

Cold / Wind chill

coldwindchill_cases <-
c("bitterwindchill","bitterwindchilltemperatures","cold","coldtemperature",
"coldtemperatures","coldwave","coldweather","coldwindchill","coldwindchilltemperatures",
"coldwinds","excessivecold","extendedcold","extremecold","extremecoldwindchill",
"extremecordcold","extremewindchill","extremewindchillblowingsno","extremewindchills",
"extremewindchilltemperatures","prolongcold","recordcold","severecold",
"unseasonablecold","unseasonablycold","unusuallycold","windchill","unseasonablycool","blowingsnow&extremewindch","blowingsnowextremewindchil")
stormdata$EVTYPE[stormdata$EVTYPE %in% coldwindchill_cases] <- "coldwindchill"

Debris

This event has been properly entered.

Dense fog

densefog_cases <- c("densefog","fog","patchydensefog","vog")
stormdata$EVTYPE[stormdata$EVTYPE %in% densefog_cases] <- "densefog"

Drought

drought_cases <- c("drought","snowdrought")
stormdata$EVTYPE[stormdata$EVTYPE %in% drought_cases] <- "drought"

Heat drought

heatdrought_cases <- c("droughtexcessiveheat","excessiveheatdrought","heatdrought","heatwavedrought")
stormdata$EVTYPE[stormdata$EVTYPE %in% heatdrought_cases] <- "heatdrought"

Dust devil

dust_cases <- c("blowingdust","dustdevel","dustdevil")
stormdata$EVTYPE[stormdata$EVTYPE %in% dust_cases] <- "dustdevil"

Dust storm

This event has been properly entered.

Excessive heat

heat_cases <-
c("excessiveheat","extremeheat","heat","heatburst","heatwave","heatwaves",
"recordexcessiveheat","recordheat","recordheatwave","recordwarmtemps.","recordwarm","unusuallywarm","unseasonablywarmyear","unseasonablyhot","abnormalwarmth","hotweather","hightemperaturerecord","recordhightemperatures")
stormdata$EVTYPE[stormdata$EVTYPE %in% heat_cases] <- "excessiveheat"

Extreme cold / wind chill

coldwindchill_cases <-
c("bitterwindchill","bitterwindchilltemperatures","cold","coldandsnow","coldandwetconditions","coldtemperature","coldtemperatures","coldwave","coldweather","coldwindchill","coldwindchilltemperatures","coldwinds","excessivecold","extendedcold","extremecold","extremecoldwindchill","extremerecordcold","extremewindchill","extremewindchillblowingsno","extremewindchills","extremewindchilltemperatures","fogandcoldtemperatures","highwindlowwindchill","lowwindchill","prolongcold","prolongcoldsnow","recordcold","recordsnowcold","severecold","snowandcold","snowbittercold","snowcold","snow\\cold","unseasonablecold","unseasonablycold","unusuallycold","windchill")
stormdata$EVTYPE[stormdata$EVTYPE %in% coldwindchill_cases] <- "extremecoldwindchill"
#

Flash Flood

flash_cases <- 
c("flashflood","flashfloodflood", "flashfloodfromicejams","flashfloodheavyrain",
"flashflooding","flashfloodingflood","flashfloodingthunderstormwi",
"flashfloodlandslide","flashfloodlandslides","flashfloods","flashfloodstreet",
"flashfloodwinds","flashfloooding","floodflash","floodflashflood","floodflashflooding",
"floodfloodflash","icestormflashflood","localflashflood")
stormdata$EVTYPE[stormdata$EVTYPE %in% flash_cases] <- "flashflood"
#
#

Flood

flood_cases <-
c("flood","flooding","floodrainwind","floodrainwinds","floodriverflood","floods","floodwatch","localflood","majorflood","minorflood","minorflooding","riverandstreamflood","riverflood","riverflooding","ruralflood","smallstreamflood","smallstreamflooding","snowmeltflooding","streamflooding","streetflood","streetflooding","smallstreamandurbanflood","smallstreamandurbanfloodin","smallstreamurbanflood","urbanandsmall","urbanandsmallstream","urbanandsmallstreamflood","urbanandsmallstreamfloodin","urbanflood","urbanflooding","urbanfloods","urbansmall","urbansmallflooding","urbansmallstream","urbansmallstreamflood","urbansmallstreamflooding","urbansmallstrmfldg","urbansmlstreamfld","urbansmlstreamfldg","urbanstreetflooding","icejamflooding")
stormdata$EVTYPE[stormdata$EVTYPE %in% flood_cases] <- "flood"
#

Frost/Freeze

frostfreeze_cases <-
c("coldandfrost","earlyfrost","firstfrost","frost","frostfreeze",
"frost\\freeze","recordcoldfrost","agriculturalfreeze","damagingfreeze",
"earlyfreeze","freeze","hardfreeze","latefreeze")
stormdata$EVTYPE[stormdata$EVTYPE %in% frostfreeze_cases] <- "frostfreeze"

Funnel Cloud

funnel_cases <- c("coldairfunnel","coldairfunnels","funnel","funnelcloud","funnelcloud.","funnelclouds","funnels","wallcloudfunnelcloud","funnelcloudhail")
stormdata$EVTYPE[stormdata$EVTYPE %in% funnel_cases] <- "funnelcloud"

Freezing Fog

This event has been properly entered.

Hail

hail_cases <-
c("deephail","hail","hail.","hailaloft","haildamage","hailicyroads",
"hailstorm","hailstorms","hailwind","hailwinds","lateseasonhail",
"nonseverehail","smallhail","thunderstormhail","windhail")
stormdata$EVTYPE[stormdata$EVTYPE %in% hail_cases] <- "hail"

Heat

This event has been included in “excessive heat”.

Heavy Rain

heavyrain_cases <-
c("heavyprecipatation","heavyprecipitation","heavyshower","heavyshowers","excessiverain","excessiverainfall","heavyrain","heavyrainandwind","heavyraineffects","heavyrainfall","heavyrains","heavyrainsevereweather","heavyrainsmallstreamurban","heavyrainwind","hvyrain","locallyheavyrain","prolongedrain","rain","raindamage","rainheavy","rainstorm","rainwind","recordexcessiverainfall","recordrainfall","torrentialrain","torrentialrainfall",
"unseasonalrain","excessiveprecipitation")
stormdata$EVTYPE[stormdata$EVTYPE %in% heavyrain_cases] <- "heavyrain"

Heavy snow and ice

heavysnowice_cases <- 
c("heavysnow","heavysnowand","heavysnowandblowingsnow","heavysnowandice",
"heavysnowblowingsnow","heavysnowhigh","heavysnowice","heavysnow&ice",
"heavysnowpack","heavysnowshower","snowandheavysnow",
"snowadvisory","snowheavysnow","snowice","iceandsnow","icesnow","snowsquall","snowsqualls","snowstorm","snowandice")
stormdata$EVTYPE[stormdata$EVTYPE %in% heavysnowice_cases] <- "heavysnowandice"

High Surf

surf_cases <- 
c("hazardoussurf","heavysurf","heavysurfhighsurf","highsurf","highsurfadvisories","highsurfadvisory","roughsurf")
stormdata$EVTYPE[stormdata$EVTYPE %in% surf_cases] <- "highsurf"

High Wind

highwind_cases <-c("highwind","highwindandseas","highwinddamage","highwindg","highwinds","highwindscold","highwindseas","highwindssnow","snowhighwinds")
stormdata$EVTYPE[stormdata$EVTYPE %in% highwind_cases] <- "highwind"

Hurricane / Typhoon

hurtyp_cases <- 
c("hurricane","hurricaneedouard","hurricaneemily","hurricaneerin","hurricanefelix","hurricanegeneratedswells","hurricanegordon","hurricaneopal","hurricaneopalhighwinds","hurricanetyphoon","typhoon")
stormdata$EVTYPE[stormdata$EVTYPE %in% hurtyp_cases] <- "hurricanetyphoon"

Ice Storm

icestorm_cases <- c("glazeicestorm","icestorm")
stormdata$EVTYPE[stormdata$EVTYPE %in% icestorm_cases] <- "icestorm"

Lake effect snow

lakeeffectsnow_cases <- c("heavylakesnow","lakeeffectsnow")
stormdata$EVTYPE[stormdata$EVTYPE %in% lakeeffectsnow_cases] <- "lakeeffectsnow"

Lake shore flood

lakeshoreflood_cases <- c("lakeflood","lakeshoreflood")
stormdata$EVTYPE[stormdata$EVTYPE %in% lakeshoreflood_cases] <- "lakeshoreflood"

Lightning

lightning_cases <- c("lightning","lightning.","lightningandwinds","lightningdamage","lightningfire","lightninginjury","lightningwauseon")
stormdata$EVTYPE[stormdata$EVTYPE %in% lightning_cases] <- "lightning"

Marine hail

This event has been properly entered.

Marine high wind

This event has been properly entered.

Marine strong wind

This event has been properly entered.

Marine thunder storm wind

marinethunderstormwind_cases <- c("marinethunderstormwind","marinetstmwind")
stormdata$EVTYPE[stormdata$EVTYPE %in% marinethunderstormwind_cases] <- "marinethunderstormwind"

Rip

rip_cases <- c("ripcurrent","ripcurrents")
stormdata$EVTYPE[stormdata$EVTYPE %in% rip_cases] <- "ripcurrent"
#

Seiche

This event has been properly entered.

Sleet

sleet_cases <- c("sleet","sleetstorm","snowsleet","snowandsleet","lightsnowandsleet","sleetsnow","snowsleetrain")
stormdata$EVTYPE[stormdata$EVTYPE %in% sleet_cases] <- "sleet"

Storm surge/tide

tide_cases <- c("astronomicalhightide","blowouttide","blowouttides","hightides","stormsurgetide","coastalsurge","stormsurge","tidalflooding")
stormdata$EVTYPE[stormdata$EVTYPE %in% tide_cases] <- "stormsurgetide"

Strong wind

strongwind_cases <- c("stormforcewinds","strongwind","strongwinds","strongwindgust","windadvisory","gustnado","windgusts","gustywind","wind")
stormdata$EVTYPE[stormdata$EVTYPE %in% strongwind_cases] <- "strongwind"

Thunder storm wind

thunderstormwind_cases <-
 c("gustythunderstormwind","gustythunderstormwinds","severethunderstorm","severethunderstorms","severethunderstormwinds","thunderestormwinds","thunderstorm","thunderstormdamage","thunderstormdamageto","thunderstorms","thunderstormswind","thunderstormswinds","thunderstormw","thunderstormwind","thunderstormwind.","thunderstormwindawning","thunderstormwindg","thunderstormwindmph","thunderstormwindmph.","thunderstormwinds","thunderstormwinds.","thunderstormwindsand","thunderstormwindsg","thunderstormwindslecen","thunderstormwindslightning","thunderstormwindsmph","thunderstormwindss","thunderstormwindssmallstrea","thunderstormwindtree","thunderstormwindtrees","thunderstormwins","thunderstormwwinds","thunderstromwind","thunderstromwinds","thundertormwinds","thundertsormwind","tstm","tstmheavyrain","tstmw","tstmwind","tstmwindandlightning","tstmwinddamage","tstmwindg","tstmwindhail","tstmwinds","tstmwnd","thundeerstormwinds","tunderstormwind")    
stormdata$EVTYPE[stormdata$EVTYPE %in% thunderstormwind_cases] <- "thunderstormwind"
#tstmwindandlightning,tstmheavyrain,thunderstormwindslightning - leave out?

Tornado

tornado_cases <-
c("coldairtornado","tornado","tornadodebris","tornadoes","tornadof","tornados","torndao")
stormdata$EVTYPE[stormdata$EVTYPE %in% tornado_cases] <- "tornado"

Tropical depression

This event has been properly entered.

Tropical storm

tropicalstorm_cases <- c("tropicalstorm","tropicalstormalberto","tropicalstormdean","tropicalstormgordon" ,"tropicalstormjerry")
stormdata$EVTYPE[stormdata$EVTYPE %in% tropicalstorm_cases] <- "tropicalstorm"

Tsunami

This event has been properly entered.

Volcanic ash

volc_cases <- c("volcanicash","volcanicashfall","volcanicashplume","volcaniceruption")
stormdata$EVTYPE[stormdata$EVTYPE %in% volc_cases] <- "volcanicash"

Water spout

waterspout_cases <- 
c("waterspout","waterspoutfunnelcloud","waterspouts","waterspouttornado","wayterspout",
"tornadowaterspout")
stormdata$EVTYPE[stormdata$EVTYPE %in% waterspout_cases] <- "waterspout"

Wild fire

wildfire_cases <- 
c("brushfire","brushfires","forestfires","grassfires","redflagfirewx","wildfire","wildfires","wildforestfire","wildforestfires")
stormdata$EVTYPE[stormdata$EVTYPE %in% wildfire_cases] <- "wildfire"

Winter storm

winterstorm_cases <- c("winterstorm","winterstorms")
stormdata$EVTYPE[stormdata$EVTYPE %in% winterstorm_cases] <- "winterstorm"

Winter weather

winter_cases <- c("wintermix","winterweather","winterweathermix","winterymix")
stormdata$EVTYPE[stormdata$EVTYPE %in% winter_cases] <- "winterweather"

Combined events

#mudslide and landslide
slide_cases <- 
c("landslide","landslides","mudrockslide",
"mudslide","mudslidelandslide","mudslides","rockslide")
stormdata$EVTYPE[stormdata$EVTYPE %in% slide_cases] <- "landslide"
#thunderstormwindlightning
thunderstormwindlightning_cases <- c("thunderstormwindlightning","lightningandthunderstormwin")
stormdata$EVTYPE[stormdata$EVTYPE %in% thunderstormwindlightning_cases] <- "thunderstormwindlightning"
#heavyrainlightning
heavyrainlightning_cases <- c("lightningandheavyrain","heavyrainlightning","lightningheavyrain")
stormdata$EVTYPE[stormdata$EVTYPE %in% heavyrainlightning_cases] <- "heavyrainlightning"
#winterstormhighwind
winterstormhighwind_cases <-
c("winterstormhighwind","winterstormhighwinds")
stormdata$EVTYPE[stormdata$EVTYPE %in% winterstormhighwind_cases] <- "winterstormhighwind"
#highwindwindchill
highwindwindchill_cases <- 
c("highwindwindchill","highwindsandwindchill","windchillhighwind","snowhighwindwindchill")
stormdata$EVTYPE[stormdata$EVTYPE %in% highwindwindchill_cases] <- "highwindwindchill"
#highwindblizzard
highwindblizzard_cases <- 
c("highwindblizzard","highwindblizzardfreezingra","blizzardhighwind")
stormdata$EVTYPE[stormdata$EVTYPE %in% highwindblizzard_cases] <- "highwindblizzard"
#heavysnowhighwind
heavysnowhighwind_cases <-
c("heavysnowhighwind","heavysnowhighwindsfreezing","heavysnowhighwinds","highwindandheavysnow","highwindheavysnow")
stormdata$EVTYPE[stormdata$EVTYPE %in% heavysnowhighwind_cases] <- "heavysnowhighwind"
#highwindheavyrain
highwindheavyrain_cases <- 
c("highwindheavyrain","highwindsheavyrains","highwindsheavyrain")
stormdata$EVTYPE[stormdata$EVTYPE %in% highwindheavyrain_cases] <- "highwindheavyrain"
#unseasonablywarmwet
stormdata$EVTYPE[stormdata$EVTYPE %in% c("unseasonablywarmwet","unseasonablywarm&wet")] <- "unseasonablywarmwet"
#heavysnowicestorm
stormdata$EVTYPE[stormdata$EVTYPE %in% c("snowandicestorm","heavysnowandicestorm","heavysnowicestorm","icestormandsnow")] <- "heavysnowicestorm"
#dryweather
dry_cases <- c("excessivelydry","recorddryness","dryconditions","driestmonth","dryness","abnormallydry","verydry","dryspell","mildanddrypattern","drypattern","hotdrypattern","milddrypattern","unseasonablydry")
stormdata$EVTYPE[stormdata$EVTYPE %in% dry_cases] <- "dryweather"

stormdata$EVTYPE <- as.character(stormdata$EVTYPE)
length(unique(stormdata$EVTYPE))
## [1] 329
unique(stormdata$EVTYPE)
##   [1] "tornado"                     "thunderstormwind"           
##   [3] "hail"                        "freezingrain"               
##   [5] "snow"                        "flashflood"                 
##   [7] "heavysnowandice"             "winterstorm"                
##   [9] "hurricanetyphoon"            "extremecoldwindchill"       
##  [11] "heavyrain"                   "lightning"                  
##  [13] "densefog"                    "ripcurrent"                 
##  [15] "highwind"                    "funnelcloud"                
##  [17] "thunderstormwindshail"       "excessiveheat"              
##  [19] "strongwind"                  "lighting"                   
##  [21] "heavyrainlightning"          "wallcloud"                  
##  [23] "flood"                       "waterspout"                 
##  [25] "blizzard"                    "breakupflooding"            
##  [27] "highwindblizzard"            "frostfreeze"                
##  [29] "coastalflood"                "highwindandhightides"       
##  [31] "stormsurgetide"              "heavysnowhighwind"          
##  [33] "recordcoldandhighwind"       "recordhightemperature"      
##  [35] "recordhigh"                  "highwindheavyrain"          
##  [37] "icestorm"                    "recordlow"                  
##  [39] "highwindwindchill"           "lowtemperaturerecord"       
##  [41] "avalanche"                   "marinemishap"               
##  [43] "highwindwindchillblizzard"   "highseas"                   
##  [45] "severeturbulence"            "recordsnowfall"             
##  [47] "recordwarmth"                "heavysnowwind"              
##  [49] "winddamage"                  "duststorm"                  
##  [51] "apachecounty"                "sleet"                      
##  [53] "dustdevil"                   "thunderstormwindsfunnelclou"
##  [55] "winterstormhighwind"         "gustywinds"                 
##  [57] "floodingheavyrain"           "snowandwind"                
##  [59] "heavysurfcoastalflooding"    "highsurf"                   
##  [61] "wildfire"                    "high"                       
##  [63] "highwindsduststorm"          "landslide"                  
##  [65] "drymicroburst"               "winds"                      
##  [67] "microburst"                  "ice"                        
##  [69] "downburst"                   "gustnadoand"                
##  [71] "wetmicroburst"               "downburstwinds"             
##  [73] "drymicroburstwinds"          "drymircoburstwinds"         
##  [75] "microburstwinds"             "blizzardheavysnow"          
##  [77] "blowingsnow"                 "freezingdrizzle"            
##  [79] "lightningthunderstormwindss" "heavyrainflooding"          
##  [81] "glaze"                       "firstsnow"                  
##  [83] "freezingrainandsleet"        "dryweather"                 
##  [85] "unseasonablywet"             "wintrymix"                  
##  [87] "winterweather"               "ripcurrentsheavysurf"       
##  [89] "sleetrainsnow"               "unseasonablywarm"           
##  [91] "drought"                     "normalprecipitation"        
##  [93] "highwindsflooding"           "dry"                        
##  [95] "rainsnow"                    "snowrainsleet"              
##  [97] "tornadoes,tstmwind,hail"     "tropicalstorm"              
##  [99] "lightningthunderstormwinds"  "thunderstormwindlightning"  
## [101] "ligntning"                   "freezingrainsnow"           
## [103] "thundersnow"                 "coolandwet"                 
## [105] "heavyrainsnow"               "snowsleetfreezingrain"      
## [107] "glazeice"                    "earlysnow"                  
## [109] "smallstreamand"              "excessivewetness"           
## [111] "gradientwinds"               "sleeticestorm"              
## [113] "thunderstormwindsurbanflood" "rotatingwallcloud"          
## [115] "largewallcloud"              "blowingsnowextremewindchi"  
## [117] "freezingrainsleet"           "heavysnowblizzard"          
## [119] "windstorm"                   "lakeshoreflood"             
## [121] "heavysnowicestorm"           "heavysnowsleet"             
## [123] "heatdrought"                 "thundestormwinds"           
## [125] "warmdryconditions"           "highwindscoastalflood"      
## [127] "snowrain"                    "icefloes"                   
## [129] "highwaves"                   "lakeeffectsnow"             
## [131] "heavysnowfreezingrain"       "heavywetsnow"               
## [133] "dustdevilwaterspout"         "thunderstormwindsheavyrain" 
## [135] "blizzardandheavysnow"        "blizzardandextremewindchil" 
## [137] "mudslidesurbanflooding"      "heavysnowwinterstorm"       
## [139] "blizzardwinterstorm"         "duststormhighwinds"         
## [141] "icejam"                      "heavysnowandhighwinds"      
## [143] "heavysnowhighwinds&flood"    "hailflooding"               
## [145] "thunderstormwindsflashflood" "wetsnow"                    
## [147] "heavyrainandflood"           "rainandwind"                
## [149] "snowicestorm"                "belownormalprecipitation"   
## [151] "lightsnow"                   "recordtemperatures"         
## [153] "other"                       "recordsnow"                 
## [155] "heavysnowsqualls"            "icyroads"                   
## [157] "heavymix"                    "snowfreezingrain"           
## [159] "lackofsnow"                  "damfailure"                 
## [161] "thuderstormwinds"            "freezingrainandsnow"        
## [163] "freezingrainsleetand"        "southeast"                  
## [165] "freezingdrizzleandfreezing"  "heavyrain;urbanfloodwinds;" 
## [167] "highwater"                   "snowshowers"                
## [169] "heavysnowblizzardavalanche"  "wetweather"                 
## [171] "unseasonablywarmanddry"      "freezingrainsleetandlight"  
## [173] "tidalflood"                  "beacherosin"                
## [175] "lowtemperature"              "sleet&freezingrain"         
## [177] "heavyrainsflooding"          "thunderstormwindsflooding"  
## [179] "highwayflooding"             "hypothermia"                
## [181] "thunerstormwinds"            "heavyrainmudslidesflood"    
## [183] "dryhotweather"               "rapidlyrisingwater"         
## [185] "icestrongwinds"              "heavysnowandstrongwinds"    
## [187] "snowaccumulation"            "snowblowingsnow"            
## [189] "thunderstormwindhail"        "thunderstormwindsflood"     
## [191] "nearrecordsnow"              "excessive"                  
## [193] "heavyseas"                   "flood&heavyrain"            
## [195] "?"                           "hotpattern"                 
## [197] "snowfallrecord"              "mildpattern"                
## [199] "saharandust"                 "urbanfloodlandslide"        
## [201] "heavyswells"                 "smallstream"                
## [203] "heavyrainurbanflood"         "landslideurbanflood"        
## [205] "recorddrymonth"              "temperaturerecord"          
## [207] "icejamfloodminor"            "marineaccident"             
## [209] "coastalstorm"                "erosioncstlflood"           
## [211] "lightsnowflurries"           "wetmonth"                   
## [213] "wetyear"                     "beacherosion"               
## [215] "hotanddry"                   "heavyrainhighsurf"          
## [217] "icefog"                      "landslump"                  
## [219] "lateseasonsnowfall"          "freezingfog"                
## [221] "driftingsnow"                "whirlwind"                  
## [223] "latesnow"                    "recordmaysnow"              
## [225] "recordwintersnow"            "recordtemperature"          
## [227] "mixedprecip"                 "blackice"                   
## [229] "gradientwind"                "freezingspray"              
## [231] "summaryjan"                  "summaryofmarch"             
## [233] "summaryofaprilrd"            "summaryofapril"             
## [235] "summaryaugust"               "summaryofmay"               
## [237] "summaryofmayam"              "summaryofmaypm"             
## [239] "metrostorm,may"              "summaryofjune"              
## [241] "summaryjune"                 "summaryofjuly"              
## [243] "summaryjuly"                 "summaryofaugust"            
## [245] "summaryseptember"            "summarysept."               
## [247] "summary:oct."                "summary:october"            
## [249] "summary:nov."                "wetmicoburst"               
## [251] "nosevereweather"             "summary:sept."              
## [253] "lightsnowfall"               "gustywindrain"              
## [255] "gustywindhvyrain"            "earlysnowfall"              
## [257] "monthlysnowfall"             "seasonalsnowfall"           
## [259] "monthlyrainfall"             "smlstreamfld"               
## [261] "volcanicash"                 "thundersnowshower"          
## [263] "none"                        "dambreak"                   
## [265] "sleetfreezingrain"           "hypothermiaexposure"        
## [267] "mixedprecipitation"          "icestormblizzard"           
## [269] "floodstrongwind"             "mountainsnows"              
## [271] "heavysurfandwind"            "highswells"                 
## [273] "earlyrain"                   "hotspell"                   
## [275] "unusualwarmth"               "wakelowwind"                
## [277] "moderatesnow"                "moderatesnowfall"           
## [279] "coastalerosion"              "unusualrecordwarmth"        
## [281] "seiche"                      "hyperthermiaexposure"       
## [283] "icepellets"                  "recordcool"                 
## [285] "tropicaldepression"          "coolspell"                  
## [287] "gustywindhail"               "lightsnowfreezingprecip"    
## [289] "monthlyprecipitation"        "monthlytemperature"         
## [291] "remnantsoffloyd"             "landspout"                  
## [293] "excessivesnow"               "windandwave"                
## [295] "lightfreezingrain"           "recordprecipitation"        
## [297] "iceroads"                    "roughseas"                  
## [299] "unseasonablywarmwet"         "unseasonablycool&wet"       
## [301] "nonseverewinddamage"         "warmweather"                
## [303] "unseasonallowtemp"           "lateseasonsnow"             
## [305] "gustylakewind"               "redflagcriteria"            
## [307] "wnd"                         "smoke"                      
## [309] "extremelywet"                "unusuallylatesnow"          
## [311] "recordlowrainfall"           "roguewave"                  
## [313] "prolongwarmth"               "accumulatedsnowfall"        
## [315] "fallingsnowice"              "nontstmwind"                
## [317] "patchyice"                   "northernlights"             
## [319] "marinethunderstormwind"      "verywarm"                   
## [321] "abnormallywet"               "iceonroad"                  
## [323] "drowning"                    "marinehail"                 
## [325] "marinehighwind"              "tsunami"                    
## [327] "densesmoke"                  "marinestrongwind"           
## [329] "astronomicallowtide"

Results

Impact on public health

The number of injuries and fatalities will be used to determine events with maximum impact on public health. The following calculates the sum of injuries and fatalities for each event type and plots the top ten events with maximum number of injuries and fatalities.

# Injuries

event_injury <- aggregate(INJURIES~EVTYPE,data=stormdata,sum)
event_injury <- arrange(event_injury,desc(INJURIES))
head(event_injury,10)
##              EVTYPE INJURIES
## 1           tornado    91364
## 2  thunderstormwind     9507
## 3     excessiveheat     9209
## 4             flood     6873
## 5         lightning     5231
## 6          icestorm     1990
## 7        flashflood     1802
## 8          wildfire     1608
## 9          highwind     1506
## 10             hail     1371
injury_top10 <- event_injury[1:10,]

# Fatalities
event_fatality <- aggregate(FATALITIES~EVTYPE,data=stormdata,sum)
event_fatality <- arrange(event_fatality,desc(FATALITIES))
head(event_fatality,10)
##                  EVTYPE FATALITIES
## 1               tornado       5633
## 2         excessiveheat       3132
## 3            flashflood       1035
## 4             lightning        817
## 5      thunderstormwind        711
## 6            ripcurrent        572
## 7                 flood        511
## 8  extremecoldwindchill        468
## 9              highwind        293
## 10            avalanche        225
fatality_top10 <- event_fatality[1:10,]
g_inj <- ggplot(injury_top10,aes(x=reorder(EVTYPE,-INJURIES),y=INJURIES,fill=INJURIES))
g_inj + geom_bar(stat="identity") + theme(axis.text.x  = element_text(angle=90,vjust=0.5,hjust=1)) + xlab("") + ggtitle("Top 10 events with maximum injuries")

g_fat <- ggplot(fatality_top10,aes(x=reorder(EVTYPE,-FATALITIES),y=FATALITIES,fill=FATALITIES))
g_fat + geom_bar(stat="identity") + theme(axis.text.x  = element_text(angle=90,vjust=0.5,hjust=1)) + xlab("") + ggtitle("Top 10 events with maximum fatalities")

Tornadoes have caused the maximum damage to public health, with nearly 5,633 fatalities and 91,364 injuries, from 1950 t0 2011.

Impact on the economy

The amount of damage on crops and property will be used to determine events with maximum impact on the economy.

According to page 12 of the documentation, the crop damage is given to three significant digits in the CROPDMG column and its exponent is given in the CROPDMGEXP column. For property damage the corresponding columns are PROPDMG and PROPDMGEXP. However this method has not been consistently followed while entering the data, so some care must be exercised while extracting the numbers for the economic damage. The following code achieves this objective.

# Damage to crops:

stormdata$CROPDMGEXP_ACTUAL <- 0

for (ii in 0:9) {
  stormdata$CROPDMGEXP_ACTUAL[stormdata$CROPDMGEXP == as.character(ii)] <- ii
}
stormdata$CROPDMGEXP_ACTUAL[stormdata$CROPDMGEXP %in% c("","+","-","?")] <- 0
stormdata$CROPDMGEXP_ACTUAL[stormdata$CROPDMGEXP %in% c("h","H")] <- 2
stormdata$CROPDMGEXP_ACTUAL[stormdata$CROPDMGEXP %in% c("k","K")] <- 3
stormdata$CROPDMGEXP_ACTUAL[stormdata$CROPDMGEXP %in% c("m","M")] <- 6
stormdata$CROPDMGEXP_ACTUAL[stormdata$CROPDMGEXP %in% c("b","B")] <- 9

stormdata$CROPDMG_ACTUAL <- stormdata$CROPDMG * 10^stormdata$CROPDMGEXP_ACTUAL


# Damage to property:

stormdata$PROPDMGEXP_ACTUAL <- 0

for (ii in 0:9) {
  stormdata$PROPDMGEXP_ACTUAL[stormdata$PROPDMGEXP == as.character(ii)] <- ii
}
stormdata$PROPDMGEXP_ACTUAL[stormdata$PROPDMGEXP %in% c("","+","-","?")] <- 0
stormdata$PROPDMGEXP_ACTUAL[stormdata$PROPDMGEXP %in% c("h","H")] <- 2
stormdata$PROPDMGEXP_ACTUAL[stormdata$PROPDMGEXP %in% c("k","K")] <- 3
stormdata$PROPDMGEXP_ACTUAL[stormdata$PROPDMGEXP %in% c("m","M")] <- 6
stormdata$PROPDMGEXP_ACTUAL[stormdata$PROPDMGEXP %in% c("b","B")] <- 9

stormdata$PROPDMG_ACTUAL <- stormdata$PROPDMG * 10^stormdata$PROPDMGEXP_ACTUAL

# Total damage:
stormdata$TOTAL_DAMAGE <- stormdata$CROPDMG_ACTUAL + stormdata$PROPDMG_ACTUAL

In the above, stormdata$CROPDMG_ACTUAL and stormdata$PROPDMG_ACTUAL give the actual damage in US dollars to crops and property, respectively, and TOTAL_DAMAGE is the sum of these two quantities. The next code plots the top ten events which caused the maximum damage.

event_damage <- arrange(aggregate(TOTAL_DAMAGE~EVTYPE,data=stormdata,sum),desc(TOTAL_DAMAGE))
head(event_damage,10)
##              EVTYPE TOTAL_DAMAGE
## 1             flood 161063274407
## 2  hurricanetyphoon  90872527810
## 3           tornado  57367113946
## 4    stormsurgetide  47975517150
## 5        flashflood  19122009246
## 6              hail  19024483136
## 7           drought  15018672000
## 8  thunderstormwind  12449411764
## 9          icestorm   8967041360
## 10         wildfire   8899910130
damage_top10 <- event_damage[1:10,]
gdam <- ggplot(damage_top10,aes(x=reorder(EVTYPE,-TOTAL_DAMAGE),y=TOTAL_DAMAGE/1e9,fill=TOTAL_DAMAGE/1e9))
gdam + geom_bar(stat="identity") + theme(axis.text.x  = element_text(angle=90,vjust=0.5,hjust=1)) + xlab("") + ylab("DAMAGE (BILLION USD)") + scale_fill_continuous("BILLION USD") + ggtitle("Top 10 events with maximum economic damage")

Floods have caused the maximum economic damage, with loss of property and crops worth more than 160 billion USD from 1950 to 2011.

References and Notes

  1. This assignment is part of the course “Reproducible Research” offered cy Coursera at https://www.coursera.org/learn/reproducible-research and taught by Prof. Roger D. Peng, Johns Hopkins University.

  2. The data is available at https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 and the documentation is available at https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf.