This page records a more detailed process of how the EVTYPE variable is cleaned.

When checked with unique(), 222 event types are found in the EVTYPE variable, much more than 48 event types suggested by the Storm Data Documentation, suggesting that there might be typos and other errors resulting in redundancy.

unique(storm_harmful$EVTYPE)
##   [1] "WINTER STORM"              "TORNADO"                  
##   [3] "TSTM WIND"                 "HIGH WIND"                
##   [5] "FLASH FLOOD"               "FREEZING RAIN"            
##   [7] "EXTREME COLD"              "LIGHTNING"                
##   [9] "HAIL"                      "FLOOD"                    
##  [11] "TSTM WIND/HAIL"            "EXCESSIVE HEAT"           
##  [13] "RIP CURRENTS"              "Other"                    
##  [15] "HEAVY SNOW"                "WILD/FOREST FIRE"         
##  [17] "ICE STORM"                 "BLIZZARD"                 
##  [19] "STORM SURGE"               "Ice jam flood (minor"     
##  [21] "DUST STORM"                "STRONG WIND"              
##  [23] "DUST DEVIL"                "Tstm Wind"                
##  [25] "URBAN/SML STREAM FLD"      "FOG"                      
##  [27] "ROUGH SURF"                "Heavy Surf"               
##  [29] "Dust Devil"                "HEAVY RAIN"               
##  [31] "Marine Accident"           "AVALANCHE"                
##  [33] "Freeze"                    "DRY MICROBURST"           
##  [35] "Strong Wind"               "WINDS"                    
##  [37] "COASTAL STORM"             "Erosion/Cstl Flood"       
##  [39] "River Flooding"            "WATERSPOUT"               
##  [41] "DAMAGING FREEZE"           "Damaging Freeze"          
##  [43] "HURRICANE"                 "TROPICAL STORM"           
##  [45] "Beach Erosion"             "High Surf"                
##  [47] "Heavy Rain/High Surf"      "Unseasonable Cold"        
##  [49] "Early Frost"               "Wintry Mix"               
##  [51] "Extreme Cold"              "DROUGHT"                  
##  [53] "Coastal Flooding"          "Torrential Rainfall"      
##  [55] "Landslump"                 "Hurricane Edouard"        
##  [57] "Coastal Storm"             "TIDAL FLOODING"           
##  [59] "Tidal Flooding"            "Strong Winds"             
##  [61] "EXTREME WINDCHILL"         "Glaze"                    
##  [63] "Extended Cold"             "Whirlwind"                
##  [65] "Heavy snow shower"         "Light snow"               
##  [67] "COASTAL FLOOD"             "Light Snow"               
##  [69] "MIXED PRECIP"              "COLD"                     
##  [71] "Freezing Spray"            "DOWNBURST"                
##  [73] "Mudslides"                 "Microburst"               
##  [75] "Mudslide"                  "Cold"                     
##  [77] "SNOW"                      "Coastal Flood"            
##  [79] "Snow Squalls"              "Wind Damage"              
##  [81] "Light Snowfall"            "Freezing Drizzle"         
##  [83] "Gusty wind/rain"           "GUSTY WIND/HVY RAIN"      
##  [85] "Wind"                      "Cold Temperature"         
##  [87] "Heat Wave"                 "Snow"                     
##  [89] "COLD AND SNOW"             "HEAVY SURF"               
##  [91] "RAIN/SNOW"                 "WIND"                     
##  [93] "FREEZE"                    "TSTM WIND (G45)"          
##  [95] "Gusty Winds"               "GUSTY WIND"               
##  [97] "TSTM WIND 40"              "TSTM WIND 45"             
##  [99] "HARD FREEZE"               "TSTM WIND (41)"           
## [101] "HEAT"                      "RIVER FLOOD"              
## [103] "TSTM WIND (G40)"           "RIP CURRENT"              
## [105] "HIGH SURF"                 "MUD SLIDE"                
## [107] "Frost/Freeze"              "SNOW AND ICE"             
## [109] "COASTAL FLOODING"          "AGRICULTURAL FREEZE"      
## [111] "WINTER WEATHER"            "STRONG WINDS"             
## [113] "SNOW SQUALL"               "ICY ROADS"                
## [115] "OTHER"                     "THUNDERSTORM"             
## [117] "Hypothermia/Exposure"      "HYPOTHERMIA/EXPOSURE"     
## [119] "Lake Effect Snow"          "Freezing Rain"            
## [121] "Mixed Precipitation"       "BLACK ICE"                
## [123] "COASTALSTORM"              "LIGHT SNOW"               
## [125] "DAM BREAK"                 "Gusty winds"              
## [127] "blowing snow"              "FREEZING DRIZZLE"         
## [129] "FROST"                     "GRADIENT WIND"            
## [131] "UNSEASONABLY COLD"         "GUSTY WINDS"              
## [133] "TSTM WIND AND LIGHTNING"   "gradient wind"            
## [135] "Gradient wind"             "Freezing drizzle"         
## [137] "WET MICROBURST"            "Heavy surf and wind"      
## [139] "FUNNEL CLOUD"              "TYPHOON"                  
## [141] "LANDSLIDES"                "HIGH SWELLS"              
## [143] "HIGH WINDS"                "SMALL HAIL"               
## [145] "UNSEASONAL RAIN"           "COASTAL FLOODING/EROSION" 
## [147] " TSTM WIND (G45)"          "TSTM WIND  (G45)"         
## [149] "HIGH WIND (G40)"           "TSTM WIND (G35)"          
## [151] "GLAZE"                     "COASTAL EROSION"          
## [153] "UNSEASONABLY WARM"         "SEICHE"                   
## [155] "COASTAL  FLOODING/EROSION" "HYPERTHERMIA/EXPOSURE"    
## [157] "WINTRY MIX"                "RIVER FLOODING"           
## [159] "ROCK SLIDE"                "GUSTY WIND/HAIL"          
## [161] "HEAVY SEAS"                " TSTM WIND"               
## [163] "LANDSPOUT"                 "RECORD HEAT"              
## [165] "EXCESSIVE SNOW"            "LAKE EFFECT SNOW"         
## [167] "FLOOD/FLASH/FLOOD"         "MIXED PRECIPITATION"      
## [169] "WIND AND WAVE"             "FLASH FLOOD/FLOOD"        
## [171] "LIGHT FREEZING RAIN"       "ICE ROADS"                
## [173] "HIGH SEAS"                 "RAIN"                     
## [175] "ROUGH SEAS"                "TSTM WIND G45"            
## [177] "NON-SEVERE WIND DAMAGE"    "WARM WEATHER"             
## [179] "THUNDERSTORM WIND (G40)"   "LANDSLIDE"                
## [181] "HIGH WATER"                " FLASH FLOOD"             
## [183] "LATE SEASON SNOW"          "WINTER WEATHER MIX"       
## [185] "ROGUE WAVE"                "FALLING SNOW/ICE"         
## [187] "NON-TSTM WIND"             "NON TSTM WIND"            
## [189] "MUDSLIDE"                  "BRUSH FIRE"               
## [191] "BLOWING DUST"              "VOLCANIC ASH"             
## [193] "   HIGH SURF ADVISORY"     "HAZARDOUS SURF"           
## [195] "WILDFIRE"                  "COLD WEATHER"             
## [197] "WHIRLWIND"                 "ICE ON ROAD"              
## [199] "SNOW SQUALLS"              "DROWNING"                 
## [201] "EXTREME COLD/WIND CHILL"   "MARINE TSTM WIND"         
## [203] "HURRICANE/TYPHOON"         "DENSE FOG"                
## [205] "WINTER WEATHER/MIX"        "FROST/FREEZE"             
## [207] "ASTRONOMICAL HIGH TIDE"    "HEAVY SURF/HIGH SURF"     
## [209] "TROPICAL DEPRESSION"       "LAKE-EFFECT SNOW"         
## [211] "MARINE HIGH WIND"          "THUNDERSTORM WIND"        
## [213] "TSUNAMI"                   "STORM SURGE/TIDE"         
## [215] "COLD/WIND CHILL"           "LAKESHORE FLOOD"          
## [217] "MARINE THUNDERSTORM WIND"  "MARINE STRONG WIND"       
## [219] "ASTRONOMICAL LOW TIDE"     "DENSE SMOKE"              
## [221] "MARINE HAIL"               "FREEZING FOG"

Cleaning Process

Compare the 222 event types in storm_harmful with the 2.1.1 Storm Data Event Table in the Storm Data Documentation. Reduce the number of event types step by step. Command length(unique(storm_harmful$EVTYPE)) is used to show the number of events after each step of cleaning.

  1. Convert all the letters to upper case.
storm_harmful$EVTYPE=toupper(storm_harmful$EVTYPE)
length(unique(storm_harmful$EVTYPE))
## [1] 186
  1. Some event types contain leading whitespaces, for example " FLASH FLOOD“”, which should be “FLASH FLOOD”. Remove leading whitespaces with str_trim.
library(stringr)
storm_harmful$EVTYPE=str_trim(storm_harmful$EVTYPE)
length(unique(storm_harmful$EVTYPE))
## [1] 183
  1. Check and replace redundant event types (in alphabetical order as the Storm Data Event Table).

Principles:

  1. If an ineffective event type name is encountered, check the Storm Data Documentation to see if it is included in an effective event type. For example, LANDSLIDES is not an effective event type name, however, it is included in the DEBRIS FLOW section so DEBRIS FLOW should be the effective event type name.
  2. If two event types appear in an event type name……
    1. If both events can be found in one effective event type, use the effective event type name. eg. SNOW AND ICE -> WINTER WEATHER
    2. If one is an effective name but the other is not, take the effective one. eg. GUSTY WIND/HAIL -> HAIL
    3. Else, take whatever comes first. eg. HEAVY RAIN/HIGH SURF -> HEAVY RAIN.
  3. Keep event type OTHER, and put event types that cannot be sorted into this group.

Clean event types related to COASTAL FLOOD.

table(storm_harmful[grep("COASTAL",storm_harmful$EVTYPE),]$EVTYPE)
## 
## COASTAL  FLOODING/EROSION           COASTAL EROSION 
##                         1                         1 
##             COASTAL FLOOD          COASTAL FLOODING 
##                       153                        35 
##  COASTAL FLOODING/EROSION             COASTAL STORM 
##                         3                         4 
##              COASTALSTORM 
##                         1
storm_harmful[grep("COASTAL",storm_harmful$EVTYPE),]$EVTYPE="COASTAL FLOOD"
length(unique(storm_harmful$EVTYPE))
## [1] 177

Clean event types related to COLD/WINDCHILL.

table(storm_harmful[grep("COLD|HYPOTHERMIA|WINDCHILL",storm_harmful$EVTYPE),]$EVTYPE)
## 
##                    COLD           COLD AND SNOW        COLD TEMPERATURE 
##                      20                       1                       2 
##            COLD WEATHER         COLD/WIND CHILL           EXTENDED COLD 
##                       1                      90                       1 
##            EXTREME COLD EXTREME COLD/WIND CHILL       EXTREME WINDCHILL 
##                     166                     111                      19 
##    HYPOTHERMIA/EXPOSURE       UNSEASONABLE COLD       UNSEASONABLY COLD 
##                       6                       1                       3
storm_harmful[grep("^COLD|HYPOTHERMIA",storm_harmful$EVTYPE),]$EVTYPE="COLD/WIND CHILL"
storm_harmful[grep(" COLD|WINDCHILL",storm_harmful$EVTYPE),]$EVTYPE="EXTREME COLD/WIND CHILL"
length(unique(storm_harmful$EVTYPE))
## [1] 167

Clean event types related to DEBRIS FLOW.

table(storm_harmful[grep("SLIDE|SLUMP",storm_harmful$EVTYPE),]$EVTYPE)
## 
##  LANDSLIDE LANDSLIDES  LANDSLUMP  MUD SLIDE   MUDSLIDE  MUDSLIDES 
##        190          1          1          2          5          1 
## ROCK SLIDE 
##          1
storm_harmful[grep("SLIDE|SLUMP",storm_harmful$EVTYPE),]$EVTYPE="DEBRIS FLOW"
length(unique(storm_harmful$EVTYPE))
## [1] 161

Clean event types related to FOG.

table(storm_harmful[grep("FOG",storm_harmful$EVTYPE),]$EVTYPE)
## 
##    DENSE FOG          FOG FREEZING FOG 
##           58          101            7
library(ggplot2)
fog=storm_harmful[grep("FOG",storm_harmful$EVTYPE),]
qplot(INJURIES,data=fog,facets=.~EVTYPE)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-10

qplot(FATALITIES,data=fog,facets=.~EVTYPE)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-10

qplot(PROP,data=fog,facets=.~EVTYPE)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-10

qplot(CROP,data=fog,facets=.~EVTYPE)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-10

Although only DENSE FOG and FREEZING FOG are considered effective event types according to the Storm Data Event Table, FOG appears much more often in the data. After making exploratory plots on INJURIES, FATALITES, PROP, and CROP to compare DENSE FOG, FOG, and FREEZING FOG, I did not find clear distinctions among these three groups, so I decided to keep FOG as the way it is.

Clean event types related to DUST.

table(storm_harmful[grep("DUST",storm_harmful$EVTYPE),]$EVTYPE)
## 
## BLOWING DUST   DUST DEVIL   DUST STORM 
##            1           84           96
storm_harmful[grep("BLOWING DUST",storm_harmful$EVTYPE),]$EVTYPE="DUST STORM"
length(unique(storm_harmful$EVTYPE))
## [1] 160

Clean event types related to HEAT.

table(storm_harmful[grep("HEAT|HYPERTHERMIA|WARM",storm_harmful$EVTYPE),]$EVTYPE)
## 
##        EXCESSIVE HEAT                  HEAT             HEAT WAVE 
##                   685                   164                     1 
## HYPERTHERMIA/EXPOSURE           RECORD HEAT     UNSEASONABLY WARM 
##                     1                     1                     5 
##          WARM WEATHER 
##                     1
storm_harmful[grep("HYPERTHERMIA|WARM",storm_harmful$EVTYPE),]$EVTYPE="HEAT"
storm_harmful[grep("HEAT WAVE|RECORD HEAT",storm_harmful$EVTYPE),]$EVTYPE="EXCESSIVE HEAT"
length(unique(storm_harmful$EVTYPE))
## [1] 155

Clean event types related to FLASH FLOOD.

table(storm_harmful[grep("FLASH|DAM BREAK|HIGH WATER",storm_harmful$EVTYPE),]$EVTYPE)
## 
##         DAM BREAK       FLASH FLOOD FLASH FLOOD/FLOOD FLOOD/FLASH/FLOOD 
##                 2             19012                 1                 1 
##        HIGH WATER 
##                 2
storm_harmful[grep("FLASH|DAM BREAK|HIGH WATER",storm_harmful$EVTYPE),]$EVTYPE="FLASH FLOOD"
length(unique(storm_harmful$EVTYPE))
## [1] 151

Clean event types related to FLOOD.

table(storm_harmful[grep("FLOOD",storm_harmful$EVTYPE),]$EVTYPE)
## 
##        COASTAL FLOOD   EROSION/CSTL FLOOD          FLASH FLOOD 
##                  198                    2                19018 
##                FLOOD ICE JAM FLOOD (MINOR      LAKESHORE FLOOD 
##                 9513                    1                    5 
##          RIVER FLOOD       RIVER FLOODING       TIDAL FLOODING 
##                   80                    6                    4
storm_harmful[grep("RIVER",storm_harmful$EVTYPE),]$EVTYPE="FLOOD"
storm_harmful[grep("CSTL|TIDAL FLOODING",storm_harmful$EVTYPE),]$EVTYPE="COASTAL FLOOD"
storm_harmful[grep("ICE JAM",storm_harmful$EVTYPE),]$EVTYPE="FLASH FLOOD"
length(unique(storm_harmful$EVTYPE))
## [1] 146

Clean event types related to FROST/FREEZE.

table(storm_harmful[grep("FROST|FREEZE",storm_harmful$EVTYPE),]$EVTYPE)
## 
## AGRICULTURAL FREEZE     DAMAGING FREEZE         EARLY FROST 
##                   3                   3                   1 
##              FREEZE               FROST        FROST/FREEZE 
##                  14                   1                 117 
##         HARD FREEZE 
##                   1
storm_harmful[grep("FROST|FREEZE",storm_harmful$EVTYPE),]$EVTYPE="FROST/FREEZE"
length(unique(storm_harmful$EVTYPE))
## [1] 140

Clean event types related to HAIL.

table(storm_harmful[grep("HAIL",storm_harmful$EVTYPE),]$EVTYPE)
## 
## GUSTY WIND/HAIL            HAIL     MARINE HAIL      SMALL HAIL 
##               1           22679               2              11 
##  TSTM WIND/HAIL 
##             441
storm_harmful[grep("SMALL HAIL",storm_harmful$EVTYPE),]$EVTYPE="HAIL"
length(unique(storm_harmful$EVTYPE))
## [1] 139

Clean event types related to GUSTY WIND.

Since GUSTY WIND is not an effective event type, if one particular related event type contains HAIL in the string it is re-labeled as HAIL, and if RAIN is in the string it is re-labeled as HEAVY RAIN, and the rest are labeled as STRONG WIND.

table(storm_harmful[grep("GUSTY",storm_harmful$EVTYPE),]$EVTYPE)
## 
##          GUSTY WIND     GUSTY WIND/HAIL GUSTY WIND/HVY RAIN 
##                  13                   1                   1 
##     GUSTY WIND/RAIN         GUSTY WINDS 
##                   1                  30
storm_harmful[grep("GUSTY WIND/HAIL",storm_harmful$EVTYPE),]$EVTYPE="HAIL"
storm_harmful[grep("GUSTY WIND/HVY RAIN|GUSTY WIND/RAIN",storm_harmful$EVTYPE),]$EVTYPE="HEAVY RAIN"
storm_harmful[grep("GUSTY WIND",storm_harmful$EVTYPE),]$EVTYPE="STRONG WIND"
length(unique(storm_harmful$EVTYPE))
## [1] 134

Clean event types related to RAIN.

table(storm_harmful[grep("RAIN|DRIZZLE",storm_harmful$EVTYPE),]$EVTYPE)
## 
##     FREEZING DRIZZLE        FREEZING RAIN           HEAVY RAIN 
##                    5                    9                 1049 
## HEAVY RAIN/HIGH SURF  LIGHT FREEZING RAIN                 RAIN 
##                    1                   22                    3 
##            RAIN/SNOW  TORRENTIAL RAINFALL      UNSEASONAL RAIN 
##                    2                    1                    2
storm_harmful[grep("FREEZING RAIN|RAIN/SNOW|FREEZING DRIZZLE",storm_harmful$EVTYPE),]$EVTYPE="WINTER WEATHER"
storm_harmful[grep("RAIN",storm_harmful$EVTYPE),]$EVTYPE="HEAVY RAIN"
length(unique(storm_harmful$EVTYPE))
## [1] 126

Clean event types related to SNOW.

table(storm_harmful[grep("SNOW",storm_harmful$EVTYPE),]$EVTYPE)
## 
##      BLOWING SNOW    EXCESSIVE SNOW  FALLING SNOW/ICE        HEAVY SNOW 
##                 1                25                 2              1029 
## HEAVY SNOW SHOWER  LAKE-EFFECT SNOW  LAKE EFFECT SNOW  LATE SEASON SNOW 
##                 1               194                 4                 1 
##        LIGHT SNOW    LIGHT SNOWFALL              SNOW      SNOW AND ICE 
##               141                 1                16                 1 
##       SNOW SQUALL      SNOW SQUALLS 
##                 3                 5
storm_harmful[grep("EXCESSIVE SNOW|LATE SEASON SNOW|HEAVY SNOW SHOWER",storm_harmful$EVTYPE),]$EVTYPE="HEAVY SNOW"
storm_harmful[agrep("LAKE-EFFECT",storm_harmful$EVTYPE),]$EVTYPE="LAKE-EFFECT SNOW"
storm_harmful[grep("BLOWING SNOW|LIGHT SNOW|SNOW SQUALL|FALLING SNOW/ICE|SNOW AND ICE",storm_harmful$EVTYPE),]$EVTYPE="WINTER WEATHER"
storm_harmful$EVTYPE[which(storm_harmful$EVTYPE=="SNOW")]="HEAVY SNOW"
length(unique(storm_harmful$EVTYPE))
## [1] 114

Clean event types related to HIGH SURF.

table(storm_harmful[grep("SURF",storm_harmful$EVTYPE),]$EVTYPE)
## 
##       HAZARDOUS SURF           HEAVY SURF  HEAVY SURF AND WIND 
##                    1                   29                    1 
## HEAVY SURF/HIGH SURF            HIGH SURF   HIGH SURF ADVISORY 
##                   50                  128                    1 
##           ROUGH SURF 
##                    2
storm_harmful[grep("SURF",storm_harmful$EVTYPE),]$EVTYPE="HIGH SURF"
length(unique(storm_harmful$EVTYPE))
## [1] 108

Clean event types related to HIGH WIND.

table(storm_harmful[grep("HIGH WIND",storm_harmful$EVTYPE),]$EVTYPE)
## 
##        HIGH WIND  HIGH WIND (G40)       HIGH WINDS MARINE HIGH WIND 
##             5402                2                1               19
storm_harmful[grep("^HIGH WIND",storm_harmful$EVTYPE),]$EVTYPE="HIGH WIND"
length(unique(storm_harmful$EVTYPE))
## [1] 106

Clean event types related to HURRICANE(TYPHOON).

table(storm_harmful[grep("HURRICANE|TYPHOON",storm_harmful$EVTYPE),]$EVTYPE)
## 
##         HURRICANE HURRICANE EDOUARD HURRICANE/TYPHOON           TYPHOON 
##               126                 1                72                 9
storm_harmful[grep("HURRICANE|TYPHOON",storm_harmful$EVTYPE),]$EVTYPE="HURRICANE(TYPHOON)"
length(unique(storm_harmful$EVTYPE))
## [1] 103

Clean event types related to RIP CURRENT.

table(storm_harmful[grep("RIP|GRADIENT",storm_harmful$EVTYPE),]$EVTYPE)
## 
## GRADIENT WIND   RIP CURRENT  RIP CURRENTS 
##             6           364           239
storm_harmful[grep("RIP|GRADIENT",storm_harmful$EVTYPE),]$EVTYPE="RIP CURRENT"
length(unique(storm_harmful$EVTYPE))
## [1] 101

Clean event types related to STORM SURGE/TIDE.

table(storm_harmful[grep("SURGE|TIDE",storm_harmful$EVTYPE),]$EVTYPE)
## 
## ASTRONOMICAL HIGH TIDE  ASTRONOMICAL LOW TIDE            STORM SURGE 
##                      8                      2                    169 
##       STORM SURGE/TIDE 
##                     47
storm_harmful[grep("STORM SURGE",storm_harmful$EVTYPE),]$EVTYPE="STORM SURGE/TIDE"
storm_harmful[grep("ASTRONOMICAL HIGH",storm_harmful$EVTYPE),]$EVTYPE="COASTAL FLOOD"
length(unique(storm_harmful$EVTYPE))
## [1] 99

Clean event types related to STRONG WIND.

table(storm_harmful[grep("STRONG WIND",storm_harmful$EVTYPE),]$EVTYPE)
## 
## MARINE STRONG WIND        STRONG WIND       STRONG WINDS 
##                 46               3412                 45
storm_harmful[grep("^STRONG WIND",storm_harmful$EVTYPE),]$EVTYPE="STRONG WIND"
length(unique(storm_harmful$EVTYPE))
## [1] 98

Clean event types related to THUNDERSTORM WIND.

table(storm_harmful[grep("THUNDERSTORM|TSTM|BURST",storm_harmful$EVTYPE),]$EVTYPE)
## 
##                DOWNBURST           DRY MICROBURST MARINE THUNDERSTORM WIND 
##                        1                       75                       33 
##         MARINE TSTM WIND               MICROBURST            NON-TSTM WIND 
##                      109                        1                        1 
##            NON TSTM WIND             THUNDERSTORM        THUNDERSTORM WIND 
##                        1                        2                    43097 
##  THUNDERSTORM WIND (G40)                TSTM WIND         TSTM WIND  (G45) 
##                        1                    61778                        1 
##           TSTM WIND (41)          TSTM WIND (G35)          TSTM WIND (G40) 
##                        1                        1                        9 
##          TSTM WIND (G45)             TSTM WIND 40             TSTM WIND 45 
##                       37                        1                        1 
##  TSTM WIND AND LIGHTNING            TSTM WIND G45           TSTM WIND/HAIL 
##                        1                        1                      441 
##           WET MICROBURST 
##                        3
storm_harmful[grep("^THUNDERSTORM|^TSTM|BURST",storm_harmful$EVTYPE),]$EVTYPE="THUNDERSTORM WIND"
storm_harmful[grep("MARINE TSTM",storm_harmful$EVTYPE),]$EVTYPE="MARINE THUNDERSTORM WIND"
length(unique(storm_harmful$EVTYPE))
## [1] 80
storm_harmful[agrep("NON-TSTM WIND",storm_harmful$EVTYPE),]$EVTYPE="OTHER"
length(unique(storm_harmful$EVTYPE))
## [1] 78

Clean event types related to TORNADO.

table(storm_harmful[grep("TORNADO|LANDSPOUT",storm_harmful$EVTYPE),]$EVTYPE)
## 
## LANDSPOUT   TORNADO 
##         2     12366
storm_harmful[grep("TORNADO|LANDSPOUT",storm_harmful$EVTYPE),]$EVTYPE="TORNADO"
length(unique(storm_harmful$EVTYPE))
## [1] 77

Clean event types related to WILDFIRE.

table(storm_harmful[grep("FIRE",storm_harmful$EVTYPE),]$EVTYPE)
## 
##       BRUSH FIRE WILD/FOREST FIRE         WILDFIRE 
##                1              381              847
storm_harmful[grep("FIRE",storm_harmful$EVTYPE),]$EVTYPE="WILDFIRE"
length(unique(storm_harmful$EVTYPE))
## [1] 75

Clean event types related to WINTER WEATHER.

table(storm_harmful[grep("WINTER WEATHER|ICE|ICY|GLAZE|MIXED PRECIP|WINTRY|FREEZING SPRAY",storm_harmful$EVTYPE),]$EVTYPE)
## 
##           BLACK ICE      FREEZING SPRAY               GLAZE 
##                   1                   1                  16 
##         ICE ON ROAD           ICE ROADS           ICE STORM 
##                   1                   1                 631 
##           ICY ROADS        MIXED PRECIP MIXED PRECIPITATION 
##                  18                   6                  18 
##      WINTER WEATHER  WINTER WEATHER MIX  WINTER WEATHER/MIX 
##                 597                   2                 139 
##          WINTRY MIX 
##                   4
storm_harmful[grep("WINTER WEATHER|BLACK ICE|GLAZE|ROAD|MIXED PRECIP|WINTRY|FREEZING SPRAY",storm_harmful$EVTYPE),]$EVTYPE="WINTER WEATHER"
length(unique(storm_harmful$EVTYPE))
## [1] 64

Clean event types related to WIND.

table(storm_harmful[grep("WIND",storm_harmful$EVTYPE),]$EVTYPE)
## 
##          COLD/WIND CHILL  EXTREME COLD/WIND CHILL                HIGH WIND 
##                      120                      301                     5405 
##         MARINE HIGH WIND       MARINE STRONG WIND MARINE THUNDERSTORM WIND 
##                       19                       46                      142 
##   NON-SEVERE WIND DAMAGE              STRONG WIND        THUNDERSTORM WIND 
##                        1                     3457                   105452 
##                WHIRLWIND                     WIND            WIND AND WAVE 
##                        3                       67                        1 
##              WIND DAMAGE                    WINDS 
##                        1                        1
storm_harmful[grep("^WIND|WHIRLWIND|WIND DAMAGE",storm_harmful$EVTYPE),]$EVTYPE="OTHER"
length(unique(storm_harmful$EVTYPE))
## [1] 58

Clean the rest event types.

storm_harmful[grep("URBAN",storm_harmful$EVTYPE),]$EVTYPE="HEAVY RAIN"
storm_harmful[grep("DROWNING|MARINE ACCIDENT|BEACH EROSION",storm_harmful$EVTYPE),]$EVTYPE="OTHER"
storm_harmful[grep("SWELLS|ROGUE|SEAS",storm_harmful$EVTYPE),]$EVTYPE="HIGH SURF"
length(unique(storm_harmful$EVTYPE))
## [1] 49

After Cleaning

Here comes the EVTYPE after the cleaning process. Two event types, OTHER and FOG, are added. Event type SLEET does not appear in the storm_harmful dataset. Thus, the total number of event types after cleaning is 48+2-1=49.

sort(unique(storm_harmful$EVTYPE))
##  [1] "ASTRONOMICAL LOW TIDE"    "AVALANCHE"               
##  [3] "BLIZZARD"                 "COASTAL FLOOD"           
##  [5] "COLD/WIND CHILL"          "DEBRIS FLOW"             
##  [7] "DENSE FOG"                "DENSE SMOKE"             
##  [9] "DROUGHT"                  "DUST DEVIL"              
## [11] "DUST STORM"               "EXCESSIVE HEAT"          
## [13] "EXTREME COLD/WIND CHILL"  "FLASH FLOOD"             
## [15] "FLOOD"                    "FOG"                     
## [17] "FREEZING FOG"             "FROST/FREEZE"            
## [19] "FUNNEL CLOUD"             "HAIL"                    
## [21] "HEAT"                     "HEAVY RAIN"              
## [23] "HEAVY SNOW"               "HIGH SURF"               
## [25] "HIGH WIND"                "HURRICANE(TYPHOON)"      
## [27] "ICE STORM"                "LAKE-EFFECT SNOW"        
## [29] "LAKESHORE FLOOD"          "LIGHTNING"               
## [31] "MARINE HAIL"              "MARINE HIGH WIND"        
## [33] "MARINE STRONG WIND"       "MARINE THUNDERSTORM WIND"
## [35] "OTHER"                    "RIP CURRENT"             
## [37] "SEICHE"                   "STORM SURGE/TIDE"        
## [39] "STRONG WIND"              "THUNDERSTORM WIND"       
## [41] "TORNADO"                  "TROPICAL DEPRESSION"     
## [43] "TROPICAL STORM"           "TSUNAMI"                 
## [45] "VOLCANIC ASH"             "WATERSPOUT"              
## [47] "WILDFIRE"                 "WINTER STORM"            
## [49] "WINTER WEATHER"