opts_chunk$set(echo = TRUE, results = "show", cache = TRUE)

Title: Analysis of the U.S. National Oceanic and Atmospheric Administration's (NOAA) Storm Database - Pay more attention in convection and flood related events.

Synopsis

In this report, we aim to explore the NOAA Storm Database and answer two basic questions about severe weather events.

After performed the analysis, convection related events are most harmful with respect to population health. The convection related events costed 109,410 injuries and 7,872 fataliies in United States from 1950 to 1993. Flood related events have the greatest economic consequences in United States from 1950 to 1993, the flood events costed 167.1 billions in property damage and 12.2 billions in crop damange (total: 179.3 biliions).

Data Processing

Reading the data

The NOAA database was in the form of a comma-separated-value file which compressed by bzip2 algorithm. To read the bzip2 file, we used read.csv command for help.

data <- read.csv("repdata-data-StormData.csv.bz2", sep = ",", header = TRUE, 
    stringsAsFactors = FALSE)

After reading the file, there are 902297 rows and 37 columns in this dataset.

dim(data)
## [1] 902297     37

First, let's have a look about the structure of the data

str(data)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

Processing EVTYPE column

To answer the question about which types of events are most harmful with respect to population health, we are interested in is the EVTYPE, FATALITIES and INJURIES columns.

We check the total number of unique EVTYPE value in the database and have a look about the first few unique value.

length(unique(data$EVTYPE))
## [1] 985
head(unique(data$EVTYPE))
## [1] "TORNADO"               "TSTM WIND"             "HAIL"                 
## [4] "FREEZING RAIN"         "SNOW"                  "ICE STORM/FLASH FLOOD"

There are 985 terms appeared in the EVTYPE column. To ease for analysis, we first convert all the terms into lowercase and trim the leading and tailing whitespace.

data$EVTYPE = tolower(data$EVTYPE)
data$EVTYPE = gsub("^\\s+|\\s+$", "", data$EVTYPE)  #remove leading and tailing whitespace

Since there are too many different terms in EVTYPE column, it is better to group the similar events together. We used 2009 Annual Summaries from National Climatic Data Center as reference and proposed the following categories:

Category Related Term
Convection lightning, lighting, ligntning, tornado, torndao, funnel, thunderstorm wind, hail
Extreme Temperatures cold, low, cool, heat, high temperature, record high, warm, hyperthermia, hot
Flood flood
Marine coastal storm, tsunami, rip current
Tropical Cylones tropical storm, hurricane
Winter winter storm, snow, blizzard, freez, ice, icy, avalanche, avalance
Other drought, dust storm, duststorm, dust devil, rain, shower, fog, vog, high wind , wind, wnd, waterspout, water spout, wayterspout, fire weather, dry, mud slide, mudslide, volcanic ash
convectionTerm = c("lightning", "lighting", "ligntning", "tornado", "torndao", 
    "funnel", "thunderstorm", "wind", "hail")

extremeTempTerm = c("cold", "low", "cool", "heat", "high temperature", "record", 
    "high", "warm", "hyperthermia", "hot")

floodTerm = c("flood")

marineTerm = c("coastal", "storm", "tsunami", "rip current")

tropicalCyloneTerm = c("tropical storm", "hurricane")

winterTerm = c("winter storm", "snow", "blizzard", "freez", "ice", "icy", "avalanche", 
    "avalance")

otherTerm = c("drought", "dust storm", "duststorm", "dust devil", "rain", "shower", 
    "fog", "vog", "high wind", "wind", "wnd", "waterspout", "water spout", "wayterspout", 
    "fire weather", "dry", "mud slide", "mudslide", "volcanic ash")

We defined a function to form the regular expression searching pattern and a function to show the unique EVTYPE value that related to a particular category

# formSearchPattern function parameter terms - is a list of character (e.g.
# c('Peter', 'Paul', 'Mary')) after executed this function, it returns the
# pattern for regular expression (e.g. '(Peter|Paul|Mary)' )
formSearchPattern <- function(terms) {
    paste(paste("(", paste(terms, collapse = "|"), sep = ""), ")", sep = "")
}

# uniqueTerm function parmeter dfCol - is the character list that you would
# like to look for related terms paramter terms - the character list which
# contains the related search terms after executed this function, it returns
# the list of unique terms that contrain the keywords in paramter terms
uniqueTerm <- function(dfCol, terms) {
    pattern = formSearchPattern(terms)
    logical = grepl(pattern, dfCol)
    unique(dfCol[which(logical)])
}

EVTYPE that related to convection.

uniqueTerm(data$EVTYPE, convectionTerm)
##   [1] "tornado"                        "tstm wind"                     
##   [3] "hail"                           "hurricane opal/high winds"     
##   [5] "thunderstorm winds"             "lightning"                     
##   [7] "thunderstorm wind"              "thunderstorm wins"             
##   [9] "high winds"                     "funnel cloud"                  
##  [11] "tornado f0"                     "thunderstorm winds lightning"  
##  [13] "thunderstorm winds/hail"        "wind"                          
##  [15] "lighting"                       "lightning and heavy rain"      
##  [17] "funnel"                         "thunderstorm winds hail"       
##  [19] "heavy rain/lightning"           "flash flooding/thunderstorm wi"
##  [21] "wall cloud/funnel cloud"        "thunderstorm"                  
##  [23] "hail 1.75)"                     "lightning/heavy rain"          
##  [25] "high wind"                      "wind chill"                    
##  [27] "high wind/blizzard"             "high wind and high tides"      
##  [29] "high wind/blizzard/freezing ra" "high wind and heavy snow"      
##  [31] "record cold and high wind"      "high winds heavy rains"        
##  [33] "high wind/ blizzard"            "blizzard/high wind"            
##  [35] "high wind/low wind chill"       "high winds and wind chill"     
##  [37] "heavy snow/high winds/freezing" "wind chill/high wind"          
##  [39] "high wind/wind chill/blizzard"  "high wind/wind chill"          
##  [41] "high wind/heavy snow"           "high wind/seas"                
##  [43] "high winds/heavy rain"          "heavy snow/wind"               
##  [45] "wind damage"                    "hail storm"                    
##  [47] "funnel clouds"                  "thunderstorm winds/funnel clou"
##  [49] "winter storm/high wind"         "winter storm/high winds"       
##  [51] "gusty winds"                    "strong winds"                  
##  [53] "snow and wind"                  "high winds dust storm"         
##  [55] "winter storm high winds"        "severe thunderstorm"           
##  [57] "severe thunderstorms"           "severe thunderstorm winds"     
##  [59] "thunderstorms winds"            "flood/rain/winds"              
##  [61] "winds"                          "thunderstorms"                 
##  [63] "flash flood winds"              "strong wind"                   
##  [65] "high wind damage"               "flood/rain/wind"               
##  [67] "downburst winds"                "dry microburst winds"          
##  [69] "dry mircoburst winds"           "microburst winds"              
##  [71] "high winds 57"                  "high winds 66"                 
##  [73] "high winds 76"                  "high winds 63"                 
##  [75] "high winds 67"                  "heavy snow/high winds"         
##  [77] "high winds 82"                  "high winds 80"                 
##  [79] "high winds 58"                  "lightning thunderstorm windss" 
##  [81] "hail 75"                        "high winds 73"                 
##  [83] "high winds 55"                  "thunderstorm winds 60"         
##  [85] "thunderstorm windss"            "tornados"                      
##  [87] "high winds/flooding"            "waterspout/tornado"            
##  [89] "waterspout tornado"             "waterspout-tornado"            
##  [91] "tornadoes, tstm wind, hail"     "lightning thunderstorm winds"  
##  [93] "lightning injury"               "lightning and thunderstorm win"
##  [95] "thunderstorm winds53"           "thunderstorm winds 13"         
##  [97] "small hail"                     "heavy snow/high wind"          
##  [99] "ligntning"                      "high winds/"                   
## [101] "extreme wind chills"            "high  winds"                   
## [103] "hail 80"                        "extreme wind chill"            
## [105] "gradient winds"                 "thunderstorm winds urban flood"
## [107] "thunderstorm winds small strea" "cold air funnel"               
## [109] "cold air funnels"               "blowing snow- extreme wind chi"
## [111] "snow- high wind- wind chill"    "cold air tornado"              
## [113] "thunderstorm winds 2"           "funnel cloud/hail"             
## [115] "tstm wind 51"                   "tstm wind 50"                  
## [117] "tstm wind 52"                   "tstm wind 55"                  
## [119] "thunderstorm winds 61"          "hail 0.75"                     
## [121] "thunderstorm damage"            "thundertorm winds"             
## [123] "hail 1.00"                      "hail/winds"                    
## [125] "wind storm"                     "hail/wind"                     
## [127] "hail 1.75"                      "thunderstormw 50"              
## [129] "wind/hail"                      "thunderstorms wind"            
## [131] "thunderstorm  winds"            "tunderstorm wind"              
## [133] "thundertsorm wind"              "thunderstorm winds/ hail"      
## [135] "thunderstorm wind/lightning"    "thundestorm winds"             
## [137] "waterspout/ tornado"            "lightning."                    
## [139] "high wind 63"                   "high winds/coastal flood"      
## [141] "thunderstorm wind g50"          "lightning fire"                
## [143] "thunderstorm winds/heavy rain"  "thunderstrom winds"            
## [145] "thunderstorm winds      le cen" "hail 225"                      
## [147] "blizzard and extreme wind chil" "low wind chill"                
## [149] "blowing snow & extreme wind ch" "tornado f3"                    
## [151] "funnel cloud."                  "torndao"                       
## [153] "hail 0.88"                      "tornado f1"                    
## [155] "thunderstorm winds g"           "deep hail"                     
## [157] "dust storm/high winds"          "thunderstorm wind g60"         
## [159] "thunderstorm winds."            "hail 88"                       
## [161] "hail 175"                       "hail 100"                      
## [163] "hail 150"                       "hail 075"                      
## [165] "thunderstorm wind g55"          "hail 125"                      
## [167] "thunderstorm winds g60"         "hail 200"                      
## [169] "thunderstorm winds funnel clou" "thunderstorm winds 62"         
## [171] "heavy snow and high winds"      "heavy snow/high winds & flood" 
## [173] "hail flooding"                  "thunderstorm winds/flash flood"
## [175] "high wind 70"                   "thunderstorm winds 53"         
## [177] "tornado/waterspout"             "rain and wind"                 
## [179] "thunderstorm wind 59"           "thunderstorm wind 52"          
## [181] "thunderstorm wind 69"           "hail damage"                   
## [183] "lightning damage"               "lightning and winds"           
## [185] "tstm wind g58"                  "thunderstormw winds"           
## [187] "thunderstorm wind 60 mph"       "thunderstorm wind 65mph"       
## [189] "thunderstorm wind/ trees"       "thunderstorm wind/awning"      
## [191] "thunderstorm wind 98 mph"       "thunderstorm wind trees"       
## [193] "tornado f2"                     "thunderstorm wind 59 mph"      
## [195] "thunderstorm winds 63 mph"      "thunderstorm wind/ tree"       
## [197] "thunderstorm damage to"         "thunderstorm wind 65 mph"      
## [199] "thunderstorm wind."             "thunderstorm wind 59 mph."     
## [201] "thunderstorm hail"              "hail 088"                      
## [203] "thunderstorm windshail"         "lightning  wauseon"            
## [205] "thuderstorm winds"              "storm force winds"             
## [207] "thunderstorm winds and"         "hail/icy roads"                
## [209] "heavy rain; urban flood winds;" "tstm wind damage"              
## [211] "rain/wind"                      "thunderstorm winds 50"         
## [213] "thunderstorm wind g52"          "thunderstorm winds 52"         
## [215] "thunderstorm wind g51"          "thunderstorm wind g61"         
## [217] "thunderestorm winds"            "thunderstorm winds/flooding"   
## [219] "thundeerstorm winds"            "thunderstorm w inds"           
## [221] "thunderstorm wind 50"           "thunerstorm winds"             
## [223] "high winds/cold"                "cold/winds"                    
## [225] "thunderstorm wind 56"           "hail aloft"                    
## [227] "ice/strong winds"               "extreme wind chill/blowing sno"
## [229] "snow/high winds"                "high winds/snow"               
## [231] "heavy snow and strong winds"    "blowing snow/extreme wind chil"
## [233] "tornadoes"                      "thunderstorm wind/hail"        
## [235] "hail 275"                       "hail 450"                      
## [237] "thunderstormw"                  "hailstorm"                     
## [239] "tstm winds"                     "hailstorms"                    
## [241] "funnels"                        "tstm wind 65)"                 
## [243] "thunderstorm winds/ flood"      "high wind and seas"            
## [245] "thunderstormwinds"              "thunderstorm winds heavy rain" 
## [247] "thunderstrom wind"              "high wind 48"                  
## [249] "waterspout funnel cloud"        "extreme windchill"             
## [251] "tstm wind/hail"                 "heavy rain and wind"           
## [253] "heavy rain/wind"                "whirlwind"                     
## [255] "gusty wind"                     "gradient wind"                 
## [257] "hail(0.75)"                     "gusty wind/rain"               
## [259] "gusty wind/hvy rain"            "tstm wind (g45)"               
## [261] "tstm wind 40"                   "tstm wind 45"                  
## [263] "tstm wind (41)"                 "tstm wind (g40)"               
## [265] "strong wind gust"               "flood/strong wind"             
## [267] "tstm wind and lightning"        "heavy surf and wind"           
## [269] "tstm wind  (g45)"               "high wind (g40)"               
## [271] "tstm wind (g35)"                "wake low wind"                 
## [273] "cold wind chill temperatures"   "bitter wind chill"             
## [275] "bitter wind chill temperatures" "wind advisory"                 
## [277] "gusty wind/hail"                "extreme windchill temperatures"
## [279] "late season hail"               "wind and wave"                 
## [281] "tstm wind g45"                  "non severe hail"               
## [283] "non-severe wind damage"         "thunderstorm wind (g40)"       
## [285] "wind gusts"                     "gusty lake wind"               
## [287] "non-tstm wind"                  "non tstm wind"                 
## [289] "gusty thunderstorm winds"       "marine tstm wind"              
## [291] "tornado debris"                 "extreme cold/wind chill"       
## [293] "gusty thunderstorm wind"        "marine hail"                   
## [295] "cold/wind chill"                "marine high wind"              
## [297] "marine thunderstorm wind"       "marine strong wind"

EVTYPE that related to extreme temperature.

uniqueTerm(data$EVTYPE, extremeTempTerm)
##   [1] "hurricane opal/high winds"      "record cold"                   
##   [3] "high winds"                     "heat"                          
##   [5] "cold"                           "extreme cold"                  
##   [7] "high wind"                      "high wind/blizzard"            
##   [9] "high wind and high tides"       "high wind/blizzard/freezing ra"
##  [11] "high tides"                     "high wind and heavy snow"      
##  [13] "record cold and high wind"      "record high temperature"       
##  [15] "record high"                    "high winds heavy rains"        
##  [17] "high wind/ blizzard"            "blizzard/high wind"            
##  [19] "high wind/low wind chill"       "heavy snow/high"               
##  [21] "record low"                     "high winds and wind chill"     
##  [23] "heavy snow/high winds/freezing" "low temperature record"        
##  [25] "wind chill/high wind"           "high wind/wind chill/blizzard" 
##  [27] "high wind/wind chill"           "high wind/heavy snow"          
##  [29] "high temperature record"        "record high temperatures"      
##  [31] "high wind/seas"                 "high winds/heavy rain"         
##  [33] "high seas"                      "record rainfall"               
##  [35] "record snowfall"                "record warmth"                 
##  [37] "extreme heat"                   "excessive heat"                
##  [39] "winter storm/high wind"         "winter storm/high winds"       
##  [41] "high surf"                      "blowing dust"                  
##  [43] "high"                           "high winds dust storm"         
##  [45] "winter storm high winds"        "high wind damage"              
##  [47] "high winds 57"                  "high winds 66"                 
##  [49] "high winds 76"                  "high winds 63"                 
##  [51] "high winds 67"                  "heavy snow/high winds"         
##  [53] "blowing snow"                   "high winds 82"                 
##  [55] "high winds 80"                  "high winds 58"                 
##  [57] "high winds 73"                  "high winds 55"                 
##  [59] "record heat"                    "heat wave"                     
##  [61] "unseasonably cold"              "extreme/record cold"           
##  [63] "unseasonably warm"              "high winds/flooding"           
##  [65] "heavy snow/high wind"           "high winds/"                   
##  [67] "cool and wet"                   "severe cold"                   
##  [69] "cold wave"                      "high  winds"                   
##  [71] "cold and wet conditions"        "heavy snow/blowing snow"       
##  [73] "cold air funnel"                "cold air funnels"              
##  [75] "blowing snow- extreme wind chi" "snow- high wind- wind chill"   
##  [77] "cold air tornado"               "prolong cold"                  
##  [79] "drought/excessive heat"         "warm dry conditions"           
##  [81] "high wind 63"                   "high winds/coastal flood"      
##  [83] "high waves"                     "heavy snow andblowing snow"    
##  [85] "low wind chill"                 "blowing snow & extreme wind ch"
##  [87] "dust storm/high winds"          "record heat wave"              
##  [89] "heavy snow and high winds"      "heavy snow/high winds & flood" 
##  [91] "high wind 70"                   "below normal precipitation"    
##  [93] "record/excessive heat"          "heat waves"                    
##  [95] "record temperatures"            "fog and cold temperatures"     
##  [97] "record snow"                    "snow/cold"                     
##  [99] "record cold/frost"              "high water"                    
## [101] "heat wave drought"              "record snow/cold"              
## [103] "unseasonably warm and dry"      "record/excessive rainfall"     
## [105] "low temperature"                "highway flooding"              
## [107] "high winds/cold"                "cold/winds"                    
## [109] "snow/ bitter cold"              "dry hot weather"               
## [111] "cold weather"                   "extreme wind chill/blowing sno"
## [113] "snow/high winds"                "high winds/snow"               
## [115] "blowing snow/extreme wind chil" "snow/blowing snow"             
## [117] "heat/drought"                   "heat drought"                  
## [119] "near record snow"               "high wind and seas"            
## [121] "snow and cold"                  "hot pattern"                   
## [123] "prolong cold/snow"              "snow\\cold"                    
## [125] "snowfall record"                "hot/dry pattern"               
## [127] "high wind 48"                   "record dry month"              
## [129] "temperature record"             "hot and dry"                   
## [131] "heavy rain/high surf"           "unseasonable cold"             
## [133] "excessive cold"                 "record warm temps."            
## [135] "extended cold"                  "record may snow"               
## [137] "record winter snow"             "record temperature"            
## [139] "heatburst"                      "cold temperature"              
## [141] "cold and snow"                  "blow-out tides"                
## [143] "unseasonably cool"              "cold and frost"                
## [145] "blow-out tide"                  "high swells"                   
## [147] "high  swells"                   "hot spell"                     
## [149] "unseasonably hot"               "high wind (g40)"               
## [151] "abnormal warmth"                "unusual warmth"                
## [153] "wake low wind"                  "cold temperatures"             
## [155] "cold wind chill temperatures"   "unusual/record warmth"         
## [157] "unseasonably warm year"         "hyperthermia/exposure"         
## [159] "record cool"                    "record warm"                   
## [161] "hot weather"                    "cool spell"                    
## [163] "record dryness"                 "record  cold"                  
## [165] "record precipitation"           "unseasonably warm/wet"         
## [167] "unseasonably cool & wet"        "unusually warm"                
## [169] "unusually cold"                 "warm weather"                  
## [171] "unseasonably warm & wet"        "unseasonal low temp"           
## [173] "high surf advisory"             "record low rainfall"           
## [175] "prolong warmth"                 "excessive heat/drought"        
## [177] "astronomical high tide"         "very warm"                     
## [179] "extreme cold/wind chill"        "high surf advisories"          
## [181] "heavy surf/high surf"           "cold/wind chill"               
## [183] "marine high wind"               "astronomical low tide"

EVTYPE that related to flood

uniqueTerm(data$EVTYPE, floodTerm)
##  [1] "ice storm/flash flood"          "flash flood"                   
##  [3] "flash flooding"                 "flooding"                      
##  [5] "flood"                          "flash flooding/thunderstorm wi"
##  [7] "breakup flooding"               "river flood"                   
##  [9] "coastal flood"                  "flood watch/"                  
## [11] "flash floods"                   "flooding/heavy rain"           
## [13] "heavy surf coastal flooding"    "urban flooding"                
## [15] "urban/small flooding"           "local flood"                   
## [17] "flood/flash flood"              "flood/rain/winds"              
## [19] "flash flood winds"              "urban/small stream flooding"   
## [21] "stream flooding"                "flash flood/"                  
## [23] "flood/rain/wind"                "small stream urban flood"      
## [25] "urban flood"                    "heavy rain/flooding"           
## [27] "coastal flooding"               "high winds/flooding"           
## [29] "urban/small stream flood"       "minor flooding"                
## [31] "urban/small stream  flood"      "urban and small stream flood"  
## [33] "small stream flooding"          "floods"                        
## [35] "small stream and urban floodin" "small stream/urban flood"      
## [37] "small stream and urban flood"   "rural flood"                   
## [39] "thunderstorm winds urban flood" "major flood"                   
## [41] "ice jam flooding"               "street flood"                  
## [43] "small stream flood"             "lake flood"                    
## [45] "urban and small stream floodin" "river and stream flood"        
## [47] "minor flood"                    "high winds/coastal flood"      
## [49] "river flooding"                 "flood/river flood"             
## [51] "mud slides urban flooding"      "heavy snow/high winds & flood" 
## [53] "hail flooding"                  "thunderstorm winds/flash flood"
## [55] "heavy rain and flood"           "local flash flood"             
## [57] "flood/flash flooding"           "coastal/tidal flood"           
## [59] "flash flood/flood"              "flash flood from ice jams"     
## [61] "flash flood - heavy rain"       "flash flood/ street"           
## [63] "flash flood/heavy rain"         "heavy rain; urban flood winds;"
## [65] "flood flash"                    "flood flood/flash"             
## [67] "tidal flood"                    "flood/flash"                   
## [69] "heavy rains/flooding"           "thunderstorm winds/flooding"   
## [71] "highway flooding"               "flash flood/ flood"            
## [73] "heavy rain/mudslides/flood"     "beach erosion/coastal flood"   
## [75] "snowmelt flooding"              "flash flooding/flood"          
## [77] "beach flood"                    "thunderstorm winds/ flood"     
## [79] "flood & heavy rain"             "flood/flashflood"              
## [81] "urban small stream flood"       "urban flood landslide"         
## [83] "urban floods"                   "heavy rain/urban flood"        
## [85] "flash flood/landslide"          "landslide/urban flood"         
## [87] "flash flood landslides"         "ice jam flood (minor"          
## [89] "coastalflood"                   "erosion/cstl flood"            
## [91] "tidal flooding"                 "street flooding"               
## [93] "flood/strong wind"              "coastal flooding/erosion"      
## [95] "urban/street flooding"          "coastal  flooding/erosion"     
## [97] "flood/flash/flood"              "cstl flooding/erosion"         
## [99] "lakeshore flood"

EVTYPE that related to flood

uniqueTerm(data$EVTYPE, marineTerm)
##   [1] "ice storm/flash flood"          "winter storm"                  
##   [3] "thunderstorm winds"             "thunderstorm wind"             
##   [5] "rip current"                    "thunderstorm wins"             
##   [7] "thunderstorm winds lightning"   "thunderstorm winds/hail"       
##   [9] "thunderstorm winds hail"        "flash flooding/thunderstorm wi"
##  [11] "thunderstorm"                   "coastal flood"                 
##  [13] "ice storm"                      "dust storm"                    
##  [15] "hail storm"                     "thunderstorm winds/funnel clou"
##  [17] "winter storm/high wind"         "winter storm/high winds"       
##  [19] "heavy surf coastal flooding"    "high winds dust storm"         
##  [21] "winter storm high winds"        "winter storms"                 
##  [23] "rainstorm"                      "severe thunderstorm"           
##  [25] "severe thunderstorms"           "severe thunderstorm winds"     
##  [27] "thunderstorms winds"            "thunderstorms"                 
##  [29] "lightning thunderstorm windss"  "thunderstorm winds 60"         
##  [31] "thunderstorm windss"            "coastal flooding"              
##  [33] "rip currents heavy surf"        "storm surge"                   
##  [35] "tropical storm alberto"         "tropical storm"                
##  [37] "tropical storm gordon"          "tropical storm jerry"          
##  [39] "lightning thunderstorm winds"   "lightning and thunderstorm win"
##  [41] "thunderstorm winds53"           "thunderstorm winds 13"         
##  [43] "sleet/ice storm"                "thunderstorm winds urban flood"
##  [45] "thunderstorm winds small strea" "thunderstorm winds 2"          
##  [47] "thunderstorm winds 61"          "thunderstorm damage"           
##  [49] "wind storm"                     "snowstorm"                     
##  [51] "thunderstormw 50"               "snow and ice storm"            
##  [53] "thunderstorms wind"             "thunderstorm  winds"           
##  [55] "tunderstorm wind"               "tropical storm dean"           
##  [57] "thunderstorm winds/ hail"       "thunderstorm wind/lightning"   
##  [59] "thundestorm winds"              "heavy snow/ice storm"          
##  [61] "coastal surge"                  "heavy snow and ice storm"      
##  [63] "high winds/coastal flood"       "thunderstorm wind g50"         
##  [65] "thunderstorm winds/heavy rain"  "thunderstorm winds      le cen"
##  [67] "ice storm and snow"             "thunderstorm winds g"          
##  [69] "glaze/ice storm"                "heavy snow/winter storm"       
##  [71] "blizzard/winter storm"          "dust storm/high winds"         
##  [73] "thunderstorm wind g60"          "thunderstorm winds."           
##  [75] "thunderstorm wind g55"          "thunderstorm winds g60"        
##  [77] "thunderstorm winds funnel clou" "thunderstorm winds 62"         
##  [79] "thunderstorm winds/flash flood" "thunderstorm winds 53"         
##  [81] "thunderstorm wind 59"           "thunderstorm wind 52"          
##  [83] "coastal/tidal flood"            "snow/ice storm"                
##  [85] "rip currents/heavy surf"        "thunderstorm wind 69"          
##  [87] "thunderstormw winds"            "thunderstorm wind 60 mph"      
##  [89] "thunderstorm wind 65mph"        "thunderstorm wind/ trees"      
##  [91] "thunderstorm wind/awning"       "thunderstorm wind 98 mph"      
##  [93] "thunderstorm wind trees"        "rip currents"                  
##  [95] "thunderstorm wind 59 mph"       "thunderstorm winds 63 mph"     
##  [97] "thunderstorm wind/ tree"        "thunderstorm damage to"        
##  [99] "thunderstorm wind 65 mph"       "thunderstorm wind."            
## [101] "thunderstorm wind 59 mph."      "thunderstorm hail"             
## [103] "thunderstorm windshail"         "thuderstorm winds"             
## [105] "storm force winds"              "thunderstorm winds and"        
## [107] "thunderstorm winds 50"          "thunderstorm wind g52"         
## [109] "thunderstorm winds 52"          "thunderstorm wind g51"         
## [111] "thunderstorm wind g61"          "thunderestorm winds"           
## [113] "thunderstorm winds/flooding"    "thundeerstorm winds"           
## [115] "thunderstorm w inds"            "thunderstorm wind 50"          
## [117] "thunerstorm winds"              "beach erosion/coastal flood"   
## [119] "thunderstorm wind 56"           "thunderstorm wind/hail"        
## [121] "thunderstormw"                  "hailstorm"                     
## [123] "hailstorms"                     "thunderstorm winds/ flood"     
## [125] "thunderstormwinds"              "thunderstorm winds heavy rain" 
## [127] "duststorm"                      "coastal storm"                 
## [129] "coastalflood"                   "metro storm, may 26"           
## [131] "coastalstorm"                   "icestorm/blizzard"             
## [133] "coastal flooding/erosion"       "coastal erosion"               
## [135] "coastal  flooding/erosion"      "thunderstorm wind (g40)"       
## [137] "gusty thunderstorm winds"       "gusty thunderstorm wind"       
## [139] "sleet storm"                    "storm surge/tide"              
## [141] "tsunami"                        "marine thunderstorm wind"

EVTYPE that related to winter

uniqueTerm(data$EVTYPE, winterTerm)
##   [1] "freezing rain"                  "snow"                          
##   [3] "ice storm/flash flood"          "snow/ice"                      
##   [5] "winter storm"                   "blizzard"                      
##   [7] "blizzard weather"               "high wind/blizzard"            
##   [9] "heavy snow"                     "freeze"                        
##  [11] "high wind/blizzard/freezing ra" "high wind and heavy snow"      
##  [13] "high wind/ blizzard"            "ice storm"                     
##  [15] "blizzard/high wind"             "heavy snow/high"               
##  [17] "heavy snow/high winds/freezing" "avalanche"                     
##  [19] "high wind/wind chill/blizzard"  "high wind/heavy snow"          
##  [21] "record snowfall"                "heavy snow/wind"               
##  [23] "winter storm/high wind"         "winter storm/high winds"       
##  [25] "snow and wind"                  "winter storm high winds"       
##  [27] "winter storms"                  "heavy snowpack"                
##  [29] "ice"                            "blizzard/heavy snow"           
##  [31] "heavy snow/high winds"          "blowing snow"                  
##  [33] "freezing drizzle"               "light snow and sleet"          
##  [35] "first snow"                     "freezing rain and sleet"       
##  [37] "sleet/rain/snow"                "rain/snow"                     
##  [39] "snow/rain/sleet"                "damaging freeze"               
##  [41] "heavy snow/high wind"           "freezing rain/snow"            
##  [43] "thundersnow"                    "heavy rain/snow"               
##  [45] "snow/sleet/freezing rain"       "glaze ice"                     
##  [47] "early snow"                     "heavy snow/blowing snow"       
##  [49] "sleet/ice storm"                "blowing snow- extreme wind chi"
##  [51] "snow and heavy snow"            "ground blizzard"               
##  [53] "snow/heavy snow"                "freezing rain/sleet"           
##  [55] "ice jam flooding"               "snow- high wind- wind chill"   
##  [57] "ice/snow"                       "heavy snow/blizzard"           
##  [59] "snow and ice"                   "snowstorm"                     
##  [61] "snow and ice storm"             "heavy snow/sleet"              
##  [63] "agricultural freeze"            "heavy snow/ice storm"          
##  [65] "heavy snow and ice storm"       "snow/rain"                     
##  [67] "ice floes"                      "snow squalls"                  
##  [69] "snow squall"                    "blizzard/freezing rain"        
##  [71] "heavy lake snow"                "heavy snow/freezing rain"      
##  [73] "lake effect snow"               "heavy wet snow"                
##  [75] "blizzard and heavy snow"        "heavy snow and ice"            
##  [77] "ice storm and snow"             "heavy snow andblowing snow"    
##  [79] "heavy snow/ice"                 "blizzard and extreme wind chil"
##  [81] "blowing snow & extreme wind ch" "glaze/ice storm"               
##  [83] "heavy snow/winter storm"        "avalance"                      
##  [85] "blizzard/winter storm"          "ice jam"                       
##  [87] "frost\\freeze"                  "hard freeze"                   
##  [89] "heavy snow and high winds"      "heavy snow/high winds & flood" 
##  [91] "wet snow"                       "snow/ice storm"                
##  [93] "light snow"                     "record snow"                   
##  [95] "snow/cold"                      "flash flood from ice jams"     
##  [97] "heavy snow squalls"             "heavy snow/squalls"            
##  [99] "heavy snow-squalls"             "icy roads"                     
## [101] "snow freezing rain"             "lack of snow"                  
## [103] "snow/sleet"                     "snow/freezing rain"            
## [105] "snow drought"                   "heavy snow   freezing rain"    
## [107] "ice and snow"                   "freezing rain and snow"        
## [109] "freezing rain sleet and"        "heavy snow & ice"              
## [111] "freezing drizzle and freezing"  "hail/icy roads"                
## [113] "snow showers"                   "heavy snow/blizzard/avalanche" 
## [115] "record snow/cold"               "freezing rain sleet and light" 
## [117] "sleet & freezing rain"          "snow/ bitter cold"             
## [119] "snow sleet"                     "early freeze"                  
## [121] "ice/strong winds"               "snow/high winds"               
## [123] "high winds/snow"                "snowmelt flooding"             
## [125] "heavy snow and strong winds"    "snow accumulation"             
## [127] "blowing snow/extreme wind chil" "snow/ ice"                     
## [129] "snow/blowing snow"              "near record snow"              
## [131] "sleet/snow"                     "snow/sleet/rain"               
## [133] "snow and cold"                  "prolong cold/snow"             
## [135] "snow\\cold"                     "snowfall record"               
## [137] "heavy snow and"                 "lake-effect snow"              
## [139] "ice jam flood (minor"           "light snow/flurries"           
## [141] "ice fog"                        "late-season snowfall"          
## [143] "freezing fog"                   "drifting snow"                 
## [145] "heavy snow shower"              "late snow"                     
## [147] "record may snow"                "record winter snow"            
## [149] "late season snowfall"           "black ice"                     
## [151] "freezing spray"                 "light snowfall"                
## [153] "early snowfall"                 "monthly snowfall"              
## [155] "seasonal snowfall"              "thundersnow shower"            
## [157] "cold and snow"                  "sleet/freezing rain"           
## [159] "frost/freeze"                   "late freeze"                   
## [161] "snow and sleet"                 "blizzard summary"              
## [163] "icestorm/blizzard"              "mountain snows"                
## [165] "moderate snow"                  "moderate snowfall"             
## [167] "ice pellets"                    "light snow/freezing precip"    
## [169] "excessive snow"                 "light freezing rain"           
## [171] "ice roads"                      "late season snow"              
## [173] "snow advisory"                  "unusually late snow"           
## [175] "accumulated snowfall"           "falling snow/ice"              
## [177] "patchy ice"                     "ice on road"

EVTYPE that related to other

uniqueTerm(data$EVTYPE, otherTerm)
##   [1] "tstm wind"                      "freezing rain"                 
##   [3] "hurricane opal/high winds"      "thunderstorm winds"            
##   [5] "heavy rain"                     "thunderstorm wind"             
##   [7] "dense fog"                      "high winds"                    
##   [9] "thunderstorm winds lightning"   "thunderstorm winds/hail"       
##  [11] "wind"                           "heavy rains"                   
##  [13] "lightning and heavy rain"       "thunderstorm winds hail"       
##  [15] "heavy rain/lightning"           "waterspout"                    
##  [17] "lightning/heavy rain"           "high wind"                     
##  [19] "wind chill"                     "high wind/blizzard"            
##  [21] "high wind and high tides"       "high wind/blizzard/freezing ra"
##  [23] "high wind and heavy snow"       "record cold and high wind"     
##  [25] "high winds heavy rains"         "high wind/ blizzard"           
##  [27] "blizzard/high wind"             "high wind/low wind chill"      
##  [29] "high winds and wind chill"      "heavy snow/high winds/freezing"
##  [31] "wind chill/high wind"           "high wind/wind chill/blizzard" 
##  [33] "high wind/wind chill"           "high wind/heavy snow"          
##  [35] "high wind/seas"                 "high winds/heavy rain"         
##  [37] "record rainfall"                "heavy snow/wind"               
##  [39] "wind damage"                    "dust storm"                    
##  [41] "dust devil"                     "thunderstorm winds/funnel clou"
##  [43] "winter storm/high wind"         "winter storm/high winds"       
##  [45] "gusty winds"                    "strong winds"                  
##  [47] "flooding/heavy rain"            "snow and wind"                 
##  [49] "water spout"                    "high winds dust storm"         
##  [51] "winter storm high winds"        "mudslides"                     
##  [53] "rainstorm"                      "severe thunderstorm winds"     
##  [55] "thunderstorms winds"            "dry microburst"                
##  [57] "flood/rain/winds"               "winds"                         
##  [59] "dry microburst 61"              "flash flood winds"             
##  [61] "strong wind"                    "high wind damage"              
##  [63] "flood/rain/wind"                "downburst winds"               
##  [65] "dry microburst winds"           "dry mircoburst winds"          
##  [67] "dry microburst 53"              "microburst winds"              
##  [69] "high winds 57"                  "dry microburst 50"             
##  [71] "high winds 66"                  "high winds 76"                 
##  [73] "high winds 63"                  "high winds 67"                 
##  [75] "heavy snow/high winds"          "high winds 82"                 
##  [77] "high winds 80"                  "high winds 58"                 
##  [79] "lightning thunderstorm windss"  "dry microburst 58"             
##  [81] "high winds 73"                  "high winds 55"                 
##  [83] "dry microburst 84"              "thunderstorm winds 60"         
##  [85] "heavy rain/flooding"            "thunderstorm windss"           
##  [87] "freezing rain and sleet"        "unseasonably dry"              
##  [89] "sleet/rain/snow"                "drought"                       
##  [91] "high winds/flooding"            "dry"                           
##  [93] "rain/snow"                      "snow/rain/sleet"               
##  [95] "waterspout/tornado"             "waterspouts"                   
##  [97] "waterspout tornado"             "waterspout-tornado"            
##  [99] "waterspout-"                    "tornadoes, tstm wind, hail"    
## [101] "lightning thunderstorm winds"   "wayterspout"                   
## [103] "thunderstorm winds53"           "thunderstorm winds 13"         
## [105] "heavy snow/high wind"           "mud slide"                     
## [107] "freezing rain/snow"             "high winds/"                   
## [109] "extreme wind chills"            "heavy rain/snow"               
## [111] "snow/sleet/freezing rain"       "high  winds"                   
## [113] "mud slides"                     "extreme wind chill"            
## [115] "gradient winds"                 "thunderstorm winds urban flood"
## [117] "thunderstorm winds small strea" "blowing snow- extreme wind chi"
## [119] "freezing rain/sleet"            "snow- high wind- wind chill"   
## [121] "fog"                            "thunderstorm winds 2"          
## [123] "tstm wind 51"                   "tstm wind 50"                  
## [125] "tstm wind 52"                   "tstm wind 55"                  
## [127] "thunderstorm winds 61"          "thundertorm winds"             
## [129] "hail/winds"                     "wind storm"                    
## [131] "hail/wind"                      "wind/hail"                     
## [133] "thunderstorms wind"             "thunderstorm  winds"           
## [135] "drought/excessive heat"         "tunderstorm wind"              
## [137] "thundertsorm wind"              "thunderstorm winds/ hail"      
## [139] "thunderstorm wind/lightning"    "heavy rain/severe weather"     
## [141] "thundestorm winds"              "waterspout/ tornado"           
## [143] "warm dry conditions"            "high wind 63"                  
## [145] "high winds/coastal flood"       "rain"                          
## [147] "snow/rain"                      "thunderstorm wind g50"         
## [149] "blizzard/freezing rain"         "heavy snow/freezing rain"      
## [151] "dust devil waterspout"          "thunderstorm winds/heavy rain" 
## [153] "thunderstrom winds"             "thunderstorm winds      le cen"
## [155] "blizzard and extreme wind chil" "low wind chill"                
## [157] "blowing snow & extreme wind ch" "waterspout/"                   
## [159] "mud slides urban flooding"      "thunderstorm winds g"          
## [161] "dust storm/high winds"          "thunderstorm wind g60"         
## [163] "thunderstorm winds."            "hvy rain"                      
## [165] "thunderstorm wind g55"          "thunderstorm winds g60"        
## [167] "thunderstorm winds funnel clou" "thunderstorm winds 62"         
## [169] "heavy snow and high winds"      "heavy snow/high winds & flood" 
## [171] "thunderstorm winds/flash flood" "high wind 70"                  
## [173] "heavy rain and flood"           "thunderstorm winds 53"         
## [175] "tornado/waterspout"             "rain and wind"                 
## [177] "thunderstorm wind 59"           "thunderstorm wind 52"          
## [179] "excessive rain"                 "thunderstorm wind 69"          
## [181] "lightning and winds"            "fog and cold temperatures"     
## [183] "tstm wind g58"                  "mudslide"                      
## [185] "snow freezing rain"             "snow/freezing rain"            
## [187] "snow drought"                   "thunderstormw winds"           
## [189] "thunderstorm wind 60 mph"       "thunderstorm wind 65mph"       
## [191] "thunderstorm wind/ trees"       "thunderstorm wind/awning"      
## [193] "thunderstorm wind 98 mph"       "thunderstorm wind trees"       
## [195] "torrential rain"                "thunderstorm wind 59 mph"      
## [197] "thunderstorm winds 63 mph"      "thunderstorm wind/ tree"       
## [199] "thunderstorm wind 65 mph"       "flash flood - heavy rain"      
## [201] "thunderstorm wind."             "thunderstorm wind 59 mph."     
## [203] "heavy snow   freezing rain"     "thunderstorm windshail"        
## [205] "thuderstorm winds"              "storm force winds"             
## [207] "freezing rain and snow"         "freezing rain sleet and"       
## [209] "thunderstorm winds and"         "flash flood/heavy rain"        
## [211] "heavy rain; urban flood winds;" "tstm wind damage"              
## [213] "rain/wind"                      "thunderstorm winds 50"         
## [215] "thunderstorm wind g52"          "thunderstorm winds 52"         
## [217] "snow showers"                   "thunderstorm wind g51"         
## [219] "heat wave drought"              "unseasonably warm and dry"     
## [221] "freezing rain sleet and light"  "record/excessive rainfall"     
## [223] "thunderstorm wind g61"          "sleet & freezing rain"         
## [225] "heavy rains/flooding"           "thunderestorm winds"           
## [227] "thunderstorm winds/flooding"    "thundeerstorm winds"           
## [229] "thunderstorm wind 50"           "thunerstorm winds"             
## [231] "heavy rain/mudslides/flood"     "high winds/cold"               
## [233] "cold/winds"                     "thunderstorm wind 56"          
## [235] "dry hot weather"                "ice/strong winds"              
## [237] "extreme wind chill/blowing sno" "snow/high winds"               
## [239] "high winds/snow"                "heavy snow and strong winds"   
## [241] "blowing snow/extreme wind chil" "thunderstorm wind/hail"        
## [243] "excessive rainfall"             "tstm winds"                    
## [245] "tstm wind 65)"                  "thunderstorm winds/ flood"     
## [247] "heavy rainfall"                 "heat/drought"                  
## [249] "heat drought"                   "high wind and seas"            
## [251] "thunderstormwinds"              "thunderstorm winds heavy rain" 
## [253] "snow/sleet/rain"                "duststorm"                     
## [255] "flood & heavy rain"             "thunderstrom wind"             
## [257] "hot/dry pattern"                "dry pattern"                   
## [259] "mild/dry pattern"               "heavy showers"                 
## [261] "high wind 48"                   "waterspout funnel cloud"       
## [263] "heavy shower"                   "heavy rain/urban flood"        
## [265] "heavy rain/small stream urban"  "extreme windchill"             
## [267] "tstm wind/hail"                 "record dry month"              
## [269] "heavy rain and wind"            "hot and dry"                   
## [271] "heavy rain/high surf"           "rain damage"                   
## [273] "ice fog"                        "torrential rainfall"           
## [275] "heavy rain/wind"                "freezing fog"                  
## [277] "whirlwind"                      "heavy snow shower"             
## [279] "gusty wind"                     "gradient wind"                 
## [281] "gusty wind/rain"                "gusty wind/hvy rain"           
## [283] "monthly rainfall"               "mudslide/landslide"            
## [285] "volcanic ash"                   "volcanic ash plume"            
## [287] "thundersnow shower"             "tstm wind (g45)"               
## [289] "sleet/freezing rain"            "tstm heavy rain"               
## [291] "tstm wind 40"                   "tstm wind 45"                  
## [293] "tstm wind (41)"                 "tstm wind (g40)"               
## [295] "tstm wnd"                       "rain (heavy)"                  
## [297] "strong wind gust"               "flood/strong wind"             
## [299] "tstm wind and lightning"        "heavy surf and wind"           
## [301] "mild and dry pattern"           "dry spell"                     
## [303] "unseasonal rain"                "early rain"                    
## [305] "prolonged rain"                 "tstm wind  (g45)"              
## [307] "high wind (g40)"                "tstm wind (g35)"               
## [309] "dry weather"                    "wake low wind"                 
## [311] "cold wind chill temperatures"   "bitter wind chill"             
## [313] "bitter wind chill temperatures" "patchy dense fog"              
## [315] "wind advisory"                  "gusty wind/hail"               
## [317] "excessively dry"                "vog"                           
## [319] "record dryness"                 "extreme windchill temperatures"
## [321] "dry conditions"                 "dryness"                       
## [323] "wind and wave"                  "light freezing rain"           
## [325] "tstm wind g45"                  "non-severe wind damage"        
## [327] "thunderstorm wind (g40)"        "locally heavy rain"            
## [329] "wind gusts"                     "gusty lake wind"               
## [331] "abnormally dry"                 "wnd"                           
## [333] "very dry"                       "record low rainfall"           
## [335] "non-tstm wind"                  "non tstm wind"                 
## [337] "gusty thunderstorm winds"       "heavy rain effects"            
## [339] "excessive heat/drought"         "marine tstm wind"              
## [341] "extreme cold/wind chill"        "gusty thunderstorm wind"       
## [343] "cold/wind chill"                "marine high wind"              
## [345] "marine thunderstorm wind"       "marine strong wind"            
## [347] "volcanic ashfall"

To ease for further analysis, for each category, we added a column. Thus, we added 7 columns to the data:isConvection, isExtremeTemp, isFlood, isMarine, isTropicalCylone, isWinter, isOther. Each EVTYPE value in the data will be checked if it is related to these categories. If it is related a particular category (say convection), then TRUE value will then be assigned to related column (i.e. isConvection), otherwise, FALSE value will then be assigned.

# isRelated function parmeter dfCol - is the character list that you would
# like to look for related terms paramter terms - the character list which
# contains the related search terms after executed this function, it returns
# the list of logical (TRUE/FALSE) to show which is/are related to a
# particular catgeory
isRelated <- function(dfCol, terms) {
    pattern = formSearchPattern(terms)
    logical = grepl(pattern, dfCol)
}

data["isConvection"] = isRelated(data$EVTYPE, convectionTerm)
data["isExtremeTemp"] = isRelated(data$EVTYPE, extremeTempTerm)
data["isFlood"] = isRelated(data$EVTYPE, floodTerm)
data["isMarine"] = isRelated(data$EVTYPE, marineTerm)
data["isTropicalCyclone"] = isRelated(data$EVTYPE, tropicalCyloneTerm)
data["isWinter"] = isRelated(data$EVTYPE, winterTerm)
data["isOther"] = isRelated(data$EVTYPE, otherTerm)
options(scipen = 999)

The total number of events that related to each category:

Category Events Related
Convection 81.7609
Extreme Temperatures 3.2002
Flood 9.1689
Marine 13.9949
Tropical Cylones 0.1092
Winter 4.0156
Other 42.8164
NOT IN ANY Category 1.9107

As you can see, 81.8% of the events are related to Convection. There are around 2% not fall into any category. As we can see, the total sum of the percentage is more than 100%. It is because some event (e.g. flash flooding/thunderstorm wi) would be falled into more than one category.

We defined the priority rule to categorize the event type, the rules is defined as follows:

For event that does not fall into any one of the above category, then we categorize such event to misc.

# convection > extremeTemp > flood > marine > tropicalCyclone > winter >
# other
data["category"] = "misc"
data[which(data$isConvection == TRUE), "category"] = "convection"
data[which(data$category == "misc" & data$isExtremeTemp == TRUE), "category"] = "extremeTemp"
data[which(data$category == "misc" & data$isFlood == TRUE), "category"] = "flood"
data[which(data$category == "misc" & data$isMarine == TRUE), "category"] = "marine"
data[which(data$category == "misc" & data$isTropicalCyclone == TRUE), "category"] = "tropicalCyclone"
data[which(data$category == "misc" & data$isWinter == TRUE), "category"] = "winter"
data[which(data$category == "misc" & data$isOther == TRUE), "category"] = "other"

After forcing each evtype fall into 1 category only, the total number of events that related to each category:

Category Events Related
Convection 81.7609
Extreme Temperatures 0.5938
Flood 9.1666
Marine 1.7544
Tropical Cylones 0.0318
Winter 2.5101
Other 2.2716
Misc. 1.9107

Processing FATALITIES column

Check if there is any missing value in FATALITIES column

any(is.na(data$FATALITIES))
## [1] FALSE

There is no missing value in FATALITIES column. We don't need to do anything in this column.

Processing INJURIES column

Again, we now check if there is any missing value in INJURIES column.

any(is.na(data$INJURIES))
## [1] FALSE

There is no missing value in INJURIES column. We don't need to do prerocessing in this column.

About Property and Crop Damage

There are four columns related to property and crop damage: PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP

According to the National Weather Service Storm Data Documentation, the estimated damage amount was rounded by 3 significant digits and followed by an alphabetical character which shows the magnitude of the number. For example, 1.55B for $1,550,000,000. The value 1.55B was seperated into 2 parts: the numeric part (e.g. 1.55) and the alphabetical part (e.g. B) and stored into two different columns (PROPDMG/CROPDMG and PROPDMGEXP/CROPDMGEXP). The meaning of the alphabetical character:

First, we check if there is any missing value or invalid value in PROPDMGEXP and CROPDMGEXP

unique(data$PROPDMGEXP)
##  [1] "K" "M" ""  "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-"
## [18] "1" "8"
unique(data$CROPDMGEXP)
## [1] ""  "M" "K" "m" "B" "?" "0" "k" "2"

As we can see there is some invalid value (e.g. ?, 0) in PROPDMGEXP / CROPDMGEXP, we defined the conversion rule as follows:

  1. convert the value in PROPDMGEXP / CROPDMGEXP into uppercase

  2. check if it is a valid character (i.e. H, K, M, B).

    2.1. If it is valid, then multiply the value PROPDMG/CROPDMG with the corrsponding factor and store the value into a new column PROPDMGAmt / cropDmgAmt

    2.2. Otherwise, we store the value PROPDMG/CROPDMG directly to the new column propDMGAmt / cropDmgAmt

getAmt <- function(dfDMG, dfDMGEXP) {
    res <- sapply(dfDMGEXP, function(char) {
        char = toupper(char)
        if (char == "H") {
            100
        } else if (char == "K") {
            1000
        } else if (char == "M") {
            1000000
        } else if (char == "B") {
            1000000000
        } else {
            1
        }
    })
    res * as.numeric(dfDMG)
}

data["propDmgAmt"] = getAmt(data$PROPDMG, data$PROPDMGEXP)
data["cropDmgAmt"] = getAmt(data$CROPDMG, data$CROPDMGEXP)

Some summary about the property and crop damage.

summary(data$propDmgAmt)
##         Min.      1st Qu.       Median         Mean      3rd Qu. 
##            0            0            0       474000          500 
##         Max. 
## 115000000000
summary(data$cropDmgAmt)
##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
##          0          0          0      54400          0 5000000000

Processing BGN_DATE column

BGN_DATE is in character format, we convert it to date format.

data$BGN_DATE = as.Date(data$BGN_DATE, format = "%m/%d/%Y %k:%M:%S")

summary(data$BGN_DATE)
##         Min.      1st Qu.       Median         Mean      3rd Qu. 
## "1950-01-03" "1995-04-20" "2002-03-18" "1998-12-27" "2007-07-28" 
##         Max. 
## "2011-11-30"

The event date from 1950-01-03 to 2011-11-30.

Filter irrelevant states

Since we focus on 50 states in United States, we checked that there are more than 72 different values in states.

length(as.factor(data$STATE))
## [1] 902297

Thus, it is necessary to filter irrelevant state from the dataset.

data = data[data$STATE %in% state.abb, ]

After filtering irrevlant state, there are 883186 valid records.

nrow(data)
## [1] 883186

Filter and reshape data

To ease for plotting the result, we based on the existing dataset, filtered irrevlant columns, reshape and create a new dataset.

# subsetting data
subsetting <- function(data) {
    # extract the data
    d = data[, c("BGN_DATE", "STATE", "FATALITIES", "INJURIES", "propDmgAmt", 
        "cropDmgAmt", "category")]

    # change column names
    colnames(d) <- c("date", "state", "fatality", "injury", "propDmgAmt", "cropDmgAmt", 
        "category")
    d
}

ds = subsetting(data)

# calculate total fatality per category
totalFatalityPerCat = aggregate(fatality ~ category, sum, data = ds)

# calculate total injury per category
totalInjuryPerCat = aggregate(injury ~ category, sum, data = ds)

# calculate total propDmgAmt
totalPropDmgAmt = aggregate(propDmgAmt ~ category, sum, data = ds)

# calculate total cropDmgAmt
totalCropDmgAmt = aggregate(cropDmgAmt ~ category, sum, data = ds)

# create statistic per category
statPerCat = totalFatalityPerCat
statPerCat["injury"] = totalInjuryPerCat$injury
statPerCat["propDmgAmt"] = totalPropDmgAmt$propDmgAmt/1000000000  #rescale in billion
statPerCat["cropDmgAmt"] = totalCropDmgAmt$cropDmgAmt/1000000000  #rescale in billion
library(reshape)
statPerCat = melt(statPerCat, id = "category")

Results

Which types of events are most harmful with respect to population health?

First, we have a look about types of events are most harmful to population health. The following figure shows the total number of fatalities and injuries from different event categories in United States, 1950 - 1993:

library(ggplot2)

ggplot(subset(statPerCat, variable == "fatality" | variable == "injury"), aes(category, 
    value, fill = variable)) + geom_bar(stat = "identity") + xlab("Event Category") + 
    ylab("Number of People") + ggtitle("Total number of fatalities and injuries\nfrom different event category in United States, 1950 - 1993") + 
    scale_x_discrete(breaks = c("convection", "extremeTemp", "flood", "marine", 
        "misc", "other", "tropicalCyclone", "winter"), labels = c("Convect-\nion", 
        "Extreme\nTemp.", "Flood", "Marine", "Misc.", "Other", "Tropical\nCyclone", 
        "Winter")) + scale_fill_discrete(name = "", breaks = c("fatality", "injury"), 
    labels = c("Fatality", "Injury"))

plot of chunk unnamed-chunk-27

Refer to the figure shown above, we can see that event related to convection are most harmful with respect to population health. It costs 109,410 injuries and 7,872 fataliies in United States from 1950 to 1993.

Which types of events have the greatest economic consequences?

We now have a look about types of events have the greatest economic consequences. The following figure shows the damages in property and crop from different event categories in United States, 1950 - 1993:

ggplot(subset(statPerCat, variable == "propDmgAmt" | variable == "cropDmgAmt"), 
    aes(category, value, fill = variable)) + geom_bar(stat = "identity") + xlab("Event Category") + 
    ylab("Cost (in billion USD)") + ggtitle("Total property and crop damage from different event category\nin United States, 1950 - 1993") + 
    scale_x_discrete(breaks = c("convection", "extremeTemp", "flood", "marine", 
        "misc", "other", "tropicalCyclone", "winter"), labels = c("Convect-\nion", 
        "Extreme\nTemp.", "Flood", "Marine", "Misc.", "Other", "Tropical\nCyclone", 
        "Winter")) + scale_fill_discrete(name = "Damage", breaks = c("propDmgAmt", 
    "cropDmgAmt"), labels = c("Property", "Crop"))

plot of chunk unnamed-chunk-28

From this figure, we can see that event related to flood have the greatest econmic consequences. It costs 167.1 billions in property damage and 12.2 billions in crop damange, total: 179.3 biliions, in United States from 1950 to 1993.

Reference: