Severe weather events cause health problems and cause property damages resulting in lives being impacted as well as economic loss. National Oceanic and Atmospheric Administration (NOAA) collects data of such events and make the data available to users for analysis. This project analyzes the data from NOAA from the year 1950 to 2011 and determines which event is most harmful to public heath and which event is responsible for greatest economic loss. The study concludes that Tornado had adverse impact on public health and Flood was most destructive in terms of property damage for the period of study.
Data was downloaded from the link provided. Downloaded file is a compressed file with the extension ‘bz2’. This file is located in the folder course5 under the Working Directory. This compressed file can be loaded using read.csv(). Alternatively, an application similar to Winzip can be used to unzip and extract the csv file ‘repdata_data_StormData.csv’ from the compressed file.
Before starting to load and process data, load libraries which are used in the analysis
suppressMessages(library(dplyr))
library('scales')
Load data from the csv to the dataframe ‘weather.data.full’
weather.data.full <- read.csv("course5/repdata_data_StormData.csv.bz2", header = TRUE)
dim(weather.data.full)
## [1] 902297 37
names(weather.data.full)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
str(weather.data.full)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
The data size looks big and occupies large memory. Quick glance at the column tells that, we don’t need all these columns to complete the analysis. We need only the following columns
    STATE
    EVTYPE
    FATALITIES
    INJURIES
    PROPDMG
    PROPDMGEXP
    CROPDMG
    CROPDMGEXP
Create another data frame called ‘weather.data.selected’ with only the columns required for the analysis and remove the original dataframe to clear memory
columns <- c("STATE", "EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")
weather.data.selected <- select(weather.data.full, all_of(columns))
rm(weather.data.full)
for (i in 1:length(columns)) {
print(paste(paste("Number of missing values in", columns[i], ":",
as.character(sum(is.na(weather.data.selected[,i]))))))
}
## [1] "Number of missing values in STATE : 0"
## [1] "Number of missing values in EVTYPE : 0"
## [1] "Number of missing values in FATALITIES : 0"
## [1] "Number of missing values in INJURIES : 0"
## [1] "Number of missing values in PROPDMG : 0"
## [1] "Number of missing values in PROPDMGEXP : 0"
## [1] "Number of missing values in CROPDMG : 0"
## [1] "Number of missing values in CROPDMGEXP : 0"
The result shows that there is no missing values in the data set.
Used NOAA directive ‘Operations and Services Performance, NWSPD 10-16’ as the basis for cleansing data.
According to the directive the column PROPDMG hold the quantity of damage and the column PROPDMGEXP has the abbreviation for this quantity. Abbreviations are
    B    for Billion
    M    for Million
    K    for Thousand
    H    for Hundred
    Blank    No data or zero
table(weather.data.selected$PROPDMGEXP)
##
## - ? + 0 1 2 3 4 5 6
## 465934 1 8 5 216 25 13 4 4 28 4
## 7 8 B h H K m M
## 5 1 40 1 6 424665 7 11330
This shows that there are some incorrect entries. All lower case abbreviations were converted to corresponding upper case abbreviations
weather.data.selected$PROPDMGEXP <- toupper(weather.data.selected$PROPDMGEXP)
table(weather.data.selected$PROPDMGEXP)
##
## - ? + 0 1 2 3 4 5 6
## 465934 1 8 5 216 25 13 4 4 28 4
## 7 8 B H K M
## 5 1 40 7 424665 11337
There are some undefined codes.
sum(as.matrix(table(weather.data.selected$PROPDMGEXP, exclude = c("B", "M", "K", "H", ""))))
## [1] 314
There are only 314 of these undefined abbreviations and their corresponding quantities are treated as just numeric values
Next step is to standardize these numbers. Set 1000 as the base and convert all numbers to this uniform base by multiplying them with multipliers as below
    B    1000000
    M    1000
    K    1
    H    0.1
    Blank    0
    Undefined    0.001
weather.data.selected$PROPDMGVAL <-
ifelse(weather.data.selected$PROPDMGEXP == "B",
weather.data.selected$PROPDMG * 1000000,
ifelse(weather.data.selected$PROPDMGEXP == "M",
weather.data.selected$PROPDMG * 1000,
ifelse(weather.data.selected$PROPDMGEXP == "K",
weather.data.selected$PROPDMG,
ifelse(weather.data.selected$PROPDMGEXP == "H",
weather.data.selected$PROPDMG / 10,
ifelse(weather.data.selected$PROPDMGEXP == "", 0,
ifelse(weather.data.selected$PROPDMG %in% c("", 0), 0,
weather.data.selected$PROPDMG / 1000))))))
The column PROPDMGVAL now has the Property Damage in $1000s
Repeat the same process done for the property damage for crop damage also
table(weather.data.selected$CROPDMGEXP)
##
## ? 0 2 B k K m M
## 618413 7 19 1 9 21 281832 1 1994
weather.data.selected$CROPDMGEXP <- toupper(weather.data.selected$CROPDMGEXP)
table(weather.data.selected$CROPDMGEXP)
##
## ? 0 2 B K M
## 618413 7 19 1 9 281853 1995
sum(as.matrix(table(weather.data.selected$CROPDMGEXP, exclude = c("B", "M", "K", "H", ""))))
## [1] 27
weather.data.selected$CROPDMGVAL <-
ifelse(weather.data.selected$CROPDMGEXP == "B",
weather.data.selected$CROPDMG * 1000000,
ifelse(weather.data.selected$CROPDMGEXP == "M",
weather.data.selected$CROPDMG * 1000,
ifelse(weather.data.selected$CROPDMGEXP == "K",
weather.data.selected$CROPDMG,
ifelse(weather.data.selected$CROPDMGEXP == "H",
weather.data.selected$CROPDMG / 10,
ifelse(weather.data.selected$CROPDMGEXP == "", 0,
ifelse(weather.data.selected$CROPDMG %in% c("", 0), 0,
weather.data.selected$CROPDMG / 1000))))))
Crop Damage has 27 undefined abbreviations. The column CROPDMGVAL has the Crop Damage in $1000s
Quick look at the values in this column shows that data was entered as free form text.
sort(unique(weather.data.selected$EVTYPE))
## [1] " HIGH SURF ADVISORY" " COASTAL FLOOD"
## [3] " FLASH FLOOD" " LIGHTNING"
## [5] " TSTM WIND" " TSTM WIND (G45)"
## [7] " WATERSPOUT" " WIND"
## [9] "?" "ABNORMAL WARMTH"
## [11] "ABNORMALLY DRY" "ABNORMALLY WET"
## [13] "ACCUMULATED SNOWFALL" "AGRICULTURAL FREEZE"
## [15] "APACHE COUNTY" "ASTRONOMICAL HIGH TIDE"
## [17] "ASTRONOMICAL LOW TIDE" "AVALANCE"
## [19] "AVALANCHE" "BEACH EROSIN"
## [21] "Beach Erosion" "BEACH EROSION"
## [23] "BEACH EROSION/COASTAL FLOOD" "BEACH FLOOD"
## [25] "BELOW NORMAL PRECIPITATION" "BITTER WIND CHILL"
## [27] "BITTER WIND CHILL TEMPERATURES" "Black Ice"
## [29] "BLACK ICE" "BLIZZARD"
## [31] "BLIZZARD AND EXTREME WIND CHIL" "BLIZZARD AND HEAVY SNOW"
## [33] "Blizzard Summary" "BLIZZARD WEATHER"
## [35] "BLIZZARD/FREEZING RAIN" "BLIZZARD/HEAVY SNOW"
## [37] "BLIZZARD/HIGH WIND" "BLIZZARD/WINTER STORM"
## [39] "BLOW-OUT TIDE" "BLOW-OUT TIDES"
## [41] "BLOWING DUST" "blowing snow"
## [43] "Blowing Snow" "BLOWING SNOW"
## [45] "BLOWING SNOW- EXTREME WIND CHI" "BLOWING SNOW & EXTREME WIND CH"
## [47] "BLOWING SNOW/EXTREME WIND CHIL" "BREAKUP FLOODING"
## [49] "BRUSH FIRE" "BRUSH FIRES"
## [51] "COASTAL FLOODING/EROSION" "COASTAL EROSION"
## [53] "Coastal Flood" "COASTAL FLOOD"
## [55] "coastal flooding" "Coastal Flooding"
## [57] "COASTAL FLOODING" "COASTAL FLOODING/EROSION"
## [59] "Coastal Storm" "COASTAL STORM"
## [61] "COASTAL SURGE" "COASTAL/TIDAL FLOOD"
## [63] "COASTALFLOOD" "COASTALSTORM"
## [65] "Cold" "COLD"
## [67] "COLD AIR FUNNEL" "COLD AIR FUNNELS"
## [69] "COLD AIR TORNADO" "Cold and Frost"
## [71] "COLD AND FROST" "COLD AND SNOW"
## [73] "COLD AND WET CONDITIONS" "Cold Temperature"
## [75] "COLD TEMPERATURES" "COLD WAVE"
## [77] "COLD WEATHER" "COLD WIND CHILL TEMPERATURES"
## [79] "COLD/WIND CHILL" "COLD/WINDS"
## [81] "COOL AND WET" "COOL SPELL"
## [83] "CSTL FLOODING/EROSION" "DAM BREAK"
## [85] "DAM FAILURE" "Damaging Freeze"
## [87] "DAMAGING FREEZE" "DEEP HAIL"
## [89] "DENSE FOG" "DENSE SMOKE"
## [91] "DOWNBURST" "DOWNBURST WINDS"
## [93] "DRIEST MONTH" "Drifting Snow"
## [95] "DROUGHT" "DROUGHT/EXCESSIVE HEAT"
## [97] "DROWNING" "DRY"
## [99] "DRY CONDITIONS" "DRY HOT WEATHER"
## [101] "DRY MICROBURST" "DRY MICROBURST 50"
## [103] "DRY MICROBURST 53" "DRY MICROBURST 58"
## [105] "DRY MICROBURST 61" "DRY MICROBURST 84"
## [107] "DRY MICROBURST WINDS" "DRY MIRCOBURST WINDS"
## [109] "DRY PATTERN" "DRY SPELL"
## [111] "DRY WEATHER" "DRYNESS"
## [113] "DUST DEVEL" "Dust Devil"
## [115] "DUST DEVIL" "DUST DEVIL WATERSPOUT"
## [117] "DUST STORM" "DUST STORM/HIGH WINDS"
## [119] "DUSTSTORM" "EARLY FREEZE"
## [121] "Early Frost" "EARLY FROST"
## [123] "EARLY RAIN" "EARLY SNOW"
## [125] "Early snowfall" "EARLY SNOWFALL"
## [127] "Erosion/Cstl Flood" "EXCESSIVE"
## [129] "Excessive Cold" "EXCESSIVE HEAT"
## [131] "EXCESSIVE HEAT/DROUGHT" "EXCESSIVE PRECIPITATION"
## [133] "EXCESSIVE RAIN" "EXCESSIVE RAINFALL"
## [135] "EXCESSIVE SNOW" "EXCESSIVE WETNESS"
## [137] "EXCESSIVELY DRY" "Extended Cold"
## [139] "Extreme Cold" "EXTREME COLD"
## [141] "EXTREME COLD/WIND CHILL" "EXTREME HEAT"
## [143] "EXTREME WIND CHILL" "EXTREME WIND CHILL/BLOWING SNO"
## [145] "EXTREME WIND CHILLS" "EXTREME WINDCHILL"
## [147] "EXTREME WINDCHILL TEMPERATURES" "EXTREME/RECORD COLD"
## [149] "EXTREMELY WET" "FALLING SNOW/ICE"
## [151] "FIRST FROST" "FIRST SNOW"
## [153] "FLASH FLOOD" "FLASH FLOOD - HEAVY RAIN"
## [155] "FLASH FLOOD FROM ICE JAMS" "FLASH FLOOD LANDSLIDES"
## [157] "FLASH FLOOD WINDS" "FLASH FLOOD/"
## [159] "FLASH FLOOD/ FLOOD" "FLASH FLOOD/ STREET"
## [161] "FLASH FLOOD/FLOOD" "FLASH FLOOD/HEAVY RAIN"
## [163] "FLASH FLOOD/LANDSLIDE" "FLASH FLOODING"
## [165] "FLASH FLOODING/FLOOD" "FLASH FLOODING/THUNDERSTORM WI"
## [167] "FLASH FLOODS" "FLASH FLOOODING"
## [169] "Flood" "FLOOD"
## [171] "FLOOD & HEAVY RAIN" "FLOOD FLASH"
## [173] "FLOOD FLOOD/FLASH" "FLOOD WATCH/"
## [175] "FLOOD/FLASH" "Flood/Flash Flood"
## [177] "FLOOD/FLASH FLOOD" "FLOOD/FLASH FLOODING"
## [179] "FLOOD/FLASH/FLOOD" "FLOOD/FLASHFLOOD"
## [181] "FLOOD/RAIN/WIND" "FLOOD/RAIN/WINDS"
## [183] "FLOOD/RIVER FLOOD" "Flood/Strong Wind"
## [185] "FLOODING" "FLOODING/HEAVY RAIN"
## [187] "FLOODS" "FOG"
## [189] "FOG AND COLD TEMPERATURES" "FOREST FIRES"
## [191] "Freeze" "FREEZE"
## [193] "Freezing drizzle" "Freezing Drizzle"
## [195] "FREEZING DRIZZLE" "FREEZING DRIZZLE AND FREEZING"
## [197] "Freezing Fog" "FREEZING FOG"
## [199] "Freezing rain" "Freezing Rain"
## [201] "FREEZING RAIN" "FREEZING RAIN AND SLEET"
## [203] "FREEZING RAIN AND SNOW" "FREEZING RAIN SLEET AND"
## [205] "FREEZING RAIN SLEET AND LIGHT" "FREEZING RAIN/SLEET"
## [207] "FREEZING RAIN/SNOW" "Freezing Spray"
## [209] "Frost" "FROST"
## [211] "Frost/Freeze" "FROST/FREEZE"
## [213] "FROST\\FREEZE" "FUNNEL"
## [215] "Funnel Cloud" "FUNNEL CLOUD"
## [217] "FUNNEL CLOUD." "FUNNEL CLOUD/HAIL"
## [219] "FUNNEL CLOUDS" "FUNNELS"
## [221] "Glaze" "GLAZE"
## [223] "GLAZE ICE" "GLAZE/ICE STORM"
## [225] "gradient wind" "Gradient wind"
## [227] "GRADIENT WIND" "GRADIENT WINDS"
## [229] "GRASS FIRES" "GROUND BLIZZARD"
## [231] "GUSTNADO" "GUSTNADO AND"
## [233] "GUSTY LAKE WIND" "GUSTY THUNDERSTORM WIND"
## [235] "GUSTY THUNDERSTORM WINDS" "Gusty Wind"
## [237] "GUSTY WIND" "GUSTY WIND/HAIL"
## [239] "GUSTY WIND/HVY RAIN" "Gusty wind/rain"
## [241] "Gusty winds" "Gusty Winds"
## [243] "GUSTY WINDS" "HAIL"
## [245] "HAIL 0.75" "HAIL 0.88"
## [247] "HAIL 075" "HAIL 088"
## [249] "HAIL 1.00" "HAIL 1.75"
## [251] "HAIL 1.75)" "HAIL 100"
## [253] "HAIL 125" "HAIL 150"
## [255] "HAIL 175" "HAIL 200"
## [257] "HAIL 225" "HAIL 275"
## [259] "HAIL 450" "HAIL 75"
## [261] "HAIL 80" "HAIL 88"
## [263] "HAIL ALOFT" "HAIL DAMAGE"
## [265] "HAIL FLOODING" "HAIL STORM"
## [267] "Hail(0.75)" "HAIL/ICY ROADS"
## [269] "HAIL/WIND" "HAIL/WINDS"
## [271] "HAILSTORM" "HAILSTORMS"
## [273] "HARD FREEZE" "HAZARDOUS SURF"
## [275] "HEAT" "HEAT DROUGHT"
## [277] "Heat Wave" "HEAT WAVE"
## [279] "HEAT WAVE DROUGHT" "HEAT WAVES"
## [281] "HEAT/DROUGHT" "Heatburst"
## [283] "HEAVY LAKE SNOW" "HEAVY MIX"
## [285] "HEAVY PRECIPATATION" "Heavy Precipitation"
## [287] "HEAVY PRECIPITATION" "Heavy rain"
## [289] "Heavy Rain" "HEAVY RAIN"
## [291] "HEAVY RAIN AND FLOOD" "Heavy Rain and Wind"
## [293] "HEAVY RAIN EFFECTS" "HEAVY RAIN/FLOODING"
## [295] "Heavy Rain/High Surf" "HEAVY RAIN/LIGHTNING"
## [297] "HEAVY RAIN/MUDSLIDES/FLOOD" "HEAVY RAIN/SEVERE WEATHER"
## [299] "HEAVY RAIN/SMALL STREAM URBAN" "HEAVY RAIN/SNOW"
## [301] "HEAVY RAIN/URBAN FLOOD" "HEAVY RAIN/WIND"
## [303] "HEAVY RAIN; URBAN FLOOD WINDS;" "HEAVY RAINFALL"
## [305] "HEAVY RAINS" "HEAVY RAINS/FLOODING"
## [307] "HEAVY SEAS" "HEAVY SHOWER"
## [309] "HEAVY SHOWERS" "HEAVY SNOW"
## [311] "HEAVY SNOW-SQUALLS" "HEAVY SNOW FREEZING RAIN"
## [313] "HEAVY SNOW & ICE" "HEAVY SNOW AND"
## [315] "HEAVY SNOW AND HIGH WINDS" "HEAVY SNOW AND ICE"
## [317] "HEAVY SNOW AND ICE STORM" "HEAVY SNOW AND STRONG WINDS"
## [319] "HEAVY SNOW ANDBLOWING SNOW" "Heavy snow shower"
## [321] "HEAVY SNOW SQUALLS" "HEAVY SNOW/BLIZZARD"
## [323] "HEAVY SNOW/BLIZZARD/AVALANCHE" "HEAVY SNOW/BLOWING SNOW"
## [325] "HEAVY SNOW/FREEZING RAIN" "HEAVY SNOW/HIGH"
## [327] "HEAVY SNOW/HIGH WIND" "HEAVY SNOW/HIGH WINDS"
## [329] "HEAVY SNOW/HIGH WINDS & FLOOD" "HEAVY SNOW/HIGH WINDS/FREEZING"
## [331] "HEAVY SNOW/ICE" "HEAVY SNOW/ICE STORM"
## [333] "HEAVY SNOW/SLEET" "HEAVY SNOW/SQUALLS"
## [335] "HEAVY SNOW/WIND" "HEAVY SNOW/WINTER STORM"
## [337] "HEAVY SNOWPACK" "Heavy Surf"
## [339] "HEAVY SURF" "Heavy surf and wind"
## [341] "HEAVY SURF COASTAL FLOODING" "HEAVY SURF/HIGH SURF"
## [343] "HEAVY SWELLS" "HEAVY WET SNOW"
## [345] "HIGH" "HIGH SWELLS"
## [347] "HIGH WINDS" "HIGH SEAS"
## [349] "High Surf" "HIGH SURF"
## [351] "HIGH SURF ADVISORIES" "HIGH SURF ADVISORY"
## [353] "HIGH SWELLS" "HIGH TEMPERATURE RECORD"
## [355] "HIGH TIDES" "HIGH WATER"
## [357] "HIGH WAVES" "High Wind"
## [359] "HIGH WIND" "HIGH WIND (G40)"
## [361] "HIGH WIND 48" "HIGH WIND 63"
## [363] "HIGH WIND 70" "HIGH WIND AND HEAVY SNOW"
## [365] "HIGH WIND AND HIGH TIDES" "HIGH WIND AND SEAS"
## [367] "HIGH WIND DAMAGE" "HIGH WIND/ BLIZZARD"
## [369] "HIGH WIND/BLIZZARD" "HIGH WIND/BLIZZARD/FREEZING RA"
## [371] "HIGH WIND/HEAVY SNOW" "HIGH WIND/LOW WIND CHILL"
## [373] "HIGH WIND/SEAS" "HIGH WIND/WIND CHILL"
## [375] "HIGH WIND/WIND CHILL/BLIZZARD" "HIGH WINDS"
## [377] "HIGH WINDS 55" "HIGH WINDS 57"
## [379] "HIGH WINDS 58" "HIGH WINDS 63"
## [381] "HIGH WINDS 66" "HIGH WINDS 67"
## [383] "HIGH WINDS 73" "HIGH WINDS 76"
## [385] "HIGH WINDS 80" "HIGH WINDS 82"
## [387] "HIGH WINDS AND WIND CHILL" "HIGH WINDS DUST STORM"
## [389] "HIGH WINDS HEAVY RAINS" "HIGH WINDS/"
## [391] "HIGH WINDS/COASTAL FLOOD" "HIGH WINDS/COLD"
## [393] "HIGH WINDS/FLOODING" "HIGH WINDS/HEAVY RAIN"
## [395] "HIGH WINDS/SNOW" "HIGHWAY FLOODING"
## [397] "Hot and Dry" "HOT PATTERN"
## [399] "HOT SPELL" "HOT WEATHER"
## [401] "HOT/DRY PATTERN" "HURRICANE"
## [403] "HURRICANE-GENERATED SWELLS" "Hurricane Edouard"
## [405] "HURRICANE EMILY" "HURRICANE ERIN"
## [407] "HURRICANE FELIX" "HURRICANE GORDON"
## [409] "HURRICANE OPAL" "HURRICANE OPAL/HIGH WINDS"
## [411] "HURRICANE/TYPHOON" "HVY RAIN"
## [413] "HYPERTHERMIA/EXPOSURE" "HYPOTHERMIA"
## [415] "Hypothermia/Exposure" "HYPOTHERMIA/EXPOSURE"
## [417] "ICE" "ICE AND SNOW"
## [419] "ICE FLOES" "Ice Fog"
## [421] "ICE JAM" "Ice jam flood (minor"
## [423] "ICE JAM FLOODING" "ICE ON ROAD"
## [425] "ICE PELLETS" "ICE ROADS"
## [427] "ICE STORM" "ICE STORM AND SNOW"
## [429] "ICE STORM/FLASH FLOOD" "Ice/Snow"
## [431] "ICE/SNOW" "ICE/STRONG WINDS"
## [433] "Icestorm/Blizzard" "Icy Roads"
## [435] "ICY ROADS" "LACK OF SNOW"
## [437] "LAKE-EFFECT SNOW" "Lake Effect Snow"
## [439] "LAKE EFFECT SNOW" "LAKE FLOOD"
## [441] "LAKESHORE FLOOD" "LANDSLIDE"
## [443] "LANDSLIDE/URBAN FLOOD" "LANDSLIDES"
## [445] "Landslump" "LANDSLUMP"
## [447] "LANDSPOUT" "LARGE WALL CLOUD"
## [449] "Late-season Snowfall" "LATE FREEZE"
## [451] "LATE SEASON HAIL" "LATE SEASON SNOW"
## [453] "Late Season Snowfall" "LATE SNOW"
## [455] "LIGHT FREEZING RAIN" "Light snow"
## [457] "Light Snow" "LIGHT SNOW"
## [459] "LIGHT SNOW AND SLEET" "Light Snow/Flurries"
## [461] "LIGHT SNOW/FREEZING PRECIP" "Light Snowfall"
## [463] "LIGHTING" "LIGHTNING"
## [465] "LIGHTNING WAUSEON" "LIGHTNING AND HEAVY RAIN"
## [467] "LIGHTNING AND THUNDERSTORM WIN" "LIGHTNING AND WINDS"
## [469] "LIGHTNING DAMAGE" "LIGHTNING FIRE"
## [471] "LIGHTNING INJURY" "LIGHTNING THUNDERSTORM WINDS"
## [473] "LIGHTNING THUNDERSTORM WINDSS" "LIGHTNING."
## [475] "LIGHTNING/HEAVY RAIN" "LIGNTNING"
## [477] "LOCAL FLASH FLOOD" "LOCAL FLOOD"
## [479] "LOCALLY HEAVY RAIN" "LOW TEMPERATURE"
## [481] "LOW TEMPERATURE RECORD" "LOW WIND CHILL"
## [483] "MAJOR FLOOD" "Marine Accident"
## [485] "MARINE HAIL" "MARINE HIGH WIND"
## [487] "MARINE MISHAP" "MARINE STRONG WIND"
## [489] "MARINE THUNDERSTORM WIND" "MARINE TSTM WIND"
## [491] "Metro Storm, May 26" "Microburst"
## [493] "MICROBURST" "MICROBURST WINDS"
## [495] "Mild and Dry Pattern" "MILD PATTERN"
## [497] "MILD/DRY PATTERN" "MINOR FLOOD"
## [499] "Minor Flooding" "MINOR FLOODING"
## [501] "MIXED PRECIP" "Mixed Precipitation"
## [503] "MIXED PRECIPITATION" "MODERATE SNOW"
## [505] "MODERATE SNOWFALL" "MONTHLY PRECIPITATION"
## [507] "Monthly Rainfall" "MONTHLY RAINFALL"
## [509] "Monthly Snowfall" "MONTHLY SNOWFALL"
## [511] "MONTHLY TEMPERATURE" "Mountain Snows"
## [513] "MUD SLIDE" "MUD SLIDES"
## [515] "MUD SLIDES URBAN FLOODING" "MUD/ROCK SLIDE"
## [517] "Mudslide" "MUDSLIDE"
## [519] "MUDSLIDE/LANDSLIDE" "Mudslides"
## [521] "MUDSLIDES" "NEAR RECORD SNOW"
## [523] "No Severe Weather" "NON-SEVERE WIND DAMAGE"
## [525] "NON-TSTM WIND" "NON SEVERE HAIL"
## [527] "NON TSTM WIND" "NONE"
## [529] "NORMAL PRECIPITATION" "NORTHERN LIGHTS"
## [531] "Other" "OTHER"
## [533] "PATCHY DENSE FOG" "PATCHY ICE"
## [535] "Prolong Cold" "PROLONG COLD"
## [537] "PROLONG COLD/SNOW" "PROLONG WARMTH"
## [539] "PROLONGED RAIN" "RAIN"
## [541] "RAIN (HEAVY)" "RAIN AND WIND"
## [543] "Rain Damage" "RAIN/SNOW"
## [545] "RAIN/WIND" "RAINSTORM"
## [547] "RAPIDLY RISING WATER" "RECORD COLD"
## [549] "Record Cold" "RECORD COLD"
## [551] "RECORD COLD AND HIGH WIND" "RECORD COLD/FROST"
## [553] "RECORD COOL" "Record dry month"
## [555] "RECORD DRYNESS" "Record Heat"
## [557] "RECORD HEAT" "RECORD HEAT WAVE"
## [559] "Record High" "RECORD HIGH"
## [561] "RECORD HIGH TEMPERATURE" "RECORD HIGH TEMPERATURES"
## [563] "RECORD LOW" "RECORD LOW RAINFALL"
## [565] "Record May Snow" "RECORD PRECIPITATION"
## [567] "RECORD RAINFALL" "RECORD SNOW"
## [569] "RECORD SNOW/COLD" "RECORD SNOWFALL"
## [571] "Record temperature" "RECORD TEMPERATURE"
## [573] "Record Temperatures" "RECORD TEMPERATURES"
## [575] "RECORD WARM" "RECORD WARM TEMPS."
## [577] "Record Warmth" "RECORD WARMTH"
## [579] "Record Winter Snow" "RECORD/EXCESSIVE HEAT"
## [581] "RECORD/EXCESSIVE RAINFALL" "RED FLAG CRITERIA"
## [583] "RED FLAG FIRE WX" "REMNANTS OF FLOYD"
## [585] "RIP CURRENT" "RIP CURRENTS"
## [587] "RIP CURRENTS HEAVY SURF" "RIP CURRENTS/HEAVY SURF"
## [589] "RIVER AND STREAM FLOOD" "RIVER FLOOD"
## [591] "River Flooding" "RIVER FLOODING"
## [593] "ROCK SLIDE" "ROGUE WAVE"
## [595] "ROTATING WALL CLOUD" "ROUGH SEAS"
## [597] "ROUGH SURF" "RURAL FLOOD"
## [599] "Saharan Dust" "SAHARAN DUST"
## [601] "Seasonal Snowfall" "SEICHE"
## [603] "SEVERE COLD" "SEVERE THUNDERSTORM"
## [605] "SEVERE THUNDERSTORM WINDS" "SEVERE THUNDERSTORMS"
## [607] "SEVERE TURBULENCE" "SLEET"
## [609] "SLEET & FREEZING RAIN" "SLEET STORM"
## [611] "SLEET/FREEZING RAIN" "SLEET/ICE STORM"
## [613] "SLEET/RAIN/SNOW" "SLEET/SNOW"
## [615] "small hail" "Small Hail"
## [617] "SMALL HAIL" "SMALL STREAM"
## [619] "SMALL STREAM AND" "SMALL STREAM AND URBAN FLOOD"
## [621] "SMALL STREAM AND URBAN FLOODIN" "SMALL STREAM FLOOD"
## [623] "SMALL STREAM FLOODING" "SMALL STREAM URBAN FLOOD"
## [625] "SMALL STREAM/URBAN FLOOD" "Sml Stream Fld"
## [627] "SMOKE" "Snow"
## [629] "SNOW" "SNOW- HIGH WIND- WIND CHILL"
## [631] "Snow Accumulation" "SNOW ACCUMULATION"
## [633] "SNOW ADVISORY" "SNOW AND COLD"
## [635] "SNOW AND HEAVY SNOW" "Snow and Ice"
## [637] "SNOW AND ICE" "SNOW AND ICE STORM"
## [639] "Snow and sleet" "SNOW AND SLEET"
## [641] "SNOW AND WIND" "SNOW DROUGHT"
## [643] "SNOW FREEZING RAIN" "SNOW SHOWERS"
## [645] "SNOW SLEET" "SNOW SQUALL"
## [647] "Snow squalls" "Snow Squalls"
## [649] "SNOW SQUALLS" "SNOW/ BITTER COLD"
## [651] "SNOW/ ICE" "SNOW/BLOWING SNOW"
## [653] "SNOW/COLD" "SNOW/FREEZING RAIN"
## [655] "SNOW/HEAVY SNOW" "SNOW/HIGH WINDS"
## [657] "SNOW/ICE" "SNOW/ICE STORM"
## [659] "SNOW/RAIN" "SNOW/RAIN/SLEET"
## [661] "SNOW/SLEET" "SNOW/SLEET/FREEZING RAIN"
## [663] "SNOW/SLEET/RAIN" "SNOW\\COLD"
## [665] "SNOWFALL RECORD" "SNOWMELT FLOODING"
## [667] "SNOWSTORM" "SOUTHEAST"
## [669] "STORM FORCE WINDS" "STORM SURGE"
## [671] "STORM SURGE/TIDE" "STREAM FLOODING"
## [673] "STREET FLOOD" "STREET FLOODING"
## [675] "Strong Wind" "STRONG WIND"
## [677] "STRONG WIND GUST" "Strong winds"
## [679] "Strong Winds" "STRONG WINDS"
## [681] "Summary August 10" "Summary August 11"
## [683] "Summary August 17" "Summary August 2-3"
## [685] "Summary August 21" "Summary August 28"
## [687] "Summary August 4" "Summary August 7"
## [689] "Summary August 9" "Summary Jan 17"
## [691] "Summary July 23-24" "Summary June 18-19"
## [693] "Summary June 5-6" "Summary June 6"
## [695] "Summary of April 12" "Summary of April 13"
## [697] "Summary of April 21" "Summary of April 27"
## [699] "Summary of April 3rd" "Summary of August 1"
## [701] "Summary of July 11" "Summary of July 2"
## [703] "Summary of July 22" "Summary of July 26"
## [705] "Summary of July 29" "Summary of July 3"
## [707] "Summary of June 10" "Summary of June 11"
## [709] "Summary of June 12" "Summary of June 13"
## [711] "Summary of June 15" "Summary of June 16"
## [713] "Summary of June 18" "Summary of June 23"
## [715] "Summary of June 24" "Summary of June 3"
## [717] "Summary of June 30" "Summary of June 4"
## [719] "Summary of June 6" "Summary of March 14"
## [721] "Summary of March 23" "Summary of March 24"
## [723] "SUMMARY OF MARCH 24-25" "SUMMARY OF MARCH 27"
## [725] "SUMMARY OF MARCH 29" "Summary of May 10"
## [727] "Summary of May 13" "Summary of May 14"
## [729] "Summary of May 22" "Summary of May 22 am"
## [731] "Summary of May 22 pm" "Summary of May 26 am"
## [733] "Summary of May 26 pm" "Summary of May 31 am"
## [735] "Summary of May 31 pm" "Summary of May 9-10"
## [737] "Summary Sept. 25-26" "Summary September 20"
## [739] "Summary September 23" "Summary September 3"
## [741] "Summary September 4" "Summary: Nov. 16"
## [743] "Summary: Nov. 6-7" "Summary: Oct. 20-21"
## [745] "Summary: October 31" "Summary: Sept. 18"
## [747] "Temperature record" "THUDERSTORM WINDS"
## [749] "THUNDEERSTORM WINDS" "THUNDERESTORM WINDS"
## [751] "THUNDERSNOW" "Thundersnow shower"
## [753] "THUNDERSTORM" "THUNDERSTORM WINDS"
## [755] "THUNDERSTORM DAMAGE" "THUNDERSTORM DAMAGE TO"
## [757] "THUNDERSTORM HAIL" "THUNDERSTORM W INDS"
## [759] "Thunderstorm Wind" "THUNDERSTORM WIND"
## [761] "THUNDERSTORM WIND (G40)" "THUNDERSTORM WIND 50"
## [763] "THUNDERSTORM WIND 52" "THUNDERSTORM WIND 56"
## [765] "THUNDERSTORM WIND 59" "THUNDERSTORM WIND 59 MPH"
## [767] "THUNDERSTORM WIND 59 MPH." "THUNDERSTORM WIND 60 MPH"
## [769] "THUNDERSTORM WIND 65 MPH" "THUNDERSTORM WIND 65MPH"
## [771] "THUNDERSTORM WIND 69" "THUNDERSTORM WIND 98 MPH"
## [773] "THUNDERSTORM WIND G50" "THUNDERSTORM WIND G51"
## [775] "THUNDERSTORM WIND G52" "THUNDERSTORM WIND G55"
## [777] "THUNDERSTORM WIND G60" "THUNDERSTORM WIND G61"
## [779] "THUNDERSTORM WIND TREES" "THUNDERSTORM WIND."
## [781] "THUNDERSTORM WIND/ TREE" "THUNDERSTORM WIND/ TREES"
## [783] "THUNDERSTORM WIND/AWNING" "THUNDERSTORM WIND/HAIL"
## [785] "THUNDERSTORM WIND/LIGHTNING" "THUNDERSTORM WINDS"
## [787] "THUNDERSTORM WINDS LE CEN" "THUNDERSTORM WINDS 13"
## [789] "THUNDERSTORM WINDS 2" "THUNDERSTORM WINDS 50"
## [791] "THUNDERSTORM WINDS 52" "THUNDERSTORM WINDS 53"
## [793] "THUNDERSTORM WINDS 60" "THUNDERSTORM WINDS 61"
## [795] "THUNDERSTORM WINDS 62" "THUNDERSTORM WINDS 63 MPH"
## [797] "THUNDERSTORM WINDS AND" "THUNDERSTORM WINDS FUNNEL CLOU"
## [799] "THUNDERSTORM WINDS G" "THUNDERSTORM WINDS G60"
## [801] "THUNDERSTORM WINDS HAIL" "THUNDERSTORM WINDS HEAVY RAIN"
## [803] "THUNDERSTORM WINDS LIGHTNING" "THUNDERSTORM WINDS SMALL STREA"
## [805] "THUNDERSTORM WINDS URBAN FLOOD" "THUNDERSTORM WINDS."
## [807] "THUNDERSTORM WINDS/ FLOOD" "THUNDERSTORM WINDS/ HAIL"
## [809] "THUNDERSTORM WINDS/FLASH FLOOD" "THUNDERSTORM WINDS/FLOODING"
## [811] "THUNDERSTORM WINDS/FUNNEL CLOU" "THUNDERSTORM WINDS/HAIL"
## [813] "THUNDERSTORM WINDS/HEAVY RAIN" "THUNDERSTORM WINDS53"
## [815] "THUNDERSTORM WINDSHAIL" "THUNDERSTORM WINDSS"
## [817] "THUNDERSTORM WINS" "THUNDERSTORMS"
## [819] "THUNDERSTORMS WIND" "THUNDERSTORMS WINDS"
## [821] "THUNDERSTORMW" "THUNDERSTORMW 50"
## [823] "THUNDERSTORMW WINDS" "THUNDERSTORMWINDS"
## [825] "THUNDERSTROM WIND" "THUNDERSTROM WINDS"
## [827] "THUNDERTORM WINDS" "THUNDERTSORM WIND"
## [829] "THUNDESTORM WINDS" "THUNERSTORM WINDS"
## [831] "TIDAL FLOOD" "Tidal Flooding"
## [833] "TIDAL FLOODING" "TORNADO"
## [835] "TORNADO DEBRIS" "TORNADO F0"
## [837] "TORNADO F1" "TORNADO F2"
## [839] "TORNADO F3" "TORNADO/WATERSPOUT"
## [841] "TORNADOES" "TORNADOES, TSTM WIND, HAIL"
## [843] "TORNADOS" "TORNDAO"
## [845] "TORRENTIAL RAIN" "Torrential Rainfall"
## [847] "TROPICAL DEPRESSION" "TROPICAL STORM"
## [849] "TROPICAL STORM ALBERTO" "TROPICAL STORM DEAN"
## [851] "TROPICAL STORM GORDON" "TROPICAL STORM JERRY"
## [853] "TSTM" "TSTM HEAVY RAIN"
## [855] "Tstm Wind" "TSTM WIND"
## [857] "TSTM WIND (G45)" "TSTM WIND (41)"
## [859] "TSTM WIND (G35)" "TSTM WIND (G40)"
## [861] "TSTM WIND (G45)" "TSTM WIND 40"
## [863] "TSTM WIND 45" "TSTM WIND 50"
## [865] "TSTM WIND 51" "TSTM WIND 52"
## [867] "TSTM WIND 55" "TSTM WIND 65)"
## [869] "TSTM WIND AND LIGHTNING" "TSTM WIND DAMAGE"
## [871] "TSTM WIND G45" "TSTM WIND G58"
## [873] "TSTM WIND/HAIL" "TSTM WINDS"
## [875] "TSTM WND" "TSTMW"
## [877] "TSUNAMI" "TUNDERSTORM WIND"
## [879] "TYPHOON" "Unseasonable Cold"
## [881] "UNSEASONABLY COLD" "UNSEASONABLY COOL"
## [883] "UNSEASONABLY COOL & WET" "UNSEASONABLY DRY"
## [885] "UNSEASONABLY HOT" "UNSEASONABLY WARM"
## [887] "UNSEASONABLY WARM & WET" "UNSEASONABLY WARM AND DRY"
## [889] "UNSEASONABLY WARM YEAR" "UNSEASONABLY WARM/WET"
## [891] "UNSEASONABLY WET" "UNSEASONAL LOW TEMP"
## [893] "UNSEASONAL RAIN" "UNUSUAL WARMTH"
## [895] "UNUSUAL/RECORD WARMTH" "UNUSUALLY COLD"
## [897] "UNUSUALLY LATE SNOW" "UNUSUALLY WARM"
## [899] "URBAN AND SMALL" "URBAN AND SMALL STREAM"
## [901] "URBAN AND SMALL STREAM FLOOD" "URBAN AND SMALL STREAM FLOODIN"
## [903] "Urban flood" "Urban Flood"
## [905] "URBAN FLOOD" "URBAN FLOOD LANDSLIDE"
## [907] "Urban Flooding" "URBAN FLOODING"
## [909] "URBAN FLOODS" "URBAN SMALL"
## [911] "URBAN SMALL STREAM FLOOD" "URBAN/SMALL"
## [913] "URBAN/SMALL FLOODING" "URBAN/SMALL STREAM"
## [915] "URBAN/SMALL STREAM FLOOD" "URBAN/SMALL STREAM FLOOD"
## [917] "URBAN/SMALL STREAM FLOODING" "URBAN/SMALL STRM FLDG"
## [919] "URBAN/SML STREAM FLD" "URBAN/SML STREAM FLDG"
## [921] "URBAN/STREET FLOODING" "VERY DRY"
## [923] "VERY WARM" "VOG"
## [925] "Volcanic Ash" "VOLCANIC ASH"
## [927] "Volcanic Ash Plume" "VOLCANIC ASHFALL"
## [929] "VOLCANIC ERUPTION" "WAKE LOW WIND"
## [931] "WALL CLOUD" "WALL CLOUD/FUNNEL CLOUD"
## [933] "WARM DRY CONDITIONS" "WARM WEATHER"
## [935] "WATER SPOUT" "WATERSPOUT"
## [937] "WATERSPOUT-" "WATERSPOUT-TORNADO"
## [939] "WATERSPOUT FUNNEL CLOUD" "WATERSPOUT TORNADO"
## [941] "WATERSPOUT/" "WATERSPOUT/ TORNADO"
## [943] "WATERSPOUT/TORNADO" "WATERSPOUTS"
## [945] "WAYTERSPOUT" "wet micoburst"
## [947] "WET MICROBURST" "Wet Month"
## [949] "WET SNOW" "WET WEATHER"
## [951] "Wet Year" "Whirlwind"
## [953] "WHIRLWIND" "WILD FIRES"
## [955] "WILD/FOREST FIRE" "WILD/FOREST FIRES"
## [957] "WILDFIRE" "WILDFIRES"
## [959] "Wind" "WIND"
## [961] "WIND ADVISORY" "WIND AND WAVE"
## [963] "WIND CHILL" "WIND CHILL/HIGH WIND"
## [965] "Wind Damage" "WIND DAMAGE"
## [967] "WIND GUSTS" "WIND STORM"
## [969] "WIND/HAIL" "WINDS"
## [971] "WINTER MIX" "WINTER STORM"
## [973] "WINTER STORM HIGH WINDS" "WINTER STORM/HIGH WIND"
## [975] "WINTER STORM/HIGH WINDS" "WINTER STORMS"
## [977] "Winter Weather" "WINTER WEATHER"
## [979] "WINTER WEATHER MIX" "WINTER WEATHER/MIX"
## [981] "WINTERY MIX" "Wintry mix"
## [983] "Wintry Mix" "WINTRY MIX"
## [985] "WND"
We can remove special characters, double spaces and convert values to upper case to make the column (EVTYPE) slightly uniform
weather.data.selected$EVTYPE <- gsub(" ", " ", (gsub("[[:punct:]]","",
toupper(weather.data.selected$EVTYPE))))
According to NOAA directive (NWSPD 10-16), the following are the list of weather events
    Astronomical Low Tide
    Avalanche
    Blizzard
    Coastal Flood
    Cold/Wind Chill
    Debris Flow
    Dense Fog
    Dense Smoke
    Drought
    Dust Devil
    Dust Storm
    Excessive Heat
    Extreme Cold/Wind Chill
    Flash Flood
    Flood     Frost/Freeze
    Funnel Cloud
    Freezing Fog
    Hail
    Heat
    Heavy Rain
    Heavy Snow
    High Surf
    High Wind
    Hurricane (Typhoon)
    Ice Storm
    Lake-Effect Snow
    Lakeshore Flood
    Lightning
    Marine Hail
    Marine High Wind
    Marine Strong Wind
    Marine Thunderstorm Wind
    Rip Current
    Seiche
    Sleet
    Storm Surge/Tide
    Strong Wind
    Thunderstorm Wind
    Tornado
    Tropical Depression
    Tropical Storm
    Tsunami
    Volcanic Ash
    Waterspout
    Wildfire
    Winter Storm
    Winter Weather
These events are stored in an array ‘Events’
Events <- c(
"Astronomical Low Tide",
"Avalanche",
"Blizzard",
"Coastal Flood",
"Cold/Wind Chill",
"Debris Flow",
"Dense Fog",
"Dense Smoke",
"Drought",
"Dust Devil",
"Dust Storm",
"Excessive Heat",
"Extreme Cold/Wind Chill",
"Flash Flood",
"Flood",
"Frost/Freeze",
"Funnel Cloud",
"Freezing Fog",
"Hail",
"Heat",
"Heavy Rain",
"Heavy Snow",
"High Surf",
"High Wind",
"Hurricane (Typhoon)",
"Ice Storm",
"Lake-Effect Snow",
"Lakeshore Flood",
"Lightning",
"Marine Hail",
"Marine High Wind",
"Marine Strong Wind",
"Marine Thunderstorm Wind",
"Rip Current",
"Seiche",
"Sleet",
"Storm Surge/Tide",
"Strong Wind",
"Thunderstorm Wind",
"Tornado",
"Tropical Depression",
"Tropical Storm",
"Tsunami",
"Volcanic Ash",
"Waterspout",
"Wildfire",
"Winter Storm",
"Winter Weather")
Using pattern matching, the values stored in EVTYPE are classified into the above listed events and stored in a new column EVENT_TYPE. Any value which cannot be classified to the above list are classifed under ‘Other’
weather.data.selected$EVENT_TYPE <-
ifelse(grepl("ASTRONOMICAL LOW TIDE", weather.data.selected$EVTYPE), "Astronomical Low Tide",
ifelse(grepl("AVALANCHE|AVALANCE", weather.data.selected$EVTYPE), "Avalanche",
ifelse(grepl("BLIZZARD", weather.data.selected$EVTYPE), "Blizzard",
ifelse(grepl("COASTAL FLOOD|COASTALFLOOD|COASTALTIDAL FLOOD", weather.data.selected$EVTYPE), "Coastal Flood",
ifelse(grepl("DENSE FOG", weather.data.selected$EVTYPE), "Dense Fog",
ifelse(grepl("DENSE SMOKE", weather.data.selected$EVTYPE), "Dense Smoke",
ifelse(grepl("DROUGHT|ABNORMALLY DRY|DRY|DRIEST MONTH", weather.data.selected$EVTYPE), "Drought",
ifelse(grepl("DUST DEVIL|DUST DEVEL", weather.data.selected$EVTYPE), "Dust Devil",
ifelse(grepl("DUST STORM", weather.data.selected$EVTYPE), "Dust Storm",
ifelse(grepl("LANDSLIDE|MUD SLIDE|ROCK SLIDE|MUDSLIDE|DAM", weather.data.selected$EVTYPE), "Debris Flow",
ifelse(grepl("EXCESSIVE HEAT", weather.data.selected$EVTYPE), "Excessive Heat",
ifelse(grepl("EXCESSIVE COLD|EXTREME COLD|EXTREME WINDCHILL|EXTREMERECORD COLD|THERMIA|RECORD COOL", weather.data.selected$EVTYPE), "Extreme Cold/Wind Chill",
weather.data.selected$EVTYPE))))))))))))
weather.data.selected$EVENT_TYPE2 <-
ifelse(grepl("COLD|WIND CHILL|COLD AIR|LOW TEMP|UNSEASONABLY COOL|COOL SPELL", weather.data.selected$EVENT_TYPE), "Cold/Wind Chill",
ifelse(grepl("FLASH FLOOD|FLOOD FLASH|FLOODFLASH|FLASH FLOOO", weather.data.selected$EVENT_TYPE), "Flash Flood",
ifelse(grepl("LAKESHORE FLOOD|LAKE FLOOD", weather.data.selected$EVENT_TYPE), "Lakeshore Flood",
ifelse(grepl("FLOOD|HIGH WATER|RISING WATER|STREAM FLD|STRM FLD", weather.data.selected$EVENT_TYPE), "Flood",
ifelse(grepl("FREEZING FOG", weather.data.selected$EVENT_TYPE), "Freezing Fog",
ifelse(grepl("FREEZ|FROST", weather.data.selected$EVENT_TYPE), "Frost/Freeze",
ifelse(grepl("FUNNEL CLOUD", weather.data.selected$EVENT_TYPE), "Funnel Cloud",
ifelse(grepl("MARINE HAIL", weather.data.selected$EVENT_TYPE), "Marine Hail",
ifelse(grepl("HAIL", weather.data.selected$EVENT_TYPE), "Hail",
ifelse(grepl("HEAT|HOT|HIGH TEMP|UNSEASONABLY HOT", weather.data.selected$EVENT_TYPE), "Heat",
ifelse(grepl("HEAVY RAIN|HEAVY PRECI|HEAVY SHOWER|EXCESSIVE RAIN|EXCESSIVE PRECIP|HVY RAIN|RAIN HEAVY|TORRENTIAL|RECORD RAIN|RECORD PRECI|RAINSTORM", weather.data.selected$EVENT_TYPE), "Heavy Rain",
ifelse(grepl("HEAVY SNOW|HEAVY WET SNOW", weather.data.selected$EVENT_TYPE), "Heavy Snow",
ifelse(grepl("HIGH SURF|HEAVY SURF|ROUGH SURF|HAZARDOUS SURF|HIGH WAVES|ROGUE WAVE|BEACH EROS|HIGH SWELLS|COASTAL SURGE", weather.data.selected$EVENT_TYPE), "High Surf",
ifelse(grepl("MARINE HIGH WIND", weather.data.selected$EVENT_TYPE), "Marine High Wind",
ifelse(grepl("HIGH WIND", weather.data.selected$EVENT_TYPE), "High Wind",
ifelse(grepl("HURRICANE|TYPHOON", weather.data.selected$EVENT_TYPE), "Hurricane (Typhoon)",
ifelse(grepl("ICE", weather.data.selected$EVENT_TYPE), "Ice Storm",
ifelse(grepl("LAKEEFFECT SNOW|LAKE EFFECT SNOW", weather.data.selected$EVENT_TYPE), "Lake-Effect Snow",
ifelse(grepl("LIGHTNING|LIGHTING|LIGNTNING", weather.data.selected$EVENT_TYPE), "Lightning",
ifelse(grepl("MARINE STRONG WIND", weather.data.selected$EVENT_TYPE), "Marine Strong Wind",
ifelse(grepl("MARINE THUNDERSTORM WIND", weather.data.selected$EVENT_TYPE), "Marine Thunderstorm Wind",
ifelse(grepl("RIP CURRENT", weather.data.selected$EVENT_TYPE), "Rip Current",
weather.data.selected$EVENT_TYPE))))))))))))))))))))))
weather.data.selected$EVENT_TYPE <-
ifelse(grepl("SEICHE", weather.data.selected$EVENT_TYPE2), "Seiche",
ifelse(grepl("SLEET", weather.data.selected$EVENT_TYPE2), "Sleet",
ifelse(grepl("STORM SURGE|TIDE", weather.data.selected$EVENT_TYPE2), "Storm Surge/Tide",
ifelse(grepl("STRONG WIND|WIND|WND|MICROBURST", weather.data.selected$EVENT_TYPE2), "Strong Wind",
ifelse(grepl("THUNDERSTORM|TSTM|THUNDERSTROM|THUNDERTORM|THUNDEERSTORM|THUNDERESTORM|THUNDERTSORM|THUNERSTORM|THUDERSTORM|THUNDESTORM|TUNDERSTORM", weather.data.selected$EVENT_TYPE2), "Thunderstorm Wind",
ifelse(grepl("TORNADO|TORNDAO|FUNNEL", weather.data.selected$EVENT_TYPE2), "Tornado",
ifelse(grepl("TROPICAL DEPRESSION", weather.data.selected$EVENT_TYPE2), "Tropical Depression",
ifelse(grepl("TROPICAL STORM", weather.data.selected$EVENT_TYPE2), "Tropical Storm",
ifelse(grepl("TSUNAMI", weather.data.selected$EVENT_TYPE2), "Tsunami",
ifelse(grepl("VOLCANIC", weather.data.selected$EVENT_TYPE2), "Volcanic Ash",
ifelse(grepl("WATERSPOUT|WATER SPOUT|WAYTERSPOUT", weather.data.selected$EVENT_TYPE2), "Waterspout",
ifelse(grepl("WILD FIRE|WILDFIRE|FOREST FIRE|BRUSH FIRE|GRASS FIRE", weather.data.selected$EVENT_TYPE2), "Wildfire",
ifelse(grepl("WINTER STORM", weather.data.selected$EVENT_TYPE2), "Winter Storm",
ifelse(grepl("WINTER|SNOW|WINTRY MIX|ICY ROADS", weather.data.selected$EVENT_TYPE2), "Winter Weather",
weather.data.selected$EVENT_TYPE2))))))))))))))
weather.data.selected$EVENT_TYPE <-
ifelse(weather.data.selected$EVENT_TYPE %in% Events,
weather.data.selected$EVENT_TYPE, "Other")
Calculate total Fatalities per event and store in an array FATAL_Sum
Calculate total Injuries per event and store in another array INJURY_Sum
Total of Fatalities and Injuries will give us the total health harm
FATAL_Sum <- with(weather.data.selected, tapply(FATALITIES, EVENT_TYPE, sum))
INJURY_Sum <- with(weather.data.selected, tapply(INJURIES, EVENT_TYPE, sum))
sort(FATAL_Sum + INJURY_Sum)
## Astronomical Low Tide Dense Smoke Freezing Fog
## 0 0 0
## Lake-Effect Snow Lakeshore Flood Marine Hail
## 0 0 0
## Seiche Tropical Depression Volcanic Ash
## 0 0 0
## Marine High Wind Sleet Funnel Cloud
## 2 2 3
## Coastal Flood Waterspout Marine Strong Wind
## 13 32 36
## Marine Thunderstorm Wind Thunderstorm Wind Dust Devil
## 36 40 45
## Frost/Freeze Storm Surge/Tide Drought
## 54 67 86
## Debris Flow Tsunami Cold/Wind Chill
## 106 162 236
## Dense Fog Heavy Rain Avalanche
## 360 380 396
## High Surf Tropical Storm Dust Storm
## 416 449 462
## Extreme Cold/Wind Chill Winter Weather Blizzard
## 573 816 906
## Rip Current Other Heavy Snow
## 1101 1133 1163
## Hurricane (Typhoon) Hail Winter Storm
## 1466 1512 1554
## Wildfire High Wind Ice Storm
## 1698 1813 2256
## Flash Flood Heat Lightning
## 2837 3896 6049
## Flood Excessive Heat Strong Wind
## 7390 8445 10639
## Tornado
## 97043
The above shows that Tornado is harmful to public health
Calculate total Property Damage per event and store in an array PROPDMG_Sum
Calculate total Crop Damage per event and store in another array CROPDMG_Sum
Total of Property Damage and Crop Damage will give us the total material damage
PROPDMG_Sum <- with(weather.data.selected, tapply(PROPDMGVAL, EVENT_TYPE, sum))
CROPDMG_Sum <- with(weather.data.selected, tapply(CROPDMGVAL, EVENT_TYPE, sum))
sort(PROPDMG_Sum + CROPDMG_Sum)
## Marine Hail Dense Smoke Rip Current
## 4.00 100.00 163.00
## Funnel Cloud Astronomical Low Tide Marine Strong Wind
## 194.60 320.00 418.33
## Marine Thunderstorm Wind Volcanic Ash Dust Devil
## 486.40 500.00 719.13
## Seiche Marine High Wind Sleet
## 980.00 1297.01 1350.00
## Tropical Depression Freezing Fog Lakeshore Flood
## 1737.00 2182.00 7570.00
## Avalanche Dust Storm Waterspout
## 8721.80 9199.00 9564.20
## Dense Fog Lake-Effect Snow Winter Weather
## 9674.00 40182.00 65237.25
## High Surf Tsunami Other
## 102080.00 144082.00 182069.60
## Cold/Wind Chill Heat Coastal Flood
## 273986.60 424383.55 432672.06
## Excessive Heat Debris Flow Blizzard
## 500155.70 653535.10 771973.95
## Lightning Heavy Snow Thunderstorm Wind
## 946082.03 1076680.24 1226666.55
## Extreme Cold/Wind Chill Frost/Freeze Heavy Rain
## 1407163.40 1728503.50 4029041.94
## Winter Storm High Wind Tropical Storm
## 6716441.25 6733542.94 8409286.55
## Wildfire Ice Storm Strong Wind
## 8899910.13 8989753.71 11172061.95
## Drought Flash Flood Hail
## 15025675.38 18440074.12 20734050.36
## Storm Surge/Tide Tornado Hurricane (Typhoon)
## 47975004.15 57408059.95 90762527.81
## Flood
## 161096780.38
The above shows that Flood was causing most damage in terms of property loss
Draw a Bar chart to show the harmful event to public health
par(mar= c(9,4,2,1))
barplot(rbind(FATAL_Sum, INJURY_Sum), col = c("purple2", "turquoise2"), las = 2, main = "Deaths and Injuries due to Weather Events",
cex.names = 0.8, cex.axis = 0.8, beside = F)
mtext(text = "Weather Events", side = 1, line = 8)
mtext(text = "Deaths and Injuries", side = 2, line = 3)
legend("center", c("Deaths", "Injurries"), text.col = c("purple2", "turquoise2"), bty = "n")
Display the result in text form
paste("The event which was most harmful to public health is",
names((FATAL_Sum + INJURY_Sum)[(FATAL_Sum + INJURY_Sum) == max(FATAL_Sum + INJURY_Sum)]), "with a total deaths and injuries of",
as.character((FATAL_Sum + INJURY_Sum)[(FATAL_Sum + INJURY_Sum) == max(FATAL_Sum + INJURY_Sum)]))
## [1] "The event which was most harmful to public health is Tornado with a total deaths and injuries of 97043"
Draw a Bar chart to depict the property damage
options(scipen=999)
par(mar= c(9,6,2,1))
barplot(rbind(PROPDMG_Sum, CROPDMG_Sum), col = c("green3", "blue4"), las = 2,
main = "Property Damage due to Weather Events",
cex.names = 0.8, cex.axis = 0.8, beside = F)
mtext(text = "Weather Events", side = 1, line = 8)
mtext(text = "Damage in 1000s", side = 2, line = 5)
legend("topright", c("Property Damage", "Crop Damage"), text.col = c("green3", "blue4"), bty = "n")
Display the result in text form
paste("The event which is most destruptive in terms of property loss is",
names((PROPDMG_Sum + CROPDMG_Sum)[(PROPDMG_Sum + CROPDMG_Sum) == max(PROPDMG_Sum + CROPDMG_Sum)]), "with a total loss of",
as.character(dollar_format()((PROPDMG_Sum + CROPDMG_Sum)[(PROPDMG_Sum + CROPDMG_Sum) == max(PROPDMG_Sum + CROPDMG_Sum)] * 1000)))
## [1] "The event which is most destruptive in terms of property loss is Flood with a total loss of $161,096,780,378"
unique(weather.data.selected$STATE)
## [1] "AL" "AZ" "AR" "CA" "CO" "CT" "DE" "DC" "FL" "GA" "HI" "ID" "IL" "IN" "IA"
## [16] "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH" "NJ"
## [31] "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT" "VT"
## [46] "VA" "WA" "WV" "WI" "WY" "PR" "AK" "ST" "AS" "GU" "MH" "VI" "AM" "LC" "PH"
## [61] "GM" "PZ" "AN" "LH" "LM" "LE" "LS" "SL" "LO" "PM" "PK" "XX"
This list shows that the data has been collected across all territories under the control of United States which includes places like Puerto Rico, Guam ect.
What will be the result if we exclude all regions outside continental / geographical United States?
Extracted data pertaining to continental United Stated and continued the study
states <- c("AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FL", "GA", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "OH", "OK", "OR", "PA", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VA", "WA", "WV", "WI", "WY")
weather.data.selected_USA <- filter(weather.data.selected, STATE %in% states)
PROPDMG_Sum_USA <- with(weather.data.selected_USA, tapply(PROPDMGVAL, EVENT_TYPE, sum))
CROPDMG_Sum_USA <- with(weather.data.selected_USA, tapply(CROPDMGVAL, EVENT_TYPE, sum))
FATAL_Sum_USA <- with(weather.data.selected_USA, tapply(FATALITIES, EVENT_TYPE, sum))
INJURY_Sum_USA <- with(weather.data.selected_USA, tapply(INJURIES, EVENT_TYPE, sum))
paste("The event which was most harmful to public health within geographical USA is",
names((FATAL_Sum_USA + INJURY_Sum_USA)[(FATAL_Sum_USA + INJURY_Sum_USA) == max(FATAL_Sum_USA + INJURY_Sum_USA)]), "with a total deaths and injuries of",
as.character((FATAL_Sum_USA + INJURY_Sum_USA)[(FATAL_Sum_USA + INJURY_Sum_USA) == max(FATAL_Sum_USA + INJURY_Sum_USA)]))
## [1] "The event which was most harmful to public health within geographical USA is Tornado with a total deaths and injuries of 97043"
paste("The event which is most destruptive in terms of property loss within geographical USA is",
names((PROPDMG_Sum_USA + CROPDMG_Sum_USA)[(PROPDMG_Sum_USA + CROPDMG_Sum_USA) == max(PROPDMG_Sum_USA + CROPDMG_Sum_USA)]), "with a total loss of",
as.character(dollar_format()((PROPDMG_Sum_USA + CROPDMG_Sum_USA)[(PROPDMG_Sum_USA + CROPDMG_Sum_USA) == max(PROPDMG_Sum_USA + CROPDMG_Sum_USA)] * 1000)))
## [1] "The event which is most destruptive in terms of property loss within geographical USA is Flood with a total loss of $160,931,361,328"
The result of analysis, using both sets (full and geographical USA) of data, is same
Most harmful to public health: Tornado
Most disruptive to property: Flood