I have undertaken an analysis of severe weather events in order to determine the economic damage and population damage of each type of event. This involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. I have focused my analysis on years subsequent to 1995 due to the availability of higher quality standardised data after this date.
#download the dataset to working directory
download.file("http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
"StormData.csv.bz2")
#download storm data documentation and FAQ
download.file("http://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf", "StormDoc.pdf")
download.file("http://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf", "FAQ.pdf")
#read data in
stormData = read.csv("StormData.csv.bz2")
#Quick look at data
dim(stormData)
## [1] 902297 37
head(stormData)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL
## 3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL
## 4 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL
## 5 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL
## 6 1 11/15/1951 0:00:00 2000 CST 77 LAUDERDALE AL
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO 0 0
## 2 TORNADO 0 0
## 3 TORNADO 0 0
## 4 TORNADO 0 0
## 5 TORNADO 0 0
## 6 TORNADO 0 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1 NA 0 14.0 100 3 0 0
## 2 NA 0 2.0 150 2 0 0
## 3 NA 0 0.1 123 2 0 0
## 4 NA 0 0.0 100 2 0 0
## 5 NA 0 0.0 150 2 0 0
## 6 NA 0 1.5 177 2 0 0
## INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1 15 25.0 K 0
## 2 0 2.5 K 0
## 3 2 25.0 K 0
## 4 2 2.5 K 0
## 5 2 2.5 K 0
## 6 6 2.5 K 0
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3040 8812 3051 8806 1
## 2 3042 8755 0 0 2
## 3 3340 8742 0 0 3
## 4 3458 8626 0 0 4
## 5 3412 8642 0 0 5
## 6 3450 8748 0 0 6
colnames(stormData)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
summary(stormData)
## STATE__ BGN_DATE BGN_TIME
## Min. : 1.0 5/25/2011 0:00:00: 1202 12:00:00 AM: 10163
## 1st Qu.:19.0 4/27/2011 0:00:00: 1193 06:00:00 PM: 7350
## Median :30.0 6/9/2011 0:00:00 : 1030 04:00:00 PM: 7261
## Mean :31.2 5/30/2004 0:00:00: 1016 05:00:00 PM: 6891
## 3rd Qu.:45.0 4/4/2011 0:00:00 : 1009 12:00:00 PM: 6703
## Max. :95.0 4/2/2006 0:00:00 : 981 03:00:00 PM: 6700
## (Other) :895866 (Other) :857229
## TIME_ZONE COUNTY COUNTYNAME STATE
## CST :547493 Min. : 0 JEFFERSON : 7840 TX : 83728
## EST :245558 1st Qu.: 31 WASHINGTON: 7603 KS : 53440
## MST : 68390 Median : 75 JACKSON : 6660 OK : 46802
## PST : 28302 Mean :101 FRANKLIN : 6256 MO : 35648
## AST : 6360 3rd Qu.:131 LINCOLN : 5937 IA : 31069
## HST : 2563 Max. :873 MADISON : 5632 NE : 30271
## (Other): 3631 (Other) :862369 (Other):621339
## EVTYPE BGN_RANGE BGN_AZI
## HAIL :288661 Min. : 0 :547332
## TSTM WIND :219940 1st Qu.: 0 N : 86752
## THUNDERSTORM WIND: 82563 Median : 0 W : 38446
## TORNADO : 60652 Mean : 1 S : 37558
## FLASH FLOOD : 54277 3rd Qu.: 1 E : 33178
## FLOOD : 25326 Max. :3749 NW : 24041
## (Other) :170878 (Other):134990
## BGN_LOCATI END_DATE END_TIME
## :287743 :243411 :238978
## COUNTYWIDE : 19680 4/27/2011 0:00:00: 1214 06:00:00 PM: 9802
## Countywide : 993 5/25/2011 0:00:00: 1196 05:00:00 PM: 8314
## SPRINGFIELD : 843 6/9/2011 0:00:00 : 1021 04:00:00 PM: 8104
## SOUTH PORTION: 810 4/4/2011 0:00:00 : 1007 12:00:00 PM: 7483
## NORTH PORTION: 784 5/30/2004 0:00:00: 998 11:59:00 PM: 7184
## (Other) :591444 (Other) :653450 (Other) :622432
## COUNTY_END COUNTYENDN END_RANGE END_AZI
## Min. :0 Mode:logical Min. : 0 :724837
## 1st Qu.:0 NA's:902297 1st Qu.: 0 N : 28082
## Median :0 Median : 0 S : 22510
## Mean :0 Mean : 1 W : 20119
## 3rd Qu.:0 3rd Qu.: 0 E : 20047
## Max. :0 Max. :925 NE : 14606
## (Other): 72096
## END_LOCATI LENGTH WIDTH F
## :499225 Min. : 0.0 Min. : 0 Min. :0
## COUNTYWIDE : 19731 1st Qu.: 0.0 1st Qu.: 0 1st Qu.:0
## SOUTH PORTION : 833 Median : 0.0 Median : 0 Median :1
## NORTH PORTION : 780 Mean : 0.2 Mean : 8 Mean :1
## CENTRAL PORTION: 617 3rd Qu.: 0.0 3rd Qu.: 0 3rd Qu.:1
## SPRINGFIELD : 575 Max. :2315.0 Max. :4400 Max. :5
## (Other) :380536 NA's :843563
## MAG FATALITIES INJURIES PROPDMG
## Min. : 0 Min. : 0 Min. : 0.0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0.0 1st Qu.: 0
## Median : 50 Median : 0 Median : 0.0 Median : 0
## Mean : 47 Mean : 0 Mean : 0.2 Mean : 12
## 3rd Qu.: 75 3rd Qu.: 0 3rd Qu.: 0.0 3rd Qu.: 0
## Max. :22000 Max. :583 Max. :1700.0 Max. :5000
##
## PROPDMGEXP CROPDMG CROPDMGEXP WFO
## :465934 Min. : 0.0 :618413 :142069
## K :424665 1st Qu.: 0.0 K :281832 OUN : 17393
## M : 11330 Median : 0.0 M : 1994 JAN : 13889
## 0 : 216 Mean : 1.5 k : 21 LWX : 13174
## B : 40 3rd Qu.: 0.0 0 : 19 PHI : 12551
## 5 : 28 Max. :990.0 B : 9 TSA : 12483
## (Other): 84 (Other): 9 (Other):690738
## STATEOFFIC
## :248769
## TEXAS, North : 12193
## ARKANSAS, Central and North Central: 11738
## IOWA, Central : 11345
## KANSAS, Southwest : 11212
## GEORGIA, North and Central : 11120
## (Other) :595920
## ZONENAMES
## :594029
## :205988
## GREATER RENO / CARSON CITY / M - GREATER RENO / CARSON CITY / M : 639
## GREATER LAKE TAHOE AREA - GREATER LAKE TAHOE AREA : 592
## JEFFERSON - JEFFERSON : 303
## MADISON - MADISON : 302
## (Other) :100444
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_
## Min. : 0 Min. :-14451 Min. : 0 Min. :-14455
## 1st Qu.:2802 1st Qu.: 7247 1st Qu.: 0 1st Qu.: 0
## Median :3540 Median : 8707 Median : 0 Median : 0
## Mean :2875 Mean : 6940 Mean :1452 Mean : 3509
## 3rd Qu.:4019 3rd Qu.: 9605 3rd Qu.:3549 3rd Qu.: 8735
## Max. :9706 Max. : 17124 Max. :9706 Max. :106220
## NA's :47 NA's :40
## REMARKS REFNUM
## :287433 Min. : 1
## : 24013 1st Qu.:225575
## Trees down.\n : 1110 Median :451149
## Several trees were blown down.\n : 568 Mean :451149
## Trees were downed.\n : 446 3rd Qu.:676723
## Large trees and power lines were blown down.\n: 432 Max. :902297
## (Other) :588295
colnames(stormData) = tolower(colnames(stormData))
#The data documentation indicates that standardised reporting did not occur until 1996. I focus on the data post 1996.
#split the dates out and bind them to the dataframe
library(reshape2)
library(dplyr)
library(ggplot2)
library(stringr)
dates = colsplit(stormData$bgn_date, " ", c("date", "time"))
stormData = cbind(stormData, dates)
#prepare dates
library(lubridate)
stormData$date = mdy(stormData$date)
#year($stormData$date) > 1996
#The columns we are interested in are "EVTYPE", and "FATALITIES" and "INJURIES" for population health and "PROPDMG" and "CROPDMG" for economic consequences.
df = select(stormData, evtype, fatalities, injuries, propdmg, cropdmg,date)
df = filter(df, year(date) > 1995)
df = na.omit(df)
Its clear that evtype is not an ideal descriptor. Clean the column.
library(reshape2)
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(stringr)
#Cleaning EVTYPE
df = arrange(df, evtype)
unique(df$evtype)
## [1] HIGH SURF ADVISORY COASTAL FLOOD
## [3] FLASH FLOOD LIGHTNING
## [5] TSTM WIND TSTM WIND (G45)
## [7] WATERSPOUT WIND
## [9] ABNORMAL WARMTH ABNORMALLY DRY
## [11] ABNORMALLY WET ACCUMULATED SNOWFALL
## [13] AGRICULTURAL FREEZE ASTRONOMICAL HIGH TIDE
## [15] ASTRONOMICAL LOW TIDE AVALANCHE
## [17] Beach Erosion BEACH EROSION
## [19] BITTER WIND CHILL BITTER WIND CHILL TEMPERATURES
## [21] Black Ice BLACK ICE
## [23] BLIZZARD Blizzard Summary
## [25] BLOW-OUT TIDE BLOW-OUT TIDES
## [27] BLOWING DUST blowing snow
## [29] Blowing Snow BRUSH FIRE
## [31] COASTAL FLOODING/EROSION COASTAL EROSION
## [33] Coastal Flood COASTAL FLOOD
## [35] coastal flooding Coastal Flooding
## [37] COASTAL FLOODING COASTAL FLOODING/EROSION
## [39] Coastal Storm COASTAL STORM
## [41] COASTALFLOOD COASTALSTORM
## [43] Cold COLD
## [45] Cold and Frost COLD AND FROST
## [47] COLD AND SNOW Cold Temperature
## [49] COLD TEMPERATURES COLD WEATHER
## [51] COLD WIND CHILL TEMPERATURES COLD/WIND CHILL
## [53] COOL SPELL CSTL FLOODING/EROSION
## [55] DAM BREAK Damaging Freeze
## [57] DAMAGING FREEZE DENSE FOG
## [59] DENSE SMOKE DOWNBURST
## [61] DRIEST MONTH Drifting Snow
## [63] DROUGHT DROWNING
## [65] DRY DRY CONDITIONS
## [67] DRY MICROBURST DRY SPELL
## [69] DRY WEATHER DRYNESS
## [71] DUST DEVEL Dust Devil
## [73] DUST DEVIL DUST STORM
## [75] Early Frost EARLY RAIN
## [77] Early snowfall EARLY SNOWFALL
## [79] Erosion/Cstl Flood Excessive Cold
## [81] EXCESSIVE HEAT EXCESSIVE HEAT/DROUGHT
## [83] EXCESSIVE RAIN EXCESSIVE RAINFALL
## [85] EXCESSIVE SNOW EXCESSIVELY DRY
## [87] Extended Cold Extreme Cold
## [89] EXTREME COLD EXTREME COLD/WIND CHILL
## [91] EXTREME WIND CHILL EXTREME WINDCHILL
## [93] EXTREME WINDCHILL TEMPERATURES EXTREMELY WET
## [95] FALLING SNOW/ICE FIRST FROST
## [97] FIRST SNOW FLASH FLOOD
## [99] FLASH FLOOD/FLOOD FLASH FLOODING
## [101] Flood FLOOD
## [103] Flood/Flash Flood FLOOD/FLASH/FLOOD
## [105] Flood/Strong Wind FOG
## [107] Freeze FREEZE
## [109] Freezing drizzle Freezing Drizzle
## [111] FREEZING DRIZZLE Freezing Fog
## [113] FREEZING FOG Freezing rain
## [115] Freezing Rain FREEZING RAIN
## [117] FREEZING RAIN/SLEET Freezing Spray
## [119] Frost FROST
## [121] Frost/Freeze FROST/FREEZE
## [123] Funnel Cloud FUNNEL CLOUD
## [125] FUNNEL CLOUDS Glaze
## [127] GLAZE gradient wind
## [129] Gradient wind GRADIENT WIND
## [131] GUSTY LAKE WIND GUSTY THUNDERSTORM WIND
## [133] GUSTY THUNDERSTORM WINDS Gusty Wind
## [135] GUSTY WIND GUSTY WIND/HAIL
## [137] GUSTY WIND/HVY RAIN Gusty wind/rain
## [139] Gusty winds Gusty Winds
## [141] GUSTY WINDS HAIL
## [143] Hail(0.75) HAIL/WIND
## [145] HARD FREEZE HAZARDOUS SURF
## [147] HEAT Heat Wave
## [149] HEAT WAVE Heatburst
## [151] Heavy Precipitation Heavy rain
## [153] Heavy Rain HEAVY RAIN
## [155] Heavy Rain and Wind HEAVY RAIN EFFECTS
## [157] Heavy Rain/High Surf HEAVY RAIN/WIND
## [159] HEAVY RAINFALL HEAVY SEAS
## [161] HEAVY SNOW Heavy snow shower
## [163] HEAVY SNOW SQUALLS Heavy Surf
## [165] HEAVY SURF Heavy surf and wind
## [167] HEAVY SURF/HIGH SURF HIGH SWELLS
## [169] HIGH SEAS High Surf
## [171] HIGH SURF HIGH SURF ADVISORIES
## [173] HIGH SURF ADVISORY HIGH SWELLS
## [175] HIGH WATER High Wind
## [177] HIGH WIND HIGH WIND (G40)
## [179] HIGH WINDS Hot and Dry
## [181] HOT SPELL HOT WEATHER
## [183] HURRICANE Hurricane Edouard
## [185] HURRICANE/TYPHOON HYPERTHERMIA/EXPOSURE
## [187] Hypothermia/Exposure HYPOTHERMIA/EXPOSURE
## [189] ICE Ice Fog
## [191] ICE JAM Ice jam flood (minor
## [193] ICE ON ROAD ICE PELLETS
## [195] ICE ROADS ICE STORM
## [197] Ice/Snow ICE/SNOW
## [199] Icestorm/Blizzard Icy Roads
## [201] ICY ROADS Lake Effect Snow
## [203] LAKE EFFECT SNOW LAKE-EFFECT SNOW
## [205] LAKESHORE FLOOD LANDSLIDE
## [207] LANDSLIDES Landslump
## [209] LANDSLUMP LANDSPOUT
## [211] LATE FREEZE LATE SEASON HAIL
## [213] LATE SEASON SNOW Late Season Snowfall
## [215] LATE SNOW Late-season Snowfall
## [217] LIGHT FREEZING RAIN Light snow
## [219] Light Snow LIGHT SNOW
## [221] Light Snow/Flurries LIGHT SNOW/FREEZING PRECIP
## [223] Light Snowfall LIGHTNING
## [225] LOCALLY HEAVY RAIN Marine Accident
## [227] MARINE HAIL MARINE HIGH WIND
## [229] MARINE STRONG WIND MARINE THUNDERSTORM WIND
## [231] MARINE TSTM WIND Metro Storm, May 26
## [233] Microburst Mild and Dry Pattern
## [235] Minor Flooding MIXED PRECIP
## [237] Mixed Precipitation MIXED PRECIPITATION
## [239] MODERATE SNOW MODERATE SNOWFALL
## [241] MONTHLY PRECIPITATION Monthly Rainfall
## [243] MONTHLY RAINFALL Monthly Snowfall
## [245] MONTHLY SNOWFALL MONTHLY TEMPERATURE
## [247] Mountain Snows MUD SLIDE
## [249] Mudslide MUDSLIDE
## [251] MUDSLIDE/LANDSLIDE Mudslides
## [253] MUDSLIDES No Severe Weather
## [255] NON SEVERE HAIL NON TSTM WIND
## [257] NON-SEVERE WIND DAMAGE NON-TSTM WIND
## [259] NONE NORTHERN LIGHTS
## [261] Other OTHER
## [263] PATCHY DENSE FOG PATCHY ICE
## [265] Prolong Cold PROLONG COLD
## [267] PROLONG WARMTH PROLONGED RAIN
## [269] RAIN RAIN (HEAVY)
## [271] Rain Damage RAIN/SNOW
## [273] RECORD COLD Record Cold
## [275] RECORD COLD RECORD COOL
## [277] Record dry month RECORD DRYNESS
## [279] Record Heat RECORD HEAT
## [281] Record High RECORD LOW RAINFALL
## [283] Record May Snow RECORD PRECIPITATION
## [285] RECORD RAINFALL RECORD SNOW
## [287] RECORD SNOWFALL Record temperature
## [289] RECORD TEMPERATURE Record Temperatures
## [291] RECORD TEMPERATURES RECORD WARM
## [293] RECORD WARM TEMPS. Record Warmth
## [295] RECORD WARMTH Record Winter Snow
## [297] RED FLAG CRITERIA RED FLAG FIRE WX
## [299] REMNANTS OF FLOYD RIP CURRENT
## [301] RIP CURRENTS RIVER FLOOD
## [303] River Flooding RIVER FLOODING
## [305] ROCK SLIDE ROGUE WAVE
## [307] ROUGH SEAS ROUGH SURF
## [309] Saharan Dust SAHARAN DUST
## [311] Seasonal Snowfall SEICHE
## [313] SEVERE THUNDERSTORM SEVERE THUNDERSTORMS
## [315] SLEET SLEET STORM
## [317] SLEET/FREEZING RAIN small hail
## [319] Small Hail SMALL HAIL
## [321] Sml Stream Fld SMOKE
## [323] Snow SNOW
## [325] Snow Accumulation SNOW ADVISORY
## [327] Snow and Ice SNOW AND ICE
## [329] Snow and sleet SNOW AND SLEET
## [331] SNOW DROUGHT SNOW SHOWERS
## [333] SNOW SQUALL Snow squalls
## [335] Snow Squalls SNOW SQUALLS
## [337] SNOW/BLOWING SNOW SNOW/FREEZING RAIN
## [339] SNOW/ICE SNOW/SLEET
## [341] SNOWMELT FLOODING STORM SURGE
## [343] STORM SURGE/TIDE STREET FLOODING
## [345] Strong Wind STRONG WIND
## [347] STRONG WIND GUST Strong winds
## [349] Strong Winds STRONG WINDS
## [351] Summary August 10 Summary August 11
## [353] Summary August 17 Summary August 2-3
## [355] Summary August 21 Summary August 28
## [357] Summary August 4 Summary August 7
## [359] Summary August 9 Summary Jan 17
## [361] Summary July 23-24 Summary June 18-19
## [363] Summary June 5-6 Summary June 6
## [365] Summary of April 12 Summary of April 13
## [367] Summary of April 21 Summary of April 27
## [369] Summary of April 3rd Summary of August 1
## [371] Summary of July 11 Summary of July 2
## [373] Summary of July 22 Summary of July 26
## [375] Summary of July 29 Summary of July 3
## [377] Summary of June 10 Summary of June 11
## [379] Summary of June 12 Summary of June 13
## [381] Summary of June 15 Summary of June 16
## [383] Summary of June 18 Summary of June 23
## [385] Summary of June 24 Summary of June 3
## [387] Summary of June 30 Summary of June 4
## [389] Summary of June 6 Summary of March 14
## [391] Summary of March 23 Summary of March 24
## [393] SUMMARY OF MARCH 24-25 SUMMARY OF MARCH 27
## [395] SUMMARY OF MARCH 29 Summary of May 10
## [397] Summary of May 13 Summary of May 14
## [399] Summary of May 22 Summary of May 22 am
## [401] Summary of May 22 pm Summary of May 26 am
## [403] Summary of May 26 pm Summary of May 31 am
## [405] Summary of May 31 pm Summary of May 9-10
## [407] Summary Sept. 25-26 Summary September 20
## [409] Summary September 23 Summary September 3
## [411] Summary September 4 Summary: Nov. 16
## [413] Summary: Nov. 6-7 Summary: Oct. 20-21
## [415] Summary: October 31 Summary: Sept. 18
## [417] Temperature record Thundersnow shower
## [419] THUNDERSTORM Thunderstorm Wind
## [421] THUNDERSTORM WIND THUNDERSTORM WIND (G40)
## [423] THUNDERSTORMS Tidal Flooding
## [425] TIDAL FLOODING TORNADO
## [427] TORNADO DEBRIS Torrential Rainfall
## [429] TROPICAL DEPRESSION TROPICAL STORM
## [431] TSTM TSTM HEAVY RAIN
## [433] Tstm Wind TSTM WIND
## [435] TSTM WIND (G45) TSTM WIND (41)
## [437] TSTM WIND (G35) TSTM WIND (G40)
## [439] TSTM WIND (G45) TSTM WIND 40
## [441] TSTM WIND 45 TSTM WIND AND LIGHTNING
## [443] TSTM WIND G45 TSTM WIND/HAIL
## [445] TSTM WINDS TSTM WND
## [447] TSUNAMI TYPHOON
## [449] Unseasonable Cold UNSEASONABLY COLD
## [451] UNSEASONABLY COOL UNSEASONABLY COOL & WET
## [453] UNSEASONABLY DRY UNSEASONABLY HOT
## [455] UNSEASONABLY WARM UNSEASONABLY WARM & WET
## [457] UNSEASONABLY WARM AND DRY UNSEASONABLY WARM YEAR
## [459] UNSEASONABLY WARM/WET UNSEASONABLY WET
## [461] UNSEASONAL LOW TEMP UNSEASONAL RAIN
## [463] UNUSUAL WARMTH UNUSUAL/RECORD WARMTH
## [465] UNUSUALLY COLD UNUSUALLY LATE SNOW
## [467] UNUSUALLY WARM Urban flood
## [469] Urban Flood URBAN FLOOD
## [471] Urban Flooding URBAN/SMALL STRM FLDG
## [473] URBAN/SML STREAM FLD URBAN/SML STREAM FLDG
## [475] URBAN/STREET FLOODING VERY DRY
## [477] VERY WARM VOG
## [479] Volcanic Ash VOLCANIC ASH
## [481] Volcanic Ash Plume VOLCANIC ASHFALL
## [483] VOLCANIC ERUPTION WAKE LOW WIND
## [485] WALL CLOUD WARM WEATHER
## [487] WATERSPOUT WATERSPOUTS
## [489] wet micoburst WET MICROBURST
## [491] Wet Month Wet Year
## [493] Whirlwind WHIRLWIND
## [495] WILD/FOREST FIRE WILDFIRE
## [497] Wind WIND
## [499] WIND ADVISORY WIND AND WAVE
## [501] WIND CHILL Wind Damage
## [503] WIND DAMAGE WIND GUSTS
## [505] WINDS WINTER MIX
## [507] WINTER STORM Winter Weather
## [509] WINTER WEATHER WINTER WEATHER MIX
## [511] WINTER WEATHER/MIX WINTERY MIX
## [513] Wintry mix Wintry Mix
## [515] WINTRY MIX WND
## 985 Levels: HIGH SURF ADVISORY COASTAL FLOOD ... WND
df = group_by(df, evtype)
#Finds all the occurrence of Coastal types
df$evtype[str_detect(df$evtype, ignore.case("coastal"))] = "Coastal"
## Warning: invalid factor level, NA generated
#Flood grouping
df$evtype[str_detect(df$evtype, ignore.case("flood"))] = "Flood"
#Thunderstorm
df$evtype[str_detect(df$evtype, ignore.case("thunder"))] = "Thunderstorm"
## Warning: invalid factor level, NA generated
#Winter
df$evtype[str_detect(df$evtype, ignore.case("winter"))] = "Winter"
## Warning: invalid factor level, NA generated
#Wind
df$evtype[str_detect(df$evtype, ignore.case("wind"))] = "Wind"
#Summary
df$evtype[str_detect(df$evtype, ignore.case("summary"))] = "Summary"
## Warning: invalid factor level, NA generated
#unseasonable
df$evtype[str_detect(df$evtype, ignore.case("unseason"))] = "Unseasonable weather"
## Warning: invalid factor level, NA generated
#Heavy Rain
df$evtype[str_detect(df$evtype, ignore.case("heavy rain"))] = "Heavy Rain"
#Volcanic activity
df$evtype[str_detect(df$evtype, ignore.case("volcan"))] = "Volcanic Activity"
## Warning: invalid factor level, NA generated
#Snow
df$evtype[str_detect(df$evtype, ignore.case("Snow"))] = "Snow Related"
## Warning: invalid factor level, NA generated
#Tidal
df$evtype[str_detect(df$evtype, ignore.case("tide"))] = "Tidal"
## Warning: invalid factor level, NA generated
#Surf
df$evtype[str_detect(df$evtype, ignore.case("surf"))] = "Surf"
## Warning: invalid factor level, NA generated
df = na.omit(df)
df = group_by(df, evtype)
summary(df)
## evtype fatalities injuries propdmg
## HAIL :207715 Min. : 0.00 Min. : 0.0 Min. : 0
## Wind :162036 1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.: 0
## Flood : 75432 Median : 0.00 Median : 0.0 Median : 0
## TORNADO : 23154 Mean : 0.02 Mean : 0.1 Mean : 12
## LIGHTNING : 13203 3rd Qu.: 0.00 3rd Qu.: 0.0 3rd Qu.: 1
## Heavy Rain: 11536 Max. :158.00 Max. :1150.0 Max. :5000
## (Other) : 35700
## cropdmg date
## Min. : 0.0 Min. :1996-01-01 00:00:00
## 1st Qu.: 0.0 1st Qu.:2000-05-03 00:00:00
## Median : 0.0 Median :2004-02-07 00:00:00
## Mean : 2.1 Mean :2003-12-13 02:53:26
## 3rd Qu.: 0.0 3rd Qu.:2007-07-05 00:00:00
## Max. :990.0 Max. :2011-11-30 00:00:00
##
evtypeoccurence = summarise(df,
sum_fatalities = sum(fatalities),
sum_injuries = sum(injuries),
sum_propdmg = sum(propdmg),
sum_cropdmg = sum(cropdmg),
n = n())
evtypeoccurence = arrange(evtypeoccurence, desc(n))
head(evtypeoccurence, 10)
## Source: local data frame [10 x 6]
##
## evtype sum_fatalities sum_injuries sum_propdmg
## 1 HAIL 7 713 575317.3
## 2 Wind 880 5286 1730344.6
## 3 Flood 1303 8434 2082705.4
## 4 TORNADO 1511 20667 1187878.2
## 5 LIGHTNING 651 4141 488561.8
## 6 Heavy Rain 94 230 47016.3
## 7 FUNNEL CLOUD 0 1 134.1
## 8 URBAN/SML STREAM FLD 28 79 26051.9
## 9 WATERSPOUT 2 2 5730.2
## 10 WILDFIRE 75 911 83007.3
## Variables not shown: sum_cropdmg (dbl), n (int)
It is clear that the top 10 categories comprise the majority of the economic and population damage. Below is a chart exploring the relationship between evtype and injuries and fatalities.
ggplot(evtypeoccurence[1:10,], aes(x = evtype, y = sum_fatalities)) + ggtitle("Total Fatilities by EVType") + xlab("EVType") + ylab("Total Fatilities") +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle=90))
ggplot(evtypeoccurence[1:10,], aes(x = evtype, y = sum_injuries)) + ggtitle("Total Injuries by EVType") + xlab("EVType") + ylab("Total Injuries") +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle=90))
This chart indicates Tornados are the most fatal of event type, with floods and wind effects also damaging in terms of fatilities. Tornados are also cause the most injuries by far.
economicdmg = select(evtypeoccurence, evtype, sum_propdmg, sum_cropdmg)
economicdmg = mutate(economicdmg, totaldmg = sum_propdmg + sum_cropdmg)
economicdmg = arrange(economicdmg, desc(totaldmg))
economicdmg = economicdmg[1:20,]
economicdmg = select(economicdmg, evtype, sum_propdmg, sum_cropdmg)
economicdf = melt(economicdmg, id = "evtype")
ggplot(data = economicdf, aes(x = evtype, y = value, fill = variable)) +
geom_bar(stat = "identity") +
ggtitle("EVType and Economic Damage") +
xlab("EV Type") +
ylab("Total Damage") +
theme(axis.text.x = element_text(angle=90))
#alternatively, to generate a panel plot.
#ggplot(data = economicdmg, aes(x = evtype, y = value)) +
# geom_bar(stat = "identity") +
# ggtitle("EVType and Economic Damage") +
# xlab("EV Type") +
# ylab("Total Damage") +
# theme(axis.text.x = element_text(angle=90)) +
# facet_grid(variable ~.)
The above chart indicates that floods cause the most economic damage, with wind and tornados also damaging in an economic sense. Further, property damage is by far the largest component of total economic damage.