In this report, we assessed the damage that different weather events cause to both human health (injuries and fatalities) and the economy (damage to property and crops). By far, we found that tornadoes are the most damaging weather event in terms of injuries and fatalities. Tornadoes are also very damaging to property and crops, but at levels closer to those observed from other weather events.
First, we uncompress the bz2 file and load the data from the csv file.
library(R.utils) # for bunzip2
if (!file.exists('~/data/StormData.csv')){
filename='~/data/repdata_data_StormData.csv.bz2'
bunzip2(filename, destname = "StormData.csv", remove = FALSE)
}
d<- read.csv("~/data/StormData.csv")
The event data is stored in the EVTYPE column. This is a column of factors representing different types of damaging weather events. If we look at the number of levels of this factor, we see there are 985 possibilities. Further examination reveals that there are many duplicated types of events. Notice how many types of entries mention wind:
e<-levels(d$EVTYPE)
e[grepl('wind',e,ignore.case=T)]
## [1] " TSTM WIND" " TSTM WIND (G45)"
## [3] " WIND" "BITTER WIND CHILL"
## [5] "BITTER WIND CHILL TEMPERATURES" "BLIZZARD AND EXTREME WIND CHIL"
## [7] "BLIZZARD/HIGH WIND" "BLOWING SNOW- EXTREME WIND CHI"
## [9] "BLOWING SNOW & EXTREME WIND CH" "BLOWING SNOW/EXTREME WIND CHIL"
## [11] "COLD WIND CHILL TEMPERATURES" "COLD/WIND CHILL"
## [13] "COLD/WINDS" "DOWNBURST WINDS"
## [15] "DRY MICROBURST WINDS" "DRY MIRCOBURST WINDS"
## [17] "DUST STORM/HIGH WINDS" "EXTREME COLD/WIND CHILL"
## [19] "EXTREME WIND CHILL" "EXTREME WIND CHILL/BLOWING SNO"
## [21] "EXTREME WIND CHILLS" "EXTREME WINDCHILL"
## [23] "EXTREME WINDCHILL TEMPERATURES" "FLASH FLOOD WINDS"
## [25] "FLOOD/RAIN/WIND" "FLOOD/RAIN/WINDS"
## [27] "Flood/Strong Wind" "gradient wind"
## [29] "Gradient wind" "GRADIENT WIND"
## [31] "GRADIENT WINDS" "GUSTY LAKE WIND"
## [33] "GUSTY THUNDERSTORM WIND" "GUSTY THUNDERSTORM WINDS"
## [35] "Gusty Wind" "GUSTY WIND"
## [37] "GUSTY WIND/HAIL" "GUSTY WIND/HVY RAIN"
## [39] "Gusty wind/rain" "Gusty winds"
## [41] "Gusty Winds" "GUSTY WINDS"
## [43] "HAIL/WIND" "HAIL/WINDS"
## [45] "Heavy Rain and Wind" "HEAVY RAIN/WIND"
## [47] "HEAVY RAIN; URBAN FLOOD WINDS;" "HEAVY SNOW AND HIGH WINDS"
## [49] "HEAVY SNOW AND STRONG WINDS" "HEAVY SNOW/HIGH WIND"
## [51] "HEAVY SNOW/HIGH WINDS" "HEAVY SNOW/HIGH WINDS & FLOOD"
## [53] "HEAVY SNOW/HIGH WINDS/FREEZING" "HEAVY SNOW/WIND"
## [55] "Heavy surf and wind" "HIGH WINDS"
## [57] "High Wind" "HIGH WIND"
## [59] "HIGH WIND (G40)" "HIGH WIND 48"
## [61] "HIGH WIND 63" "HIGH WIND 70"
## [63] "HIGH WIND AND HEAVY SNOW" "HIGH WIND AND HIGH TIDES"
## [65] "HIGH WIND AND SEAS" "HIGH WIND DAMAGE"
## [67] "HIGH WIND/ BLIZZARD" "HIGH WIND/BLIZZARD"
## [69] "HIGH WIND/BLIZZARD/FREEZING RA" "HIGH WIND/HEAVY SNOW"
## [71] "HIGH WIND/LOW WIND CHILL" "HIGH WIND/SEAS"
## [73] "HIGH WIND/WIND CHILL" "HIGH WIND/WIND CHILL/BLIZZARD"
## [75] "HIGH WINDS" "HIGH WINDS 55"
## [77] "HIGH WINDS 57" "HIGH WINDS 58"
## [79] "HIGH WINDS 63" "HIGH WINDS 66"
## [81] "HIGH WINDS 67" "HIGH WINDS 73"
## [83] "HIGH WINDS 76" "HIGH WINDS 80"
## [85] "HIGH WINDS 82" "HIGH WINDS AND WIND CHILL"
## [87] "HIGH WINDS DUST STORM" "HIGH WINDS HEAVY RAINS"
## [89] "HIGH WINDS/" "HIGH WINDS/COASTAL FLOOD"
## [91] "HIGH WINDS/COLD" "HIGH WINDS/FLOODING"
## [93] "HIGH WINDS/HEAVY RAIN" "HIGH WINDS/SNOW"
## [95] "HURRICANE OPAL/HIGH WINDS" "ICE/STRONG WINDS"
## [97] "LIGHTNING AND WINDS" "LIGHTNING THUNDERSTORM WINDS"
## [99] "LIGHTNING THUNDERSTORM WINDSS" "LOW WIND CHILL"
## [101] "MARINE HIGH WIND" "MARINE STRONG WIND"
## [103] "MARINE THUNDERSTORM WIND" "MARINE TSTM WIND"
## [105] "MICROBURST WINDS" "NON-SEVERE WIND DAMAGE"
## [107] "NON-TSTM WIND" "NON TSTM WIND"
## [109] "RAIN AND WIND" "RAIN/WIND"
## [111] "RECORD COLD AND HIGH WIND" "SEVERE THUNDERSTORM WINDS"
## [113] "SNOW- HIGH WIND- WIND CHILL" "SNOW AND WIND"
## [115] "SNOW/HIGH WINDS" "STORM FORCE WINDS"
## [117] "Strong Wind" "STRONG WIND"
## [119] "STRONG WIND GUST" "Strong winds"
## [121] "Strong Winds" "STRONG WINDS"
## [123] "THUDERSTORM WINDS" "THUNDEERSTORM WINDS"
## [125] "THUNDERESTORM WINDS" "THUNDERSTORM WINDS"
## [127] "Thunderstorm Wind" "THUNDERSTORM WIND"
## [129] "THUNDERSTORM WIND (G40)" "THUNDERSTORM WIND 50"
## [131] "THUNDERSTORM WIND 52" "THUNDERSTORM WIND 56"
## [133] "THUNDERSTORM WIND 59" "THUNDERSTORM WIND 59 MPH"
## [135] "THUNDERSTORM WIND 59 MPH." "THUNDERSTORM WIND 60 MPH"
## [137] "THUNDERSTORM WIND 65 MPH" "THUNDERSTORM WIND 65MPH"
## [139] "THUNDERSTORM WIND 69" "THUNDERSTORM WIND 98 MPH"
## [141] "THUNDERSTORM WIND G50" "THUNDERSTORM WIND G51"
## [143] "THUNDERSTORM WIND G52" "THUNDERSTORM WIND G55"
## [145] "THUNDERSTORM WIND G60" "THUNDERSTORM WIND G61"
## [147] "THUNDERSTORM WIND TREES" "THUNDERSTORM WIND."
## [149] "THUNDERSTORM WIND/ TREE" "THUNDERSTORM WIND/ TREES"
## [151] "THUNDERSTORM WIND/AWNING" "THUNDERSTORM WIND/HAIL"
## [153] "THUNDERSTORM WIND/LIGHTNING" "THUNDERSTORM WINDS"
## [155] "THUNDERSTORM WINDS LE CEN" "THUNDERSTORM WINDS 13"
## [157] "THUNDERSTORM WINDS 2" "THUNDERSTORM WINDS 50"
## [159] "THUNDERSTORM WINDS 52" "THUNDERSTORM WINDS 53"
## [161] "THUNDERSTORM WINDS 60" "THUNDERSTORM WINDS 61"
## [163] "THUNDERSTORM WINDS 62" "THUNDERSTORM WINDS 63 MPH"
## [165] "THUNDERSTORM WINDS AND" "THUNDERSTORM WINDS FUNNEL CLOU"
## [167] "THUNDERSTORM WINDS G" "THUNDERSTORM WINDS G60"
## [169] "THUNDERSTORM WINDS HAIL" "THUNDERSTORM WINDS HEAVY RAIN"
## [171] "THUNDERSTORM WINDS LIGHTNING" "THUNDERSTORM WINDS SMALL STREA"
## [173] "THUNDERSTORM WINDS URBAN FLOOD" "THUNDERSTORM WINDS."
## [175] "THUNDERSTORM WINDS/ FLOOD" "THUNDERSTORM WINDS/ HAIL"
## [177] "THUNDERSTORM WINDS/FLASH FLOOD" "THUNDERSTORM WINDS/FLOODING"
## [179] "THUNDERSTORM WINDS/FUNNEL CLOU" "THUNDERSTORM WINDS/HAIL"
## [181] "THUNDERSTORM WINDS/HEAVY RAIN" "THUNDERSTORM WINDS53"
## [183] "THUNDERSTORM WINDSHAIL" "THUNDERSTORM WINDSS"
## [185] "THUNDERSTORMS WIND" "THUNDERSTORMS WINDS"
## [187] "THUNDERSTORMW WINDS" "THUNDERSTORMWINDS"
## [189] "THUNDERSTROM WIND" "THUNDERSTROM WINDS"
## [191] "THUNDERTORM WINDS" "THUNDERTSORM WIND"
## [193] "THUNDESTORM WINDS" "THUNERSTORM WINDS"
## [195] "TORNADOES, TSTM WIND, HAIL" "Tstm Wind"
## [197] "TSTM WIND" "TSTM WIND (G45)"
## [199] "TSTM WIND (41)" "TSTM WIND (G35)"
## [201] "TSTM WIND (G40)" "TSTM WIND (G45)"
## [203] "TSTM WIND 40" "TSTM WIND 45"
## [205] "TSTM WIND 50" "TSTM WIND 51"
## [207] "TSTM WIND 52" "TSTM WIND 55"
## [209] "TSTM WIND 65)" "TSTM WIND AND LIGHTNING"
## [211] "TSTM WIND DAMAGE" "TSTM WIND G45"
## [213] "TSTM WIND G58" "TSTM WIND/HAIL"
## [215] "TSTM WINDS" "TUNDERSTORM WIND"
## [217] "WAKE LOW WIND" "Whirlwind"
## [219] "WHIRLWIND" "Wind"
## [221] "WIND" "WIND ADVISORY"
## [223] "WIND AND WAVE" "WIND CHILL"
## [225] "WIND CHILL/HIGH WIND" "Wind Damage"
## [227] "WIND DAMAGE" "WIND GUSTS"
## [229] "WIND STORM" "WIND/HAIL"
## [231] "WINDS" "WINTER STORM HIGH WINDS"
## [233] "WINTER STORM/HIGH WIND" "WINTER STORM/HIGH WINDS"
In the next section, I attempt to use regular expressions to combine types. First, I use grepl to make logical masks for the various categories, and change the names of those factors to be consistent.
evt<-d$EVTYPE
wind<-grepl('wind',evt,ignore.case=T)
heat<-grepl('warm|heat|hot',evt,ignore.case=T)
snow<-grepl('winter|snow|freezing|ice|blizzard',evt,ignore.case=T)
flood<-grepl('flood|rain',evt,ignore.case=T)
flood<-flood&!snow ##separate water damage from ice damage
tornado<-grepl('tornado|waterspou|funnel',evt,ignore.case=T)
lightning<-grepl('lightning',evt,ignore.case=T)
hurricane<-grepl('hurricane|tropical',evt,ignore.case=T)
hail<-grepl('hail',evt,ignore.case=T)
evt[wind]<-'WIND'
evt[heat]<-'HEAT'
evt[snow]<-'SNOW'
evt[flood]<-'FLOOD'
evt[tornado]<-'TORNADO'
evt[lightning]<-'LIGHTNING'
evt[hurricane]<- 'HURRICANE'
evt[hail]<-'HAIL'
d$evt<-evt ##add this back to the dataframe
First, we will look at the most damaging event types in terms of human health impact. We will compare the total number of injuries and fatalities for each event type.
health.damage<-tapply(d$FATALITIES+d$INJURIES,d$evt,sum,na.rm=T)
health.damage<-sort(health.damage,dec=T) ##sort to make plotting easier
barplot(health.damage[1:10],cex.names=0.7,las=2,main='Injuries and Fatalities from Weather')
We see that tornadoes do much more damage than other weather events.
Next, we will examine the economic impact of these events, by looking at the total property and crop damage they cause.
property.damage<-tapply(d$CROPDMG+d$PROPDMG,d$evt,sum,na.rm=T)
property.damage<-sort(property.damage,dec=T)
barplot(property.damage[1:10],cex.names=0.7,las=2,main='Economic Impact of Weather')
Notice that in terms of economic impact, wind, flood and tornado are much closer to each other in terms of total damage caused.
Tornadoes are by far the most damaging in terms of human health, but when considering the economic impact, wind and flood do about as much damage. Hail, lightning and snow also have significant economic impact.