The Storm Database about severe weather events from the National Oceanic & Atmospheric Administration (NOAA) is analysed in order to ascertain which events are most dangerous for people and which have the greatest economic impact. Tornados and hurricanes are placed in top position both in human and economic damages. Storms, floods and winds follow, and also heat in number of fatalities and injuries, but not in economic losses.
The first step is to read the data and attach the necessary packages in R for the analysis.
library(plyr)
data<-read.table("repdata-data-StormData.csv.bz2", header=T, sep = ",")
n<-dim(data)[1]
As there are a lot of different types of weather events, after inspecting them and detecting the most frequent, a new factor is defined by grouping similar ones together. For example, tornado, funnel cloud, waterspout, hurricane and typhoon are joined in the category tornado/hurricane (including some mispelling options).
head(sort(table(data$EVTYPE),decreasing=TRUE),20)
##
## HAIL TSTM WIND THUNDERSTORM WIND
## 288661 219940 82563
## TORNADO FLASH FLOOD FLOOD
## 60652 54277 25326
## THUNDERSTORM WINDS HIGH WIND LIGHTNING
## 20843 20212 15754
## HEAVY SNOW HEAVY RAIN WINTER STORM
## 15708 11723 11433
## WINTER WEATHER FUNNEL CLOUD MARINE TSTM WIND
## 7026 6839 6175
## MARINE THUNDERSTORM WIND WATERSPOUT STRONG WIND
## 5812 3796 3566
## URBAN/SML STREAM FLD WILDFIRE
## 3392 2761
data$eventnew<-tolower(data$EVTYPE)
data$eventnew[grep("hail",data$eventnew)]<-"hail"
data$eventnew[grep("storm|thund|lightning",data$eventnew)]<-"storm"
data$eventnew[grep("torn|funnel|spout|hurricane|typhoon",data$eventnew)]<-"tornado/hurricane"
data$eventnew[grep("flood|fld",data$eventnew)]<-"flood"
data$eventnew[grep("wind",data$eventnew)]<-"winds"
data$eventnew[grep("snow|blizzard|ice|freez|frost|icy|avalanche",data$eventnew)]<-"snow/ice"
data$eventnew[grep("rain",data$eventnew)]<-"rain"
data$eventnew[grep("winter|cold",data$eventnew)]<-"winter"
data$eventnew[grep("fire|smoke",data$eventnew)]<-"fire"
data$eventnew[grep("drought|dry",data$eventnew)]<-"drought"
data$eventnew[grep("heat|warm|hot",data$eventnew)]<-"heat"
data$eventnew[grep("fog",data$eventnew)]<-"fog"
data$eventnew[grep("surf",data$eventnew)]<-"surf"
data$eventnew[grep("landslide",data$eventnew)]<-"landslide"
data$eventnew[grep("rip",data$eventnew)]<-"rip current"
data$eventnew[!data$eventnew %in% c("hail","storm","tornado/hurricane","flood","winds","snow/ice","rain","winter",
"fire","drought","heat","fog","surf","landslide","rip current")]<-"other"
data$eventnew<-factor(data$eventnew)
Finally, three new dataframes with the most frequent (new defined) weather events and the corresponding number of fatalities, number of injuries and damage (in dollar amount) are created.
total_fatalities<-ddply(data,.(eventnew),summarise,total.fatalities=sum(FATALITIES))
total_injuries<-ddply(data,.(eventnew),summarise,total.injuries=sum(INJURIES))
total_damage<-ddply(data,.(eventnew),summarise,total.damage=sum(PROPDMG))
total_fatalities<-total_fatalities[order(total_fatalities$total.fatalities,decreasing=TRUE),]
total_injuries<-total_injuries[order(total_injuries$total.injuries,decreasing=TRUE),]
total_damage<-total_damage[order(total_damage$total.damage,decreasing=TRUE),]
The most dangerous weather events for human health are tornados/hurricanes. Specifically, more than 38% of fatalities and more than 66% of injuries due to weather events are caused by these. Also storms, heat, flood and winds cause an important number of damages in human health. Indeed, these five types of events are responsible for 86.7% of deaths and 93.6% of injuries due to weather.
par(mfrow = c(2,1),las=2)
barplot(total_fatalities$total.fatalities,names.arg=total_fatalities$eventnew,col=2,cex.names=.8,
cex.axis=0.8,main="Fatalitites caused by weather events")
barplot(total_injuries$total.injuries,names.arg=total_injuries$eventnew,col=2,cex.names=.8,
cex.axis=0.8,main="Injuries caused by weather events")
Figure 1. Number of fatalities (upper) and injuries (lower) caused by the different type of weather events
total_fatalities$total.fatalities<-total_fatalities$total.fatalities/sum(total_fatalities$total.fatalities)
total_fatalities
## eventnew total.fatalities
## 14 tornado/hurricane 0.381248
## 6 heat 0.207527
## 3 flood 0.102542
## 12 storm 0.095873
## 15 winds 0.079696
## 10 rip current 0.037768
## 11 snow/ice 0.033873
## 16 winter 0.018224
## 13 surf 0.010763
## 8 other 0.006603
## 9 rain 0.006603
## 2 fire 0.005943
## 4 fog 0.005282
## 5 hail 0.002971
## 7 landslide 0.002575
## 1 drought 0.002509
sum(total_fatalities$total.fatalities[1:5])
## [1] 0.8669
total_injuries$total.injuries<-total_injuries$total.injuries/sum(total_injuries$total.injuries)
total_injuries
## eventnew total.injuries
## 14 tornado/hurricane 0.6601674
## 12 storm 0.0848443
## 6 heat 0.0656666
## 15 winds 0.0636813
## 3 flood 0.0617742
## 11 snow/ice 0.0165874
## 2 fire 0.0114426
## 5 hail 0.0104392
## 4 fog 0.0076568
## 16 winter 0.0058209
## 10 rip current 0.0037644
## 8 other 0.0037003
## 9 rain 0.0019925
## 13 surf 0.0017505
## 7 landslide 0.0003771
## 1 drought 0.0003345
sum(total_injuries$total.injuries[1:5])
## [1] 0.9361
Tornados and hurricanes are also the weather events causing most property damages (30% approx.), followed by floods, storms and winds, as there were in fatalities and injuries. The four types together account for 89.3% of the total amount (in dollars). However, heat is not an important cause of economic damage (as it was in human damage).
par(las=2)
barplot(total_damage$total.damage,names.arg=total_damage$eventnew,col=2,cex.names=.8,
cex.axis=0.8,main="Damage caused by weather events")
Figure 2. Damage (in dollar amount) caused by the different type of weather events
total_damage$total.damage<-total_damage$total.damage/sum(total_damage$total.damage)
total_damage
## eventnew total.damage
## 14 tornado/hurricane 2.987e-01
## 3 flood 2.262e-01
## 12 storm 2.036e-01
## 15 winds 1.650e-01
## 5 hail 6.425e-02
## 11 snow/ice 1.761e-02
## 2 fire 1.151e-02
## 9 rain 4.912e-03
## 16 winter 2.368e-03
## 7 landslide 1.752e-03
## 4 fog 1.569e-03
## 8 other 1.078e-03
## 13 surf 5.926e-04
## 1 drought 5.546e-04
## 6 heat 2.786e-04
## 10 rip current 1.498e-05
sum(total_damage$total.damage[1:4])
## [1] 0.8935