Storm Data Analysis: Which Storms are most Destructive?
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(knitr)
## Warning: package 'knitr' was built under R version 3.2.2
Synopsis We examine NOAA Storm data to determine what kinds of storms are most destructive both in terms of human health and in terms of economic costs. Load data
storm.df<-read.csv("repdata-data-StormData.csv",header=TRUE,nrows=368798)
Data Processing: Pre-Process and Prepare data for analysis Select just relevant columns EVTYPE, and info on injuries, fatalities and damage. Convert the property damange and crop damage into dollar terms for analysis. Add Property damage and crop damage together to get total damage.
storm.df2<-select(storm.df,EVTYPE,FATALITIES,INJURIES:CROPDMGEXP)%>%
filter(PROPDMGEXP!="-",PROPDMGEXP!="?",PROPDMGEXP!="+",CROPDMGEXP!="?",
CROPDMGEXP!=2,PROPDMGEXP!="")
Prop.exp<-data.frame(PROPDMGEXP=unique(storm.df2$PROPDMGEXP),
Prop.exp=c(3,6,9,6,0,5,6,4,2,3,2,7,2,1,8))
Crop.exp<-data.frame(CROPDMGEXP=unique(storm.df2$CROPDMGEXP),Crop.exp=c(0,6,3,6,0,3,9))
storm.df2<-left_join(storm.df2,Prop.exp,by="PROPDMGEXP")%>%
left_join(Crop.exp,by="CROPDMGEXP")
storm.df2<-storm.df2%>%mutate(Prop.Damage=PROPDMG*(10^Prop.exp),
Crop.Damage=CROPDMG*(10^Crop.exp), Total.Damage=Prop.Damage+Crop.Damage)
storm.health<-storm.df2%>%group_by(EVTYPE)%>%
summarise(Avg.Fat=mean(FATALITIES,na.rm=TRUE),Avg.Inj=mean(INJURIES,na.rm=TRUE))%>%
arrange(desc(Avg.Fat),desc(Avg.Inj))
head(storm.health,10)
## Source: local data frame [10 x 3]
##
## EVTYPE Avg.Fat Avg.Inj
## 1 TORNADOES, TSTM WIND, HAIL 25.000 0.00
## 2 WINTER STORMS 10.000 17.00
## 3 TROPICAL STORM GORDON 8.000 43.00
## 4 HEAT 4.400 94.00
## 5 HEAT WAVE 4.375 10.75
## 6 HEAT WAVE DROUGHT 4.000 15.00
## 7 HIGH WIND/SEAS 4.000 0.00
## 8 HIGH WIND AND SEAS 3.000 20.00
## 9 SNOW AND ICE 3.000 0.00
## 10 HURRICANE OPAL/HIGH WINDS 2.000 0.00
storm.health2<-storm.health%>%arrange(desc(Avg.Inj),desc(Avg.Fat))
head(storm.health2,10)
## Source: local data frame [10 x 3]
##
## EVTYPE Avg.Fat Avg.Inj
## 1 HEAT 4.40 94.0
## 2 TROPICAL STORM GORDON 8.00 43.0
## 3 WILD FIRES 0.75 37.5
## 4 THUNDERSTORMW 0.00 27.0
## 5 HIGH WIND AND SEAS 3.00 20.0
## 6 SNOW/HIGH WINDS 0.00 18.0
## 7 WINTER STORMS 10.00 17.0
## 8 HEAT WAVE DROUGHT 4.00 15.0
## 9 WINTER STORM HIGH WINDS 1.00 15.0
## 10 DENSE FOG 0.90 12.0
health.plot<-storm.health[1:10,]%>%select(-Avg.Inj)
Create Health Plots
with(health.plot,barplot(Avg.Fat,names.arg=EVTYPE,xlab="Storm Type",
main="Average Fatalities by Storm Type (Top 10)",col="blue"))
health.plot2<-storm.health2[1:10,]%>%select(-Avg.Fat)
with(health.plot2,barplot(Avg.Inj,names.arg=EVTYPE,xlab="Storm Type",
main="Average Injuries by Storm Type (Top 10)",col="red"))
We see that Tornadoes, TSTM WIND, and HAIL is the largest cause of death but Heat is the Largest cause of injuries. Heat is also on the top 5 of death therefore it seems like Heat is the most harmful to the population from a health perspective.
We consider Property Damage and Crop Damage combined to determine the greatest economic impact.
storm.econ<-storm.df2%>%group_by(EVTYPE)%>%
summarise(Avg.Dmg=mean(Total.Damage,na.rm=TRUE),count=n())%>%
arrange(desc(Avg.Dmg))
head(storm.econ,10)
## Source: local data frame [10 x 3]
##
## EVTYPE Avg.Dmg count
## 1 HEAVY RAIN/SEVERE WEATHER 2500000000 1
## 2 TORNADOES, TSTM WIND, HAIL 1602500000 1
## 3 HURRICANE OPAL 398980750 8
## 4 SEVERE THUNDERSTORM 200893333 6
## 5 WILD FIRES 156025000 4
## 6 HURRICANE 153923443 70
## 7 HURRICANE OPAL/HIGH WINDS 110000000 1
## 8 RIVER FLOOD 98521160 103
## 9 HAILSTORM 80333333 3
## 10 TYPHOON 66783889 9
Now we’ll consider only events which had over 100 occurances since our last list seemed to be filled with extreme 1 off cases. However, large storms had the most extreme impacts.
storm.econ2<-storm.df2%>%group_by(EVTYPE)%>%
summarise(Avg.Dmg=mean(Total.Damage,na.rm=TRUE),count=n())%>%
filter(count>100)%>%arrange(desc(Avg.Dmg))
head(storm.econ2,10)
## Source: local data frame [10 x 3]
##
## EVTYPE Avg.Dmg count
## 1 RIVER FLOOD 98521160.2 103
## 2 ICE STORM 27353411.5 220
## 3 WINTER STORM 16091667.9 330
## 4 WILD/FOREST FIRE 6743077.3 110
## 5 FLOOD 5090194.5 2153
## 6 HEAVY SNOW 1252927.7 588
## 7 HEAVY RAIN 1115425.6 197
## 8 FLASH FLOODING 1088594.7 292
## 9 HIGH WINDS 1041499.6 603
## 10 FLOOD/FLASH FLOOD 975254.6 275
Plot Total Damage
with(head(storm.econ2,5),barplot(Avg.Dmg,names.arg=EVTYPE,xlab="Storm Type",
main="Average Total Damage by Storm Type (Top 5)",col="green"))
Now we see that various kinds of flooding and heavy storms are generally the most expensive in terms of economic impact. River Floods are by far the largest here with over $80M in average damage.
Results
Overall we see that Heat is the most damaging in terms of Human health but Flooding, espcially River Flooding, and Heavy Storms cost the most.