This study investigates which storm events are the most harmful to population health in the USA and which cause the most economic damage, using data provided by the NOAA Storm Database. After loading and cleaning the data, graphs are plotted which investigate these problems. It was found that Tornadoes are the most damaging to population health by far while floods cause the most economic damage.
We begin by loading the data.
data<-read.csv("repdata-data-StormData.csv.bz2")
We must multiply the property and crop data damage by their orders of magnitude encoded in PROPDMGEXP and CROPDMGEXP. First we must correct those columns as they contain invalid entries.
data$PROPDMGEXP<-factor(data$PROPDMGEXP,levels=c("K","M","B",0),labels=c("K","M","B",0),ordered=FALSE)
data$CROPDMGEXP<-factor(data$CROPDMGEXP,levels=c("K","M","B",0),labels=c("K","M","B",0),ordered=FALSE)
data$PROPDMGEXP[is.na(data$PROPDMGEXP)]<-0
data$CROPDMGEXP[is.na(data$CROPDMGEXP)]<-0
data$PROPDMG[data$PROPDMGEXP=="K"] <- data$PROPDMG[data$PROPDMGEXP=="K"]*(10^3)
data$PROPDMG[data$PROPDMGEXP=="M"] <- data$PROPDMG[data$PROPDMGEXP=="M"]*(10^6)
data$PROPDMG[data$PROPDMGEXP=="B"] <- data$PROPDMG[data$PROPDMGEXP=="B"]*(10^9)
data$CROPDMG[data$CROPDMGEXP=="K"] <- data$CROPDMG[data$CROPDMGEXP=="K"]*(10^3)
data$CROPDMG[data$CROPDMGEXP=="M"] <- data$CROPDMG[data$CROPDMGEXP=="M"]*(10^6)
data$CROPDMG[data$CROPDMGEXP=="B"] <- data$CROPDMG[data$CROPDMGEXP=="B"]*(10^9)
Now that the data has been processed we can proceed to investigate the questions at hand.
We will investigate the effect on population health by investigating which events cause the most fatalities and injuries.
pophealthtable<-tapply(data$FATALITIES + data$INJURIES,data$EVTYPE, sum)
pophealthdata<-pophealthtable[order(pophealthtable,decreasing=TRUE)][1:5]
par(mar = c(6.5, 5.1, 4.1, 0.5), mgp = c(4, 1, 0))
names(pophealthdata)[3]<-"TSTM\nWIND"
names(pophealthdata)[2]<-"EXCESSIVE\nHEAT"
barplot(pophealthdata,main="Population damage by \nmost damaging 5 events",ylab="Sum of injuries and fatalities",names.arg=names(pophealthdata),las=2)
This shows that Tornados are the most damaging events to population health by far, dwarfing the effects of the next four most population damaging events.
We will investigate which events have the worst economic consequences by examining which event types cause the incur the largest property and crop damages.
ecotable<-tapply(data$PROPDMG + data$CROPDMG,data$EVTYPE, sum)
ecodata<-ecotable[order(ecotable,decreasing=TRUE)][1:5]
names(ecodata)[2]<-"HURRICANE\n/TYPHOON"
names(ecodata)[4]<-"STORM\nSURGE"
par(mar = c(6.5, 5.1, 4.1, 0.5), mgp = c(4, 1, 0))
barplot(ecodata,main="Economic damage by \nmost damaging 5 events",ylab="Sum of crop and property damage, $",names.arg=names(ecodata),las=2)
Thus we observe that Floods cause the most economic damage, causing approximately twice as much economic damage as the next most economically damaging events - hurricanes/typhoons. Tornadoes also cause significant amounts of economic damage, ranking third, but the extreme population health damage caused by tornadoes makes them a very serious event.