Which weather event is most damaging? (Reproducible Research: Peer Assessment 2)

Synopsis

This analysis attempts to determine which weather event type is most damaging to human health and to the economy using the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The data in the database spans from 1950 to 2011.

The data clearly suggests that tornados are the most damaging weather events to human health. By a smaller margin, the data suggests floods are the most damaging to the economy.

Data Processing

The following code will download the data into the working directory of R and name it “Storm_Data.csv”,

#fileURL <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
#download.file(fileURL,destfile="Storm_Data.csv")

The following code will read the data into R as a data table called “Storms”.

Storms<-read.csv("Storm_Data.csv",stringsAsFactors=FALSE)

Since the analysis to determine which type of storm is most damaging to health and to the economy requires storm types be clearing defined, cleaning of the variable EVTYPE is necessary. The following code capatilizes all letters in EVTYPE and then transforms character strings signifying the same storm type to one type. For example, the strings “Snow”,“Ice”,“Icy”, and “Freez” are converted to “Winter Storm” because they all describe similar weather phenomena.

Storms$EVTYPE<-toupper(Storms$EVTYPE)

Storms$EVTYPE[grep("HURRICANE",Storms$EVTYPE)]<-"HURRICANE"
Storms$EVTYPE[grep("TROPICAL STORM",Storms$EVTYPE)]<-"TROPICAL STORM"
Storms$EVTYPE[grep("TORNADO",Storms$EVTYPE)]<-"TORNADO"
Storms$EVTYPE[grep("SPOUT",Storms$EVTYPE)]<-"TORNADO"
Storms$EVTYPE[grep("WIND",Storms$EVTYPE)]<-"HIGH WIND"
Storms$EVTYPE[grep("WND",Storms$EVTYPE)]<-"HIGH WIND"
Storms$EVTYPE[grep("HEAT",Storms$EVTYPE)]<-"HEAT"
Storms$EVTYPE[grep("FIRE",Storms$EVTYPE)]<-"FIRE"
Storms$EVTYPE[grep("SNOW",Storms$EVTYPE)]<-"WINTER STORM"
Storms$EVTYPE[grep("FREEZ",Storms$EVTYPE)]<-"WINTER STORM"
Storms$EVTYPE[grep("ICE",Storms$EVTYPE)]<-"WINTER STORM"
Storms$EVTYPE[grep("ICY",Storms$EVTYPE)]<-"WINTER STORM"
Storms$EVTYPE[grep("WINT",Storms$EVTYPE)]<-"WINTER STORM"
Storms$EVTYPE[grep("BLIZZARD",Storms$EVTYPE)]<-"WINTER STORM"
Storms$EVTYPE[grep("FLOOD",Storms$EVTYPE)]<-"FLOOD"
Storms$EVTYPE[grep("FLD",Storms$EVTYPE)]<-"FLOOD"
Storms$EVTYPE[grep("THUNDERSTO",Storms$EVTYPE)]<-"THUNDERSTORM"
Storms$EVTYPE[grep("LIGHTN",Storms$EVTYPE)]<-"THUNDERSTORM"
Storms$EVTYPE[grep("HAIL",Storms$EVTYPE)]<-"THUNDERSTORM"
Storms$EVTYPE[grep("RIP C",Storms$EVTYPE)]<-"RIP CURRENT"
Storms$EVTYPE[grep("AVALAN",Storms$EVTYPE)]<-"AVALANCHE"
Storms$EVTYPE[grep("SURGE",Storms$EVTYPE)]<-"STORM SURGE"
Storms$EVTYPE[grep("TIDE",Storms$EVTYPE)]<-"STORM SURGE"

The analysis to determine which storm type is most damaging to the economy relies on the variables PROPDMG, and CROPDMG, signifing damage to property and damage to crops. These variables units are stored in the variables PROPDMGEXP and CROPDMGEXP. In this variables “K” signifies PROPDMG/CROPDMG is reported in thousands, “M” in millions, and “B” in billions. The following R code scales the PROPDMG and CROPDMG values appropriately based on the PROPDMGEXP and CROPDMGEXP variables.

k=function(a){
        a*1000
}
m=function(a){
        a*1000000
}
b=function(a){
        a*1000000000
}
Storms2<-subset(Storms,Storms$PROPDMG>0|Storms$CROPDMG>0)

group1<-subset(Storms2,Storms2$PROPDMGEXP=="K")
group2<-subset(Storms2,Storms2$PROPDMGEXP=="M")
group3<-subset(Storms2,Storms2$PROPDMGEXP=="B")
group4<-subset(Storms2,Storms2$PROPDMGEXP=="")

group1$PROPDMG=sapply(group1$PROPDMG,k)
group2$PROPDMG=sapply(group2$PROPDMG,m)
group3$PROPDMG=sapply(group3$PROPDMG,b)

Storms2<-rbind(group1,group2,group3,group4)

group5<-subset(Storms2,Storms2$CROPDMGEXP=="K")
group6<-subset(Storms2,Storms2$CROPDMGEXP=="M")
group7<-subset(Storms2,Storms2$CROPDMGEXP=="B")
group8<-subset(Storms2,Storms2$CROPDMGEXP=="")

group5$CROPDMG=sapply(group5$CROPDMG,k)
group6$CROPDMG=sapply(group6$CROPDMG,m)
group7$CROPDMG=sapply(group7$CROPDMG,b)

Storms2<-rbind(group5,group6,group7,group8)

Results

Plotting injuries and fatalities by storm type shows clearly that in this dataset, tornados by far cause the most damage to human health. The figure below illustrates this point.

inj<-as.data.frame(sort(tapply(Storms$INJURIES,Storms$EVTYPE,sum)))
names(inj)=c("injuries")
inj$EVTYPE=row.names(inj)

fat<-as.data.frame(sort(tapply(Storms$FATALITIES,Storms$EVTYPE,sum)))
names(fat)=c("fatalities")
fat$EVTYPE=row.names(fat)

par(mfrow=c(1, 2),oma=c(0,0,2,0))
colvect=c("gray","gray","gray","gray","gray","gray","gray","gray","gray", "gray", "gray", "gray","red")
barplot(inj[303:315,1],names.arg=inj[303:315,2], horiz=TRUE,cex.names=.6, las=1, main="Injuries by Storm Type",xpd=FALSE,col=colvect)
barplot(fat[303:315,1],names.arg=fat[303:315,2], horiz=TRUE,cex.names=.6, las=1, main="Fatalities by Storm Type",col=colvect)
title("Fig 1-Storm Damage to Health by Type", outer=TRUE)

plot of chunk unnamed-chunk-5

Plotting property and crop damage by storm type shows that floods are the most damaging weather events to the economy. The following code will create this plot with costs listed in billions.

prop<-tapply(Storms2$PROPDMG,Storms2$EVTYPE,sum)
crop<-tapply(Storms2$CROPDMG,Storms2$EVTYPE,sum)
damdata<-as.data.frame(cbind(prop,crop))
damdata$EVTYPE=row.names(damdata)
damdata$Tot=damdata$prop+damdata$crop
damdata<-damdata[with(damdata,order(Tot)),]

d=function(a){
        a/1000000000
}

plotdata<-as.matrix(damdata[81:90,1:2])
plotdata[,1]<-sapply(plotdata[,1],d)
plotdata[,2]<-sapply(plotdata[,2],d)
plotdata<-t(plotdata)
row.names(plotdata)=c("Property Damage","Crop Damage")      
par(mfrow=c(1,1))
barplot(plotdata,names.arg=damdata[81:90,3],main="Fig 2-Storm Damage to Property and Crops by Type",legend.text=row.names(plotdata),horiz=TRUE,cex.names=.6, las=1,xlab="Cost in Billions",args.legend = list(x="bottomright"))

plot of chunk unnamed-chunk-6

The following table shows the values for the events responsible for the most property and crop damage.

damdata$Tot=damdata$prop+damdata$crop
damdata<-damdata[with(damdata,order(Tot, decreasing=TRUE)),]
damdata[1:10,]
##                     prop      crop         EVTYPE       Tot
## FLOOD          1.676e+11 1.227e+10          FLOOD 1.798e+11
## HURRICANE      8.474e+10 5.505e+09      HURRICANE 9.024e+10
## TORNADO        5.859e+10 3.675e+08        TORNADO 5.896e+10
## STORM SURGE    4.797e+10 8.550e+05    STORM SURGE 4.798e+10
## THUNDERSTORM   1.811e+10 3.052e+09   THUNDERSTORM 2.116e+10
## WINTER STORM   1.240e+10 7.205e+09   WINTER STORM 1.961e+10
## HIGH WIND      1.604e+10 2.147e+09      HIGH WIND 1.819e+10
## DROUGHT        1.046e+09 1.397e+10        DROUGHT 1.502e+10
## FIRE           8.502e+09 4.033e+08           FIRE 8.905e+09
## TROPICAL STORM 7.714e+09 6.949e+08 TROPICAL STORM 8.409e+09