Synopsis

The NOAA storm database was processed and analysed for total damage to both human health and economic activity for the entire recorded period.

It was found that tornados posed the greatest risk to both health and property damage, while drought was responsible for the most crop damage. Flooding contributed substantially to all metrics although it was less dangerous to people than excessive heat exposure.

Data Processing

U.S. National Oceanic and Atmospheric Administration’s storm database was used for this analysis. This database was downloaded from the following link as a compressed CSV file, and loaded into RStudio.

As this analysis focusses primarily on human health and economic damage as it correlates to event type, the relevent columns were subsetted to reduce processing time. The factors considered are: Event type, Fatalities, Injuries, Crop damage and Property damage.

As the exponential values are stored in a separate column for property and crop damage, described by a letter representation rather than a number (h for hundred, m for million) between hundreds and billions of dollars, a function must be used to find the real value for comparison. One the factor is know, the true damage value may be found by multipication with the PROPDMG and CROPDMG data.

stormData <- read.csv(bzfile("repdata%2Fdata%2FStormData.csv.bz2"))
stormDataSub <- stormData[,c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]


stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP==0]<-0
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP==""]<-0
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP==" "]<-0
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="0"]<-0
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="h"]<-100
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="H"]<-100
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="k"]<-1000
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="K"]<-1000
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="m"]<-100000
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="M"]<-100000
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="b"]<-100000000
stormDataSub$PROPEXPNUM[stormDataSub$PROPDMGEXP=="B"]<-100000000

stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP==0]<-0
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP==""]<-0
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP==" "]<-0
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="0"]<-0
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="h"]<-100
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="H"]<-100
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="k"]<-1000
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="K"]<-1000
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="m"]<-100000
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="M"]<-100000
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="b"]<-100000000
stormDataSub$CROPEXPNUM[stormDataSub$CROPDMGEXP=="B"]<-100000000

stormDataSub$Propdamage<-stormDataSub$PROPDMG*stormDataSub$PROPEXPNUM
stormDataSub$Cropdamage<-stormDataSub$CROPDMG*stormDataSub$CROPEXPNUM

Results

Effects on human health

fatalities <- aggregate(FATALITIES ~ EVTYPE, stormDataSub, FUN = sum)
injuries <- aggregate(INJURIES ~ EVTYPE, stormDataSub, FUN = sum)

fatalitiesTop<-fatalities[order(-fatalities$FATALITIES), ][1:10, ]
InjuriesTop<-injuries[order(-injuries$INJURIES), ][1:10, ]

ggplot(fatalitiesTop, aes(x=reorder(EVTYPE, FATALITIES), y=FATALITIES, fill=EVTYPE)) + 
geom_bar(fill="skyblue2", stat="identity")+ggtitle("Figure 1: Total Fatalities by Event type, 1950-2011") + xlab("Event Type") + ylab("Total Fatalities") + theme(legend.position="none") + coord_flip()

ggplot(InjuriesTop, aes(x=reorder(EVTYPE, INJURIES), y=INJURIES, fill=EVTYPE)) + 
geom_bar(fill="skyblue4", stat="identity")+ggtitle("Figure 2: Total Injuries by Event type, 1950-2011") + xlab("Event Type") + ylab("Total Injuries") + theme(legend.position="none") + coord_flip()

Tornadoes account for a substantial proportion of both injuries and fatalities, followed by heat, flood and thunderstorm wind.

Effects on economic activity

propdmg <- aggregate(Propdamage ~ EVTYPE, stormDataSub, FUN = sum)
cropdmg <- aggregate(Cropdamage ~ EVTYPE, stormDataSub, FUN = sum)

propdmgTop<-propdmg[order(-propdmg$Propdamage), ][1:10, ]
cropdmgTop<-cropdmg[order(-cropdmg$Cropdamage), ][1:10, ]

ggplot(propdmgTop, aes(x=reorder(EVTYPE, Propdamage), y=Propdamage, fill=EVTYPE)) + 
geom_bar(fill="skyblue2", stat="identity")+ggtitle("Figure 3: Total Property Damage by Event type, 1950-2011") + xlab("Event Type") + ylab("Total Property Damage (dollars)") + theme(legend.position="none") + coord_flip()

ggplot(cropdmgTop, aes(x=reorder(EVTYPE, Cropdamage), y=Cropdamage, fill=EVTYPE)) + 
geom_bar(fill="skyblue4", stat="identity")+ggtitle("Figure 4: Total Crop Damage by Event type, 1950-2011") + xlab("Event Type") + ylab("Total Crop Damage (dollars)") + theme(legend.position="none") + coord_flip()

Most property damage is caused by flooding, followed by tornados and hurricanes. Drought, hail and flooding caused most crop damage.

A potential issue with the economic analysis is that the time-value of money is not accounted for. Irregular distributions of events through time and outliers could affect the accuracy of the analysis. Taking a shorter time period or correcting for inflation value may aid in accuracy.

It also does not account for geographical differences in event location, which is the major factor in using data to plan ahead for disaster management.