Analysis of Natural Disaster Data from NOAA

Synopsis

As an exercise in reproducible research, this publication will aim to investigate the severity of natural disasters based on the health and property damage that they cause. The data for this exercise can be found at https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 with the documentation at https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf.

The analysis will take into account injuries and deaths caused by weather events under the Health section and crop and property damage under the Economy section.

Data Processing

Import and conditioning of data for analysis

The entire analysis has been conducted in RStudio. R version 3.3.1 (2016-06-21), platform x86_64-apple-darwin13.4.0.

Download and load code and related libraries

Data Conditioning In order to include all results in our analysis first, only relevant columns of data are used and the “EXP” column values are incorporated into the Crop and Property damage data in order to get sensible numerical values.

trimmed<-stormdata[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]

trimmed$CROPDMGEXP<-toupper(trimmed$CROPDMGEXP)

trimmed$cropnumeric<-ifelse(trimmed$CROPDMGEXP=="K",1000*trimmed$CROPDMG,ifelse(trimmed$CROPDMGEXP=="M",1000000*trimmed$CROPDMG,ifelse(trimmed$CROPDMGEXP=="B",1000000000*trimmed$CROPDMG,ifelse(trimmed$CROPDMGEXP=="2",100*trimmed$CROPDMG,trimmed$CROPDMG))))

trimmed$PROPDMGEXP<-toupper(trimmed$PROPDMGEXP)

trimmed<-subset(trimmed,trimmed$PROPDMGEXP %in% c("K","M","B","0","5","6","4","2","3","7","1","8",""))

trimmed$propnumeric<-ifelse(trimmed$PROPDMGEXP=="K",1000*trimmed$PROPDMG,ifelse(trimmed$PROPDMGEXP=="M",1000000*trimmed$PROPDMG,ifelse(trimmed$PROPDMGEXP=="B",1000000000*trimmed$PROPDMG,ifelse(trimmed$CROPDMGEXP=="2",100*trimmed$PROPDMG,ifelse(trimmed$PROPDMGEXP=="5",trimmed$PROPDMG*100000,ifelse(trimmed$PROPDMGEXP=="6",1000000*trimmed$PROPDMG,ifelse(trimmed$PROPDMGEXP=="",trimmed$PROPDMG,(10^as.numeric(trimmed$PROPDMGEXP))*trimmed$PROPDMG)))))))
## Warning in ifelse(trimmed$PROPDMGEXP == "", trimmed$PROPDMG,
## (10^as.numeric(trimmed$PROPDMGEXP)) * : NAs introduced by coercion

Results

Investigating Effects on Health

Instead of looking at the total numbers of injuries and deaths related to weather events, only the average rates will be reported, this also allows us to incorporate all the data from the dataset as previous years may not necessarily include data on all event types. The Tropical Storm Gordon has been excluded from the figures as it has not been categorised under an event type despite not being an event type per se.

fatalities<-aggregate(FATALITIES~EVTYPE,data=trimmed,FUN=mean)
fatalities<-fatalities[order(fatalities$FATALITIES,decreasing=TRUE),]

injuries<-aggregate(INJURIES~EVTYPE,data=trimmed,FUN=mean)
injuries<-injuries[order(injuries$INJURIES,decreasing=TRUE),]

plotfatal<-ggplot(data = fatalities[c(6,5,4,2,1),], aes(x = reorder(EVTYPE,FATALITIES), y = FATALITIES)) +geom_bar(stat = "identity", position = "stack")+coord_flip()+ggtitle("Number of Fatalities")+ theme(axis.title.y=element_blank())

plotinjure<-ggplot(data = injuries[c(6,5,4,3,1),], aes(x = reorder(EVTYPE,INJURIES), y = INJURIES)) +geom_bar(stat = "identity", position = "stack")+coord_flip()+ggtitle("Number of Injuries")+ theme(axis.title.y=element_blank())

grid.arrange(plotfatal,plotinjure,ncol=2)
Top 5 most harmful Weather Events

Top 5 most harmful Weather Events

As it can be seen in the plots, Heat Waves seem to be the most dangerous events in terms of health risks. It must be borne in mind that these are only the mean values, without taking into account the frequency of the event type.

Investigating Effects on Economy

A similar route was taken in investigating the economic effects of weather events, as before, Hurricane Opal was removed from the plots due to misclassification.

cropdmg<-aggregate(cropnumeric~EVTYPE,data=trimmed,FUN=mean)
cropdmg<-cropdmg[order(cropdmg$cropnumeric,decreasing=TRUE),]

propdmg<-aggregate(propnumeric~EVTYPE,data=trimmed,FUN=mean)
propdmg<-propdmg[order(propdmg$propnumeric,decreasing=TRUE),]

plotcrop<-ggplot(data = cropdmg[c(5,4,3,2,1),], aes(x = reorder(EVTYPE,cropnumeric), y = cropnumeric)) +geom_bar(stat = "identity", position = "stack")+coord_flip()+ggtitle("Crop Damage")+ylab("Damage in USD")+ theme(axis.title.y=element_blank())

plotprop<-ggplot(data = propdmg[c(6,5,3,2,1),], aes(x = reorder(EVTYPE,propnumeric), y = propnumeric)) +geom_bar(stat = "identity", position = "stack")+coord_flip()+ggtitle("Property Damage")+ylab("Damage in USD")+ theme(axis.title.y=element_blank())

grid.arrange(plotprop,plotcrop,ncol=2)
Top 5 most Economically harmful Weather Events (amounts in USD)

Top 5 most Economically harmful Weather Events (amounts in USD)