Reproducible Research: Peer Assessment 2

Health and Economic Impacts of Storms Since 1950 Using NOAA Data

Synopsis

The data set was found to have two quantitative measures each for both health and economic damages. So, for each outcome (health and economic), the two respective measures were summed over every event type and merged into new data sets. The events with the top ten combined impacts (fatalities and injuries or crop and property) were plotted on bar graphs filled according to the measures.

Data Processing

The raw data set was read into R using read.csv and cached for further use. No other processing was done.

setwd("~/R Files")
stormdata <- read.csv("~/R Files/repdata%2Fdata%2FStormData.csv", header = TRUE)

Question 1: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

The two quantitative measures(Fatalities and Injuries) were aggregated by summing and merged into a single dataframe. The data frame was then ordered by the sum total of Fatalities and Injuries for each event type.

library(ggplot2)
library(gridExtra)
library(reshape2)
stormdata_health<-merge(aggregate(stormdata$FATALITIES, by=list(EVTYPE=stormdata$EVTYPE), FUN=sum), aggregate(stormdata$INJURIES, by=list(EVTYPE=stormdata$EVTYPE), FUN=sum), by="EVTYPE")
names(stormdata_health)<-c("EVTYPE", "FATALITIES", "INJURIES")
stormdata_health$EVTYPE<-as.character(stormdata_health$EVTYPE)
stormdata_health<-stormdata_health[order(stormdata_health$FATALITIES+stormdata_health$INJURIES, decreasing = TRUE),]

With the data aggregated by event type and ordered, a list was drawn of the top ten events by combined impact. The data frame was then melted in order to distinguish the health impact from Fatalities and Injuries respectively and the list was used to subset the necessary events for the final graph.

event_list<-stormdata_health[1:10,1]
stormdata_health<-melt(stormdata_health, id.vars="EVTYPE", measure.vars=c("FATALITIES", "INJURIES"))
stormdata_health_top<-stormdata_health[which(stormdata_health$EVTYPE %in% event_list),]

Question 1 Results

##                EVTYPE   variable value
## 1             TORNADO FATALITIES  5633
## 2      EXCESSIVE HEAT FATALITIES  1903
## 3           TSTM WIND FATALITIES   504
## 4               FLOOD FATALITIES   470
## 5           LIGHTNING FATALITIES   816
## 6                HEAT FATALITIES   937
## 7         FLASH FLOOD FATALITIES   978
## 8           ICE STORM FATALITIES    89
## 9   THUNDERSTORM WIND FATALITIES   133
## 10       WINTER STORM FATALITIES   206
## 986           TORNADO   INJURIES 91346
## 987    EXCESSIVE HEAT   INJURIES  6525
## 988         TSTM WIND   INJURIES  6957
## 989             FLOOD   INJURIES  6789
## 990         LIGHTNING   INJURIES  5230
## 991              HEAT   INJURIES  2100
## 992       FLASH FLOOD   INJURIES  1777
## 993         ICE STORM   INJURIES  1975
## 994 THUNDERSTORM WIND   INJURIES  1488
## 995      WINTER STORM   INJURIES  1321

Using ggplot, the top 10 results were plotted using a subfilled bar graph to distinguish counts for Injuries against counts for Fatalities

g1<-ggplot(stormdata_health_top, aes(x=reorder(EVTYPE, -value), y=value, fill=variable))
g1+geom_bar(stat="identity")+theme(axis.text.x = element_text(size=8, angle=45))+xlab("Storm Event Type")+ylab("Combined Impact")+ggtitle("Health Outcomes from Storm Events (Top 10 Combined Impacts)")

##

Question 2: Across the United States, which types of events have the greatest economic consequences?

The exact same methodology was used to answer this question with the caveat that the two quantitative variables used (CROPDMG and PROPDMG) had to be consolidated with their corresponding orders of magnitude (in PROPDMGEXP and CROPDMGEXP respectively) in order to accurately measure economic impact for each.

expmatch_PROPDMG<-data.frame(EXP=c(levels(stormdata$PROPDMGEXP)), FACTOR=c(1,1,1,1,1,1e1,1e2,1e3,1e4,1e5,1e6,1e7,1e8,1e9,1e2,1e2,1e3,1e6,1e6))
stormdata$PROPDMG<-stormdata$PROPDMG*expmatch_PROPDMG[match(stormdata$PROPDMGEXP, expmatch_PROPDMG$EXP),2]

expmatch_CROPDMG<-data.frame(EXP=c(levels(stormdata$CROPDMGEXP)), FACTOR=c(1,1,1,1e2,1e9,1e3,1e3,1e6,1e6))
stormdata$CROPDMG<-stormdata$CROPDMG*expmatch_CROPDMG[match(stormdata$CROPDMGEXP, expmatch_CROPDMG$EXP),2]

Once the damage values were modified by their corresponding orders of magnitude it was business as usual.

stormdata_econ<-merge(aggregate(stormdata$PROPDMG, by=list(EVTYPE=stormdata$EVTYPE), FUN=sum), aggregate(stormdata$CROPDMG, by=list(EVTYPE=stormdata$EVTYPE), FUN=sum), by="EVTYPE")
names(stormdata_econ)<-c("EVTYPE", "PROPDMG", "CROPDMG")
stormdata_econ$EVTYPE<-as.character(stormdata_econ$EVTYPE)
stormdata_econ<-stormdata_econ[order(stormdata_econ$PROPDMG+stormdata_econ$CROPDMG, decreasing = TRUE),]

event_list2<-stormdata_econ[1:10,1]
stormdata_econ<-melt(stormdata_econ, id.vars="EVTYPE", measure.vars=c("PROPDMG", "CROPDMG"))
stormdata_econ_top<-stormdata_econ[which(stormdata_econ$EVTYPE %in% event_list2),]

Question 2 Results

##                EVTYPE variable        value
## 1               FLOOD  PROPDMG 144657709807
## 2   HURRICANE/TYPHOON  PROPDMG  69305840000
## 3             TORNADO  PROPDMG  56947380677
## 4         STORM SURGE  PROPDMG  43323536000
## 5                HAIL  PROPDMG  15735267513
## 6         FLASH FLOOD  PROPDMG  16822673979
## 7             DROUGHT  PROPDMG   1046106000
## 8           HURRICANE  PROPDMG  11868319010
## 9         RIVER FLOOD  PROPDMG   5118945500
## 10          ICE STORM  PROPDMG   3944927860
## 986             FLOOD  CROPDMG   5661968450
## 987 HURRICANE/TYPHOON  CROPDMG   2607872800
## 988           TORNADO  CROPDMG    414953270
## 989       STORM SURGE  CROPDMG         5000
## 990              HAIL  CROPDMG   3025954473
## 991       FLASH FLOOD  CROPDMG   1421317100
## 992           DROUGHT  CROPDMG  13972566000
## 993         HURRICANE  CROPDMG   2741910000
## 994       RIVER FLOOD  CROPDMG   5029459000
## 995         ICE STORM  CROPDMG   5022113500

The results were then plotted in a similar way.

g2<-ggplot(stormdata_econ_top, aes(x=reorder(EVTYPE, -value), y=value, fill=variable))
g2+geom_bar(stat="identity")+theme(axis.text.x = element_text(size=8, angle=45))+xlab("Storm Event Type")+ylab("Combined Impact")+ggtitle("Economic Outcomes from Storm Events (Top 10 Combined Impacts)")

##