Natural Disasters’ Impact on Health and Property

Synopsis

This analysis looks for the impact of natural disasters on health and property across the US. First, the data is prepared by aggregating the impact categories from the various types of disasters. Next, the top 10 items from this table is selected for further analysis. Then, the rank of each type of damage (fatalities, injuries and propdmg (property damage)) is included in the table. Finally, the aggregated information is plotted to show the analysis.

Data processing

First, read the data and convert event types to factors:

stormData <- read.csv('repdata-data-StormData.csv')
stormData$EVTYPE <- as.factor(stormData$EVTYPE)

Second aggregate data on the three categories of impact we want to analyze; fatalities, injuries and propdmg.

harmfulData <- aggregate(FATALITIES ~ EVTYPE, stormData, sum)
harmfulData$INJURIES <- aggregate(INJURIES ~ EVTYPE, stormData, sum)$INJURIES
harmfulData$PROPDMG <- aggregate(PROPDMG ~ EVTYPE, stormData, sum)$PROPDMG

Third, convert the columns to contain numeric values

harmfulData$FATALITIES <- as.numeric(harmfulData$FATALITIES)
harmfulData$INJURIES <- as.numeric(harmfulData$INJURIES)
harmfulData$PROPDMG <- as.numeric(harmfulData$PROPDMG)

Fourth, retain only the top 10 disaster categories for further analysis

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

fatTop <- top_n(harmfulData, n=10, FATALITIES)

Fifth, place the rank of the categories from this top 10 selection back into the tagble

fatTop$FATALITIES_rank <- rank(fatTop$FATALITIES)
fatTop$INJURIES_rank <- rank(fatTop$INJURIES)
fatTop$PROPDMG_rank <- rank(fatTop$PROPDMG)

Sixth, melt these event type categories to prepare simultaneous plotting.

library(reshape2)
meltdf <- melt(fatTop[-(2:4)],id="EVTYPE")

Results

This plot shows how the three damage categories fatalities, injuries and propdmg rank. Rank 10 means it was the category with the highest damages and 1 is the lowest, within this top 10 selection. Only 10 categories are included for readability reasons. The plot shows that all categories agree that tornadoes were the most damaging. For other categories, there is some variation in how they rank. Thus, the plot illustrates how the various disasters impact life and treasure, and in what cases they are the same.

library(ggplot2)
f <- which.max(harmfulData$FATALITIES)
i <- which.max(harmfulData$INJURIES)
harmfulData[f,]

##      EVTYPE FATALITIES INJURIES PROPDMG
## 830 TORNADO       5633    91346 3212258

harmfulData[i,]

##      EVTYPE FATALITIES INJURIES PROPDMG
## 830 TORNADO       5633    91346 3212258

g <- ggplot(meltdf,aes(x=EVTYPE,y=value, color=variable, group=variable)) + geom_line() + scale_y_continuous(breaks = seq(0, 10, 1)) + labs(title = "How disaster categories rank") + ylab('Impact rank (10=high)') + xlab('Disaster category')
g

Natural Disasters’ Impact on Health and Property

LB

22 august 2016

Synopsis

Data processing

Results