Author: Marcos de Aguiar
This study aims to make a data analysis to check the consequences of natural disasters, especially storm related. The study will use storm data from 1950 up to 2011 provided by NOAA (U.S. National Oceanic and Atmospheric Administration). This study will make 2 analyses. One is which kind of event has the biggest consequence in public health. The second is which kind has the biggest economic consequences.
library(dplyr)
First download and unzip the file from the following link:
Then uncompress an load the data into R.
stormDS <- read.csv("repdata-data-StormData.csv")
We will assume that the harm to the population health is the sum of fatalities and injuries.
Creates a dataset with the sum of fatalities, then another with the sum of injuries.
fatalitiesDS <- aggregate(stormDS$FATALITIES, by=list(EVTYPE=stormDS$EVTYPE), FUN=sum, na.rm=TRUE)
injuriesDS <- aggregate(stormDS$INJURIES, by=list(EVTYPE=stormDS$EVTYPE), FUN=sum, na.rm=TRUE)
Merges the 2 together, and create a column with the sum.
harmDS <- merge(fatalitiesDS, injuriesDS, by = "EVTYPE")
harmDS$total <- (harmDS$x.x + harmDS$x.y)
Gets the top 10 most harmfull events.
head(arrange(harmDS,desc(total)), 10)
## EVTYPE x.x x.y total
## 1 TORNADO 5633 91346 96979
## 2 EXCESSIVE HEAT 1903 6525 8428
## 3 TSTM WIND 504 6957 7461
## 4 FLOOD 470 6789 7259
## 5 LIGHTNING 816 5230 6046
## 6 HEAT 937 2100 3037
## 7 FLASH FLOOD 978 1777 2755
## 8 ICE STORM 89 1975 2064
## 9 THUNDERSTORM WIND 133 1488 1621
## 10 WINTER STORM 206 1321 1527
Bar graph showing the worst events.
top4DS <- head(arrange(harmDS,desc(total)), 4)
barplot(top4DS$total, names.arg = top4DS$EVTYPE, xlab = "Event type", ylab = "Health consequences", main = "Total health consequences by event.")
As we can observe, tornados and excessive heats are the most harmfull events for the general population health.
We will assume that economical consequences are the sum of property damages and crop damages.
Creates a dataset with the sum of property damage, then another with the sum of crop damages.
propertydamDS <- aggregate(stormDS$PROPDMG, by=list(EVTYPE=stormDS$EVTYPE), FUN=sum, na.rm=TRUE)
cropdamDS <- aggregate(stormDS$CROPDMG, by=list(EVTYPE=stormDS$EVTYPE), FUN=sum, na.rm=TRUE)
Merges the 2 together, and create a column with the sum.
damageDS <- merge(propertydamDS, cropdamDS, by = "EVTYPE")
damageDS$total <- (damageDS$x.x + damageDS$x.y)
Gets the top 10 most economically damaging events.
head(arrange(damageDS,desc(total)), 10)
## EVTYPE x.x x.y total
## 1 TORNADO 3212258.2 100018.52 3312276.7
## 2 FLASH FLOOD 1420124.6 179200.46 1599325.1
## 3 TSTM WIND 1335965.6 109202.60 1445168.2
## 4 HAIL 688693.4 579596.28 1268289.7
## 5 FLOOD 899938.5 168037.88 1067976.4
## 6 THUNDERSTORM WIND 876844.2 66791.45 943635.6
## 7 LIGHTNING 603351.8 3580.61 606932.4
## 8 THUNDERSTORM WINDS 446293.2 18684.93 464978.1
## 9 HIGH WIND 324731.6 17283.21 342014.8
## 10 WINTER STORM 132720.6 1978.99 134699.6
Bar graph showing the worst events.
top4DS <- head(arrange(damageDS,desc(total)), 4)
barplot(top4DS$total, names.arg = top4DS$EVTYPE, xlab = "Event type", ylab = "Economical damage", main = "Total economic damage by event.")
## Result:
As we can observe, tornados and flash floods are the most damaging events for the economy.