This document is a result of the analysis of “Storm Data” that is published from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The results suggested that “TORNADO” is the most harmful event both on population health and economic damages.
I describe the steps of processing data as followings.
1. read the file.
2. extract necessary columns for subsequent data analysis.
3. calculate the number of total health influences(variables
“FATALITIES”+“INJURIES”) by each event (EVTYPE).
Storm <- read.csv("repdata_data_StormData.csv")
library(dplyr)
StormEv <- select(Storm, EVTYPE, FATALITIES, INJURIES, PROPDMG)
EVHealth <- aggregate(FATALITIES+INJURIES~EVTYPE, data = StormEv, sum)
EVEcon <- aggregate(PROPDMG~EVTYPE, data = StormEv, sum)
head(EVHealth[order(EVHealth$`FATALITIES + INJURIES`, decreasing = T),])
## EVTYPE FATALITIES + INJURIES
## 834 TORNADO 96979
## 130 EXCESSIVE HEAT 8428
## 856 TSTM WIND 7461
## 170 FLOOD 7259
## 464 LIGHTNING 6046
## 275 HEAT 3037
The table shows that “TORNADO” induced the maximum number of population health damages.
head(EVEcon[order(EVEcon$PROPDMG, decreasing = T),])
## EVTYPE PROPDMG
## 834 TORNADO 3212258.2
## 153 FLASH FLOOD 1420124.6
## 856 TSTM WIND 1335965.6
## 170 FLOOD 899938.5
## 760 THUNDERSTORM WIND 876844.2
## 244 HAIL 688693.4
The table shows that “TORNADO” also induced the maximum number of property damages.
Data plot of top 6 events of each question.
Top6Event <- data.frame("Health" = head(EVHealth[order(EVHealth$`FATALITIES + INJURIES`, decreasing = T),]), "Economy" = head(EVEcon[order(EVEcon$PROPDMG, decreasing = T),]))
library(ggplot2)
g1 <- ggplot(Top6Event, aes(Health.EVTYPE, Health.FATALITIES...INJURIES))
g1 + geom_bar(stat = "identity") + labs(x = NULL, y = "Total", title = "Damage on Population Health")
g2 <- ggplot(Top6Event, aes(Economy.EVTYPE, Economy.PROPDMG))
g2 + geom_bar(stat = "identity") + labs(x = NULL, y = "Total", title = "Damage on Property")