This is for an assignment of Coursera course ‘Reproducible Research’. The most harmful type of storm event against 1) population health and 2) economy was explored in U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The level of harms to population health was determined by a number of casualties with death or injury due to each storm event type in the country and the level of harms to economy was defined as a damage to properties, expressed in US dollar unit, for each storm event type. Against both health and economy, tornado had the greatest impact.
The data file repdata-data-StormData.csv.bz2 was read using the read.csv function and the variables of EVTYPE, FATALITIES, INJURIES and PROPDMG were selected out. Then, a new variable TOTAL_CASUALTIES, i.e. sum of FATALITIES and INJURIES, was created. Finally, for each storm type, the sum of TOTAL_CASUALTIES and the sum of PROPDMG were calculated then, the rank order of TOTAL_CASUALTIES and PROPDMG was used to answer the question: what type of strom event is most harmful to human health and economy. For discernbility reason, only top 5 harmful events are shown.
library(ggplot2)
storm <- read.csv("repdata-data-StormData.csv.bz2", header = T)
data <- storm[, c("EVTYPE", "FATALITIES","INJURIES", "PROPDMG")]
data$TOTAL_CASUALTIES <- data$FATALITIES + data$INJURIES
data2 <- aggregate(data[,2:5], list(data$EVTYPE), sum)
names(data2)[1] <- "EVENT_TYPE"
health <- data2[order(data2$TOTAL_CASUALTIES, decreasing = T),c(1,5)]
health <- health[1:5,] # top 5 harmful EVTYPE
health$EVENT_TYPE <- factor(health$EVENT_TYPE, levels=health$EVENT_TYPE)
economy<- data2[order(data2$PROPDMG, decreasing = T),]
economy <- economy[1:5, c(1,4)] #top 5 harmful EVTYPE to economy
economy$EVENT_TYPE <- factor(economy$EVENT_TYPE, levels=economy$EVENT_TYPE)
Table 1 shows 5 most harmful storm event types (EVENT_TYPE) based on the number of casualties (TOTAL_CASUALTIES; killed or injured). These data are also displayed in Figure 1. Tornado events caused the highest number of deaths or injuries across the US.
print("Table 1")
## [1] "Table 1"
print(health)
## EVENT_TYPE TOTAL_CASUALTIES
## 834 TORNADO 96979
## 130 EXCESSIVE HEAT 8428
## 856 TSTM WIND 7461
## 170 FLOOD 7259
## 464 LIGHTNING 6046
ggplot(health, aes(EVENT_TYPE, TOTAL_CASUALTIES/10^3)) + geom_point() +
ggtitle("Fig.1 The number of death or injury from 5 most harmful storm types")+
ylab("Number of death or injury, thousand")+
xlab("Types of storm")
Table 2 shows 5 most harmful storm types based on the property damage in US dollars. These data are also displayed in Figure 2. As is the case with human health, Tornado events had the highest impacts to economy across the US.
print("Table 2")
## [1] "Table 2"
print(economy)
## EVENT_TYPE PROPDMG
## 834 TORNADO 3212258.2
## 153 FLASH FLOOD 1420124.6
## 856 TSTM WIND 1335965.6
## 170 FLOOD 899938.5
## 760 THUNDERSTORM WIND 876844.2
ggplot(economy, aes(EVENT_TYPE, PROPDMG/10^3)) + geom_point() +
ylim(0,3500) +
ggtitle("Fig.2 Property damage from 5 most harmful storm types")+
ylab("Property Damage in thousand US dollar")+
xlab("Types of storm")