The U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database was explored to find out which weather events had severe effect over the public health and economy. This analysis can help the government for preparing for the severe weather events.
Loading the Data
storm <- read.csv("./repdata%2Fdata%2FStormData.csv", na.strings = c("", "NA"))
Aggregating and Ordering the total injuries and fatalities across USA by different weather events.
injurydata <- aggregate(INJURIES ~ EVTYPE, data = storm, sum)
injurydata <- injurydata[order(-injurydata$INJURIES),]
fatalitydata <- aggregate(FATALITIES ~ EVTYPE, data = storm, sum)
fatalitydata <- fatalitydata[order(-fatalitydata$FATALITIES),]
Pre-processing to calculate actual amount of damages(in $)
summary(storm$PROPDMGEXP)
## - ? + 0 1 2 3 4 5 6
## 1 8 5 216 25 13 4 4 28 4
## 7 8 B h H K m M NA's
## 5 1 40 1 6 424665 7 11330 465934
storm$PROPDMGEXP <- as.character(tolower(storm$PROPDMGEXP))
storm$PROPDMGEXP[storm$PROPDMGEXP == "-"] <- 0
storm$PROPDMGEXP[storm$PROPDMGEXP == "?"] <- 0
storm$PROPDMGEXP[storm$PROPDMGEXP == "+"] <- 0
storm$PROPDMGEXP[storm$PROPDMGEXP == "b"] <- 9
storm$PROPDMGEXP[storm$PROPDMGEXP == "k"] <- 3
storm$PROPDMGEXP[storm$PROPDMGEXP == "m"] <- 6
storm$PROPDMGEXP[storm$PROPDMGEXP == "h"] <- 2
storm$PROPDMGEXP <- as.numeric(storm$PROPDMGEXP)
summary(storm$CROPDMGEXP)
## ? 0 2 B k K m M NA's
## 7 19 1 9 21 281832 1 1994 618413
storm$CROPDMGEXP <- as.character(tolower(storm$CROPDMGEXP))
storm$CROPDMGEXP[storm$CROPDMGEXP == "?"] <- 0
storm$CROPDMGEXP[storm$CROPDMGEXP == "b"] <- 9
storm$CROPDMGEXP[storm$CROPDMGEXP == "k"] <- 3
storm$CROPDMGEXP[storm$CROPDMGEXP == "m"] <- 6
storm$CROPDMGEXP <- as.numeric(storm$CROPDMGEXP)
Calculating the actual damage (in $)
storm$PROPDMG <- storm$PROPDMG*10^storm$PROPDMGEXP
storm$CROPDMG <- storm$CROPDMG*10^storm$CROPDMGEXP
#assigning zero for NAs
storm$PROPDMG[which(is.na(storm$PROPDMG))] <- 0
storm$CROPDMG[which(is.na(storm$CROPDMG))] <- 0
Total damages (in $) due to the different weather events.
Economicdata <- aggregate(PROPDMG + CROPDMG ~ EVTYPE, data = storm, sum)
names(Economicdata)[2] <- "Overalldmg"
Economicdata <- Economicdata[order(-Economicdata$Overalldmg),]
library(ggplot2)
library(gridExtra)
head(injurydata)
## EVTYPE INJURIES
## 834 TORNADO 91346
## 856 TSTM WIND 6957
## 170 FLOOD 6789
## 130 EXCESSIVE HEAT 6525
## 464 LIGHTNING 5230
## 275 HEAT 2100
head(fatalitydata)
## EVTYPE FATALITIES
## 834 TORNADO 5633
## 130 EXCESSIVE HEAT 1903
## 153 FLASH FLOOD 978
## 275 HEAT 937
## 464 LIGHTNING 816
## 856 TSTM WIND 504
p1 <- ggplot(head(injurydata), aes(EVTYPE, INJURIES)) +
geom_bar(stat = "identity") +
xlab("Weather Events") +
ylab("Number of Injuries") +
ggtitle("Most harmful events with respect to population health")
p2 <- ggplot(head(fatalitydata), aes(EVTYPE, FATALITIES)) +
geom_bar(stat = "identity") +
xlab("Weather Events") +
ylab("Number of Fatalities")
grid.arrange(p1, p2, ncol=1)
From the above plots, we can see the top 6 weather events with most harm to human health. Also, it is clear that the TORNADO is the most harmful event with respect to population health.
head(Economicdata)
## EVTYPE Overalldmg
## 170 FLOOD 150319678250
## 411 HURRICANE/TYPHOON 71913712800
## 834 TORNADO 57362333944
## 670 STORM SURGE 43323541000
## 244 HAIL 18761221926
## 153 FLASH FLOOD 18243990872
ggplot(head(Economicdata), aes(EVTYPE, Overalldmg)) +
geom_bar(stat = "identity") +
xlab("Weather Events") +
ylab("Total damages (in Dollars)") +
ggtitle("Most harmful events with the greatest economic consequences") +
theme(axis.text.x=element_text(size=7))
From the above plots, we can see the top 6 weather events that has the greatest economis consequences.