SYNOPSIS

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

In this research, the researcher used the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database (1950 - 2011) to analyze the characteristics of major storms and weather events in the United States and to answer the following basic questions about severe weather events:

The analysis revealed that tornadoes are the most harmful weather event to the population health, followed by the excessive heat, TSTM Winds and floods. On the other hand, floods caused the most economic damage among all the severe weather events types, mostly due to losses on property, followed by Hurricane/Typhoon and Tornadoes.

DATA PROCESSING

  1. Load the required packages
library(plyr)
library(ggplot2)
  1. Read the storm dataset from NOAA
data <- read.csv("StormData.csv", header = TRUE, stringsAsFactors = FALSE)
  1. Extract and simplify the data that will be used
eventtype <- c('EVTYPE','FATALITIES','INJURIES','PROPDMG','PROPDMGEXP','CROPDMG','CROPDMGEXP')
stormdata <- data[, eventtype]
  1. Clean the data
newdata <- function(x) gsub("^\\s+|\\s+$", "", x)
stormdata$EVTYPE <- toupper(stormdata$EVTYPE)
stormdata$EVTYPE <- newdata(stormdata$EVTYPE)
  1. Calculate the Property Damage
stormdata$PROPDMGEXP[is.na(stormdata$PROPDMGEXP)] <- 0
stormdata$PROPDMGEXP[stormdata$PROPDMGEXP == ""] <- 1
stormdata$PROPDMGEXP[grep("[-+?]", stormdata$PROPDMGEXP)] <- 1
stormdata$PROPDMGEXP[grep("[Hh]", stormdata$PROPDMGEXP)] <- 100
stormdata$PROPDMGEXP[grep("[Kk]", stormdata$PROPDMGEXP)] <- 1000
stormdata$PROPDMGEXP[grep("[Mm]", stormdata$PROPDMGEXP)] <- 1e+06
stormdata$PROPDMGEXP[grep("[Bb]", stormdata$PROPDMGEXP)] <- 1e+09
stormdata$PROPDMGEXP <- as.numeric(stormdata$PROPDMGEXP)
stormdata$PROPDMG <- stormdata$PROPDMGEXP * stormdata$PROPDMG
  1. Calculate the Crop Damage
stormdata$CROPDMGEXP[is.na(stormdata$CROPDMGEXP)] <- 0
stormdata$CROPDMGEXP[stormdata$CROPDMGEXP == ""] <- 1
stormdata$CROPDMGEXP[grep("[-+?]", stormdata$CROPDMGEXP)] <- 1
stormdata$CROPDMGEXP[grep("[Hh]", stormdata$CROPDMGEXP)] <- 100
stormdata$CROPDMGEXP[grep("[Kk]", stormdata$CROPDMGEXP)] <- 1000
stormdata$CROPDMGEXP[grep("[Mm]", stormdata$CROPDMGEXP)] <- 1e+06
stormdata$CROPDMGEXP[grep("[Bb]", stormdata$CROPDMGEXP)] <- 1e+09
stormdata$CROPDMGEXP <- as.numeric(stormdata$CROPDMGEXP)
stormdata$CROPDMG <- stormdata$CROPDMGEXP * stormdata$CROPDMG
  1. Calculate the sums of each event type
health <- ddply(stormdata, .(EVTYPE), summarize, fatalities = sum(FATALITIES), injuries = sum(INJURIES), totalhealth = sum(FATALITIES + INJURIES))
rankedfatalities <- head(health[order(health$fatalities, decreasing = T), ], n = 10)[c(1,2)]
rankedinjuries <- head(health[order(health$injuries, decreasing = T), ], n = 10)[c(1,3)]
rankedhealth <- head(health[order(health$totalhealth, decreasing = T), ], n = 10)[c(1,4)]

damage <- ddply(stormdata, .(EVTYPE), summarize, prop = sum(PROPDMG), crop = sum(CROPDMG), totaldamage = sum(CROPDMG + PROPDMG))
rankedprop <- head(damage[order(damage$prop, decreasing = T), ], n = 10)[c(1,2)]
rankedcrop <- head(damage[order(damage$crop, decreasing = T), ], n = 10)[c(1,3)]
rankeddamage <- head(damage[order(damage$totaldamage, decreasing = T), ], n = 10)
  1. Plotting events with highest fatalities and injuries
ggplot(rankedfatalities, aes(x = EVTYPE, y = fatalities, fill = EVTYPE)) + geom_bar(stat = "identity") + labs(x = "Type of Events", y = "Fatalities", fill = "Type of Events") + ggtitle("Events with Highest Fatalities") + theme(axis.text.x = element_blank())

ggplot(rankedinjuries, aes(x = EVTYPE, y = injuries, fill = EVTYPE)) + geom_bar(stat = "identity") + labs(x = "Type of Events", y = "Injuries", fill = "Type of Events") + ggtitle("Events with Highest Injuries") + theme(axis.text.x = element_blank())

  1. Plotting events with highest property and crop damages
ggplot(rankedprop, aes(x = EVTYPE, y = prop, fill = EVTYPE)) + geom_bar(stat = "identity") + labs(x = "Type of Events", y = "Injuries", fill = "Type of Events") + ggtitle("Events with Highest Property Damages") + theme(axis.text.x = element_blank())

ggplot(rankedcrop, aes(x = EVTYPE, y = crop, fill = EVTYPE)) + geom_bar(stat = "identity") + labs(x = "Type of Events", y = "Crop Damages", fill = "Type of Events") + ggtitle("Events with Highest Crop Damages") + theme(axis.text.x = element_blank())

RESULTS AND CONCLUSIONS

  1. Events that are most harmful with respect to population health
ggplot(rankedhealth, aes(x = EVTYPE, y = totalhealth, fill = EVTYPE)) + geom_bar(stat = "identity") + labs(x = "Type of Events", y = "Fatalities and Injuries", fill = "Type of Events") + ggtitle("Top 10 Most Harmful Events to Human Health") + theme(axis.text.x = element_blank())

  1. Events that have a Greatest Impact on Economy.
ggplot(rankeddamage, aes(x = EVTYPE, y = totaldamage, fill = EVTYPE)) + geom_bar(stat = "identity") + labs(x = "Type of Events", y = "Properties and Crops Damages", fill = "Type of Events") + ggtitle("Top 10 Most Damaging Events that have a Greatest Impact on Economy") + theme(axis.text.x = element_blank())