Introduction / Synopsis:

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

After careful analysis of the data provided (NOAA), Tornados were identified as the highest single impact from a damage (property + crops) and severe incidents (injuries and death.)

Setup and load files

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(reshape2)
library(ggplot2)
library(ggthemes)
# download file and unzip
downloadFile <- download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",destfile = "Storm.csv.bz2")
storm <- read.csv(bzfile("Storm.csv.bz2"))
unlink("Storm.csv.bz2")

Data Processing

# Selects only the data columns which include type of disaster, fatality and injury counts.
keep <- c("EVTYPE", "FATALITIES","INJURIES")
storm_data <- storm[keep]

# Aggregates the sums for fatalities and injuries
suppressWarnings(fatalities_and_injuries_sum <- storm_data %>% group_by(EVTYPE) %>% summarise_all(funs(sum)))

# sort by number of fatalities and injuries
fatalities_and_injuries_sum <- fatalities_and_injuries_sum %>% mutate(total = FATALITIES + INJURIES)
fatalities_and_injuries_sum <- arrange(fatalities_and_injuries_sum, -total)
fatalities_and_injuries_sum$total <- NULL

# Keep only the top 20 fatalities and injury events
top20_events_by_fatlities_injuries <- fatalities_and_injuries_sum[1:20,]
top20_events_by_fatlities_injuries_melt <- melt(top20_events_by_fatlities_injuries, id.vars = "EVTYPE")
top20_events_by_fatlities_injuries_melt$EVTYPE <- factor(top20_events_by_fatlities_injuries_melt$EVTYPE, levels = top20_events_by_fatlities_injuries_melt$EVTYPE[order(c(1:20))])

# Selects only data columns which include event type, property damage and crop damage
keep <- c("EVTYPE", "PROPDMG","CROPDMG")
storm_data <- storm[keep]

# Aggregates the sums for the property and crop damage
property_and_crop_sum <- storm_data %>% group_by(EVTYPE) %>% summarise_all(funs(sum))
property_and_crop_sum <- property_and_crop_sum %>% mutate(total = PROPDMG + CROPDMG)
property_and_crop_sum <- arrange(property_and_crop_sum, -total)

# Keep only the top 20 property and crop events
top20_property_and_crop <- property_and_crop_sum[1:20,]
top20_property_and_crop$total <- NULL
top20_property_and_crop_melt <- melt(top20_property_and_crop, id.vars = "EVTYPE")
top20_property_and_crop_melt$EVTYPE <- factor(top20_property_and_crop_melt$EVTYPE, levels = top20_property_and_crop_melt$EVTYPE[order(c(1:20))])

Plots

Plots demonstrate the top 20 injury and death impact, AND the property and crop damages on different barplots.

Conclusion:

Tornados have the highest fatality + injuries AND highest economic impact based on the data provided for analysis. Excessive heat was the second most impactful on fatality + injuries at ~8.5% of the total impact of a Tornado. Flash flood was the second most impactful on property + crop damages at ~48% of the Tornado impact.