Introduction

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project explored the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, which tracks characteristics of major storms and weather events in the United States, including estimates of any fatalities, injuries, and property damage over the period during the years 1950 - 2011. Exploration of the database sought to answer the following two questions about severe weather events across the United States of America:

Synopsis

The CSV file containing the raw data was downloaded from the internet and loaded in R where:

Data Processing

library(dplyr)
library(ggplot2)
library(forcats)
# download data, if necessary
data_url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file_name <- "storm_data_csv.bz2"

if(!file.exists(file_name)){
        download.file(data_url, destfile = file_name, method = "curl")
}

# load data from disk
raw_data <- read.csv(file_name, header = TRUE)
# process data for analysis
cleaned_data <- raw_data %>%
        select(EVTYPE, FATALITIES, INJURIES, #
               PROPDMG, PROPDMGEXP, CROPDMG,
               CROPDMGEXP) %>%
        mutate(EVTYPE = as.factor(EVTYPE),
               PROPDMGEXP = case_when(PROPDMGEXP == "K" ~ 1e3,
                                      PROPDMGEXP == "M" ~ 1e6,
                                      PROPDMGEXP == "B" ~ 1e9,
                                      TRUE ~ 0
                                      ),
               CROPDMGEXP = case_when(CROPDMGEXP == "K" ~ 1e3,
                                      CROPDMGEXP == "M" ~ 1e6,
                                      CROPDMGEXP == "B" ~ 1e9,
                                      TRUE ~ 0
                                      )
               )

Results

# plot total causalities for top 10 weather event
cleaned_data %>%
        group_by(EVTYPE) %>%
        summarise(casualities = sum(FATALITIES + INJURIES, na.rm = TRUE)) %>%
        ungroup() %>%
        top_n(10) %>%
        ggplot(aes(x=casualities, y=fct_reorder(EVTYPE, casualities))) +
        geom_col() +
        labs(y = "Severe Weather Event",
             x = "Total Number of Direct or Indirect Casualities",
             title = "Top 10 Severe Weather Events with the Most Casualities in the United States from 1950 - 2011"
             ) +
        theme_classic() +
        theme(axis.title.x = element_text(size=11, face="bold", color = "black"),
              axis.title.y = element_text(size=11, face="bold", color = "black"),
              plot.title = element_text(size=12, face="bold", color = "black")
              )

Findings/Conclusion #1

Tornadoes were the severe weather events that resulted in the most causalities in the United States.

# plot economic damages for top 10 weather events
cleaned_data %>%
        group_by(EVTYPE) %>%
        summarise(econ_damage = sum((PROPDMG *PROPDMGEXP) + (CROPDMG * CROPDMGEXP), 
                                    na.rm = TRUE)) %>%
        ungroup() %>%
        top_n(10) %>%
        ggplot(aes(x=econ_damage, y=fct_reorder(EVTYPE, econ_damage))) +
        geom_col() +
        labs(y = "Severe Weather Event",
             x = "Total Economic Damage",
             title = "Top 10 Severe Weather Events with the Most Economic Damage from 1950 - 2011"
             ) +
        theme_classic() +
        theme(axis.title.x = element_text(size=11, face="bold", color = "black"),
              axis.title.y = element_text(size=11, face="bold", color = "black"),
              plot.title = element_text(size=12, face="bold", color = "black")
              )

Findings/Conclusion #2

Floods were the severe weather events that resulted in the most economic damage in the United States.