Synopsis

This analysis explores the NOAA Storm Database to identify the most harmful weather events in the United States in terms of public health and economic impact. The data includes events from 1950 to 2011. The analysis focuses on two main questions:

  1. Which types of events are most harmful to population health (fatalities and injuries)?
  2. Which types of events have the greatest economic consequences (property and crop damage)?

Results show that tornadoes have the highest impact on human health, while floods lead in economic damage.

Data Processing

knitr::opts_chunk$set(echo = TRUE, cache = TRUE)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(readr)
# Load the raw data
data <- read.csv("repdata_data_StormData.csv.bz2")
# Select relevant columns
storm_data <- data %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)

# Function to convert exponents
exp_converter <- function(exp) {
  ifelse(exp %in% c("K", "k"), 1e3,
         ifelse(exp %in% c("M", "m"), 1e6,
                ifelse(exp %in% c("B", "b"), 1e9, 1)))
}

# Calculate damage
storm_data <- storm_data %>%
  mutate(
    PROPDMGVAL = PROPDMG * exp_converter(PROPDMGEXP),
    CROPDMGVAL = CROPDMG * exp_converter(CROPDMGEXP),
    TOTALDMG = PROPDMGVAL + CROPDMGVAL
  )

Results

Most Harmful Events to Population Health

health_impact <- storm_data %>%
  group_by(EVTYPE) %>%
  summarise(
    Fatalities = sum(FATALITIES, na.rm = TRUE),
    Injuries = sum(INJURIES, na.rm = TRUE)
  ) %>%
  mutate(Total = Fatalities + Injuries) %>%
  arrange(desc(Total)) %>%
  top_n(10, Total)

ggplot(health_impact, aes(x = reorder(EVTYPE, -Total), y = Total)) +
  geom_bar(stat = "identity", fill = "tomato") +
  labs(title = "Top 10 Weather Events Causing Most Health Impact",
       x = "Event Type", y = "Total Injuries + Fatalities") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Events with the Greatest Economic Consequences

economic_impact <- storm_data %>%
  group_by(EVTYPE) %>%
  summarise(Economic_Damage = sum(TOTALDMG, na.rm = TRUE)) %>%
  arrange(desc(Economic_Damage)) %>%
  top_n(10, Economic_Damage)

ggplot(economic_impact, aes(x = reorder(EVTYPE, -Economic_Damage), y = Economic_Damage / 1e9)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  labs(title = "Top 10 Weather Events Causing Most Economic Damage",
       x = "Event Type", y = "Damage (in Billion USD)") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Conclusion

This information can help municipalities prioritize resources for disaster preparedness and response.