Synopsis

The goal of this analysis is to identify which severe weather events in the United States have the most significant impact on public health and the economy. Using the NOAA Storm Database (1950–2011), we aggregated fatalities and injuries to determine health risks and calculated property and crop damage to assess economic consequences. The data required cleaning to normalize economic multipliers for thousands, millions, and billions. Our findings indicate that Tornadoes are the most harmful to population health, causing the highest number of combined fatalities and injuries. Conversely, Floods result in the greatest economic damage, followed closely by Hurricanes/Typhoons and Storm Surges.

Data Processing

1. Loading the Data

We download the dataset from the provided URL and read it into R. Given the file size, we utilize the cache = TRUE option for this code chunk to speed up subsequent document knits.

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.5.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.5.3
# Using a clean URL string
fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
destfile <- "StormData.csv.bz2"

# Download using "libcurl" which is more reliable for HTTPS
if (!file.exists(destfile)) {
    download.file(fileUrl, destfile, method = "libcurl")
}

# This step might take a few minutes!
stormData <- read.csv(destfile)

2. Cleaning Health Impact Data

To address the question of which events are most harmful to population health, we sum the FATALITIES and INJURIES columns by event type (EVTYPE). We focus on the top 10 events to ensure the final visualization is clear.

health_impact <- stormData %>%
    group_by(EVTYPE) %>%
    summarise(Total_Fatalities = sum(FATALITIES), 
              Total_Injuries = sum(INJURIES),
              Total_Health = sum(FATALITIES + INJURIES)) %>%
    arrange(desc(Total_Health))

top_health <- head(health_impact, 10)

3. Cleaning Economic Impact Data

Economic consequences are measured using Property Damage (PROPDMG) and Crop Damage (CROPDMG). Since these values use letter codes (K, M, B) for scale, we convert them to numeric multipliers to calculate the total cost in dollars.

# Function to convert character codes to numeric multipliers
get_multiplier <- function(exp) {
    exp <- toupper(exp)
    if (exp == 'H') return(100)
    if (exp == 'K') return(1000)
    if (exp == 'M') return(1e+06)
    if (exp == 'B') return(1e+09)
    return(1)
}

# Apply multipliers to create total damage columns
stormData$PROP_MULT <- sapply(stormData$PROPDMGEXP, get_multiplier)
stormData$CROP_MULT <- sapply(stormData$CROPDMGEXP, get_multiplier)

economic_impact <- stormData %>%
    mutate(Total_Damage = (PROPDMG * PROP_MULT) + (CROPDMG * CROP_MULT)) %>%
    group_by(EVTYPE) %>%
    summarise(Grand_Total_Damage = sum(Total_Damage)) %>%
    arrange(desc(Grand_Total_Damage))

top_econ <- head(economic_impact, 10)

Results

1. Events Most Harmful to Population Health

To address the question of which events are most harmful to population health, we examined the combined total of fatalities and injuries. Based on the NOAA database, Tornadoes are the leading cause of health issues in the United States, followed by Excessive Heat and TSTM Wind.

library(ggplot2)
ggplot(top_health, aes(x = reorder(EVTYPE, -Total_Health), y = Total_Health)) +
    geom_bar(stat = "identity", fill = "firebrick") +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    labs(title = "Top 10 Most Harmful Weather Events (1950-2011)",
         subtitle = "Calculated as the sum of Fatalities and Injuries",
         x = "Event Type",
         y = "Total Fatalities and Injuries")
Figure 1: Top 10 Weather Events Affecting Population Health

Figure 1: Top 10 Weather Events Affecting Population Health

As shown in Figure 1, the impact of Tornadoes on human life and safety significantly outweighs all other categories, making them the primary public health concern among severe weather events.

2. Events with Greatest Economic Consequences

To determine which events have the greatest economic impact, we analyzed the sum of property and crop damage. After normalizing the damage multipliers, we found that Floods cause the highest total economic loss.

ggplot(top_econ, aes(x = reorder(EVTYPE, -Grand_Total_Damage), y = Grand_Total_Damage / 1e+09)) +
    geom_bar(stat = "identity", fill = "steelblue") +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    labs(title = "Top 10 Weather Events with Greatest Economic Impact",
         subtitle = "Total damage in Billions of USD (Property + Crops)",
         x = "Event Type",
         y = "Total Damage (Billions of USD)")

As illustrated in Figure 2, Floods are responsible for over $150 billion in damages, significantly higher than Hurricanes or Storm Surges. While Tornadoes are the most dangerous to people, Floods and tropical water systems are the most destructive to infrastructure and agriculture.