Population Health and Economic Impact of Weather Events - NOAA Storm Data, 1950-2011

Synopsis

This analysis explores the U.S. National Oceanic and Atmospheric Administration (NOAA) Storm Database from 1950 to November 2011 to identify severe weather events with the highest population health and economic impact. The raw CSV data file is loaded and processed entirely within this document. Population health harm is calculated by combining fatalities and injuries. Economic damage is calculated by summing property damage and crop damage, after converting exponential notation (e.g., ‘K’, ‘M’, ‘B’) into numerical dollar amounts. The results show that tornadoes are the most harmful event type for population health. For economic damage, floods are the most costly weather event, followed by hurricanes/typhoons and tornadoes. These findings provide critical prioritization data for municipal managers preparing for severe weather.

Data Processing

Load libraries

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.5.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(knitr)

Set working directory

setwd("C:\\Users\\aliss\\OneDrive\\Documents\\04-Professional Development\\Coursera_Data Science - Foundations using R Specialization_202507\\Course 5 - Reproducible Research\\Course project 2")

Load the data

storm_data <- read.csv("repdata_data_StormData.csv", header = TRUE)

Select only columns needed for analysis

storm_data_filtered <- storm_data %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)

Display the first few rows of the filtered data

head(storm_data_filtered)
##    EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO          0       15    25.0          K       0           
## 2 TORNADO          0        0     2.5          K       0           
## 3 TORNADO          0        2    25.0          K       0           
## 4 TORNADO          0        2     2.5          K       0           
## 5 TORNADO          0        2     2.5          K       0           
## 6 TORNADO          0        6     2.5          K       0

Create function to convert exponent character codes to numeric multipliers

exponent_to_multiplier <- function(exp) {
  # Convert to uppercase for case-insensitivity
  exp <- toupper(exp)
  
  # Define the multiplier based on the character code
  multiplier <- case_when(
    exp == "H" ~ 100,      # Hundred
    exp == "K" ~ 1000,     # Thousand
    exp == "M" ~ 10^6,     # Million
    exp == "B" ~ 10^9,     # Billion
    exp %in% c("", "0", "+", "-", "?") ~ 1, # No multiplier / ignored
    TRUE ~ 0               # Catch-all for other non-standard codes
  )
  return(multiplier)
}

Create actual property and crop damage costs (in USD), health harm, and economic damage variables

storm_data_processed <- storm_data_filtered %>%
  mutate(
    # Calculate property damage cost
    prop_damage_multiplier = exponent_to_multiplier(PROPDMGEXP),
    property_damage = PROPDMG * prop_damage_multiplier,
    
    # Calculate crop damage cost
    crop_damage_multiplier = exponent_to_multiplier(CROPDMGEXP),
    crop_damage = CROPDMG * crop_damage_multiplier,
    
    # Calculate total harm to population health
    health_harm = FATALITIES + INJURIES,
    
    # Calculate total economic damage
    economic_consequence = property_damage + crop_damage
  ) %>%
  # Filter out events with zero impact (for efficiency, though not strictly required)
  filter(health_harm > 0 | economic_consequence > 0) %>%
  # Simplify the EVTYPE variable by converting to uppercase and trimming whitespace
  mutate(EVTYPE = toupper(trimws(EVTYPE)))

Display the calculated damage fields

head(storm_data_processed %>% select(EVTYPE, property_damage, crop_damage, health_harm))
##    EVTYPE property_damage crop_damage health_harm
## 1 TORNADO           25000           0          15
## 2 TORNADO            2500           0           0
## 3 TORNADO           25000           0           2
## 4 TORNADO            2500           0           2
## 5 TORNADO            2500           0           2
## 6 TORNADO            2500           0           6

Consolidate event types

# --- 1. Health Harm ---
health_harm_summary <- storm_data_processed %>%
  group_by(EVTYPE) %>%
  summarise(
    Total_Fatalities = sum(FATALITIES),
    Total_Injuries = sum(INJURIES),
    Total_Health_Harm = sum(health_harm)
  ) %>%
  arrange(desc(Total_Health_Harm))

top_10_health <- head(health_harm_summary, 10)

# --- 2. Economic Consequences ---
economic_consequence_summary <- storm_data_processed %>%
  group_by(EVTYPE) %>%
  summarise(
    Total_Property_Damage = sum(property_damage) / 10^9, # Convert to billions of USD
    Total_Crop_Damage = sum(crop_damage) / 10^9,         # Convert to billions of USD
    Total_Economic_Consequence = sum(economic_consequence) / 10^9 # Convert to billions of USD
  ) %>%
  arrange(desc(Total_Economic_Consequence))

top_10_economic <- head(economic_consequence_summary, 10)

Results

Most harmful events for population health

# Display the top 10 events most harmful to population health
kable(top_10_health, 
      caption = "Top 10 Most Harmful Severe Weather Events (Fatalities + Injuries)",
      format = "markdown")
Top 10 Most Harmful Severe Weather Events (Fatalities + Injuries)
EVTYPE Total_Fatalities Total_Injuries Total_Health_Harm
TORNADO 5633 91346 96979
EXCESSIVE HEAT 1903 6525 8428
TSTM WIND 504 6957 7461
FLOOD 470 6789 7259
LIGHTNING 816 5230 6046
HEAT 937 2100 3037
FLASH FLOOD 978 1777 2755
ICE STORM 89 1975 2064
THUNDERSTORM WIND 133 1488 1621
WINTER STORM 206 1321 1527
# Plotting the results (Figure 1 - Health Harm)
# Melt the data for ggplot to plot Fatalities and Injuries side-by-side
top_10_health_plot <- top_10_health %>%
  select(EVTYPE, Total_Fatalities, Total_Injuries) %>%
  tidyr::pivot_longer(cols = starts_with("Total"), names_to = "Harm_Type", values_to = "Count")

ggplot(top_10_health_plot, aes(x = reorder(EVTYPE, Count), y = Count, fill = Harm_Type)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(
    title = "Figure 1: Top 10 Severe Weather Events by Total Population Health Harm",
    subtitle = "Aggregated Fatalities and Injuries, 1950-2011",
    x = "Event Type",
    y = "Total Number of People Affected (Fatalities or Injuries)",
    fill = "Type of Harm"
  ) +
  theme_minimal() +
  scale_fill_manual(values = c("Total_Fatalities" = "darkred", "Total_Injuries" = "salmon"))

Most harmful events for economic consequences

# Display the top 10 events with the greatest economic consequences
kable(top_10_economic, 
      caption = "Top 10 Severe Weather Events by Total Economic Consequence (Billions USD)",
      digits = 2,
      format = "markdown")
Top 10 Severe Weather Events by Total Economic Consequence (Billions USD)
EVTYPE Total_Property_Damage Total_Crop_Damage Total_Economic_Consequence
FLOOD 144.66 5.66 150.32
HURRICANE/TYPHOON 69.31 2.61 71.91
TORNADO 56.94 0.41 57.35
STORM SURGE 43.32 0.00 43.32
HAIL 15.73 3.03 18.76
FLASH FLOOD 16.14 1.42 17.56
DROUGHT 1.05 13.97 15.02
HURRICANE 11.87 2.74 14.61
RIVER FLOOD 5.12 5.03 10.15
ICE STORM 3.94 5.02 8.97
# Plotting the results (Figure 2 - Economic Consequences)
# Melt the data for ggplot to plot Property and Crop Damage side-by-side
top_10_economic_plot <- top_10_economic %>%
  select(EVTYPE, Total_Property_Damage, Total_Crop_Damage) %>%
  tidyr::pivot_longer(cols = starts_with("Total"), names_to = "Damage_Type", values_to = "Cost_Billion_USD")

ggplot(top_10_economic_plot, aes(x = reorder(EVTYPE, Cost_Billion_USD), y = Cost_Billion_USD, fill = Damage_Type)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(
    title = "Figure 2: Top 10 Severe Weather Events by Total Economic Damage",
    subtitle = "Property and Crop Damage (in Billions USD), 1950-2011",
    x = "Event Type",
    y = "Total Cost (Billions USD)",
    fill = "Type of Damage"
  ) +
  theme_minimal() +
  scale_fill_manual(values = c("Total_Property_Damage" = "darkblue", "Total_Crop_Damage" = "goldenrod"))

Conclusions

Tornado events are the most damaging to population health, accounting for the highest total number of both fatalities and injuries. Other significant contributors include Excessive Heat, TSTM (Thunderstorm) Winds, and Floods.

Flood events cause the greatest total economic damage, primarily through property destruction. Hurricane/Typhoon and Tornado events follow as the second and third most costly event types, respectively, with most damage also concentrated in property. While Drought does not rank highly for health, it is a major economic factor due to its large impact on crop production.