This analysis examines U.S. NOAA storm event records to identify which event types are most harmful to population health and which create the largest economic losses.
Population health impact is defined as the combined total of fatalities and injuries.
Economic impact is defined as the combined total of property damage and crop damage after converting NOAA damage exponents (for example K, M, B) to numeric multipliers.
The analysis starts from the raw compressed CSV file and performs all data processing in this document for reproducibility.
Results are summarized with ranked tables and two figures showing the top 10 event types in each category.
Under these definitions, tornadoes are typically the largest health burden, while flooding-related events tend to dominate total economic losses.

Data Processing

required_packages <- c("dplyr", "ggplot2", "knitr", "scales")
missing_packages <- required_packages[!sapply(required_packages, requireNamespace, quietly = TRUE)]

if (length(missing_packages) > 0) {
  stop(
    paste(
      "Please install missing packages before knitting:",
      paste(missing_packages, collapse = ", ")
    )
  )
}

library(dplyr)
library(ggplot2)
library(knitr)
library(scales)
# Start from the raw NOAA data file in the working directory
data_file <- "repdata_data_StormData.csv.bz2"
storm_raw <- read.csv(data_file, stringsAsFactors = FALSE)

# Convert NOAA exponent codes to numeric multipliers
exp_to_multiplier <- function(exp_code) {
  exp_code <- toupper(trimws(exp_code))
  mult <- rep(1, length(exp_code))
  mult[exp_code == "H"] <- 1e2
  mult[exp_code == "K"] <- 1e3
  mult[exp_code == "M"] <- 1e6
  mult[exp_code == "B"] <- 1e9

  is_digit <- grepl("^[0-9]$", exp_code)
  mult[is_digit] <- 10^as.numeric(exp_code[is_digit])
  mult
}

# Keep needed variables and create analysis metrics
storm <- storm_raw %>%
  transmute(
    EVTYPE = toupper(trimws(EVTYPE)),
    FATALITIES = as.numeric(FATALITIES),
    INJURIES = as.numeric(INJURIES),
    PROPDMG = as.numeric(PROPDMG),
    PROPDMGEXP = as.character(PROPDMGEXP),
    CROPDMG = as.numeric(CROPDMG),
    CROPDMGEXP = as.character(CROPDMGEXP)
  ) %>%
  mutate(
    health_impact = FATALITIES + INJURIES,
    property_damage = PROPDMG * exp_to_multiplier(PROPDMGEXP),
    crop_damage = CROPDMG * exp_to_multiplier(CROPDMGEXP),
    economic_impact = property_damage + crop_damage
  )

# Summarize impacts by event type
health_by_event <- storm %>%
  group_by(EVTYPE) %>%
  summarise(
    total_health_impact = sum(health_impact, na.rm = TRUE),
    total_fatalities = sum(FATALITIES, na.rm = TRUE),
    total_injuries = sum(INJURIES, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  filter(total_health_impact > 0) %>%
  arrange(desc(total_health_impact))

econ_by_event <- storm %>%
  group_by(EVTYPE) %>%
  summarise(
    total_economic_impact = sum(economic_impact, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  filter(total_economic_impact > 0) %>%
  arrange(desc(total_economic_impact))

top_health <- head(health_by_event, 10)
top_econ <- head(econ_by_event, 10)

Data transformations were limited to trimming/uppercasing event names for consistent grouping and converting damage exponents to numeric multipliers so that property and crop losses can be added on a common dollar scale. Unknown or blank exponent symbols are treated as multiplier 1, which is a standard conservative choice for this dataset.

Results

Across the United States, the event type with the largest population health impact is TORNADO with 96,979 combined fatalities and injuries.

kable(
  top_health,
  caption = "Top 10 event types by total population health impact (fatalities + injuries)."
)
Top 10 event types by total population health impact (fatalities + injuries).
EVTYPE total_health_impact total_fatalities total_injuries
TORNADO 96979 5633 91346
EXCESSIVE HEAT 8428 1903 6525
TSTM WIND 7461 504 6957
FLOOD 7259 470 6789
LIGHTNING 6046 816 5230
HEAT 3037 937 2100
FLASH FLOOD 2755 978 1777
ICE STORM 2064 89 1975
THUNDERSTORM WIND 1621 133 1488
WINTER STORM 1527 206 1321
ggplot(top_health, aes(x = reorder(EVTYPE, total_health_impact), y = total_health_impact)) +
  geom_col(fill = "tomato") +
  coord_flip() +
  labs(
    x = "Event type",
    y = "Total fatalities + injuries",
    title = "Most Harmful Event Types for Population Health"
  ) +
  theme_minimal()
Top 10 event types by total population health impact in the NOAA storm dataset.

Top 10 event types by total population health impact in the NOAA storm dataset.

Across the United States, the event type with the greatest total economic consequence is FLOOD with approximately $150,319,678,257 in combined property and crop damage.

kable(
  top_econ,
  caption = "Top 10 event types by total economic impact (property + crop damage)."
)
Top 10 event types by total economic impact (property + crop damage).
EVTYPE total_economic_impact
FLOOD 150319678257
HURRICANE/TYPHOON 71913712800
TORNADO 57362333947
STORM SURGE 43323541000
HAIL 18761221986
FLASH FLOOD 18244041079
DROUGHT 15018672000
HURRICANE 14610229010
RIVER FLOOD 10148404500
ICE STORM 8967041360
ggplot(top_econ, aes(x = reorder(EVTYPE, total_economic_impact), y = total_economic_impact)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  scale_y_continuous(labels = dollar_format(prefix = "$", scale = 1e-9, suffix = "B")) +
  labs(
    x = "Event type",
    y = "Total economic damage (USD, billions)",
    title = "Event Types with the Greatest Economic Consequences"
  ) +
  theme_minimal()
Top 10 event types by total economic losses in the NOAA storm dataset.

Top 10 event types by total economic losses in the NOAA storm dataset.