Synopsis

This analysis explores the NOAA Storm Database to identify which types of severe weather events are most harmful to population health and which have the greatest economic consequences in the United States between 1950 and 2011. ## Data Processing

In this section, the raw NOAA Storm Database file is downloaded (if needed), read into R, and processed for analysis.

# 1) Download the raw data file (only if not already present)
file_url  <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file_name <- "StormData.csv.bz2"

if (!file.exists(file_name)) {
  download.file(file_url, destfile = file_name, mode = "wb")
}

# 2) Read the raw CSV (directly from the .bz2, no pre-unzipping needed)
storm <- read.csv(file_name, stringsAsFactors = FALSE)

# Quick check
dim(storm)
## [1] 902297     37
names(storm)[1:10]
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"

Results

# Keep only relevant columns
health <- storm %>%
  select(EVTYPE, FATALITIES, INJURIES)

# Aggregate by event type
health_summary <- health %>%
  group_by(EVTYPE) %>%
  summarise(
    fatalities = sum(FATALITIES, na.rm = TRUE),
    injuries   = sum(INJURIES, na.rm = TRUE),
    total_harm = fatalities + injuries
  ) %>%
  arrange(desc(total_harm))

# Top 10 most harmful events
top_health <- head(health_summary, 10)

top_health
## # A tibble: 10 × 4
##    EVTYPE            fatalities injuries total_harm
##    <chr>                  <dbl>    <dbl>      <dbl>
##  1 TORNADO                 5633    91346      96979
##  2 EXCESSIVE HEAT          1903     6525       8428
##  3 TSTM WIND                504     6957       7461
##  4 FLOOD                    470     6789       7259
##  5 LIGHTNING                816     5230       6046
##  6 HEAT                     937     2100       3037
##  7 FLASH FLOOD              978     1777       2755
##  8 ICE STORM                 89     1975       2064
##  9 THUNDERSTORM WIND        133     1488       1621
## 10 WINTER STORM             206     1321       1527

Impact on Population Health

The analysis shows that a small number of severe weather event types account for the majority of health-related impacts across the United States. Events such as tornadoes, excessive heat, and floods result in the highest combined numbers of fatalities and injuries. This suggests that emergency preparedness and public health resources should prioritize these high-impact event categories.

Economic Consequences

To estimate economic impact, the NOAA storm database records property damage (PROPDMG) and crop damage (CROPDMG) along with exponent fields (PROPDMGEXP, CROPDMGEXP) that indicate the multiplier (e.g., K = thousands, M = millions, B = billions). We convert these fields into numeric dollar amounts and then aggregate total damage by event type.

# Helper: convert exponent codes to numeric multipliers
exp_to_mult <- function(exp) {
  exp <- toupper(exp)
  dplyr::case_when(
    exp == "H" ~ 1e2,
    exp == "K" ~ 1e3,
    exp == "M" ~ 1e6,
    exp == "B" ~ 1e9,
    exp %in% as.character(0:9) ~ 10^(as.numeric(exp)),
    TRUE ~ 1
  )
}

# Keep only relevant columns
econ <- storm %>%
  select(EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)

# Convert exponents -> multipliers, compute dollar damages
econ <- econ %>%
  mutate(
    prop_mult = exp_to_mult(PROPDMGEXP),
    crop_mult = exp_to_mult(CROPDMGEXP),
    prop_dmg  = PROPDMG * prop_mult,
    crop_dmg  = CROPDMG * crop_mult,
    total_dmg = prop_dmg + crop_dmg
  )

# Aggregate by event type
econ_summary <- econ %>%
  group_by(EVTYPE) %>%
  summarise(
    property_damage = sum(prop_dmg, na.rm = TRUE),
    crop_damage     = sum(crop_dmg, na.rm = TRUE),
    total_damage    = sum(total_dmg, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(total_damage))

# Top 10 by total economic damage
top_econ <- head(econ_summary, 10)
top_econ
## # A tibble: 10 × 4
##    EVTYPE            property_damage crop_damage  total_damage
##    <chr>                       <dbl>       <dbl>         <dbl>
##  1 FLOOD               144657709807   5661968450 150319678257 
##  2 HURRICANE/TYPHOON    69305840000   2607872800  71913712800 
##  3 TORNADO              56947380676.   414953270  57362333946.
##  4 STORM SURGE          43323536000         5000  43323541000 
##  5 HAIL                 15735267513.  3025954473  18761221986.
##  6 FLASH FLOOD          16822673978.  1421317100  18243991078.
##  7 DROUGHT               1046106000  13972566000  15018672000 
##  8 HURRICANE            11868319010   2741910000  14610229010 
##  9 RIVER FLOOD           5118945500   5029459000  10148404500 
## 10 ICE STORM             3944927860   5022113500   8967041360

The economic-impact results indicate that a small set of event types account for the majority of combined property and crop losses. The ranking highlights which hazards create the greatest financial burden nationally and can help prioritize mitigation and preparedness investments.

library(ggplot2)

ggplot(top_econ, aes(x = reorder(EVTYPE, total_damage), y = total_damage)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  labs(
    x = "Event type",
    y = "Total economic damage (USD)"
  )
Figure 2. Top 10 storm event types by total economic damage in the United States (1950–2011).

Figure 2. Top 10 storm event types by total economic damage in the United States (1950–2011).