1 Synopsis

This analysis examines the NOAA Storm Database to identify which types of severe weather events pose the greatest threats to population health and economic stability across the United States. The findings are intended to inform government and municipal managers responsible for preparing for severe weather events and allocating resources for emergency response and disaster mitigation.

The analysis addresses two primary questions:

  1. Which types of weather events are most harmful to population health?
  2. Which types of events have the greatest economic consequences?

Key Findings:

  • Tornadoes cause the most casualties (fatalities and injuries combined)
  • Floods result in the highest economic damage
  • Heat-related events are a significant but often underestimated threat to public health

2 Data Processing

2.1 Data Source

The data for this analysis comes from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including estimates of fatalities, injuries, and property damage.

Data URL: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2

The processed data.rds is available in https://github.com/DrNWM/Storm-Analysis

# Load processed data
storm_data <- readRDS(here("data", "processed", "storm_data_processed.rds"))
summary_stats <- readRDS(here("data", "processed", "summary_statistic.rds"))

cat("Total weather events analyzed:", nrow(storm_data), "\n")
## Total weather events analyzed: 902297
cat("Date range:", format(min(storm_data$BGN_DATE, na.rm = TRUE), "%Y"), 
    "to", format(max(storm_data$BGN_DATE, na.rm = TRUE), "%Y"), "\n")
## Date range: 1950 to 2011

2.2 Data Cleaning Steps

The following steps were taken to prepare the data for analysis:

  1. Variable Selection: Selected relevant variables including event type, date, location, fatalities, injuries, and economic damage estimates

  2. Damage Calculation: Converted property and crop damage values from coded format (e.g., “25K”, “5M”) to actual numeric values in USD

  3. Event Type Standardization: Consolidated similar event types (e.g., “TSTM WIND” and “THUNDERSTORM WIND” both mapped to “THUNDERSTORM”)

  4. Derived Variables: Created total casualty count (fatalities + injuries) and total economic damage (property + crop damage)

# Create summary table
data_summary <- data.frame(
  Metric = c("Total Events", "Unique Event Types", "Total Fatalities", 
             "Total Injuries", "Total Economic Damage (Billions USD)"),
  Value = c(
    format(nrow(storm_data), big.mark = ","),
    length(unique(storm_data$EVTYPE_CLEAN)),
    format(sum(storm_data$FATALITIES, na.rm = TRUE), big.mark = ","),
    format(sum(storm_data$INJURIES, na.rm = TRUE), big.mark = ","),
    paste0("$", format(round(sum(storm_data$TOTAL_DAMAGE, na.rm = TRUE) / 1e9, 2), 
                      big.mark = ","))
  )
)

kable(data_summary, caption = "Dataset Overview", align = c("l", "r"))
Dataset Overview
Metric Value
Total Events 902,297
Unique Event Types 343
Total Fatalities 15,145
Total Injuries 140,528
Total Economic Damage (Billions USD) $477.33

3 Results

3.1 Question 1: Events Most Harmful to Population Health

To assess population health impact, we examined fatalities and injuries caused by different weather event types across the entire United States.

health_impact <- storm_data %>%
  group_by(EVTYPE_CLEAN) %>%
  summarise(
    Fatalities = sum(FATALITIES, na.rm = TRUE),
    Injuries = sum(INJURIES, na.rm = TRUE),
    Total_Casualties = sum(TOTAL_CASUALTIES, na.rm = TRUE),
    Events = n()
  ) %>%
  filter(Total_Casualties > 0) %>%
  arrange(desc(Total_Casualties)) %>%
  slice_head(n = 15)

kable(health_impact, 
      format.args = list(big.mark = ","),
      caption = "Top 15 Weather Events by Total Casualties",
      col.names = c("Event Type", "Fatalities", "Injuries", "Total Casualties", "Number of Events"))
Top 15 Weather Events by Total Casualties
Event Type Fatalities Injuries Total Casualties Number of Events
TORNADO 5,661 91,407 97,068 60,700
HIGH WIND 1,424 11,498 12,922 364,869
EXCESSIVE HEAT 3,178 9,243 12,421 2,975
FLOOD 1,553 8,683 10,236 86,127
WINTER STORM 639 5,956 6,595 42,099
LIGHTNING 817 5,232 6,049 15,776
WILDFIRE 90 1,608 1,698 4,239
HURRICANE 135 1,333 1,468 299
HAIL 15 1,371 1,386 289,276
FOG 62 734 796 538
RIP CURRENT 368 232 600 470
RIP CURRENTS 204 297 501 304
DUST STORM 22 440 462 427
TROPICAL STORM 58 340 398 690
AVALANCHE 224 170 394 386
health_impact %>%
  pivot_longer(cols = c(Fatalities, Injuries), 
               names_to = "Type", values_to = "Count") %>%
  ggplot(aes(x = reorder(EVTYPE_CLEAN, Count), y = Count, fill = Type)) +
  geom_col(position = "dodge") +
  coord_flip() +
  scale_y_continuous(labels = comma) +
  scale_fill_manual(values = c("Fatalities" = "#d73027", "Injuries" = "#fee090"),
                    name = "Impact Type") +
  labs(
    title = "Top 15 Weather Events by Population Health Impact",
    subtitle = "Total fatalities and injuries across the United States",
    x = NULL,
    y = "Number of People Affected"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    legend.position = "bottom",
    panel.grid.minor = element_blank()
  )
Weather events with the highest impact on population health

Weather events with the highest impact on population health

Findings:

  • Tornadoes are by far the most harmful weather event type for population health, causing 97,068 total casualties
  • Excessive heat is the second most deadly event type, though it causes relatively fewer injuries compared to fatalities
  • Floods and thunderstorms also pose significant risks to public health

3.2 Question 2: Events with Greatest Economic Consequences

Economic impact was measured by combining property damage and crop damage for each event type.

economic_impact <- storm_data %>%
  group_by(EVTYPE_CLEAN) %>%
  summarise(
    Property_Damage = sum(PROPERTY_DAMAGE, na.rm = TRUE),
    Crop_Damage = sum(CROP_DAMAGE, na.rm = TRUE),
    Total_Damage = sum(TOTAL_DAMAGE, na.rm = TRUE),
    Events = n()
  ) %>%
  filter(Total_Damage > 0) %>%
  arrange(desc(Total_Damage)) %>%
  slice_head(n = 15) %>%
  mutate(
    Property_Damage_B = Property_Damage / 1e9,
    Crop_Damage_B = Crop_Damage / 1e9,
    Total_Damage_B = Total_Damage / 1e9
  )

economic_table <- economic_impact %>%
  select(EVTYPE_CLEAN, Property_Damage_B, Crop_Damage_B, Total_Damage_B, Events)

kable(economic_table, 
      digits = 2,
      format.args = list(big.mark = ","),
      caption = "Top 15 Weather Events by Economic Damage (Billions USD)",
      col.names = c("Event Type", "Property Damage", "Crop Damage", 
                   "Total Damage", "Number of Events"))
Top 15 Weather Events by Economic Damage (Billions USD)
Event Type Property Damage Crop Damage Total Damage Number of Events
FLOOD 168.27 12.39 180.66 86,127
HURRICANE 85.36 5.52 90.87 299
TORNADO 58.60 0.42 59.02 60,700
STORM SURGE 43.32 0.00 43.32 261
HAIL 15.98 3.05 19.02 289,276
HIGH WIND 16.24 2.03 18.28 364,869
WINTER STORM 12.36 5.31 17.67 42,099
DROUGHT 1.05 13.97 15.02 2,488
WILDFIRE 8.50 0.40 8.90 4,239
TROPICAL STORM 7.70 0.68 8.38 690
STORM SURGE/TIDE 4.64 0.00 4.64 148
HEAVY RAIN/SEVERE WEATHER 2.50 0.00 2.50 2
HEAVY RAIN 0.69 0.73 1.43 11,742
EXTREME COLD 0.07 1.31 1.38 657
THUNDERSTORM WIND 1.21 0.02 1.23 98
economic_impact %>%
  pivot_longer(cols = c(Property_Damage_B, Crop_Damage_B), 
               names_to = "Type", values_to = "Damage") %>%
  mutate(Type = case_when(
    Type == "Property_Damage_B" ~ "Property",
    Type == "Crop_Damage_B" ~ "Crop"
  )) %>%
  ggplot(aes(x = reorder(EVTYPE_CLEAN, Damage), y = Damage, fill = Type)) +
  geom_col(position = "stack") +
  coord_flip() +
  scale_y_continuous(labels = dollar) +
  scale_fill_manual(values = c("Property" = "#4575b4", "Crop" = "#91cf60"),
                    name = "Damage Type") +
  labs(
    title = "Top 15 Weather Events by Economic Impact",
    subtitle = "Total property and crop damage across the United States",
    x = NULL,
    y = "Economic Damage (Billions USD)"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    legend.position = "bottom",
    panel.grid.minor = element_blank()
  )
Weather events with the highest economic impact

Weather events with the highest economic impact

Findings:

  • Floods cause the greatest economic damage, totaling approximately $180.7 billion
  • Hurricanes rank second in economic impact, with damage exceeding $90.9 billion
  • Tornadoes while most deadly, rank third in economic consequences
  • Drought has substantial impact on agriculture, primarily through crop damage

4 Conclusion

This analysis reveals important distinctions between weather events that threaten population health versus those with the greatest economic impact:

Population Health Priority:

  1. Tornadoes
  2. Excessive Heat
  3. Floods

Economic Damage Priority:

  1. Floods
  2. Hurricanes
  3. Tornadoes

Implications for Resource Allocation:

While tornadoes dominate casualty statistics, floods represent the single largest combined threat to both public safety and economic stability. This suggests that flood mitigation and emergency response systems warrant priority attention in resource allocation decisions.

Heat-related events, despite their high fatality rate, may be underaddressed compared to more dramatic weather phenomena. Early warning systems and public cooling centers during heat waves could be highly cost-effective interventions.

The diverse nature of weather threats across the United States necessitates flexible, multi-hazard preparedness strategies rather than focusing resources on a single event type.


5 Session Information

sessionInfo()
## R version 4.4.3 (2025-02-28 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: Asia/Singapore
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] knitr_1.50        scales_1.4.0      data.table_1.17.0 lubridate_1.9.4  
##  [5] forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4       purrr_1.0.4      
##  [9] readr_2.1.5       tidyr_1.3.1       tibble_3.2.1      ggplot2_4.0.1    
## [13] tidyverse_2.0.0   here_1.0.2       
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6       jsonlite_2.0.0     compiler_4.4.3     tidyselect_1.2.1  
##  [5] jquerylib_0.1.4    yaml_2.3.10        fastmap_1.2.0      R6_2.6.1          
##  [9] labeling_0.4.3     generics_0.1.3     rprojroot_2.1.0    bslib_0.9.0       
## [13] pillar_1.10.1      RColorBrewer_1.1-3 tzdb_0.5.0         rlang_1.1.5       
## [17] stringi_1.8.7      cachem_1.1.0       xfun_0.52          sass_0.4.9        
## [21] S7_0.2.1           timechange_0.3.0   cli_3.6.4          withr_3.0.2       
## [25] magrittr_2.0.3     digest_0.6.37      grid_4.4.3         rstudioapi_0.17.1 
## [29] hms_1.1.3          lifecycle_1.0.4    vctrs_0.6.5        evaluate_1.0.3    
## [33] glue_1.8.0         farver_2.1.2       rmarkdown_2.29     tools_4.4.3       
## [37] pkgconfig_2.0.3    htmltools_0.5.8.1

Note: This analysis is based on historical data and should be considered alongside current climate trends and regional risk assessments when making resource allocation decisions.