This analysis examines the NOAA Storm Database to identify which types of severe weather events pose the greatest threats to population health and economic stability across the United States. The findings are intended to inform government and municipal managers responsible for preparing for severe weather events and allocating resources for emergency response and disaster mitigation.
The analysis addresses two primary questions:
Key Findings:
The data for this analysis comes from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including estimates of fatalities, injuries, and property damage.
Data URL: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2
The processed data.rds is available in https://github.com/DrNWM/Storm-Analysis
# Load processed data
storm_data <- readRDS(here("data", "processed", "storm_data_processed.rds"))
summary_stats <- readRDS(here("data", "processed", "summary_statistic.rds"))
cat("Total weather events analyzed:", nrow(storm_data), "\n")## Total weather events analyzed: 902297
cat("Date range:", format(min(storm_data$BGN_DATE, na.rm = TRUE), "%Y"),
"to", format(max(storm_data$BGN_DATE, na.rm = TRUE), "%Y"), "\n")## Date range: 1950 to 2011
The following steps were taken to prepare the data for analysis:
Variable Selection: Selected relevant variables including event type, date, location, fatalities, injuries, and economic damage estimates
Damage Calculation: Converted property and crop damage values from coded format (e.g., “25K”, “5M”) to actual numeric values in USD
Event Type Standardization: Consolidated similar event types (e.g., “TSTM WIND” and “THUNDERSTORM WIND” both mapped to “THUNDERSTORM”)
Derived Variables: Created total casualty count (fatalities + injuries) and total economic damage (property + crop damage)
# Create summary table
data_summary <- data.frame(
Metric = c("Total Events", "Unique Event Types", "Total Fatalities",
"Total Injuries", "Total Economic Damage (Billions USD)"),
Value = c(
format(nrow(storm_data), big.mark = ","),
length(unique(storm_data$EVTYPE_CLEAN)),
format(sum(storm_data$FATALITIES, na.rm = TRUE), big.mark = ","),
format(sum(storm_data$INJURIES, na.rm = TRUE), big.mark = ","),
paste0("$", format(round(sum(storm_data$TOTAL_DAMAGE, na.rm = TRUE) / 1e9, 2),
big.mark = ","))
)
)
kable(data_summary, caption = "Dataset Overview", align = c("l", "r"))| Metric | Value |
|---|---|
| Total Events | 902,297 |
| Unique Event Types | 343 |
| Total Fatalities | 15,145 |
| Total Injuries | 140,528 |
| Total Economic Damage (Billions USD) | $477.33 |
To assess population health impact, we examined fatalities and injuries caused by different weather event types across the entire United States.
health_impact <- storm_data %>%
group_by(EVTYPE_CLEAN) %>%
summarise(
Fatalities = sum(FATALITIES, na.rm = TRUE),
Injuries = sum(INJURIES, na.rm = TRUE),
Total_Casualties = sum(TOTAL_CASUALTIES, na.rm = TRUE),
Events = n()
) %>%
filter(Total_Casualties > 0) %>%
arrange(desc(Total_Casualties)) %>%
slice_head(n = 15)
kable(health_impact,
format.args = list(big.mark = ","),
caption = "Top 15 Weather Events by Total Casualties",
col.names = c("Event Type", "Fatalities", "Injuries", "Total Casualties", "Number of Events"))| Event Type | Fatalities | Injuries | Total Casualties | Number of Events |
|---|---|---|---|---|
| TORNADO | 5,661 | 91,407 | 97,068 | 60,700 |
| HIGH WIND | 1,424 | 11,498 | 12,922 | 364,869 |
| EXCESSIVE HEAT | 3,178 | 9,243 | 12,421 | 2,975 |
| FLOOD | 1,553 | 8,683 | 10,236 | 86,127 |
| WINTER STORM | 639 | 5,956 | 6,595 | 42,099 |
| LIGHTNING | 817 | 5,232 | 6,049 | 15,776 |
| WILDFIRE | 90 | 1,608 | 1,698 | 4,239 |
| HURRICANE | 135 | 1,333 | 1,468 | 299 |
| HAIL | 15 | 1,371 | 1,386 | 289,276 |
| FOG | 62 | 734 | 796 | 538 |
| RIP CURRENT | 368 | 232 | 600 | 470 |
| RIP CURRENTS | 204 | 297 | 501 | 304 |
| DUST STORM | 22 | 440 | 462 | 427 |
| TROPICAL STORM | 58 | 340 | 398 | 690 |
| AVALANCHE | 224 | 170 | 394 | 386 |
health_impact %>%
pivot_longer(cols = c(Fatalities, Injuries),
names_to = "Type", values_to = "Count") %>%
ggplot(aes(x = reorder(EVTYPE_CLEAN, Count), y = Count, fill = Type)) +
geom_col(position = "dodge") +
coord_flip() +
scale_y_continuous(labels = comma) +
scale_fill_manual(values = c("Fatalities" = "#d73027", "Injuries" = "#fee090"),
name = "Impact Type") +
labs(
title = "Top 15 Weather Events by Population Health Impact",
subtitle = "Total fatalities and injuries across the United States",
x = NULL,
y = "Number of People Affected"
) +
theme_minimal(base_size = 12) +
theme(
plot.title = element_text(face = "bold", size = 14),
legend.position = "bottom",
panel.grid.minor = element_blank()
)Weather events with the highest impact on population health
Findings:
Economic impact was measured by combining property damage and crop damage for each event type.
economic_impact <- storm_data %>%
group_by(EVTYPE_CLEAN) %>%
summarise(
Property_Damage = sum(PROPERTY_DAMAGE, na.rm = TRUE),
Crop_Damage = sum(CROP_DAMAGE, na.rm = TRUE),
Total_Damage = sum(TOTAL_DAMAGE, na.rm = TRUE),
Events = n()
) %>%
filter(Total_Damage > 0) %>%
arrange(desc(Total_Damage)) %>%
slice_head(n = 15) %>%
mutate(
Property_Damage_B = Property_Damage / 1e9,
Crop_Damage_B = Crop_Damage / 1e9,
Total_Damage_B = Total_Damage / 1e9
)
economic_table <- economic_impact %>%
select(EVTYPE_CLEAN, Property_Damage_B, Crop_Damage_B, Total_Damage_B, Events)
kable(economic_table,
digits = 2,
format.args = list(big.mark = ","),
caption = "Top 15 Weather Events by Economic Damage (Billions USD)",
col.names = c("Event Type", "Property Damage", "Crop Damage",
"Total Damage", "Number of Events"))| Event Type | Property Damage | Crop Damage | Total Damage | Number of Events |
|---|---|---|---|---|
| FLOOD | 168.27 | 12.39 | 180.66 | 86,127 |
| HURRICANE | 85.36 | 5.52 | 90.87 | 299 |
| TORNADO | 58.60 | 0.42 | 59.02 | 60,700 |
| STORM SURGE | 43.32 | 0.00 | 43.32 | 261 |
| HAIL | 15.98 | 3.05 | 19.02 | 289,276 |
| HIGH WIND | 16.24 | 2.03 | 18.28 | 364,869 |
| WINTER STORM | 12.36 | 5.31 | 17.67 | 42,099 |
| DROUGHT | 1.05 | 13.97 | 15.02 | 2,488 |
| WILDFIRE | 8.50 | 0.40 | 8.90 | 4,239 |
| TROPICAL STORM | 7.70 | 0.68 | 8.38 | 690 |
| STORM SURGE/TIDE | 4.64 | 0.00 | 4.64 | 148 |
| HEAVY RAIN/SEVERE WEATHER | 2.50 | 0.00 | 2.50 | 2 |
| HEAVY RAIN | 0.69 | 0.73 | 1.43 | 11,742 |
| EXTREME COLD | 0.07 | 1.31 | 1.38 | 657 |
| THUNDERSTORM WIND | 1.21 | 0.02 | 1.23 | 98 |
economic_impact %>%
pivot_longer(cols = c(Property_Damage_B, Crop_Damage_B),
names_to = "Type", values_to = "Damage") %>%
mutate(Type = case_when(
Type == "Property_Damage_B" ~ "Property",
Type == "Crop_Damage_B" ~ "Crop"
)) %>%
ggplot(aes(x = reorder(EVTYPE_CLEAN, Damage), y = Damage, fill = Type)) +
geom_col(position = "stack") +
coord_flip() +
scale_y_continuous(labels = dollar) +
scale_fill_manual(values = c("Property" = "#4575b4", "Crop" = "#91cf60"),
name = "Damage Type") +
labs(
title = "Top 15 Weather Events by Economic Impact",
subtitle = "Total property and crop damage across the United States",
x = NULL,
y = "Economic Damage (Billions USD)"
) +
theme_minimal(base_size = 12) +
theme(
plot.title = element_text(face = "bold", size = 14),
legend.position = "bottom",
panel.grid.minor = element_blank()
)Weather events with the highest economic impact
Findings:
This analysis reveals important distinctions between weather events that threaten population health versus those with the greatest economic impact:
Population Health Priority:
Economic Damage Priority:
Implications for Resource Allocation:
While tornadoes dominate casualty statistics, floods represent the single largest combined threat to both public safety and economic stability. This suggests that flood mitigation and emergency response systems warrant priority attention in resource allocation decisions.
Heat-related events, despite their high fatality rate, may be underaddressed compared to more dramatic weather phenomena. Early warning systems and public cooling centers during heat waves could be highly cost-effective interventions.
The diverse nature of weather threats across the United States necessitates flexible, multi-hazard preparedness strategies rather than focusing resources on a single event type.
## R version 4.4.3 (2025-02-28 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: Asia/Singapore
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.50 scales_1.4.0 data.table_1.17.0 lubridate_1.9.4
## [5] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.4
## [9] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 ggplot2_4.0.1
## [13] tidyverse_2.0.0 here_1.0.2
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 jsonlite_2.0.0 compiler_4.4.3 tidyselect_1.2.1
## [5] jquerylib_0.1.4 yaml_2.3.10 fastmap_1.2.0 R6_2.6.1
## [9] labeling_0.4.3 generics_0.1.3 rprojroot_2.1.0 bslib_0.9.0
## [13] pillar_1.10.1 RColorBrewer_1.1-3 tzdb_0.5.0 rlang_1.1.5
## [17] stringi_1.8.7 cachem_1.1.0 xfun_0.52 sass_0.4.9
## [21] S7_0.2.1 timechange_0.3.0 cli_3.6.4 withr_3.0.2
## [25] magrittr_2.0.3 digest_0.6.37 grid_4.4.3 rstudioapi_0.17.1
## [29] hms_1.1.3 lifecycle_1.0.4 vctrs_0.6.5 evaluate_1.0.3
## [33] glue_1.8.0 farver_2.1.2 rmarkdown_2.29 tools_4.4.3
## [37] pkgconfig_2.0.3 htmltools_0.5.8.1
Note: This analysis is based on historical data and should be considered alongside current climate trends and regional risk assessments when making resource allocation decisions.