This report analyzes the NOAA Storm Events database for 2025 to identify the most harmful weather events, their geographic distribution, and their seasonal patterns across the United States. The analysis integrates three NOAA datasets—event details, fatalities, and locations—using a common event identifier.
Public health impact is measured using total deaths and injuries, while economic impact is assessed using property and crop damage estimates. The results show that extreme heat and tornadoes are the most harmful to population health, while thunderstorm wind events occur most frequently across many states. Seasonal patterns indicate that storm activity varies significantly throughout the year, with winter events dominating early months and convective storms increasing during warmer periods.
The analysis also reveals that certain events, such as tornadoes and wildfires, contribute disproportionately to economic damage. These findings highlight the importance of considering multiple dimensions of risk—frequency, severity, and timing—when planning for severe weather events.
This section describes how the raw NOAA Storm Events files were loaded, joined, cleaned, transformed, and prepared for analysis. The project uses the 2025 files because the assignment asks for the most recent matching details, fatalities, and locations files.
All transformations were performed to ensure consistency and interpretability of the data. Direct and indirect deaths and injuries were combined to capture total health impact, while damage values were converted into numeric form to allow accurate comparison across event types. Event-level aggregation was used to prevent double counting caused by multiple location or fatality records per event. These steps ensure that the analysis is both accurate and reproducible.
The packages below are used for reading CSV files, manipulating data, handling strings, formatting tables, and creating visualizations.
library(dplyr)
library(readr)
library(ggplot2)
library(stringr)
library(forcats)
library(scales)
library(knitr)
library(tidyr)
The analysis begins with the raw CSV files. The three files should be saved in the same folder as this R Markdown document.
folder_path <- "."
details_file <- file.path(folder_path, "StormEvents_details-ftp_v1.0_d2025_c20260323.csv")
fatalities_file <- file.path(folder_path, "StormEvents_fatalities-ftp_v1.0_d2025_c20260323.csv")
locations_file <- file.path(folder_path, "StormEvents_locations-ftp_v1.0_d2025_c20260323.csv")
details <- read_csv(details_file, show_col_types = FALSE)
fatalities <- read_csv(fatalities_file, show_col_types = FALSE)
locations <- read_csv(locations_file, show_col_types = FALSE)
dim(details)
## [1] 72241 51
dim(fatalities)
## [1] 895 10
dim(locations)
## [1] 51870 11
The details file contains the main event-level records,
including state, event type, date, injuries, deaths, and damage
estimates. The fatalities file contains fatality-level
records where available. The locations file contains
additional geographic information for events.
names(details)
## [1] "BEGIN_YEARMONTH" "BEGIN_DAY" "BEGIN_TIME"
## [4] "END_YEARMONTH" "END_DAY" "END_TIME"
## [7] "EPISODE_ID" "EVENT_ID" "STATE"
## [10] "STATE_FIPS" "YEAR" "MONTH_NAME"
## [13] "EVENT_TYPE" "CZ_TYPE" "CZ_FIPS"
## [16] "CZ_NAME" "WFO" "BEGIN_DATE_TIME"
## [19] "CZ_TIMEZONE" "END_DATE_TIME" "INJURIES_DIRECT"
## [22] "INJURIES_INDIRECT" "DEATHS_DIRECT" "DEATHS_INDIRECT"
## [25] "DAMAGE_PROPERTY" "DAMAGE_CROPS" "SOURCE"
## [28] "MAGNITUDE" "MAGNITUDE_TYPE" "FLOOD_CAUSE"
## [31] "CATEGORY" "TOR_F_SCALE" "TOR_LENGTH"
## [34] "TOR_WIDTH" "TOR_OTHER_WFO" "TOR_OTHER_CZ_STATE"
## [37] "TOR_OTHER_CZ_FIPS" "TOR_OTHER_CZ_NAME" "BEGIN_RANGE"
## [40] "BEGIN_AZIMUTH" "BEGIN_LOCATION" "END_RANGE"
## [43] "END_AZIMUTH" "END_LOCATION" "BEGIN_LAT"
## [46] "BEGIN_LON" "END_LAT" "END_LON"
## [49] "EPISODE_NARRATIVE" "EVENT_NARRATIVE" "DATA_SOURCE"
names(fatalities)
## [1] "FAT_YEARMONTH" "FAT_DAY" "FAT_TIME"
## [4] "FATALITY_ID" "EVENT_ID" "FATALITY_TYPE"
## [7] "FATALITY_DATE" "FATALITY_AGE" "FATALITY_SEX"
## [10] "FATALITY_LOCATION"
names(locations)
## [1] "YEARMONTH" "EPISODE_ID" "EVENT_ID" "LOCATION_INDEX"
## [5] "RANGE" "AZIMUTH" "LOCATION" "LATITUDE"
## [9] "LONGITUDE" "LAT2" "LON2"
The assignment requires the three NOAA files to be joined by
EVENT_ID. The code below first creates the direct joined
file required for the project. However, for analysis, the locations and
fatalities files are also summarized to the event level before joining.
This prevents events with multiple location or fatality records from
being double-counted in event-frequency summaries.
StormEvents_joined_data <- details %>%
left_join(locations, by = "EVENT_ID") %>%
left_join(fatalities, by = "EVENT_ID")
write_csv(StormEvents_joined_data, "StormEvents_joined_data.csv")
locations_by_event <- locations %>%
group_by(EVENT_ID) %>%
summarise(
location_record_count = n(),
location_names = paste(sort(unique(LOCATION)), collapse = "; "),
.groups = "drop"
)
fatalities_by_event <- fatalities %>%
group_by(EVENT_ID) %>%
summarise(
fatality_record_count = n(),
fatality_types = paste(sort(unique(FATALITY_TYPE)), collapse = "; "),
.groups = "drop"
)
storm_event_level <- details %>%
left_join(locations_by_event, by = "EVENT_ID") %>%
left_join(fatalities_by_event, by = "EVENT_ID")
dim(StormEvents_joined_data)
## [1] 94364 70
dim(storm_event_level)
## [1] 72241 55
Several new variables are created to support the analysis:
total_deaths: direct deaths plus indirect deaths.total_injuries: direct injuries plus indirect
injuries.total_health_harm: total deaths plus total
injuries.property_damage_value: property damage converted from
NOAA abbreviations such as K, M, and B into numeric dollars.crop_damage_value: crop damage converted into numeric
dollars.total_damage_value: property damage plus crop
damage.event_month: month name ordered from January through
December.These transformations are necessary because the raw files separate direct and indirect health outcomes and store damage estimates as text values.
parse_damage <- function(x) {
x_chr <- as.character(x)
x_chr <- str_trim(x_chr)
value <- parse_number(x_chr)
multiplier <- case_when(
str_detect(str_to_upper(x_chr), "K") ~ 1e3,
str_detect(str_to_upper(x_chr), "M") ~ 1e6,
str_detect(str_to_upper(x_chr), "B") ~ 1e9,
str_detect(str_to_upper(x_chr), "T") ~ 1e12,
TRUE ~ 1
)
value * multiplier
}
storm_clean <- storm_event_level %>%
mutate(
total_deaths = coalesce(DEATHS_DIRECT, 0) + coalesce(DEATHS_INDIRECT, 0),
total_injuries = coalesce(INJURIES_DIRECT, 0) + coalesce(INJURIES_INDIRECT, 0),
total_health_harm = total_deaths + total_injuries,
property_damage_value = parse_damage(DAMAGE_PROPERTY),
crop_damage_value = parse_damage(DAMAGE_CROPS),
total_damage_value = coalesce(property_damage_value, 0) + coalesce(crop_damage_value, 0),
event_month = factor(MONTH_NAME, levels = month.name),
EVENT_TYPE = str_to_title(EVENT_TYPE),
STATE = str_to_title(STATE)
)
glimpse(storm_clean)
## Rows: 72,241
## Columns: 62
## $ BEGIN_YEARMONTH <dbl> 202503, 202503, 202501, 202501, 202501, 202501, …
## $ BEGIN_DAY <dbl> 31, 30, 5, 3, 3, 3, 3, 3, 3, 3, 19, 13, 13, 13, …
## $ BEGIN_TIME <dbl> 1104, 1552, 1800, 1300, 1300, 1300, 1547, 1527, …
## $ END_YEARMONTH <dbl> 202503, 202503, 202501, 202501, 202501, 202501, …
## $ END_DAY <dbl> 31, 30, 6, 3, 3, 3, 3, 3, 3, 3, 19, 13, 13, 13, …
## $ END_TIME <dbl> 1106, 1555, 2227, 1900, 1900, 1900, 1619, 1619, …
## $ EPISODE_ID <dbl> 201366, 200337, 197733, 197761, 197761, 197761, …
## $ EVENT_ID <dbl> 1252415, 1241136, 1222851, 1223112, 1223113, 122…
## $ STATE <chr> "Georgia", "Michigan", "Virginia", "Maryland", "…
## $ STATE_FIPS <dbl> 13, 26, 51, 24, 24, 24, 24, 51, 24, 24, 27, 27, …
## $ YEAR <dbl> 2025, 2025, 2025, 2025, 2025, 2025, 2025, 2025, …
## $ MONTH_NAME <chr> "March", "March", "January", "January", "January…
## $ EVENT_TYPE <chr> "Thunderstorm Wind", "Tornado", "Winter Storm", …
## $ CZ_TYPE <chr> "C", "C", "Z", "Z", "Z", "Z", "Z", "Z", "Z", "Z"…
## $ CZ_FIPS <dbl> 45, 27, 56, 506, 504, 503, 14, 53, 5, 505, 89, 7…
## $ CZ_NAME <chr> "CARROLL", "CASS", "SPOTSYLVANIA", "CENTRAL AND …
## $ WFO <chr> "FFC", "IWX", "LWX", "LWX", "LWX", "LWX", "LWX",…
## $ BEGIN_DATE_TIME <chr> "31-MAR-25 11:04:00", "30-MAR-25 15:52:00", "05-…
## $ CZ_TIMEZONE <chr> "EST-5", "EST-5", "EST-5", "EST-5", "EST-5", "ES…
## $ END_DATE_TIME <chr> "31-MAR-25 11:06:00", "30-MAR-25 15:55:00", "06-…
## $ INJURIES_DIRECT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ INJURIES_INDIRECT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ DEATHS_DIRECT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ DEATHS_INDIRECT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ DAMAGE_PROPERTY <chr> "1.00K", "100.00K", NA, NA, NA, NA, "0.00K", NA,…
## $ DAMAGE_CROPS <chr> NA, "0.00K", NA, NA, NA, NA, "0.00K", NA, NA, NA…
## $ SOURCE <chr> "Emergency Manager", "NWS Storm Survey", "Traine…
## $ MAGNITUDE <dbl> 52.0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 38.0, …
## $ MAGNITUDE_TYPE <chr> "EG", NA, NA, NA, NA, NA, NA, NA, NA, NA, "MS", …
## $ FLOOD_CAUSE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ CATEGORY <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ TOR_F_SCALE <chr> NA, "EF1", NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ TOR_LENGTH <dbl> NA, 2.59, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ TOR_WIDTH <dbl> NA, 100, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TOR_OTHER_WFO <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ TOR_OTHER_CZ_STATE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ TOR_OTHER_CZ_FIPS <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ TOR_OTHER_CZ_NAME <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ BEGIN_RANGE <dbl> 2.22, 1.24, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ BEGIN_AZIMUTH <chr> "W", "SW", NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ BEGIN_LOCATION <chr> "TYUS", "EDWARDSBURG", NA, NA, NA, NA, NA, NA, N…
## $ END_RANGE <dbl> 2.22, 1.47, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ END_AZIMUTH <chr> "W", "NNE", NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ END_LOCATION <chr> "TYUS", "EDWARDSBURG", NA, NA, NA, NA, NA, NA, N…
## $ BEGIN_LAT <dbl> 33.4757, 41.7900, NA, NA, NA, NA, NA, NA, NA, NA…
## $ BEGIN_LON <dbl> -85.238, -86.100, NA, NA, NA, NA, NA, NA, NA, NA…
## $ END_LAT <dbl> 33.4757, 41.8200, NA, NA, NA, NA, NA, NA, NA, NA…
## $ END_LON <dbl> -85.238, -86.070, NA, NA, NA, NA, NA, NA, NA, NA…
## $ EPISODE_NARRATIVE <chr> "A cold-front initiated a line of thunderstorms …
## $ EVENT_NARRATIVE <chr> "Tree down at the intersection of highway 5 and …
## $ DATA_SOURCE <chr> "CSV", "CSV", "CSV", "CSV", "CSV", "CSV", "CSV",…
## $ location_record_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ location_names <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ fatality_record_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ fatality_types <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ total_deaths <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ total_injuries <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ total_health_harm <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ property_damage_value <dbl> 1e+03, 1e+05, NA, NA, NA, NA, 0e+00, NA, NA, NA,…
## $ crop_damage_value <dbl> NA, 0, NA, NA, NA, NA, 0, NA, NA, NA, 0, 0, 0, 0…
## $ total_damage_value <dbl> 1e+03, 1e+05, 0e+00, 0e+00, 0e+00, 0e+00, 0e+00,…
## $ event_month <fct> March, March, January, January, January, January…
The missing value summary focuses on the variables used in the
analysis. Injury and death count fields are treated as zero where no
value is recorded because the absence of a count in these fields means
there is no reported injury or death for that event record. Damage
values are converted into numeric estimates, and missing or blank values
are treated as zero in total_damage_value.
missing_summary <- storm_clean %>%
select(
EVENT_ID, STATE, EVENT_TYPE, event_month,
DEATHS_DIRECT, DEATHS_INDIRECT,
INJURIES_DIRECT, INJURIES_INDIRECT,
total_deaths, total_injuries, total_health_harm,
property_damage_value, crop_damage_value, total_damage_value
) %>%
summarise(across(everything(), ~sum(is.na(.)))) %>%
pivot_longer(
cols = everything(),
names_to = "variable",
values_to = "missing_count"
) %>%
arrange(desc(missing_count))
kable(missing_summary, caption = "Missing values in key analysis variables")
| variable | missing_count |
|---|---|
| crop_damage_value | 17146 |
| property_damage_value | 16399 |
| EVENT_ID | 0 |
| STATE | 0 |
| EVENT_TYPE | 0 |
| event_month | 0 |
| DEATHS_DIRECT | 0 |
| DEATHS_INDIRECT | 0 |
| INJURIES_DIRECT | 0 |
| INJURIES_INDIRECT | 0 |
| total_deaths | 0 |
| total_injuries | 0 |
| total_health_harm | 0 |
| total_damage_value | 0 |
Population health impact is measured as the sum of direct deaths, indirect deaths, direct injuries, and indirect injuries. This combined measure is more informative than looking at deaths alone because some event types cause large injury burdens even when deaths are relatively limited.
health_by_event <- storm_clean %>%
group_by(EVENT_TYPE) %>%
summarise(
event_count = n_distinct(EVENT_ID),
deaths = sum(total_deaths, na.rm = TRUE),
injuries = sum(total_injuries, na.rm = TRUE),
total_health_harm = sum(total_health_harm, na.rm = TRUE),
harm_per_event = total_health_harm / event_count,
.groups = "drop"
) %>%
arrange(desc(total_health_harm))
top_health_events <- health_by_event %>%
slice_head(n = 10)
kable(
top_health_events,
caption = "Top 10 event types by total population health impact in 2025"
)
| EVENT_TYPE | event_count | deaths | injuries | total_health_harm | harm_per_event |
|---|---|---|---|---|---|
| Excessive Heat | 1439 | 90 | 326 | 416 | 0.2890896 |
| Tornado | 1591 | 64 | 257 | 321 | 0.2017599 |
| Flash Flood | 5393 | 209 | 20 | 229 | 0.0424625 |
| Heat | 2864 | 163 | 51 | 214 | 0.0747207 |
| Thunderstorm Wind | 21807 | 41 | 141 | 182 | 0.0083459 |
| Winter Weather | 4436 | 31 | 123 | 154 | 0.0347160 |
| Lightning | 288 | 21 | 98 | 119 | 0.4131944 |
| Wildfire | 350 | 63 | 43 | 106 | 0.3028571 |
| Dust Storm | 320 | 17 | 78 | 95 | 0.2968750 |
| Rip Current | 72 | 39 | 49 | 88 | 1.2222222 |
ggplot(top_health_events,
aes(x = fct_reorder(EVENT_TYPE, total_health_harm),
y = total_health_harm)) +
geom_col() +
coord_flip() +
labs(
title = "Top 10 Event Types by Population Health Impact",
subtitle = "Total health harm = deaths + injuries",
x = "Event Type",
y = "Total Health Harm"
) +
scale_y_continuous(labels = comma) +
theme_minimal()
Figure 1. Top 10 storm event types by total population health impact in 2025. Total health impact equals direct and indirect deaths plus direct and indirect injuries.
The results identify the event types that produced the largest combined number of deaths and injuries in 2025. These hazards are especially important for emergency management because they represent the largest observed public health burden in the data.
This section counts unique storm events by state and event type.
Unique EVENT_ID values are used so that events are not
double-counted because of multiple location records.
state_event_counts <- storm_clean %>%
group_by(STATE, EVENT_TYPE) %>%
summarise(event_count = n_distinct(EVENT_ID), .groups = "drop") %>%
arrange(desc(event_count))
top_state_event_combinations <- state_event_counts %>%
slice_head(n = 15)
kable(
top_state_event_combinations,
caption = "Top 15 state-event combinations by number of unique storm events in 2025"
)
| STATE | EVENT_TYPE | event_count |
|---|---|---|
| Alabama | Thunderstorm Wind | 1532 |
| Texas | Hail | 1453 |
| Texas | Thunderstorm Wind | 1205 |
| Virginia | Thunderstorm Wind | 1096 |
| Georgia | Thunderstorm Wind | 1025 |
| Pennsylvania | Thunderstorm Wind | 1010 |
| Illinois | Thunderstorm Wind | 978 |
| Missouri | Thunderstorm Wind | 886 |
| Oklahoma | Hail | 836 |
| Kansas | Thunderstorm Wind | 832 |
| South Dakota | Thunderstorm Wind | 749 |
| North Carolina | Thunderstorm Wind | 734 |
| Atlantic North | Marine Thunderstorm Wind | 724 |
| Ohio | Thunderstorm Wind | 717 |
| Indiana | Thunderstorm Wind | 714 |
top_state_event_combinations <- top_state_event_combinations %>%
mutate(state_event = paste(STATE, EVENT_TYPE, sep = " - "))
ggplot(top_state_event_combinations,
aes(x = fct_reorder(state_event, event_count),
y = event_count)) +
geom_col() +
coord_flip() +
labs(
title = "Most Frequent State-Event Type Combinations",
subtitle = "Counts based on unique EVENT_ID values",
x = "State and Event Type",
y = "Number of Events"
) +
scale_y_continuous(labels = comma) +
theme_minimal()
Figure 2. Top state-event type combinations by number of unique storm events in 2025.
This result is useful because emergency planning is often implemented at the state or municipal level. Identifying the most frequent event type in each state helps show where preparedness priorities may differ geographically.
This question examines seasonality by counting event types across months. To keep the heatmap readable, the analysis focuses on the 10 most frequent event types overall.
top_event_types_overall <- storm_clean %>%
count(EVENT_TYPE, sort = TRUE) %>%
slice_head(n = 10) %>%
pull(EVENT_TYPE)
monthly_event_counts <- storm_clean %>%
filter(EVENT_TYPE %in% top_event_types_overall) %>%
group_by(event_month, EVENT_TYPE) %>%
summarise(event_count = n_distinct(EVENT_ID), .groups = "drop")
kable(
monthly_event_counts %>% arrange(EVENT_TYPE, event_month),
caption = "Monthly counts for the 10 most frequent event types in 2025"
)
| event_month | EVENT_TYPE | event_count |
|---|---|---|
| January | Drought | 202 |
| February | Drought | 240 |
| March | Drought | 241 |
| April | Drought | 294 |
| May | Drought | 251 |
| June | Drought | 186 |
| July | Drought | 162 |
| August | Drought | 123 |
| September | Drought | 402 |
| October | Drought | 432 |
| November | Drought | 425 |
| December | Drought | 325 |
| January | Flash Flood | 69 |
| February | Flash Flood | 267 |
| March | Flash Flood | 124 |
| April | Flash Flood | 725 |
| May | Flash Flood | 624 |
| June | Flash Flood | 919 |
| July | Flash Flood | 1570 |
| August | Flash Flood | 550 |
| September | Flash Flood | 324 |
| October | Flash Flood | 149 |
| November | Flash Flood | 58 |
| December | Flash Flood | 14 |
| January | Flood | 144 |
| February | Flood | 592 |
| March | Flood | 86 |
| April | Flood | 289 |
| May | Flood | 247 |
| June | Flood | 204 |
| July | Flood | 213 |
| August | Flood | 144 |
| September | Flood | 83 |
| October | Flood | 54 |
| November | Flood | 73 |
| December | Flood | 132 |
| January | Hail | 4 |
| February | Hail | 42 |
| March | Hail | 1084 |
| April | Hail | 1820 |
| May | Hail | 2835 |
| June | Hail | 1384 |
| July | Hail | 768 |
| August | Hail | 494 |
| September | Hail | 544 |
| October | Hail | 100 |
| November | Hail | 128 |
| December | Hail | 2 |
| April | Heat | 2 |
| May | Heat | 11 |
| June | Heat | 539 |
| July | Heat | 1399 |
| August | Heat | 894 |
| September | Heat | 10 |
| December | Heat | 9 |
| January | High Wind | 282 |
| February | High Wind | 576 |
| March | High Wind | 1669 |
| April | High Wind | 231 |
| May | High Wind | 230 |
| June | High Wind | 76 |
| July | High Wind | 33 |
| August | High Wind | 19 |
| September | High Wind | 39 |
| October | High Wind | 147 |
| November | High Wind | 240 |
| December | High Wind | 1061 |
| January | Marine Thunderstorm Wind | 28 |
| February | Marine Thunderstorm Wind | 56 |
| March | Marine Thunderstorm Wind | 232 |
| April | Marine Thunderstorm Wind | 131 |
| May | Marine Thunderstorm Wind | 375 |
| June | Marine Thunderstorm Wind | 392 |
| July | Marine Thunderstorm Wind | 466 |
| August | Marine Thunderstorm Wind | 215 |
| September | Marine Thunderstorm Wind | 64 |
| October | Marine Thunderstorm Wind | 110 |
| November | Marine Thunderstorm Wind | 38 |
| December | Marine Thunderstorm Wind | 19 |
| January | Thunderstorm Wind | 56 |
| February | Thunderstorm Wind | 575 |
| March | Thunderstorm Wind | 2390 |
| April | Thunderstorm Wind | 2766 |
| May | Thunderstorm Wind | 3895 |
| June | Thunderstorm Wind | 5266 |
| July | Thunderstorm Wind | 3793 |
| August | Thunderstorm Wind | 1629 |
| September | Thunderstorm Wind | 783 |
| October | Thunderstorm Wind | 207 |
| November | Thunderstorm Wind | 143 |
| December | Thunderstorm Wind | 304 |
| January | Winter Storm | 1120 |
| February | Winter Storm | 693 |
| March | Winter Storm | 230 |
| April | Winter Storm | 62 |
| May | Winter Storm | 5 |
| June | Winter Storm | 1 |
| October | Winter Storm | 8 |
| November | Winter Storm | 357 |
| December | Winter Storm | 475 |
| January | Winter Weather | 1006 |
| February | Winter Weather | 1369 |
| March | Winter Weather | 294 |
| April | Winter Weather | 134 |
| May | Winter Weather | 12 |
| June | Winter Weather | 2 |
| September | Winter Weather | 1 |
| October | Winter Weather | 8 |
| November | Winter Weather | 336 |
| December | Winter Weather | 1274 |
ggplot(monthly_event_counts,
aes(x = event_month,
y = fct_rev(EVENT_TYPE),
fill = event_count)) +
geom_tile(color = "white") +
labs(
title = "Seasonality of Major Storm Event Types",
subtitle = "Monthly counts for the 10 most frequent event types",
x = "Month",
y = "Event Type",
fill = "Event Count"
) +
scale_fill_gradient(labels = comma) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Figure 3. Monthly distribution of the 10 most frequent storm event types in 2025. Darker cells indicate higher event counts.
The heatmap shows whether event types are concentrated in particular months or distributed more evenly across the year. This information matters for public managers because seasonal timing affects public communication, staffing, and resource readiness.
For the custom research question, this report examines which storm event types produced the highest estimated economic damage. Economic damage is measured as property damage plus crop damage after converting NOAA damage abbreviations into numeric dollar values.
damage_by_event <- storm_clean %>%
group_by(EVENT_TYPE) %>%
summarise(
event_count = n_distinct(EVENT_ID),
property_damage = sum(property_damage_value, na.rm = TRUE),
crop_damage = sum(crop_damage_value, na.rm = TRUE),
total_damage = sum(total_damage_value, na.rm = TRUE),
avg_damage_per_event = total_damage / event_count,
.groups = "drop"
) %>%
arrange(desc(total_damage))
top_damage_events <- damage_by_event %>%
slice_head(n = 10)
kable(
top_damage_events,
caption = "Top 10 event types by estimated economic damage in 2025",
format.args = list(big.mark = ",")
)
| EVENT_TYPE | event_count | property_damage | crop_damage | total_damage | avg_damage_per_event |
|---|---|---|---|---|---|
| Tornado | 1,591 | 1,906,326,500 | 3,373,000 | 1,909,699,500 | 1,200,313.953 |
| Flash Flood | 5,393 | 1,297,150,550 | 785,000 | 1,297,935,550 | 240,670.415 |
| Wildfire | 350 | 788,932,110 | 193,415,000 | 982,347,110 | 2,806,706.029 |
| Thunderstorm Wind | 21,807 | 210,492,330 | 58,624,250 | 269,116,580 | 12,340.835 |
| Flood | 2,261 | 92,357,950 | 45,000 | 92,402,950 | 40,868.178 |
| Hail | 9,205 | 60,072,500 | 2,300,000 | 62,372,500 | 6,775.937 |
| Debris Flow | 163 | 50,600,200 | 1,000 | 50,601,200 | 310,436.810 |
| Drought | 3,283 | 37,133,250 | 2,370,000 | 39,503,250 | 12,032.668 |
| Lightning | 288 | 22,600,150 | 15,400 | 22,615,550 | 78,526.215 |
| High Wind | 4,603 | 12,162,600 | 49,000 | 12,211,600 | 2,652.965 |
ggplot(top_damage_events,
aes(x = fct_reorder(EVENT_TYPE, total_damage),
y = total_damage)) +
geom_col() +
coord_flip() +
labs(
title = "Top 10 Event Types by Estimated Economic Damage",
subtitle = "Total damage = property damage + crop damage",
x = "Event Type",
y = "Estimated Damage"
) +
scale_y_continuous(labels = dollar) +
theme_minimal()
Figure 4. Top 10 storm event types by estimated economic damage in 2025. Total damage combines property and crop damage estimates.
This economic analysis adds another dimension to the public health analysis. Some hazards may be very damaging financially but may not rank as highly in injuries or deaths, while other hazards may have a larger human impact than economic impact. Considering both dimensions provides a more complete understanding of severe weather risk.
This report analyzed the 2025 NOAA Storm Events data by linking the
details, locations, and fatalities files using EVENT_ID.
The results identify the event types with the greatest public health
impact, the state-event combinations with the highest event counts, the
seasonal timing of major storm event types, and the event types
associated with the greatest estimated economic damage. The analysis
shows why emergency managers should consider several dimensions of risk
rather than relying only on event frequency. Events that happen often
are not always the most harmful, and events that cause the greatest
economic damage may differ from events that produce the greatest public
health burden. The project is reproducible because it begins with the
raw CSV files, documents each transformation, and shows the code used to
generate each table and figure. This analysis demonstrates that severe
weather risk is multi-dimensional and cannot be evaluated based on
frequency alone. While events such as thunderstorm winds occur most
often, events like excessive heat and tornadoes have significantly
greater impacts on human life. Additionally, economic damage is driven
by a different set of event types, further emphasizing the need for a
comprehensive approach to risk assessment. By combining public health,
geographic, seasonal, and economic perspectives, this report provides a
more complete understanding of storm event risk in the United States.
These insights can support policymakers and emergency management
agencies in making more informed, data-driven decisions.