Severe Weather Event Impacts Across the United States in 2025

Synopsis

This analysis studies severe weather events across the United States using the raw Storm Events CSV file. The main focus is on which event types caused the most harm to population health, measured by deaths and injuries. Because the file contains joined data, some events appear more than once, so the analysis removes duplicate EVENT_ID records before summarizing event-level results. Excessive Heat, Tornadoes, Flash Floods, Heat, and Thunderstorm Wind were among the most harmful event types for population health. Thunderstorm Wind, Flash Flood, Hail, Flood, and High Wind were the most commonly reported event types overall. Several event types were strongly seasonal, such as Heat in July and August, Winter Storms in January and February, and Tornadoes in April, May, and June. The analysis also examines which states had the highest number of each major event type. As an additional question, the report studies which event types caused the greatest property and crop damage.

Data Processing

library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.5.2
## Warning: package 'ggplot2' was built under R version 4.5.3
## Warning: package 'tibble' was built under R version 4.5.3
## Warning: package 'tidyr' was built under R version 4.5.2
## Warning: package 'readr' was built under R version 4.5.2
## Warning: package 'dplyr' was built under R version 4.5.3
## Warning: package 'forcats' was built under R version 4.5.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.1     ✔ readr     2.1.6
## ✔ forcats   1.0.1     ✔ stringr   1.5.2
## ✔ ggplot2   4.0.3     ✔ tibble    3.3.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
library(scales)
## 
## Attaching package: 'scales'
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## The following object is masked from 'package:readr':
## 
##     col_factor
storm <- read_csv("StormEvents_joined_data.csv")
## Rows: 94364 Columns: 70
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (31): STATE, MONTH_NAME, EVENT_TYPE, CZ_TYPE, CZ_NAME, WFO, BEGIN_DATE_T...
## dbl (38): BEGIN_YEARMONTH, BEGIN_DAY, BEGIN_TIME, END_YEARMONTH, END_DAY, EN...
## lgl  (1): CATEGORY
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Check structure
glimpse(storm)
## Rows: 94,364
## Columns: 70
## $ BEGIN_YEARMONTH    <dbl> 202503, 202503, 202501, 202501, 202501, 202501, 202…
## $ BEGIN_DAY          <dbl> 31, 30, 5, 3, 3, 3, 3, 3, 3, 3, 19, 13, 13, 13, 13,…
## $ BEGIN_TIME         <dbl> 1104, 1552, 1800, 1300, 1300, 1300, 1547, 1527, 130…
## $ END_YEARMONTH      <dbl> 202503, 202503, 202501, 202501, 202501, 202501, 202…
## $ END_DAY            <dbl> 31, 30, 6, 3, 3, 3, 3, 3, 3, 3, 19, 13, 13, 13, 13,…
## $ END_TIME           <dbl> 1106, 1555, 2227, 1900, 1900, 1900, 1619, 1619, 190…
## $ EPISODE_ID.x       <dbl> 201366, 200337, 197733, 197761, 197761, 197761, 197…
## $ EVENT_ID           <dbl> 1252415, 1241136, 1222851, 1223112, 1223113, 122311…
## $ STATE              <chr> "GEORGIA", "MICHIGAN", "VIRGINIA", "MARYLAND", "MAR…
## $ STATE_FIPS         <dbl> 13, 26, 51, 24, 24, 24, 24, 51, 24, 24, 27, 27, 27,…
## $ YEAR               <dbl> 2025, 2025, 2025, 2025, 2025, 2025, 2025, 2025, 202…
## $ MONTH_NAME         <chr> "March", "March", "January", "January", "January", …
## $ EVENT_TYPE         <chr> "Thunderstorm Wind", "Tornado", "Winter Storm", "Wi…
## $ CZ_TYPE            <chr> "C", "C", "Z", "Z", "Z", "Z", "Z", "Z", "Z", "Z", "…
## $ CZ_FIPS            <dbl> 45, 27, 56, 506, 504, 503, 14, 53, 5, 505, 89, 71, …
## $ CZ_NAME            <chr> "CARROLL", "CASS", "SPOTSYLVANIA", "CENTRAL AND SOU…
## $ WFO                <chr> "FFC", "IWX", "LWX", "LWX", "LWX", "LWX", "LWX", "L…
## $ BEGIN_DATE_TIME    <chr> "3/31/2025 11:04", "3/30/2025 15:52", "1/5/2025 18:…
## $ CZ_TIMEZONE        <chr> "EST-5", "EST-5", "EST-5", "EST-5", "EST-5", "EST-5…
## $ END_DATE_TIME      <chr> "3/31/2025 11:06", "3/30/2025 15:55", "1/6/2025 22:…
## $ INJURIES_DIRECT    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ INJURIES_INDIRECT  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ DEATHS_DIRECT      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ DEATHS_INDIRECT    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ DAMAGE_PROPERTY    <chr> "1.00K", "100.00K", NA, NA, NA, NA, "0.00K", NA, NA…
## $ DAMAGE_CROPS       <chr> NA, "0.00K", NA, NA, NA, NA, "0.00K", NA, NA, NA, "…
## $ SOURCE             <chr> "Emergency Manager", "NWS Storm Survey", "Trained S…
## $ MAGNITUDE          <dbl> 52.0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 38.0, NA,…
## $ MAGNITUDE_TYPE     <chr> "EG", NA, NA, NA, NA, NA, NA, NA, NA, NA, "MS", NA,…
## $ FLOOD_CAUSE        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ CATEGORY           <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TOR_F_SCALE        <chr> NA, "EF1", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ TOR_LENGTH         <dbl> NA, 2.59, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ TOR_WIDTH          <dbl> NA, 100, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ TOR_OTHER_WFO      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TOR_OTHER_CZ_STATE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TOR_OTHER_CZ_FIPS  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TOR_OTHER_CZ_NAME  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ BEGIN_RANGE        <dbl> 2.22, 1.24, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ BEGIN_AZIMUTH      <chr> "W", "SW", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ BEGIN_LOCATION     <chr> "TYUS", "EDWARDSBURG", NA, NA, NA, NA, NA, NA, NA, …
## $ END_RANGE          <dbl> 2.22, 1.47, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ END_AZIMUTH        <chr> "W", "NNE", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ END_LOCATION       <chr> "TYUS", "EDWARDSBURG", NA, NA, NA, NA, NA, NA, NA, …
## $ BEGIN_LAT          <dbl> 33.4757, 41.7900, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ BEGIN_LON          <dbl> -85.238, -86.100, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ END_LAT            <dbl> 33.4757, 41.8200, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ END_LON            <dbl> -85.238, -86.070, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ EPISODE_NARRATIVE  <chr> "A cold-front initiated a line of thunderstorms acr…
## $ EVENT_NARRATIVE    <chr> "Tree down at the intersection of highway 5 and old…
## $ DATA_SOURCE        <chr> "CSV", "CSV", "CSV", "CSV", "CSV", "CSV", "CSV", "C…
## $ YEARMONTH          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ EPISODE_ID.y       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ LOCATION_INDEX     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ RANGE              <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ AZIMUTH            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ LOCATION           <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ LATITUDE           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ LONGITUDE          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ LAT2               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ LON2               <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FAT_YEARMONTH      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FAT_DAY            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FAT_TIME           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FATALITY_ID        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FATALITY_TYPE      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FATALITY_DATE      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FATALITY_AGE       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FATALITY_SEX       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ FATALITY_LOCATION  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
# Because this is joined data, some EVENT_ID values appear more than once.
# To avoid double-counting deaths, injuries, and damages, keep one row per event.
storm_events <- storm %>%
  distinct(EVENT_ID, .keep_all = TRUE)

# Convert health impact columns to numeric and replace missing values with 0
storm_events <- storm_events %>%
  mutate(
    injuries_direct = as.numeric(INJURIES_DIRECT),
    injuries_indirect = as.numeric(INJURIES_INDIRECT),
    deaths_direct = as.numeric(DEATHS_DIRECT),
    deaths_indirect = as.numeric(DEATHS_INDIRECT),
    injuries_direct = replace_na(injuries_direct, 0),
    injuries_indirect = replace_na(injuries_indirect, 0),
    deaths_direct = replace_na(deaths_direct, 0),
    deaths_indirect = replace_na(deaths_indirect, 0),
    total_injuries = injuries_direct + injuries_indirect,
    total_deaths = deaths_direct + deaths_indirect,
    total_health_impact = total_injuries + total_deaths
  )

# Function to convert damage values such as 1.00K, 2.50M, or 1.00B
convert_damage <- function(x) {
  x <- toupper(as.character(x))
  number <- as.numeric(str_extract(x, "[0-9.]+"))
  multiplier <- case_when(
    str_detect(x, "K") ~ 1000,
    str_detect(x, "M") ~ 1000000,
    str_detect(x, "B") ~ 1000000000,
    TRUE ~ 1
  )
  replace_na(number * multiplier, 0)
}

storm_events <- storm_events %>%
  mutate(
    property_damage_num = convert_damage(DAMAGE_PROPERTY),
    crop_damage_num = convert_damage(DAMAGE_CROPS),
    total_damage = property_damage_num + crop_damage_num
  )

Results

1. Event types most harmful to population health

health_summary <- storm_events %>%
  group_by(EVENT_TYPE) %>%
  summarise(
    total_deaths = sum(total_deaths),
    total_injuries = sum(total_injuries),
    total_health_impact = sum(total_health_impact),
    number_of_events = n(),
    .groups = "drop"
  ) %>%
  arrange(desc(total_health_impact))

health_summary %>%
  slice_head(n = 10)
## # A tibble: 10 × 5
##    EVENT_TYPE   total_deaths total_injuries total_health_impact number_of_events
##    <chr>               <dbl>          <dbl>               <dbl>            <int>
##  1 Excessive H…           90            326                 416             1439
##  2 Tornado                64            257                 321             1591
##  3 Flash Flood           209             20                 229             5393
##  4 Heat                  163             51                 214             2864
##  5 Thunderstor…           41            141                 182            21807
##  6 Winter Weat…           31            123                 154             4436
##  7 Lightning              21             98                 119              288
##  8 Wildfire               63             43                 106              350
##  9 Dust Storm             17             78                  95              320
## 10 Rip Current            39             49                  88               72
health_summary %>%
  slice_head(n = 10) %>%
  ggplot(aes(x = reorder(EVENT_TYPE, total_health_impact),
             y = total_health_impact)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Top 10 Weather Event Types by Population Health Impact",
    x = "Event Type",
    y = "Deaths + Injuries"
  )

Interpretation:

Excessive Heat caused the greatest combined number of deaths and injuries. Tornadoes and Flash Floods were also major sources of population harm. For a government or municipal manager, this means that both dramatic events, such as tornadoes and flash floods, and less visually dramatic events, such as heat, should be taken seriously in emergency planning.

2. Event types most common by state

state_event_summary <- storm_events %>%
  group_by(STATE, EVENT_TYPE) %>%
  summarise(number_of_events = n(), .groups = "drop") %>%
  arrange(desc(number_of_events))

state_event_summary %>%
  slice_head(n = 20)
## # A tibble: 20 × 3
##    STATE          EVENT_TYPE               number_of_events
##    <chr>          <chr>                               <int>
##  1 ALABAMA        Thunderstorm Wind                    1532
##  2 TEXAS          Hail                                 1453
##  3 TEXAS          Thunderstorm Wind                    1205
##  4 VIRGINIA       Thunderstorm Wind                    1096
##  5 GEORGIA        Thunderstorm Wind                    1025
##  6 PENNSYLVANIA   Thunderstorm Wind                    1010
##  7 ILLINOIS       Thunderstorm Wind                     978
##  8 MISSOURI       Thunderstorm Wind                     886
##  9 OKLAHOMA       Hail                                  836
## 10 KANSAS         Thunderstorm Wind                     832
## 11 SOUTH DAKOTA   Thunderstorm Wind                     749
## 12 NORTH CAROLINA Thunderstorm Wind                     734
## 13 ATLANTIC NORTH Marine Thunderstorm Wind              724
## 14 OHIO           Thunderstorm Wind                     717
## 15 INDIANA        Thunderstorm Wind                     714
## 16 MISSISSIPPI    Thunderstorm Wind                     698
## 17 WEST VIRGINIA  Thunderstorm Wind                     681
## 18 KENTUCKY       Thunderstorm Wind                     677
## 19 TENNESSEE      Thunderstorm Wind                     658
## 20 NEW YORK       Thunderstorm Wind                     635
state_event_summary %>%
  slice_head(n = 15) %>%
  ggplot(aes(x = reorder(paste(STATE, EVENT_TYPE, sep = " - "), number_of_events),
             y = number_of_events)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Most Frequent State and Event Type Combinations",
    x = "State and Event Type",
    y = "Number of Events"
  )

Interpretation:

Texas appears repeatedly among the highest-count combinations, especially for Flash Flood, Hail, and Thunderstorm Wind. Virginia also had a high number of Flash Flood and Thunderstorm Wind events. This suggests that event preparedness varies strongly by region and that states may face very different severe weather profiles.

3. Event types characterized by month

monthly_event_summary <- storm_events %>%
  group_by(MONTH_NAME, EVENT_TYPE) %>%
  summarise(number_of_events = n(), .groups = "drop") %>%
  arrange(desc(number_of_events))

monthly_event_summary %>%
  slice_head(n = 20)
## # A tibble: 20 × 3
##    MONTH_NAME EVENT_TYPE        number_of_events
##    <chr>      <chr>                        <int>
##  1 June       Thunderstorm Wind             5266
##  2 May        Thunderstorm Wind             3895
##  3 July       Thunderstorm Wind             3793
##  4 May        Hail                          2835
##  5 April      Thunderstorm Wind             2766
##  6 March      Thunderstorm Wind             2390
##  7 April      Hail                          1820
##  8 March      High Wind                     1669
##  9 August     Thunderstorm Wind             1629
## 10 July       Flash Flood                   1570
## 11 July       Heat                          1399
## 12 June       Hail                          1384
## 13 February   Winter Weather                1369
## 14 December   Winter Weather                1274
## 15 January    Winter Storm                  1120
## 16 March      Hail                          1084
## 17 December   High Wind                     1061
## 18 January    Winter Weather                1006
## 19 June       Flash Flood                    919
## 20 August     Heat                           894
top_month_events <- storm_events %>%
  count(EVENT_TYPE, sort = TRUE) %>%
  slice_head(n = 8) %>%
  pull(EVENT_TYPE)

storm_events %>%
  filter(EVENT_TYPE %in% top_month_events) %>%
  mutate(
    MONTH_NAME = factor(
      MONTH_NAME,
      levels = month.name
    )
  ) %>%
  count(MONTH_NAME, EVENT_TYPE) %>%
  ggplot(aes(x = MONTH_NAME, y = n, group = EVENT_TYPE)) +
  geom_line() +
  facet_wrap(~ EVENT_TYPE, scales = "free_y") +
  labs(
    title = "Seasonal Patterns for Common Weather Event Types",
    x = "Month",
    y = "Number of Events"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Interpretation:

The data shows clear seasonal patterns. Heat-related events were concentrated in summer, especially July and August. Winter Weather and Winter Storm events were concentrated in the colder months. Tornadoes were most common in spring and early summer, especially April, May, and June.

4. Which event types caused the most economic damage?

damage_summary <- storm_events %>%
  group_by(EVENT_TYPE) %>%
  summarise(
    total_damage = sum(total_damage),
    property_damage = sum(property_damage_num),
    crop_damage = sum(crop_damage_num),
    number_of_events = n(),
    .groups = "drop"
  ) %>%
  arrange(desc(total_damage))

damage_summary %>%
  slice_head(n = 10)
## # A tibble: 10 × 5
##    EVENT_TYPE        total_damage property_damage crop_damage number_of_events
##    <chr>                    <dbl>           <dbl>       <dbl>            <int>
##  1 Tornado             1909699500      1906326500     3373000             1591
##  2 Flash Flood         1297935550      1297150550      785000             5393
##  3 Wildfire             982347110       788932110   193415000              350
##  4 Thunderstorm Wind    269116580       210492330    58624250            21807
##  5 Flood                 92402950        92357950       45000             2261
##  6 Hail                  62372500        60072500     2300000             9205
##  7 Debris Flow           50601200        50600200        1000              163
##  8 Drought               39503250        37133250     2370000             3283
##  9 Lightning             22615550        22600150       15400              288
## 10 High Wind             12211600        12162600       49000             4603
damage_summary %>%
  slice_head(n = 10) %>%
  ggplot(aes(x = reorder(EVENT_TYPE, total_damage),
             y = total_damage)) +
  geom_col() +
  coord_flip() +
  scale_y_continuous(labels = dollar) +
  labs(
    title = "Top 10 Weather Event Types by Property and Crop Damage",
    x = "Event Type",
    y = "Total Damage"
  )

Interpretation:

Tornadoes caused the greatest economic damage in this dataset, followed by Flash Floods and Wildfires. This is useful because the event types that are most harmful to health are not always the exact same event types that cause the most property and crop damage.

Conclusion

The analysis shows that severe weather risk should be understood in multiple ways. Excessive Heat, Tornadoes, Flash Floods, Heat, and Thunderstorm Wind were among the most harmful event types for population health. However, Tornadoes, Flash Floods, and Wildfires caused the greatest economic damage. The results also show that severe weather events are strongly seasonal and geographically concentrated. For example, heat-related events mostly occurred in summer months, while winter events occurred mainly in January, February, and December. Texas, Virginia, Alabama, California, and Pennsylvania appeared frequently among the highest event counts for major event types. Overall, the data suggests that emergency planning should consider both human health impacts and economic impacts, while also accounting for the season and the specific risks faced by each state.