This project looks at NOAA storm data from 2025 to find out which weather events are the most dangerous, where they happen most, and when they happen. Flash floods caused the most deaths and injuries with 1,132 total. Thunderstorm winds happened the most often with 22,246 events. Thunderstorm winds peak in June. Flash floods caused the most property damage at $4.6 billion. This information can help cities know what weather events to prepare for.
I used three NOAA data files to get details, fatalities, and locations. I merged them together using the EVENT ID number. I added up deaths and injuries to get total health impact. I also converted damage amounts like “200.00M” into applicable numbers.
folder_path <- "C:/Users/25kdi/Downloads/NOAA_Data"
details <- read_csv(file.path(folder_path, "StormEvents_details-ftp_v1.0_d2025_c20260323.csv"))
fatalities <- read_csv(file.path(folder_path, "StormEvents_fatalities-ftp_v1.0_d2025_c20260323.csv"))
locations <- read_csv(file.path(folder_path, "StormEvents_locations-ftp_v1.0_d2025_c20260323.csv"))
fatality_counts <- fatalities %>%
group_by(EVENT_ID) %>%
summarise(FATALITY_COUNT = n())
merged_data <- details %>%
left_join(locations, by = "EVENT_ID") %>%
left_join(fatality_counts, by = "EVENT_ID")
merged_data$FATALITY_COUNT[is.na(merged_data$FATALITY_COUNT)] <- 0
merged_data$TOTAL_HEALTH_IMPACT <- merged_data$DEATHS_DIRECT + merged_data$INJURIES_DIRECT
merged_data$BEGIN_DATE <- as.Date(merged_data$BEGIN_DATE_TIME, "%d-%b-%y %H:%M:%S")
merged_data$MONTH <- month(merged_data$BEGIN_DATE)
convert_damage <- function(x) {
num <- as.numeric(gsub("[^0-9.]", "", x))
num[is.na(num)] <- 0
multiplier <- ifelse(grepl("K", x), 1000,
ifelse(grepl("M", x), 1000000,
ifelse(grepl("B", x), 1000000000, 1)))
num * multiplier
}
merged_data$DAMAGE_PROPERTY_NUM <- convert_damage(merged_data$DAMAGE_PROPERTY)
merged_data$DAMAGE_CROPS_NUM <- convert_damage(merged_data$DAMAGE_CROPS)
health_impact <- merged_data %>%
group_by(EVENT_TYPE) %>%
summarise(
deaths = sum(DEATHS_DIRECT, na.rm = TRUE),
injuries = sum(INJURIES_DIRECT, na.rm = TRUE),
total = sum(TOTAL_HEALTH_IMPACT, na.rm = TRUE)
)
health_impact <- health_impact[order(-health_impact$total), ][1:10, ]
print(health_impact)
## # A tibble: 10 × 4
## EVENT_TYPE deaths injuries total
## <chr> <dbl> <dbl> <dbl>
## 1 Flash Flood 1075 57 1132
## 2 Tornado 75 348 423
## 3 Excessive Heat 37 326 363
## 4 Thunderstorm Wind 38 144 182
## 5 Lightning 20 95 115
## 6 Heat 62 51 113
## 7 Wildfire 61 39 100
## 8 Rip Current 38 48 86
## 9 Flood 39 14 53
## 10 High Surf 15 13 28
library(ggplot2)
ggplot(health_impact, aes(x = reorder(EVENT_TYPE, total), y = total)) +
geom_col(fill = "steelblue") +
coord_flip() +
labs(
title = "Top 10 Weather Events by Health Impact",
x = "Event Type",
y = "Total Number of Deaths and Injuries",
caption = "Values are the combined total of direct deaths and direct injuries caused by each weather event type."
)
Flash floods were the most harmful weather event in 2025 with 1,132 total deaths and injuries. Tornadoes were the second most harmful event with 423 total deaths and injuries followed by excessive heat. Overall, flash floods caused the most damage to people than any other weather event in the data.
event_frequency <- merged_data %>%
group_by(EVENT_TYPE) %>%
summarise(count = n())
event_frequency <- event_frequency[order(-event_frequency$count), ][1:10, ]
print(event_frequency)
## # A tibble: 10 × 2
## EVENT_TYPE count
## <chr> <int>
## 1 Thunderstorm Wind 22246
## 2 Flash Flood 19304
## 3 Hail 9319
## 4 Flood 7493
## 5 High Wind 4603
## 6 Winter Weather 4436
## 7 Drought 3283
## 8 Winter Storm 2951
## 9 Heat 2864
## 10 Tornado 2426
The most common weather event in 2025 was Thunderstorm Wind with 22,246 occurrences. Flash floods were the second most common event with 19,304 occurrences, followed by hail with 9,319. This shows that wind and flood-related events happen much more frequently than the other types of weather events in the data.
top_events <- event_frequency$EVENT_TYPE[1:5]
monthly_patterns <- merged_data[merged_data$EVENT_TYPE %in% top_events, ]
monthly_patterns <- monthly_patterns %>%
group_by(EVENT_TYPE, MONTH) %>%
summarise(count = n())
print(monthly_patterns)
## # A tibble: 60 × 3
## # Groups: EVENT_TYPE [5]
## EVENT_TYPE MONTH count
## <chr> <dbl> <int>
## 1 Flash Flood 1 253
## 2 Flash Flood 2 769
## 3 Flash Flood 3 124
## 4 Flash Flood 4 1924
## 5 Flash Flood 5 2165
## 6 Flash Flood 6 3669
## 7 Flash Flood 7 5888
## 8 Flash Flood 8 2222
## 9 Flash Flood 9 1365
## 10 Flash Flood 10 628
## # ℹ 50 more rows
Flash floods occur throughout the year but peak in the summer, especially in July (5,888 events) and June (3,669 events). Other events like thunderstorm, wind, and hail also show higher activity during warmer months. This shows that many of the most common weather events are strongly affected by the season.
damage_by_event <- merged_data %>%
group_by(EVENT_TYPE) %>%
summarise(
property = sum(DAMAGE_PROPERTY_NUM, na.rm = TRUE) / 1e6,
crops = sum(DAMAGE_CROPS_NUM, na.rm = TRUE) / 1e6
)
damage_by_event$total <- damage_by_event$property + damage_by_event$crops
damage_by_event <- damage_by_event[order(-damage_by_event$total), ][1:10, ]
print(damage_by_event)
## # A tibble: 10 × 4
## EVENT_TYPE property crops total
## <chr> <dbl> <dbl> <dbl>
## 1 Flash Flood 4611. 3.14 4614.
## 2 Tornado 3578. 4.04 3582.
## 3 Wildfire 789. 193. 982.
## 4 Thunderstorm Wind 260. 58.7 319.
## 5 Flood 314. 0.175 314.
## 6 Debris Flow 302. 0.001 302.
## 7 Hail 71.6 2.5 74.1
## 8 Drought 37.1 2.37 39.5
## 9 Lightning 22.6 0.0154 22.6
## 10 High Wind 12.2 0.049 12.2
Flash floods caused the most economic damage in 2025 with a total of about $4.6 billion (mostly from property damage). Tornadoes were the second most damaging event at about $3.6 billion, followed by wildfires at about 982 million. This shows that flash floods and tornadoes are the most economically damaging weather events in the data.
damage_by_state <- merged_data %>%
group_by(STATE) %>%
summarise(total_damage = sum(DAMAGE_PROPERTY_NUM + DAMAGE_CROPS_NUM, na.rm = TRUE) / 1e6)
damage_by_state <- damage_by_state[order(-damage_by_state$total_damage), ][1:10, ]
print(damage_by_state)
## # A tibble: 10 × 2
## STATE total_damage
## <chr> <dbl>
## 1 MISSOURI 3389.
## 2 TEXAS 1815.
## 3 ILLINOIS 1340.
## 4 WISCONSIN 715.
## 5 WASHINGTON 546.
## 6 NEW MEXICO 318.
## 7 ARIZONA 216.
## 8 COLORADO 183.
## 9 NORTH CAROLINA 174.
## 10 NEBRASKA 143.
Missouri had the highest total storm damage at about $3.39 billion, followed by Texas with $1.82 billion and Illinois with $1.34 billion. This shows that storm damage is not even across the US, and some states are affected much more than others.
Most harmful event: Flash Flood Most frequent event: Thunderstorm Wind Most damaging event: Flash Flood