This report analyzes storm events in the United States using NOAA data. It identifies the weather events most harmful to human health and those with the greatest economic consequences. After transforming the data (e.g., extracting year and computing total damages), we examine event frequency over time, summarize the damage by event type, and visualize tornado magnitudes and the most affected states. Tornadoes were found to be the most dangerous for human health, while floods and hurricanes caused the most financial damage.
library(tidyverse)
library(lubridate)
library(ggplot2)
# Load raw data file
repdata <- read_csv("repdata_data_StormData 2.csv")
# Parse date, extract year, compute total damage
repdata <- repdata %>%
mutate(
BGN_DATE = mdy_hms(BGN_DATE),
YEAR = year(BGN_DATE),
DAMAGE_TOTAL = coalesce(PROPDMG, 0) + coalesce(CROPDMG, 0),
EVENT_TYPE = EVTYPE
)
repdata %>%
group_by(YEAR, EVENT_TYPE) %>%
summarise(count = n(), .groups = "drop") %>%
ggplot(aes(x = YEAR, y = count, color = EVENT_TYPE)) +
geom_line(alpha = 0.6, linewidth = 0.8, show.legend = FALSE) +
labs(title = "Events per year and type",
x = "Year", y = "Number of events",
caption = "Each line represents one event type over time.")
repdata %>%
group_by(EVENT_TYPE) %>%
summarise(total_damage = sum(DAMAGE_TOTAL, na.rm = TRUE)) %>%
slice_max(total_damage, n = 10) %>%
ggplot(aes(x = reorder(EVENT_TYPE, total_damage), y = total_damage / 1e6)) +
geom_col(fill = "steelblue") +
coord_flip() +
labs(title = "Top 10 events by total damage",
x = "Event type", y = "Damage (in millions of USD)",
caption = "Combined property and crop damage in USD.")
repdata %>%
filter(EVENT_TYPE == "TORNADO", !is.na(MAG), MAG >= 0, MAG <= 10) %>%
ggplot(aes(x = MAG)) +
geom_histogram(binwidth = 1, fill = "tomato", color = "white") +
labs(title = "Distribution of tornado magnitude (filtered)",
x = "Magnitude", y = "Frequency",
caption = "MAG values filtered to range 0–10")
repdata %>%
group_by(STATE) %>%
summarise(
events = n(),
damage = sum(DAMAGE_TOTAL, na.rm = TRUE)
) %>%
slice_max(events, n = 10) %>%
ggplot(aes(x = reorder(STATE, events), y = events)) +
geom_col(fill = "darkgreen") +
coord_flip() +
labs(title = "Top 10 states by number of incidents",
x = "State", y = "Number of events",
caption = "States with the highest reported storm event counts.")
Tornadoes are the most frequent and harmful events in terms of human health impacts, showing high counts and strong magnitudes. In contrast, economic losses are largely driven by flooding and hurricanes. The southern and midwestern states report the most incidents. These findings can inform mitigation strategies and resource allocation for disaster preparedness.