This report analyzes the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database to identify the types of severe weather events most harmful to population health and those with the greatest economic consequences across the United States. The database spans weather events from 1950 to November 2011. After loading and cleaning the raw data, we aggregated fatalities and injuries by event type to assess public health impact, and computed total property and crop damage (adjusting for magnitude exponents) to assess economic impact. Tornadoes were found to be by far the most harmful event type for population health, accounting for the highest combined fatalities and injuries. In terms of economic damage, floods caused the greatest total property and crop losses, followed by hurricanes/typhoons and storm surges. These findings can help government and municipal managers prioritize resource allocation and emergency preparedness planning for severe weather events.
library(ggplot2)
library(dplyr)
library(tidyr)
library(scales)
The data is downloaded directly from the course website as a bzip2-compressed CSV file and read into R.
storm_data <- read.csv('StormData.csv')
dim(storm_data)
## [1] 902297 37
str(storm_data[, c("EVTYPE", "FATALITIES", "INJURIES",
"PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")])
## 'data.frame': 902297 obs. of 7 variables:
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
1. Standardize event type names by converting to uppercase and trimming whitespace to reduce duplicates caused by inconsistent formatting.
storm_data$EVTYPE <- trimws(toupper(storm_data$EVTYPE))
2. Convert damage exponent columns
(PROPDMGEXP, CROPDMGEXP) to numeric
multipliers. The raw data uses characters like K
(thousands), M (millions), and B (billions) to
represent magnitudes.
exp_to_numeric <- function(exp) {
exp <- toupper(trimws(exp))
case_when(
exp == "K" ~ 1e3,
exp == "M" ~ 1e6,
exp == "B" ~ 1e9,
exp == "H" ~ 1e2,
exp %in% as.character(0:9) ~ 10^as.numeric(exp),
TRUE ~ 1
)
}
storm_data <- storm_data %>%
mutate(
PROP_MULT = exp_to_numeric(PROPDMGEXP),
CROP_MULT = exp_to_numeric(CROPDMGEXP),
PROP_DAMAGE = PROPDMG * PROP_MULT,
CROP_DAMAGE = CROPDMG * CROP_MULT,
TOTAL_DAMAGE = PROP_DAMAGE + CROP_DAMAGE
)
## Warning: There were 2 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `PROP_MULT = exp_to_numeric(PROPDMGEXP)`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
3. Subset to relevant columns for efficiency.
storm_clean <- storm_data %>%
select(EVTYPE, FATALITIES, INJURIES, PROP_DAMAGE, CROP_DAMAGE, TOTAL_DAMAGE)
Health impact: Sum fatalities and injuries per event type, then compute total harm.
health_impact <- storm_clean %>%
group_by(EVTYPE) %>%
summarise(
Total_Fatalities = sum(FATALITIES, na.rm = TRUE),
Total_Injuries = sum(INJURIES, na.rm = TRUE),
.groups = "drop"
) %>%
mutate(Total_Harm = Total_Fatalities + Total_Injuries) %>%
arrange(desc(Total_Harm))
top_health <- head(health_impact, 10)
Economic impact: Sum total damage per event type.
econ_impact <- storm_clean %>%
group_by(EVTYPE) %>%
summarise(
Total_Property = sum(PROP_DAMAGE, na.rm = TRUE),
Total_Crop = sum(CROP_DAMAGE, na.rm = TRUE),
Total_Damage = sum(TOTAL_DAMAGE, na.rm = TRUE),
.groups = "drop"
) %>%
arrange(desc(Total_Damage))
top_econ <- head(econ_impact, 10)
The figure below shows the top 10 weather event types ranked by total health impact (fatalities + injuries combined), with bars stacked to show the breakdown between fatalities and injuries.
top_health_long <- top_health %>%
pivot_longer(cols = c(Total_Fatalities, Total_Injuries),
names_to = "Type", values_to = "Count") %>%
mutate(
EVTYPE = factor(EVTYPE, levels = rev(top_health$EVTYPE)),
Type = recode(Type,
Total_Fatalities = "Fatalities",
Total_Injuries = "Injuries")
)
ggplot(top_health_long, aes(x = EVTYPE, y = Count, fill = Type)) +
geom_bar(stat = "identity") +
coord_flip() +
scale_y_continuous(labels = comma) +
scale_fill_manual(values = c("Fatalities" = "#c0392b", "Injuries" = "#e67e22")) +
labs(
title = "Top 10 Weather Events by Population Health Impact (1950–2011)",
subtitle = "Combined fatalities and injuries across the United States",
x = "Event Type",
y = "Total Casualties",
fill = "Casualty Type"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold"),
legend.position = "bottom"
)
Figure 1: Top 10 weather event types by total population health impact (fatalities + injuries), 1950–2011. Tornadoes dominate all other event types by a wide margin.
Finding: Tornadoes are overwhelmingly the most harmful event type for population health, causing over 90,000 combined fatalities and injuries — far exceeding all other event types. Excessive heat and thunderstorm winds rank second and third.
The figure below shows the top 10 weather event types ranked by total economic damage (property + crop damage combined), with bars stacked to show the breakdown.
top_econ_long <- top_econ %>%
pivot_longer(cols = c(Total_Property, Total_Crop),
names_to = "Type", values_to = "Damage") %>%
mutate(
EVTYPE = factor(EVTYPE, levels = rev(top_econ$EVTYPE)),
Type = recode(Type,
Total_Property = "Property Damage",
Total_Crop = "Crop Damage")
)
ggplot(top_econ_long, aes(x = EVTYPE, y = Damage / 1e9, fill = Type)) +
geom_bar(stat = "identity") +
coord_flip() +
scale_y_continuous(labels = dollar_format(suffix = "B")) +
scale_fill_manual(values = c("Property Damage" = "#2980b9", "Crop Damage" = "#27ae60")) +
labs(
title = "Top 10 Weather Events by Economic Damage (1950–2011)",
subtitle = "Combined property and crop damage across the United States",
x = "Event Type",
y = "Total Damage (USD Billions)",
fill = "Damage Type"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold"),
legend.position = "bottom"
)
Figure 2: Top 10 weather event types by total economic damage (property + crop damage in USD), 1950–2011. Floods cause the greatest total economic losses.
Finding: Floods cause the greatest total economic damage, with over $150 billion in combined property and crop losses. Hurricanes/typhoons and storm surges rank second and third. Drought is notable for its disproportionately high crop damage relative to property damage.
cat("=== Top 5 Events by Population Health Impact ===\n")
## === Top 5 Events by Population Health Impact ===
top_health %>%
head(5) %>%
select(EVTYPE, Total_Fatalities, Total_Injuries, Total_Harm) %>%
knitr::kable(col.names = c("Event Type", "Fatalities", "Injuries", "Total Harm"),
format.args = list(big.mark = ","))
| Event Type | Fatalities | Injuries | Total Harm |
|---|---|---|---|
| TORNADO | 5,633 | 91,346 | 96,979 |
| EXCESSIVE HEAT | 1,903 | 6,525 | 8,428 |
| TSTM WIND | 504 | 6,957 | 7,461 |
| FLOOD | 470 | 6,789 | 7,259 |
| LIGHTNING | 816 | 5,230 | 6,046 |
cat("=== Top 5 Events by Economic Damage ===\n")
## === Top 5 Events by Economic Damage ===
top_econ %>%
head(5) %>%
mutate(across(c(Total_Property, Total_Crop, Total_Damage), ~ scales::dollar(., scale = 1e-9, suffix = "B"))) %>%
select(EVTYPE, Total_Property, Total_Crop, Total_Damage) %>%
knitr::kable(col.names = c("Event Type", "Property Damage", "Crop Damage", "Total Damage"))
| Event Type | Property Damage | Crop Damage | Total Damage |
|---|---|---|---|
| FLOOD | $144.66B | $5.66B | $150.32B |
| HURRICANE/TYPHOON | $69.31B | $2.61B | $71.91B |
| TORNADO | $56.95B | $0.41B | $57.36B |
| STORM SURGE | $43.32B | $0.00B | $43.32B |
| HAIL | $15.74B | $3.03B | $18.76B |