This analysis explores the NOAA Storm Database covering 902,297 weather events across the United States between 1950 and 2011. Tornadoes were found to be the most harmful events to population health, causing over 5,600 fatalities and 91,000 injuries. Floods caused the greatest economic damage, exceeding $150 billion in combined property and crop losses. Notably, the events most deadly to humans differ from those most economically destructive, highlighting the need for multi-dimensional emergency preparedness strategies.
library(ggplot2)
library(dplyr)
storm<- read.csv("repdata_data_StormData.csv.bz2")
dim(storm)
## [1] 902297 37
# How many unique event types?
length(unique(storm$EVTYPE))
## [1] 985
# Top 10 events by fatalities
fatal <- aggregate(FATALITIES ~ EVTYPE, data = storm, sum)
fatal <- fatal[order(-fatal$FATALITIES),]
head(fatal, 10)
## EVTYPE FATALITIES
## 834 TORNADO 5633
## 130 EXCESSIVE HEAT 1903
## 153 FLASH FLOOD 978
## 275 HEAT 937
## 464 LIGHTNING 816
## 856 TSTM WIND 504
## 170 FLOOD 470
## 585 RIP CURRENT 368
## 359 HIGH WIND 248
## 19 AVALANCHE 224
# Top 10 events by injuries
injury <- aggregate(INJURIES ~ EVTYPE, data = storm, sum)
injury <- injury[order(-injury$INJURIES),]
head(injury, 10)
## EVTYPE INJURIES
## 834 TORNADO 91346
## 856 TSTM WIND 6957
## 170 FLOOD 6789
## 130 EXCESSIVE HEAT 6525
## 464 LIGHTNING 5230
## 275 HEAT 2100
## 427 ICE STORM 1975
## 153 FLASH FLOOD 1777
## 760 THUNDERSTORM WIND 1488
## 244 HAIL 1361
# Convert to uppercase to fix capitalisation inconsistencies
storm$EVTYPE <- toupper(storm$EVTYPE)
# Merge heat events
storm$EVTYPE[grepl("HEAT", storm$EVTYPE)] <- "HEAT"
# Merge wind events
storm$EVTYPE[grepl("TSTM WIND|THUNDERSTORM WIND|HIGH WIND", storm$EVTYPE)] <- "WIND"
# Merge flood events
storm$EVTYPE[grepl("FLASH FLOOD", storm$EVTYPE)] <- "FLASH FLOOD"
storm$EVTYPE[grepl("^FLOOD", storm$EVTYPE)] <- "FLOOD"
# Merge tornado
storm$EVTYPE[grepl("TORNADO", storm$EVTYPE)] <- "TORNADO"
# Merge lightning
storm$EVTYPE[grepl("LIGHTNING", storm$EVTYPE)] <- "LIGHTNING"
# Verify - check unique count dropped
length(unique(storm$EVTYPE))
## [1] 687
# Recalculate after cleaning
fatal <- aggregate(FATALITIES ~ EVTYPE, data = storm, sum)
fatal <- fatal[order(-fatal$FATALITIES),]
injury <- aggregate(INJURIES ~ EVTYPE, data = storm, sum)
injury <- injury[order(-injury$INJURIES),]
# Combine fatalities and injuries - top 10 each
top_fatal <- head(fatal, 10)
top_injury <- head(injury, 10)
# Plot fatalities
ggplot(top_fatal, aes(x = reorder(EVTYPE, FATALITIES), y = FATALITIES)) +
geom_bar(stat = "identity", fill = "firebrick") +
coord_flip() +
labs(title = "Top 10 Weather Events by Fatalities",
x = "Event Type", y = "Total Fatalities") +
theme_minimal()
# Plot injuries
ggplot(top_injury, aes(x = reorder(EVTYPE, INJURIES), y = INJURIES)) +
geom_bar(stat = "identity", fill = "steelblue") +
coord_flip() +
labs(title = "Top 10 Weather Events by Injuries",
x = "Event Type", y = "Total Injuries") +
theme_minimal()
# Decode multipliers into real numbers
decode <- function(amount, exp) {
exp <- toupper(exp)
ifelse(exp == "K", amount * 1e3,
ifelse(exp == "M", amount * 1e6,
ifelse(exp == "B", amount * 1e9, amount)))
}
storm$PROP_TOTAL <- decode(storm$PROPDMG, storm$PROPDMGEXP)
storm$CROP_TOTAL <- decode(storm$CROPDMG, storm$CROPDMGEXP)
storm$TOTAL_DMG <- storm$PROP_TOTAL + storm$CROP_TOTAL
# Top 10 by total damage
econ <- aggregate(TOTAL_DMG ~ EVTYPE, data = storm, sum)
econ <- econ[order(-econ$TOTAL_DMG),]
top_econ <- head(econ, 10)
# Convert to billions for readability
top_econ$TOTAL_DMG <- top_econ$TOTAL_DMG / 1e9
ggplot(top_econ, aes(x = reorder(EVTYPE, TOTAL_DMG), y = TOTAL_DMG)) +
geom_bar(stat = "identity", fill = "darkgreen") +
coord_flip() +
labs(title = "Top 10 Weather Events by Economic Damage",
x = "Event Type", y = "Total Damage (Billions USD)") +
theme_minimal()
Across the United States between 1950 and 2011, tornadoes were overwhelmingly the most dangerous weather events to human health. Tornadoes caused 5636 fatalities and 9.1407^{4} injuries — far exceeding any other event type.
Heat-related events ranked second in fatalities with 3138 deaths, highlighting the silent but deadly nature of extreme temperature exposure. Wind events ranked second in injuries with approximately 1.1102^{4} injuries.
ggplot(top_fatal, aes(x = reorder(EVTYPE, FATALITIES), y = FATALITIES)) +
geom_bar(stat = "identity", fill = "firebrick") +
coord_flip() +
labs(title = "Top 10 Weather Events by Fatalities (1950-2011)",
x = "Event Type", y = "Total Fatalities") +
theme_minimal()
ggplot(top_injury, aes(x = reorder(EVTYPE, INJURIES), y = INJURIES)) +
geom_bar(stat = "identity", fill = "steelblue") +
coord_flip() +
labs(title = "Top 10 Weather Events by Injuries (1950-2011)",
x = "Event Type", y = "Total Injuries") +
theme_minimal()
Floods caused the greatest economic damage overall, with total losses exceeding $150 billion in property and crop damage combined. Hurricanes and typhoons ranked second, followed by tornadoes in third place.
Notably, the events most harmful to human life are not necessarily the same events most damaging to the economy — suggesting that emergency preparedness strategies must address both dimensions independently.
ggplot(top_econ, aes(x = reorder(EVTYPE, TOTAL_DMG), y = TOTAL_DMG)) +
geom_bar(stat = "identity", fill = "darkgreen") +
coord_flip() +
labs(title = "Top 10 Weather Events by Economic Damage (1950-2011)",
x = "Event Type", y = "Total Damage (Billions USD)") +
theme_minimal()