Synopsis

This analysis explores the NOAA Storm Database covering 902,297 weather events across the United States between 1950 and 2011. Tornadoes were found to be the most harmful events to population health, causing over 5,600 fatalities and 91,000 injuries. Floods caused the greatest economic damage, exceeding $150 billion in combined property and crop losses. Notably, the events most deadly to humans differ from those most economically destructive, highlighting the need for multi-dimensional emergency preparedness strategies.

Data Processing

library(ggplot2)
library(dplyr)
storm<- read.csv("repdata_data_StormData.csv.bz2")
dim(storm)
## [1] 902297     37

exploring data

# How many unique event types?
length(unique(storm$EVTYPE))
## [1] 985
# Top 10 events by fatalities
fatal <- aggregate(FATALITIES ~ EVTYPE, data = storm, sum)
fatal <- fatal[order(-fatal$FATALITIES),]
head(fatal, 10)
##             EVTYPE FATALITIES
## 834        TORNADO       5633
## 130 EXCESSIVE HEAT       1903
## 153    FLASH FLOOD        978
## 275           HEAT        937
## 464      LIGHTNING        816
## 856      TSTM WIND        504
## 170          FLOOD        470
## 585    RIP CURRENT        368
## 359      HIGH WIND        248
## 19       AVALANCHE        224
# Top 10 events by injuries
injury <- aggregate(INJURIES ~ EVTYPE, data = storm, sum)
injury <- injury[order(-injury$INJURIES),]
head(injury, 10)
##                EVTYPE INJURIES
## 834           TORNADO    91346
## 856         TSTM WIND     6957
## 170             FLOOD     6789
## 130    EXCESSIVE HEAT     6525
## 464         LIGHTNING     5230
## 275              HEAT     2100
## 427         ICE STORM     1975
## 153       FLASH FLOOD     1777
## 760 THUNDERSTORM WIND     1488
## 244              HAIL     1361

cleaning data

# Convert to uppercase to fix capitalisation inconsistencies
storm$EVTYPE <- toupper(storm$EVTYPE)

# Merge heat events
storm$EVTYPE[grepl("HEAT", storm$EVTYPE)] <- "HEAT"

# Merge wind events
storm$EVTYPE[grepl("TSTM WIND|THUNDERSTORM WIND|HIGH WIND", storm$EVTYPE)] <- "WIND"

# Merge flood events
storm$EVTYPE[grepl("FLASH FLOOD", storm$EVTYPE)] <- "FLASH FLOOD"
storm$EVTYPE[grepl("^FLOOD", storm$EVTYPE)] <- "FLOOD"

# Merge tornado
storm$EVTYPE[grepl("TORNADO", storm$EVTYPE)] <- "TORNADO"

# Merge lightning
storm$EVTYPE[grepl("LIGHTNING", storm$EVTYPE)] <- "LIGHTNING"

# Verify - check unique count dropped
length(unique(storm$EVTYPE))
## [1] 687
# Recalculate after cleaning
fatal <- aggregate(FATALITIES ~ EVTYPE, data = storm, sum)
fatal <- fatal[order(-fatal$FATALITIES),]

injury <- aggregate(INJURIES ~ EVTYPE, data = storm, sum)
injury <- injury[order(-injury$INJURIES),]

# Combine fatalities and injuries - top 10 each
top_fatal <- head(fatal, 10)
top_injury <- head(injury, 10)

# Plot fatalities
ggplot(top_fatal, aes(x = reorder(EVTYPE, FATALITIES), y = FATALITIES)) +
  geom_bar(stat = "identity", fill = "firebrick") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Fatalities",
       x = "Event Type", y = "Total Fatalities") +
  theme_minimal()

# Plot injuries
ggplot(top_injury, aes(x = reorder(EVTYPE, INJURIES), y = INJURIES)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Injuries",
       x = "Event Type", y = "Total Injuries") +
  theme_minimal()

# Decode multipliers into real numbers
decode <- function(amount, exp) {
  exp <- toupper(exp)
  ifelse(exp == "K", amount * 1e3,
  ifelse(exp == "M", amount * 1e6,
  ifelse(exp == "B", amount * 1e9, amount)))
}

storm$PROP_TOTAL <- decode(storm$PROPDMG, storm$PROPDMGEXP)
storm$CROP_TOTAL <- decode(storm$CROPDMG, storm$CROPDMGEXP)
storm$TOTAL_DMG <- storm$PROP_TOTAL + storm$CROP_TOTAL

# Top 10 by total damage
econ <- aggregate(TOTAL_DMG ~ EVTYPE, data = storm, sum)
econ <- econ[order(-econ$TOTAL_DMG),]
top_econ <- head(econ, 10)

# Convert to billions for readability
top_econ$TOTAL_DMG <- top_econ$TOTAL_DMG / 1e9

ggplot(top_econ, aes(x = reorder(EVTYPE, TOTAL_DMG), y = TOTAL_DMG)) +
  geom_bar(stat = "identity", fill = "darkgreen") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Economic Damage",
       x = "Event Type", y = "Total Damage (Billions USD)") +
  theme_minimal()

Results

Question 1: Which weather events are most harmful to population health?

Across the United States between 1950 and 2011, tornadoes were overwhelmingly the most dangerous weather events to human health. Tornadoes caused 5636 fatalities and 9.1407^{4} injuries — far exceeding any other event type.

Heat-related events ranked second in fatalities with 3138 deaths, highlighting the silent but deadly nature of extreme temperature exposure. Wind events ranked second in injuries with approximately 1.1102^{4} injuries.

ggplot(top_fatal, aes(x = reorder(EVTYPE, FATALITIES), y = FATALITIES)) +
  geom_bar(stat = "identity", fill = "firebrick") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Fatalities (1950-2011)",
       x = "Event Type", y = "Total Fatalities") +
  theme_minimal()

ggplot(top_injury, aes(x = reorder(EVTYPE, INJURIES), y = INJURIES)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Injuries (1950-2011)",
       x = "Event Type", y = "Total Injuries") +
  theme_minimal()

Question 2: Which weather events have the greatest economic consequences?

Floods caused the greatest economic damage overall, with total losses exceeding $150 billion in property and crop damage combined. Hurricanes and typhoons ranked second, followed by tornadoes in third place.

Notably, the events most harmful to human life are not necessarily the same events most damaging to the economy — suggesting that emergency preparedness strategies must address both dimensions independently.

ggplot(top_econ, aes(x = reorder(EVTYPE, TOTAL_DMG), y = TOTAL_DMG)) +
  geom_bar(stat = "identity", fill = "darkgreen") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Economic Damage (1950-2011)",
       x = "Event Type", y = "Total Damage (Billions USD)") +
  theme_minimal()