This report analyzes the U.S. National Oceanic and Atmospheric Administration (NOAA) Storm Database to identify which types of weather events are most harmful to population health and which cause the greatest economic consequences. The database covers storm events from 1950 to November 2011. After cleaning and standardizing event types, we computed total fatalities and injuries per event type, as well as total property and crop damage. The results show that tornadoes cause the highest combined number of fatalities and injuries, far exceeding other event types. Excessive heat, flash floods, lightning, and thunderstorms also contribute significantly to health impacts. In terms of economic damage, floods are the costliest, followed by hurricanes/typhoons, tornadoes, and storm surges. These findings can help government and municipal managers prioritize resources for preparedness and response.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)
The raw CSV file repdata-data-StormData.csv is read into R. The file is large, so we set stringsAsFactors = FALSE to speed up processing.
storm_data <- read.csv("repdata-data-StormData.csv", stringsAsFactors = FALSE)
dim(storm_data)
## [1] 902297 37
The analysis focuses on event type (EVTYPE), fatalities (FATALITIES), injuries (INJURIES), property damage (PROPDMG and PROPDMGEXP), and crop damage (CROPDMG and CROPDMGEXP).
The EVTYPE column contains many variations (e.g., “TORNADO”, “Tornado”, “TORNADOES”). We convert all to uppercase and then collapse similar names using a mapping table based on common groupings. This ensures consistent aggregation.
# Convert EVTYPE to uppercase
storm_data$EVTYPE <- toupper(storm_data$EVTYPE)
# Define a mapping from raw EVTYPE to standard categories
# This mapping is not exhaustive but captures major types
evtype_map <- list(
"TORNADO" = "TORNADO",
"THUNDERSTORM WIND" = "THUNDERSTORM WIND",
"HAIL" = "HAIL",
"FLASH FLOOD" = "FLASH FLOOD",
"FLOOD" = "FLOOD",
"LIGHTNING" = "LIGHTNING",
"HIGH WIND" = "HIGH WIND",
"HEAT" = "EXCESSIVE HEAT",
"EXCESSIVE HEAT" = "EXCESSIVE HEAT",
"COLD" = "EXTREME COLD",
"EXTREME COLD" = "EXTREME COLD",
"WINTER STORM" = "WINTER STORM",
"BLIZZARD" = "BLIZZARD",
"ICE STORM" = "ICE STORM",
"HURRICANE" = "HURRICANE/TYPHOON",
"TYPHOON" = "HURRICANE/TYPHOON",
"TROPICAL STORM" = "TROPICAL STORM",
"WILDFIRE" = "WILDFIRE",
"DUST STORM" = "DUST STORM",
"FOG" = "DENSE FOG",
"AVALANCHE" = "AVALANCHE",
"LANDSLIDE" = "DEBRIS FLOW",
"DEBRIS FLOW" = "DEBRIS FLOW",
"RIP CURRENT" = "RIP CURRENT",
"SURF" = "HIGH SURF",
"TSUNAMI" = "TSUNAMI",
"VOLCANIC ASH" = "VOLCANIC ASH"
)
# For unmatched types, keep the original uppercase name (but consolidate common variants)
# We'll create a new column 'EVTYPE_CLEAN'
storm_data <- storm_data %>%
mutate(EVTYPE_CLEAN = case_when(
grepl("TORNADO", EVTYPE) ~ "TORNADO",
grepl("THUNDERSTORM WIND|TSTM WIND", EVTYPE) ~ "THUNDERSTORM WIND",
grepl("HAIL", EVTYPE) ~ "HAIL",
grepl("FLASH FLOOD", EVTYPE) ~ "FLASH FLOOD",
grepl("FLOOD", EVTYPE) & !grepl("FLASH", EVTYPE) ~ "FLOOD",
grepl("LIGHTNING", EVTYPE) ~ "LIGHTNING",
grepl("HIGH WIND", EVTYPE) ~ "HIGH WIND",
grepl("HEAT|EXCESSIVE HEAT", EVTYPE) ~ "EXCESSIVE HEAT",
grepl("COLD|EXTREME COLD|WIND CHILL", EVTYPE) ~ "EXTREME COLD",
grepl("WINTER STORM", EVTYPE) ~ "WINTER STORM",
grepl("BLIZZARD", EVTYPE) ~ "BLIZZARD",
grepl("ICE STORM", EVTYPE) ~ "ICE STORM",
grepl("HURRICANE|TYPHOON", EVTYPE) ~ "HURRICANE/TYPHOON",
grepl("TROPICAL STORM", EVTYPE) ~ "TROPICAL STORM",
grepl("WILDFIRE", EVTYPE) ~ "WILDFIRE",
grepl("DUST STORM", EVTYPE) ~ "DUST STORM",
grepl("FOG", EVTYPE) ~ "DENSE FOG",
grepl("AVALANCHE", EVTYPE) ~ "AVALANCHE",
grepl("DEBRIS FLOW|LANDSLIDE", EVTYPE) ~ "DEBRIS FLOW",
grepl("RIP CURRENT", EVTYPE) ~ "RIP CURRENT",
grepl("SURF", EVTYPE) ~ "HIGH SURF",
grepl("TSUNAMI", EVTYPE) ~ "TSUNAMI",
grepl("VOLCANIC ASH", EVTYPE) ~ "VOLCANIC ASH",
TRUE ~ EVTYPE
))
The columns PROPDMGEXP and CROPDMGEXP contain codes for thousands (K), millions (M), billions (B). We convert these to numeric multipliers.
# Define multiplier function
damage_multiplier <- function(exp) {
ifelse(exp %in% c("K", "k"), 1e3,
ifelse(exp %in% c("M", "m"), 1e6,
ifelse(exp %in% c("B", "b"), 1e9,
ifelse(exp %in% c("H", "h"), 1e2, 1))))
}
storm_data <- storm_data %>%
mutate(
PROPDMGEXP_U = toupper(PROPDMGEXP),
CROPDMGEXP_U = toupper(CROPDMGEXP),
PROP_MULT = damage_multiplier(PROPDMGEXP_U),
CROP_MULT = damage_multiplier(CROPDMGEXP_U),
PROP_DAMAGE = PROPDMG * PROP_MULT,
CROP_DAMAGE = CROPDMG * CROP_MULT
)
We sum fatalities and injuries by cleaned event type, then compute total health impact (fatalities + injuries).
health_impact <- storm_data %>%
group_by(EVTYPE_CLEAN) %>%
summarise(
total_fatalities = sum(FATALITIES, na.rm = TRUE),
total_injuries = sum(INJURIES, na.rm = TRUE),
total_health = total_fatalities + total_injuries
) %>%
arrange(desc(total_health))
Similarly, we sum property and crop damage by event type.
economic_impact <- storm_data %>%
group_by(EVTYPE_CLEAN) %>%
summarise(
total_prop_damage = sum(PROP_DAMAGE, na.rm = TRUE),
total_crop_damage = sum(CROP_DAMAGE, na.rm = TRUE),
total_damage = total_prop_damage + total_crop_damage
) %>%
arrange(desc(total_damage))
Health Impact: Top Event Types
The table below shows the top 10 event types causing the greatest combined fatalities and injuries.
head(health_impact, 10)
## # A tibble: 10 × 4
## EVTYPE_CLEAN total_fatalities total_injuries total_health
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 5661 91407 97068
## 2 EXCESSIVE HEAT 3138 9224 12362
## 3 THUNDERSTORM WIND 728 9493 10221
## 4 FLOOD 490 6802 7292
## 5 LIGHTNING 817 5232 6049
## 6 FLASH FLOOD 1035 1802 2837
## 7 ICE STORM 89 1990 2079
## 8 HIGH WIND 299 1523 1822
## 9 WINTER STORM 216 1338 1554
## 10 HURRICANE/TYPHOON 133 1333 1466
displays the top 10 event types by total health impact (fatalities + injuries). Tornadoes dominate, accounting for over 56,000 fatalities and injuries combined. Excessive heat, flash floods, lightning, and thunderstorm winds also rank high.
health_top10 <- head(health_impact, 10)
ggplot(health_top10, aes(x = reorder(EVTYPE_CLEAN, total_health), y = total_health)) +
geom_bar(stat = "identity", fill = "steelblue") +
coord_flip() +
labs(
title = "Top 10 Weather Event Types by Total Fatalities and Injuries",
x = "Event Type",
y = "Total Fatalities + Injuries"
) +
theme_minimal()
The table below lists the top 10 event types by total economic damage (property + crop damage). Floods cause the most damage, exceeding $150 billion. Hurricanes/typhoons, tornadoes, and storm surges also cause massive losses.
head(economic_impact, 10)
## # A tibble: 10 × 4
## EVTYPE_CLEAN total_prop_damage total_crop_damage total_damage
## <chr> <dbl> <dbl> <dbl>
## 1 FLOOD 150621787745 10847881950 161469669695
## 2 HURRICANE/TYPHOON 85256410010 5506117800 90762527810
## 3 TORNADO 58593098029. 417461520 59010559549.
## 4 STORM SURGE 43323536000 5000 43323541000
## 5 HAIL 15974569543. 3046937623 19021507166.
## 6 FLASH FLOOD 16906908187. 1532197150 18439105337.
## 7 DROUGHT 1046106000 13972566000 15018672000
## 8 THUNDERSTORM WIND 9766600056. 1253447988 11020048044.
## 9 ICE STORM 3946027860 5022113500 8968141360
## 10 TROPICAL STORM 7714390550 694896000 8409286550
illustrates the top 10 event types by total economic damage (in billions of dollars). Floods are by far the most costly.
economic_top10 <- head(economic_impact, 10) %>%
mutate(total_damage_billions = total_damage / 1e9)
ggplot(economic_top10, aes(x = reorder(EVTYPE_CLEAN, total_damage_billions), y = total_damage_billions)) +
geom_bar(stat = "identity", fill = "darkorange") +
coord_flip() +
labs(
title = "Top 10 Weather Event Types by Total Economic Damage",
x = "Event Type",
y = "Total Damage (Billions of US Dollars)"
) +
theme_minimal()
Health: Tornadoes are the most dangerous to human health, followed by excessive heat, flash floods, and lightning. Preparedness efforts should prioritize tornado warning systems, heat action plans, and flood safety education.
Economy: Floods cause the greatest economic losses, exceeding hurricanes and tornadoes. Investment in flood control infrastructure and land-use planning may yield high returns.
These results are based on available NOAA data from 1950–2011. More recent events are not included, but the relative rankings are likely stable.