Executive Summary

This report analyzes the U.S. National Oceanic and Atmospheric Administration (NOAA) Storm Database to identify which types of weather events are most harmful to population health and which cause the greatest economic consequences. The database covers storm events from 1950 to November 2011. After cleaning and standardizing event types, we computed total fatalities and injuries per event type, as well as total property and crop damage. The results show that tornadoes cause the highest combined number of fatalities and injuries, far exceeding other event types. Excessive heat, flash floods, lightning, and thunderstorms also contribute significantly to health impacts. In terms of economic damage, floods are the costliest, followed by hurricanes/typhoons, tornadoes, and storm surges. These findings can help government and municipal managers prioritize resources for preparedness and response.

Data Processing

Loading Required Libraries

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)

Loading the Data

The raw CSV file repdata-data-StormData.csv is read into R. The file is large, so we set stringsAsFactors = FALSE to speed up processing.

storm_data <- read.csv("repdata-data-StormData.csv", stringsAsFactors = FALSE)
dim(storm_data)
## [1] 902297     37

Data Cleaning

The analysis focuses on event type (EVTYPE), fatalities (FATALITIES), injuries (INJURIES), property damage (PROPDMG and PROPDMGEXP), and crop damage (CROPDMG and CROPDMGEXP).

Standardizing Event Types

The EVTYPE column contains many variations (e.g., “TORNADO”, “Tornado”, “TORNADOES”). We convert all to uppercase and then collapse similar names using a mapping table based on common groupings. This ensures consistent aggregation.

# Convert EVTYPE to uppercase
storm_data$EVTYPE <- toupper(storm_data$EVTYPE)

# Define a mapping from raw EVTYPE to standard categories
# This mapping is not exhaustive but captures major types
evtype_map <- list(
  "TORNADO" = "TORNADO",
  "THUNDERSTORM WIND" = "THUNDERSTORM WIND",
  "HAIL" = "HAIL",
  "FLASH FLOOD" = "FLASH FLOOD",
  "FLOOD" = "FLOOD",
  "LIGHTNING" = "LIGHTNING",
  "HIGH WIND" = "HIGH WIND",
  "HEAT" = "EXCESSIVE HEAT",
  "EXCESSIVE HEAT" = "EXCESSIVE HEAT",
  "COLD" = "EXTREME COLD",
  "EXTREME COLD" = "EXTREME COLD",
  "WINTER STORM" = "WINTER STORM",
  "BLIZZARD" = "BLIZZARD",
  "ICE STORM" = "ICE STORM",
  "HURRICANE" = "HURRICANE/TYPHOON",
  "TYPHOON" = "HURRICANE/TYPHOON",
  "TROPICAL STORM" = "TROPICAL STORM",
  "WILDFIRE" = "WILDFIRE",
  "DUST STORM" = "DUST STORM",
  "FOG" = "DENSE FOG",
  "AVALANCHE" = "AVALANCHE",
  "LANDSLIDE" = "DEBRIS FLOW",
  "DEBRIS FLOW" = "DEBRIS FLOW",
  "RIP CURRENT" = "RIP CURRENT",
  "SURF" = "HIGH SURF",
  "TSUNAMI" = "TSUNAMI",
  "VOLCANIC ASH" = "VOLCANIC ASH"
)

# For unmatched types, keep the original uppercase name (but consolidate common variants)
# We'll create a new column 'EVTYPE_CLEAN'
storm_data <- storm_data %>%
  mutate(EVTYPE_CLEAN = case_when(
    grepl("TORNADO", EVTYPE) ~ "TORNADO",
    grepl("THUNDERSTORM WIND|TSTM WIND", EVTYPE) ~ "THUNDERSTORM WIND",
    grepl("HAIL", EVTYPE) ~ "HAIL",
    grepl("FLASH FLOOD", EVTYPE) ~ "FLASH FLOOD",
    grepl("FLOOD", EVTYPE) & !grepl("FLASH", EVTYPE) ~ "FLOOD",
    grepl("LIGHTNING", EVTYPE) ~ "LIGHTNING",
    grepl("HIGH WIND", EVTYPE) ~ "HIGH WIND",
    grepl("HEAT|EXCESSIVE HEAT", EVTYPE) ~ "EXCESSIVE HEAT",
    grepl("COLD|EXTREME COLD|WIND CHILL", EVTYPE) ~ "EXTREME COLD",
    grepl("WINTER STORM", EVTYPE) ~ "WINTER STORM",
    grepl("BLIZZARD", EVTYPE) ~ "BLIZZARD",
    grepl("ICE STORM", EVTYPE) ~ "ICE STORM",
    grepl("HURRICANE|TYPHOON", EVTYPE) ~ "HURRICANE/TYPHOON",
    grepl("TROPICAL STORM", EVTYPE) ~ "TROPICAL STORM",
    grepl("WILDFIRE", EVTYPE) ~ "WILDFIRE",
    grepl("DUST STORM", EVTYPE) ~ "DUST STORM",
    grepl("FOG", EVTYPE) ~ "DENSE FOG",
    grepl("AVALANCHE", EVTYPE) ~ "AVALANCHE",
    grepl("DEBRIS FLOW|LANDSLIDE", EVTYPE) ~ "DEBRIS FLOW",
    grepl("RIP CURRENT", EVTYPE) ~ "RIP CURRENT",
    grepl("SURF", EVTYPE) ~ "HIGH SURF",
    grepl("TSUNAMI", EVTYPE) ~ "TSUNAMI",
    grepl("VOLCANIC ASH", EVTYPE) ~ "VOLCANIC ASH",
    TRUE ~ EVTYPE
  ))

Converting Damage Multipliers

The columns PROPDMGEXP and CROPDMGEXP contain codes for thousands (K), millions (M), billions (B). We convert these to numeric multipliers.

# Define multiplier function
damage_multiplier <- function(exp) {
  ifelse(exp %in% c("K", "k"), 1e3,
    ifelse(exp %in% c("M", "m"), 1e6,
      ifelse(exp %in% c("B", "b"), 1e9,
        ifelse(exp %in% c("H", "h"), 1e2, 1))))
}

storm_data <- storm_data %>%
  mutate(
    PROPDMGEXP_U = toupper(PROPDMGEXP),
    CROPDMGEXP_U = toupper(CROPDMGEXP),
    PROP_MULT = damage_multiplier(PROPDMGEXP_U),
    CROP_MULT = damage_multiplier(CROPDMGEXP_U),
    PROP_DAMAGE = PROPDMG * PROP_MULT,
    CROP_DAMAGE = CROPDMG * CROP_MULT
  )

Aggregating Health Impacts

We sum fatalities and injuries by cleaned event type, then compute total health impact (fatalities + injuries).

health_impact <- storm_data %>%
  group_by(EVTYPE_CLEAN) %>%
  summarise(
    total_fatalities = sum(FATALITIES, na.rm = TRUE),
    total_injuries = sum(INJURIES, na.rm = TRUE),
    total_health = total_fatalities + total_injuries
  ) %>%
  arrange(desc(total_health))

Aggregating Economic Impacts

Similarly, we sum property and crop damage by event type.

economic_impact <- storm_data %>%
  group_by(EVTYPE_CLEAN) %>%
  summarise(
    total_prop_damage = sum(PROP_DAMAGE, na.rm = TRUE),
    total_crop_damage = sum(CROP_DAMAGE, na.rm = TRUE),
    total_damage = total_prop_damage + total_crop_damage
  ) %>%
  arrange(desc(total_damage))

Results

Health Impact: Top Event Types

The table below shows the top 10 event types causing the greatest combined fatalities and injuries.

head(health_impact, 10)
## # A tibble: 10 × 4
##    EVTYPE_CLEAN      total_fatalities total_injuries total_health
##    <chr>                        <dbl>          <dbl>        <dbl>
##  1 TORNADO                       5661          91407        97068
##  2 EXCESSIVE HEAT                3138           9224        12362
##  3 THUNDERSTORM WIND              728           9493        10221
##  4 FLOOD                          490           6802         7292
##  5 LIGHTNING                      817           5232         6049
##  6 FLASH FLOOD                   1035           1802         2837
##  7 ICE STORM                       89           1990         2079
##  8 HIGH WIND                      299           1523         1822
##  9 WINTER STORM                   216           1338         1554
## 10 HURRICANE/TYPHOON              133           1333         1466

Figure 1

displays the top 10 event types by total health impact (fatalities + injuries). Tornadoes dominate, accounting for over 56,000 fatalities and injuries combined. Excessive heat, flash floods, lightning, and thunderstorm winds also rank high.

health_top10 <- head(health_impact, 10)

ggplot(health_top10, aes(x = reorder(EVTYPE_CLEAN, total_health), y = total_health)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(
    title = "Top 10 Weather Event Types by Total Fatalities and Injuries",
    x = "Event Type",
    y = "Total Fatalities + Injuries"
  ) +
  theme_minimal()

Economic Impact: Top Event Types

The table below lists the top 10 event types by total economic damage (property + crop damage). Floods cause the most damage, exceeding $150 billion. Hurricanes/typhoons, tornadoes, and storm surges also cause massive losses.

head(economic_impact, 10)
## # A tibble: 10 × 4
##    EVTYPE_CLEAN      total_prop_damage total_crop_damage  total_damage
##    <chr>                         <dbl>             <dbl>         <dbl>
##  1 FLOOD                 150621787745        10847881950 161469669695 
##  2 HURRICANE/TYPHOON      85256410010         5506117800  90762527810 
##  3 TORNADO                58593098029.         417461520  59010559549.
##  4 STORM SURGE            43323536000               5000  43323541000 
##  5 HAIL                   15974569543.        3046937623  19021507166.
##  6 FLASH FLOOD            16906908187.        1532197150  18439105337.
##  7 DROUGHT                 1046106000        13972566000  15018672000 
##  8 THUNDERSTORM WIND       9766600056.        1253447988  11020048044.
##  9 ICE STORM               3946027860         5022113500   8968141360 
## 10 TROPICAL STORM          7714390550          694896000   8409286550

Figure 2

illustrates the top 10 event types by total economic damage (in billions of dollars). Floods are by far the most costly.

economic_top10 <- head(economic_impact, 10) %>%
  mutate(total_damage_billions = total_damage / 1e9)

ggplot(economic_top10, aes(x = reorder(EVTYPE_CLEAN, total_damage_billions), y = total_damage_billions)) +
  geom_bar(stat = "identity", fill = "darkorange") +
  coord_flip() +
  labs(
    title = "Top 10 Weather Event Types by Total Economic Damage",
    x = "Event Type",
    y = "Total Damage (Billions of US Dollars)"
  ) +
  theme_minimal()

Summary of Findings

Health: Tornadoes are the most dangerous to human health, followed by excessive heat, flash floods, and lightning. Preparedness efforts should prioritize tornado warning systems, heat action plans, and flood safety education.

Economy: Floods cause the greatest economic losses, exceeding hurricanes and tornadoes. Investment in flood control infrastructure and land-use planning may yield high returns.

These results are based on available NOAA data from 1950–2011. More recent events are not included, but the relative rankings are likely stable.