Synopsis

This report analyzes the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database to identify which types of severe weather events are most harmful to population health and which have the greatest economic consequences. The dataset spans from 1950 to November 2011, covering major storm events across the United States. After loading and cleaning the data, event types were aggregated by total fatalities, injuries, and property/crop damage costs. Tornadoes were found to be the most harmful event type with respect to population health, accounting for the highest combined fatalities and injuries. Floods caused the greatest economic damage overall when combining property and crop damage estimates. These findings can help government and municipal managers prioritize emergency preparedness and resource allocation for the most impactful weather events.


Data Processing

Loading the Data

The data is downloaded from the course website as a bzip2-compressed CSV file and read directly into R using only base R functions. No external packages are required.

url  <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file <- "StormData.csv.bz2"

if (!file.exists(file)) {
  download.file(url, destfile = file, method = "auto")
}

storm_data <- read.csv(file, stringsAsFactors = FALSE)
cat("Rows:", nrow(storm_data), "\n")
## Rows: 902297
cat("Columns:", ncol(storm_data), "\n")
## Columns: 37

Inspecting Key Columns

key_cols <- c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")
head(storm_data[, key_cols])
##    EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO          0       15    25.0          K       0           
## 2 TORNADO          0        0     2.5          K       0           
## 3 TORNADO          0        2    25.0          K       0           
## 4 TORNADO          0        2     2.5          K       0           
## 5 TORNADO          0        2     2.5          K       0           
## 6 TORNADO          0        6     2.5          K       0

Processing Health Data

Fatalities and injuries are summed by event type using base R aggregate(), then merged and sorted by total harm.

fatalities <- aggregate(FATALITIES ~ EVTYPE, data = storm_data, FUN = sum)
injuries   <- aggregate(INJURIES   ~ EVTYPE, data = storm_data, FUN = sum)

health_data <- merge(fatalities, injuries, by = "EVTYPE")
health_data$Total <- health_data$FATALITIES + health_data$INJURIES
health_data <- health_data[order(-health_data$Total), ]
rownames(health_data) <- NULL

top_health <- head(health_data, 10)
print(top_health)
##               EVTYPE FATALITIES INJURIES Total
## 1            TORNADO       5633    91346 96979
## 2     EXCESSIVE HEAT       1903     6525  8428
## 3          TSTM WIND        504     6957  7461
## 4              FLOOD        470     6789  7259
## 5          LIGHTNING        816     5230  6046
## 6               HEAT        937     2100  3037
## 7        FLASH FLOOD        978     1777  2755
## 8          ICE STORM         89     1975  2064
## 9  THUNDERSTORM WIND        133     1488  1621
## 10      WINTER STORM        206     1321  1527

Processing Economic Data

The PROPDMGEXP and CROPDMGEXP columns use letter codes (K = thousands, M = millions, B = billions). We convert these to numeric multipliers and compute total economic damage per event type.

exp_to_num <- function(exp_vec) {
  exp_vec <- toupper(trimws(exp_vec))
  result  <- rep(1, length(exp_vec))
  result[exp_vec == "H"] <- 1e2
  result[exp_vec == "K"] <- 1e3
  result[exp_vec == "2"] <- 1e2
  result[exp_vec == "3"] <- 1e3
  result[exp_vec == "4"] <- 1e4
  result[exp_vec == "5"] <- 1e5
  result[exp_vec == "6"] <- 1e6
  result[exp_vec == "7"] <- 1e7
  result[exp_vec == "M"] <- 1e6
  result[exp_vec == "B"] <- 1e9
  return(result)
}

storm_data$PropDamage  <- storm_data$PROPDMG * exp_to_num(storm_data$PROPDMGEXP)
storm_data$CropDamage  <- storm_data$CROPDMG * exp_to_num(storm_data$CROPDMGEXP)
storm_data$TotalDamage <- storm_data$PropDamage + storm_data$CropDamage

prop_agg  <- aggregate(PropDamage  ~ EVTYPE, data = storm_data, FUN = sum)
crop_agg  <- aggregate(CropDamage  ~ EVTYPE, data = storm_data, FUN = sum)
total_agg <- aggregate(TotalDamage ~ EVTYPE, data = storm_data, FUN = sum)

economic_data <- merge(prop_agg,      crop_agg,  by = "EVTYPE")
economic_data <- merge(economic_data, total_agg, by = "EVTYPE")
economic_data <- economic_data[order(-economic_data$TotalDamage), ]
rownames(economic_data) <- NULL

top_economic <- head(economic_data, 10)
print(top_economic)
##               EVTYPE   PropDamage  CropDamage  TotalDamage
## 1              FLOOD 144657709807  5661968450 150319678257
## 2  HURRICANE/TYPHOON  69305840000  2607872800  71913712800
## 3            TORNADO  56947380677   414953270  57362333947
## 4        STORM SURGE  43323536000        5000  43323541000
## 5               HAIL  15735267513  3025954473  18761221986
## 6        FLASH FLOOD  16822673979  1421317100  18243991079
## 7            DROUGHT   1046106000 13972566000  15018672000
## 8          HURRICANE  11868319010  2741910000  14610229010
## 9        RIVER FLOOD   5118945500  5029459000  10148404500
## 10         ICE STORM   3944927860  5022113500   8967041360

Results

Question 1: Which Event Types Are Most Harmful to Population Health?

par(mar = c(11, 5, 4, 2))

bar_matrix          <- rbind(top_health$FATALITIES, top_health$INJURIES)
colnames(bar_matrix) <- top_health$EVTYPE

barplot(
  bar_matrix,
  beside      = FALSE,
  col         = c("#d73027", "#fc8d59"),
  las         = 2,
  cex.names   = 0.8,
  main        = "Top 10 Weather Events Most Harmful to Population Health\n(USA, 1950-2011)",
  ylab        = "Count (Fatalities + Injuries)",
  legend.text = c("Fatalities", "Injuries"),
  args.legend = list(x = "topright", bty = "n")
)
Figure 1: Top 10 weather event types by total fatalities and injuries (1950-2011). Tornadoes are by far the most harmful event type for population health, followed by thunderstorm winds and excessive heat.

Figure 1: Top 10 weather event types by total fatalities and injuries (1950-2011). Tornadoes are by far the most harmful event type for population health, followed by thunderstorm winds and excessive heat.

Key Finding: Tornadoes are overwhelmingly the most harmful weather event for population health, responsible for over 90,000 injuries and more than 5,000 fatalities. Thunderstorm winds (TSTM WIND), excessive heat, and floods follow at a much lower magnitude.


Question 2: Which Event Types Have the Greatest Economic Consequences?

par(mar = c(11, 5, 4, 2))

prop_bil <- top_economic$PropDamage  / 1e9
crop_bil <- top_economic$CropDamage  / 1e9

bar_econ            <- rbind(prop_bil, crop_bil)
colnames(bar_econ)  <- top_economic$EVTYPE

barplot(
  bar_econ,
  beside      = FALSE,
  col         = c("#4575b4", "#74add1"),
  las         = 2,
  cex.names   = 0.8,
  main        = "Top 10 Weather Events with Greatest Economic Consequences\n(USA, 1950-2011)",
  ylab        = "Total Damage (Billions USD)",
  legend.text = c("Property Damage", "Crop Damage"),
  args.legend = list(x = "topright", bty = "n")
)
Figure 2: Top 10 weather event types by total economic damage (property + crop) in billions USD (1950-2011). Floods cause the greatest total economic damage, followed by hurricanes/typhoons and tornadoes.

Figure 2: Top 10 weather event types by total economic damage (property + crop) in billions USD (1950-2011). Floods cause the greatest total economic damage, followed by hurricanes/typhoons and tornadoes.

Key Finding: Floods cause the greatest total economic damage (~$150 billion), driven primarily by property damage. Hurricanes/Typhoons and Tornadoes follow. Drought stands out for its disproportionate crop damage relative to property damage.


Summary Table

cat("=== Top 5 Events by Health Impact ===\n")
## === Top 5 Events by Health Impact ===
print(head(health_data[, c("EVTYPE", "FATALITIES", "INJURIES", "Total")], 5),
      row.names = FALSE)
##          EVTYPE FATALITIES INJURIES Total
##         TORNADO       5633    91346 96979
##  EXCESSIVE HEAT       1903     6525  8428
##       TSTM WIND        504     6957  7461
##           FLOOD        470     6789  7259
##       LIGHTNING        816     5230  6046
cat("\n=== Top 5 Events by Economic Impact (Billions USD) ===\n")
## 
## === Top 5 Events by Economic Impact (Billions USD) ===
top5_econ <- head(economic_data, 5)
top5_print <- data.frame(
  EVTYPE         = top5_econ$EVTYPE,
  "Property_B$"  = round(top5_econ$PropDamage  / 1e9, 2),
  "Crop_B$"      = round(top5_econ$CropDamage  / 1e9, 2),
  "Total_B$"     = round(top5_econ$TotalDamage / 1e9, 2),
  check.names    = FALSE
)
print(top5_print, row.names = FALSE)
##             EVTYPE Property_B$ Crop_B$ Total_B$
##              FLOOD      144.66    5.66   150.32
##  HURRICANE/TYPHOON       69.31    2.61    71.91
##            TORNADO       56.95    0.41    57.36
##        STORM SURGE       43.32    0.00    43.32
##               HAIL       15.74    3.03    18.76

Conclusions

Based on the analysis of the NOAA Storm Database (1950-2011):

  1. Population Health: Tornadoes pose the greatest threat to human life and safety, accounting for the most fatalities and injuries by a wide margin. Emergency managers should prioritize tornado preparedness, warning systems, and sheltering infrastructure.

  2. Economic Impact: Floods cause the most total economic damage, making flood mitigation, infrastructure hardening, and flood insurance programs especially important for reducing financial losses.

These results highlight that different event types dominate different categories of harm, suggesting that a multi-hazard approach is necessary for comprehensive disaster preparedness planning.