This report analyzes the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database to identify which types of severe weather events are most harmful to population health and which have the greatest economic consequences. The dataset spans from 1950 to November 2011, covering major storm events across the United States. After loading and cleaning the data, event types were aggregated by total fatalities, injuries, and property/crop damage costs. Tornadoes were found to be the most harmful event type with respect to population health, accounting for the highest combined fatalities and injuries. Floods caused the greatest economic damage overall when combining property and crop damage estimates. These findings can help government and municipal managers prioritize emergency preparedness and resource allocation for the most impactful weather events.
The data is downloaded from the course website as a bzip2-compressed CSV file and read directly into R using only base R functions. No external packages are required.
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file <- "StormData.csv.bz2"
if (!file.exists(file)) {
download.file(url, destfile = file, method = "auto")
}
storm_data <- read.csv(file, stringsAsFactors = FALSE)
cat("Rows:", nrow(storm_data), "\n")
## Rows: 902297
cat("Columns:", ncol(storm_data), "\n")
## Columns: 37
key_cols <- c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")
head(storm_data[, key_cols])
## EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO 0 15 25.0 K 0
## 2 TORNADO 0 0 2.5 K 0
## 3 TORNADO 0 2 25.0 K 0
## 4 TORNADO 0 2 2.5 K 0
## 5 TORNADO 0 2 2.5 K 0
## 6 TORNADO 0 6 2.5 K 0
Fatalities and injuries are summed by event type using base R
aggregate(), then merged and sorted by total harm.
fatalities <- aggregate(FATALITIES ~ EVTYPE, data = storm_data, FUN = sum)
injuries <- aggregate(INJURIES ~ EVTYPE, data = storm_data, FUN = sum)
health_data <- merge(fatalities, injuries, by = "EVTYPE")
health_data$Total <- health_data$FATALITIES + health_data$INJURIES
health_data <- health_data[order(-health_data$Total), ]
rownames(health_data) <- NULL
top_health <- head(health_data, 10)
print(top_health)
## EVTYPE FATALITIES INJURIES Total
## 1 TORNADO 5633 91346 96979
## 2 EXCESSIVE HEAT 1903 6525 8428
## 3 TSTM WIND 504 6957 7461
## 4 FLOOD 470 6789 7259
## 5 LIGHTNING 816 5230 6046
## 6 HEAT 937 2100 3037
## 7 FLASH FLOOD 978 1777 2755
## 8 ICE STORM 89 1975 2064
## 9 THUNDERSTORM WIND 133 1488 1621
## 10 WINTER STORM 206 1321 1527
The PROPDMGEXP and CROPDMGEXP columns use
letter codes (K = thousands, M = millions, B = billions). We convert
these to numeric multipliers and compute total economic damage per event
type.
exp_to_num <- function(exp_vec) {
exp_vec <- toupper(trimws(exp_vec))
result <- rep(1, length(exp_vec))
result[exp_vec == "H"] <- 1e2
result[exp_vec == "K"] <- 1e3
result[exp_vec == "2"] <- 1e2
result[exp_vec == "3"] <- 1e3
result[exp_vec == "4"] <- 1e4
result[exp_vec == "5"] <- 1e5
result[exp_vec == "6"] <- 1e6
result[exp_vec == "7"] <- 1e7
result[exp_vec == "M"] <- 1e6
result[exp_vec == "B"] <- 1e9
return(result)
}
storm_data$PropDamage <- storm_data$PROPDMG * exp_to_num(storm_data$PROPDMGEXP)
storm_data$CropDamage <- storm_data$CROPDMG * exp_to_num(storm_data$CROPDMGEXP)
storm_data$TotalDamage <- storm_data$PropDamage + storm_data$CropDamage
prop_agg <- aggregate(PropDamage ~ EVTYPE, data = storm_data, FUN = sum)
crop_agg <- aggregate(CropDamage ~ EVTYPE, data = storm_data, FUN = sum)
total_agg <- aggregate(TotalDamage ~ EVTYPE, data = storm_data, FUN = sum)
economic_data <- merge(prop_agg, crop_agg, by = "EVTYPE")
economic_data <- merge(economic_data, total_agg, by = "EVTYPE")
economic_data <- economic_data[order(-economic_data$TotalDamage), ]
rownames(economic_data) <- NULL
top_economic <- head(economic_data, 10)
print(top_economic)
## EVTYPE PropDamage CropDamage TotalDamage
## 1 FLOOD 144657709807 5661968450 150319678257
## 2 HURRICANE/TYPHOON 69305840000 2607872800 71913712800
## 3 TORNADO 56947380677 414953270 57362333947
## 4 STORM SURGE 43323536000 5000 43323541000
## 5 HAIL 15735267513 3025954473 18761221986
## 6 FLASH FLOOD 16822673979 1421317100 18243991079
## 7 DROUGHT 1046106000 13972566000 15018672000
## 8 HURRICANE 11868319010 2741910000 14610229010
## 9 RIVER FLOOD 5118945500 5029459000 10148404500
## 10 ICE STORM 3944927860 5022113500 8967041360
par(mar = c(11, 5, 4, 2))
bar_matrix <- rbind(top_health$FATALITIES, top_health$INJURIES)
colnames(bar_matrix) <- top_health$EVTYPE
barplot(
bar_matrix,
beside = FALSE,
col = c("#d73027", "#fc8d59"),
las = 2,
cex.names = 0.8,
main = "Top 10 Weather Events Most Harmful to Population Health\n(USA, 1950-2011)",
ylab = "Count (Fatalities + Injuries)",
legend.text = c("Fatalities", "Injuries"),
args.legend = list(x = "topright", bty = "n")
)
Figure 1: Top 10 weather event types by total fatalities and injuries (1950-2011). Tornadoes are by far the most harmful event type for population health, followed by thunderstorm winds and excessive heat.
Key Finding: Tornadoes are overwhelmingly the most harmful weather event for population health, responsible for over 90,000 injuries and more than 5,000 fatalities. Thunderstorm winds (TSTM WIND), excessive heat, and floods follow at a much lower magnitude.
par(mar = c(11, 5, 4, 2))
prop_bil <- top_economic$PropDamage / 1e9
crop_bil <- top_economic$CropDamage / 1e9
bar_econ <- rbind(prop_bil, crop_bil)
colnames(bar_econ) <- top_economic$EVTYPE
barplot(
bar_econ,
beside = FALSE,
col = c("#4575b4", "#74add1"),
las = 2,
cex.names = 0.8,
main = "Top 10 Weather Events with Greatest Economic Consequences\n(USA, 1950-2011)",
ylab = "Total Damage (Billions USD)",
legend.text = c("Property Damage", "Crop Damage"),
args.legend = list(x = "topright", bty = "n")
)
Figure 2: Top 10 weather event types by total economic damage (property + crop) in billions USD (1950-2011). Floods cause the greatest total economic damage, followed by hurricanes/typhoons and tornadoes.
Key Finding: Floods cause the greatest total economic damage (~$150 billion), driven primarily by property damage. Hurricanes/Typhoons and Tornadoes follow. Drought stands out for its disproportionate crop damage relative to property damage.
cat("=== Top 5 Events by Health Impact ===\n")
## === Top 5 Events by Health Impact ===
print(head(health_data[, c("EVTYPE", "FATALITIES", "INJURIES", "Total")], 5),
row.names = FALSE)
## EVTYPE FATALITIES INJURIES Total
## TORNADO 5633 91346 96979
## EXCESSIVE HEAT 1903 6525 8428
## TSTM WIND 504 6957 7461
## FLOOD 470 6789 7259
## LIGHTNING 816 5230 6046
cat("\n=== Top 5 Events by Economic Impact (Billions USD) ===\n")
##
## === Top 5 Events by Economic Impact (Billions USD) ===
top5_econ <- head(economic_data, 5)
top5_print <- data.frame(
EVTYPE = top5_econ$EVTYPE,
"Property_B$" = round(top5_econ$PropDamage / 1e9, 2),
"Crop_B$" = round(top5_econ$CropDamage / 1e9, 2),
"Total_B$" = round(top5_econ$TotalDamage / 1e9, 2),
check.names = FALSE
)
print(top5_print, row.names = FALSE)
## EVTYPE Property_B$ Crop_B$ Total_B$
## FLOOD 144.66 5.66 150.32
## HURRICANE/TYPHOON 69.31 2.61 71.91
## TORNADO 56.95 0.41 57.36
## STORM SURGE 43.32 0.00 43.32
## HAIL 15.74 3.03 18.76
Based on the analysis of the NOAA Storm Database (1950-2011):
Population Health: Tornadoes pose the greatest threat to human life and safety, accounting for the most fatalities and injuries by a wide margin. Emergency managers should prioritize tornado preparedness, warning systems, and sheltering infrastructure.
Economic Impact: Floods cause the most total economic damage, making flood mitigation, infrastructure hardening, and flood insurance programs especially important for reducing financial losses.
These results highlight that different event types dominate different categories of harm, suggesting that a multi-hazard approach is necessary for comprehensive disaster preparedness planning.