Severe Weather Events and Their Impact on Public Health and Economy Synopsis This analysis explores the U.S. NOAA Storm Database to identify which weather events are most harmful to population health and which have the greatest economic consequences. The dataset includes records of severe weather events across the United States. Population health impact is measured using fatalities and injuries, while economic impact is assessed using property and crop damage. After processing and cleaning the data, event types are aggregated to determine their total impact. The results show that tornadoes are the most harmful to population health, causing the highest number of injuries and fatalities. Flood-related events and hurricanes contribute significantly to economic damage. The findings highlight the importance of prioritizing preparedness for high-impact event types. Visualizations are used to support the conclusions. The analysis is reproducible and includes all necessary code.
Data Processing
knitr::opts_chunk$set(echo = TRUE)
# Load required libraries
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
# Load data (starting from raw file)
file_path <- "repdata_data_StormData.csv.bz2"
storm_data <- read.csv(file_path, stringsAsFactors = FALSE)
# Inspect structure
str(storm_data)
## 'data.frame': 902297 obs. of 37 variables:
## $ STATE__ : num 1 1 1 1 1 1 1 1 1 1 ...
## $ BGN_DATE : chr "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
## $ BGN_TIME : chr "0130" "0145" "1600" "0900" ...
## $ TIME_ZONE : chr "CST" "CST" "CST" "CST" ...
## $ COUNTY : num 97 3 57 89 43 77 9 123 125 57 ...
## $ COUNTYNAME: chr "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
## $ STATE : chr "AL" "AL" "AL" "AL" ...
## $ EVTYPE : chr "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
## $ BGN_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ BGN_AZI : chr "" "" "" "" ...
## $ BGN_LOCATI: chr "" "" "" "" ...
## $ END_DATE : chr "" "" "" "" ...
## $ END_TIME : chr "" "" "" "" ...
## $ COUNTY_END: num 0 0 0 0 0 0 0 0 0 0 ...
## $ COUNTYENDN: logi NA NA NA NA NA NA ...
## $ END_RANGE : num 0 0 0 0 0 0 0 0 0 0 ...
## $ END_AZI : chr "" "" "" "" ...
## $ END_LOCATI: chr "" "" "" "" ...
## $ LENGTH : num 14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
## $ WIDTH : num 100 150 123 100 150 177 33 33 100 100 ...
## $ F : int 3 2 2 2 2 2 2 1 3 3 ...
## $ MAG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ INJURIES : num 15 0 2 2 2 6 1 0 14 0 ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ PROPDMGEXP: chr "K" "K" "K" "K" ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ CROPDMGEXP: chr "" "" "" "" ...
## $ WFO : chr "" "" "" "" ...
## $ STATEOFFIC: chr "" "" "" "" ...
## $ ZONENAMES : chr "" "" "" "" ...
## $ LATITUDE : num 3040 3042 3340 3458 3412 ...
## $ LONGITUDE : num 8812 8755 8742 8626 8642 ...
## $ LATITUDE_E: num 3051 0 0 0 0 ...
## $ LONGITUDE_: num 8806 0 0 0 0 ...
## $ REMARKS : chr "" "" "" "" ...
## $ REFNUM : num 1 2 3 4 5 6 7 8 9 10 ...
# Select relevant variables
storm_subset <- storm_data %>%
select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP,
CROPDMG, CROPDMGEXP)
# Convert damage multipliers to numeric values
convert_exp <- function(exp) {
if (exp %in% c("H", "h")) return(1e2)
if (exp %in% c("K", "k")) return(1e3)
if (exp %in% c("M", "m")) return(1e6)
if (exp %in% c("B", "b")) return(1e9)
return(1)
}
storm_subset$PROP_MULT <- sapply(storm_subset$PROPDMGEXP, convert_exp)
storm_subset$CROP_MULT <- sapply(storm_subset$CROPDMGEXP, convert_exp)
# Calculate total damage
storm_subset <- storm_subset %>%
mutate(
PROP_DAMAGE = PROPDMG * PROP_MULT,
CROP_DAMAGE = CROPDMG * CROP_MULT,
TOTAL_DAMAGE = PROP_DAMAGE + CROP_DAMAGE
)
# Aggregate health impact
health_impact <- storm_subset %>%
group_by(EVTYPE) %>%
summarise(
total_fatalities = sum(FATALITIES, na.rm = TRUE),
total_injuries = sum(INJURIES, na.rm = TRUE),
total_health = total_fatalities + total_injuries
) %>%
arrange(desc(total_health))
# Top 10 health-impacting events
top_health <- head(health_impact, 10)
# Aggregate economic impact
economic_impact <- storm_subset %>%
group_by(EVTYPE) %>%
summarise(
total_damage = sum(TOTAL_DAMAGE, na.rm = TRUE)
) %>%
arrange(desc(total_damage))
# Top 10 economic-impacting events
top_economic <- head(economic_impact, 10)
Results 1. Most Harmful Events to Population Health
ggplot(top_health, aes(x = reorder(EVTYPE, total_health), y = total_health)) +
geom_bar(stat = "identity") +
coord_flip() +
labs(
title = "Top 10 Weather Events by Population Health Impact",
x = "Event Type",
y = "Total (Fatalities + Injuries)"
)
Findings: Tornadoes are the most harmful event type, followed by
excessive heat and floods. These events cause the highest combined
number of injuries and fatalities, making them critical targets for
public safety planning.
ggplot(top_economic, aes(x = reorder(EVTYPE, total_damage), y = total_damage)) +
geom_bar(stat = "identity") +
coord_flip() +
labs(
title = "Top 10 Weather Events by Economic Damage",
x = "Event Type",
y = "Total Damage (USD)"
)
Findings: Floods, hurricanes/typhoons, and storm surges are responsible for the highest economic losses. These events often cause large-scale infrastructure damage and agricultural loss.
Conclusion The analysis shows that tornadoes have the greatest impact on population health, while floods and hurricanes dominate in terms of economic damage. These findings can help government and municipal managers prioritize preparedness strategies and allocate resources more effectively.