This report analyzes the U.S. National Oceanic and Atmospheric Administrations (NOAA) storm database to answer two key questions:
The analysis covers weather events from 1950 through 2011. After processing and cleaning the data, we find that tornadoes are the most dangerous events for human health, causing the most fatalities and injuries. For economic impacts, floods result in the greatest damage to property and crops. These findings can help government officials prioritize resources for severe weather preparedness.
Loading the Data We begin with the raw data file without any preprocessing:
storm_data <- read.csv("repdata_data_StormData.csv.bz2")
The event type (EVTYPE) variable required standardization:
storm_data <- storm_data %>%
mutate(
EVTYPE = tolower(EVTYPE),
EVTYPE = trimws(EVTYPE),
EVTYPE = case_when(
grepl("tornado|funnel", EVTYPE) ~ "tornado",
grepl("heat|hot", EVTYPE) ~ "excessive heat",
grepl("flood", EVTYPE) ~ "flood",
grepl("hurricane|typhoon", EVTYPE) ~ "hurricane",
grepl("thunderstorm|tstm", EVTYPE) ~ "thunderstorm",
grepl("lightning", EVTYPE) ~ "lightning",
grepl("blizzard", EVTYPE) ~ "winter storm",
TRUE ~ EVTYPE
)
)
Property and crop damage values needed conversion:
storm_data <- storm_data %>%
mutate(
PROPDMG_adj = PROPDMG * case_when(
PROPDMGEXP == "K" ~ 1000,
PROPDMGEXP == "M" ~ 1e6,
PROPDMGEXP == "B" ~ 1e9,
TRUE ~ 1
),
CROPDMG_adj = CROPDMG * case_when(
CROPDMGEXP == "K" ~ 1000,
CROPDMGEXP == "M" ~ 1e6,
CROPDMGEXP == "B" ~ 1e9,
TRUE ~ 1
),
TOTAL_DAMAGE = PROPDMG_adj + CROPDMG_adj
)
Preparing Analysis Data:
# For health impacts
health_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarise(
Fatalities = sum(FATALITIES),
Injuries = sum(INJURIES),
Total_Health = sum(FATALITIES + INJURIES)
) %>%
arrange(desc(Total_Health)) %>%
filter(Total_Health > 0)
econ_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarise(
Property_Damage = sum(PROPDMG_adj),
Crop_Damage = sum(CROPDMG_adj),
Total_Damage = sum(TOTAL_DAMAGE)
) %>%
arrange(desc(Total_Damage)) %>%
filter(Total_Damage > 0)
Results Most Harmful Events to Population Health:
health_top10 <- head(health_impact, 10)
ggplot(health_top10, aes(x = reorder(EVTYPE, Total_Health), y = Total_Health)) +
geom_col(fill = "steelblue") +
coord_flip() +
labs(
title = "Top 10 Most Harmful Weather Events to Population Health",
x = "",
y = "Total Fatalities + Injuries"
) +
theme_minimal()
Key Findings:
Tornadoes cause the most harm (r format(health_top10$Total_Health[1], big.mark=“,”) total health impacts)
Excessive heat is second most dangerous
Floods and thunderstorms also significant
Events with Greatest Economic Consequences:
econ_top10 <- head(econ_impact, 10)
ggplot(econ_top10, aes(x = reorder(EVTYPE, Total_Damage), y = Total_Damage/1e9)) +
geom_col(fill = "darkorange") +
coord_flip() +
labs(
title = "Top 10 Most Costly Weather Events",
x = "",
y = "Total Damage (Billions USD)"
) +
theme_minimal()
Key Findings:
Floods cause most damage (\(r format(econ_top10\)Total_Damage[1]/1e9, digits=2) billion)
Hurricanes second (\(r format(econ_top10\)Total_Damage[2]/1e9, digits=2) billion)
Tornadoes third (\(r format(econ_top10\)Total_Damage[3]/1e9, digits=2) billion)
Conclusion Based on the analysis of NOAA storm data from 1950-2011:
For Public Health Protection, priority should be given to:
Tornado warning systems and shelters
Heat wave response plans
Flood safety measures
For Economic Protection, focus should be on:
Flood prevention infrastructure
Hurricane-resistant building codes
Agricultural protection systems
These findings provide evidence-based guidance for severe weather preparedness planning. ```