Synopsis

This analysis explores the NOAA Storm Database from 1950 to 2011 to determine which types of severe weather events are most harmful to public health and which cause the greatest economic damage in the United States. The data is processed from the original CSV file to calculate fatalities, injuries, and total economic loss. Results show that tornadoes cause the highest number of injuries and fatalities, while floods and hurricanes lead to the largest economic losses. The findings aim to inform decision-makers responsible for disaster preparedness and resource prioritization.

Data Processing

The dataset was downloaded as a compressed .csv.bz2 file and loaded directly into R. The variables PROPDMG and CROPDMG represent damage amounts, and their associated exponents (PROPDMGEXP, CROPDMGEXP) define the magnitude (e.g., K = thousands, M = millions, B = billions). These values were converted to numeric amounts for analysis.

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.3
library(readr)
## Warning: package 'readr' was built under R version 4.3.3
data <- read.csv(bzfile("C:/Users/AAmaya/Downloads/repdata_data_StormData.csv.bz2"))

# Function to convert exponent values
convert_exp <- function(exp) {
  exp <- toupper(exp)
  if (exp == "K") return(1e3)
  if (exp == "M") return(1e6)
  if (exp == "B") return(1e9)
  if (grepl("^[0-9]+$", exp)) return(10 ^ as.numeric(exp))
  return(1)
}

# Apply conversion
data$PROPDMGEXP <- sapply(data$PROPDMGEXP, convert_exp)
data$CROPDMGEXP <- sapply(data$CROPDMGEXP, convert_exp)

# Calculate total damage
data <- data %>%
  mutate(
    prop_damage = PROPDMG * PROPDMGEXP,
    crop_damage = CROPDMG * CROPDMGEXP,
    total_damage = prop_damage + crop_damage
  )

Resultados

health <- data %>%
  group_by(EVTYPE) %>%
  summarise(
    fatalities = sum(FATALITIES, na.rm = TRUE),
    injuries = sum(INJURIES, na.rm = TRUE),
    total_health_impact = fatalities + injuries
  ) %>%
  arrange(desc(total_health_impact)) %>%
  slice(1:10)

# Plot
ggplot(health, aes(x=reorder(EVTYPE, -total_health_impact), y=total_health_impact, fill=EVTYPE)) +
  geom_col() +
  theme(axis.text.x = element_text(angle=45, hjust=1)) +
  labs(
    title = "Top 10 Weather Events Impacting Population Health",
    x = "Event Type",
    y = "Fatalities + Injuries"
  )

## Conclusion

Tornadoes are the most harmful weather events to population health in the United States, causing the highest number of fatalities and injuries. Floods, hurricanes, and tornadoes are also the most economically damaging. This information is useful for policymakers and emergency planners to prioritize efforts and resources in mitigating the effects of future severe weather events.