Synopsis

This analysis explores the NOAA Storm Database to identify which weather events are most harmful to public health and the economy. We processed data from 1950 to 2011, focusing on fatalities, injuries, and property/crop damage. Results show that tornadoes are leading causes of health issues, while floods and hurricanes cause the most economic destruction.

Data Processing

# ¡AQUÍ ADENTRO SÍ FUNCIONA EL CÓDIGO! 
library(dplyr)
## 
## Adjuntando el paquete: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)

# 1. Load data
storm_data <- read.csv("repdata_data_StormData.csv.bz2")

# 2. Function to transform exponents
exp_transform <- function(e) {
  e <- toupper(as.character(e))
  if (e == 'H') return(100)
  if (e == 'K') return(1000)
  if (e == 'M') return(1e+06)
  if (e == 'B') return(1e+09)
  return(1)
}

# 3. Clean and Calculate totals
important_data <- storm_data %>% 
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP) %>%
  mutate(prop_mult = sapply(PROPDMGEXP, exp_transform),
         crop_mult = sapply(CROPDMGEXP, exp_transform),
         PROPTOTAL = PROPDMG * prop_mult,
         CROPTOTAL = CROPDMG * crop_mult,
         TOTAL_ECON = PROPTOTAL + CROPTOTAL)

# 4. Group by event for Health (Top 10)
health_impact <- important_data %>%
  group_by(EVTYPE) %>%
  summarise(FATALITIES = sum(FATALITIES), INJURIES = sum(INJURIES)) %>%
  arrange(desc(FATALITIES + INJURIES)) %>%
  slice(1:10)

# 5. Group by event for Economy (Top 10)
econ_impact <- important_data %>%
  group_by(EVTYPE) %>%
  summarise(TOTAL_ECON = sum(TOTAL_ECON)) %>%
  arrange(desc(TOTAL_ECON)) %>%
  slice(1:10)

Results

1. Most Harmful Events to Population Health

The following figure illustrates the top 10 weather events that caused the highest number of fatalities and injuries combined across the United States.

# Transformamos los datos para poder graficar muertes y lesiones juntas
health_long <- health_impact %>%
  pivot_longer(cols = c(FATALITIES, INJURIES), names_to = "Type", values_to = "Count")

ggplot(health_long, aes(x = reorder(EVTYPE, Count), y = Count, fill = Type)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Health Impacts by Weather Event",
       x = "Event Type", 
       y = "Total Number of People Affected") +
  theme_minimal()

Results

Public Health Impact

The following chart shows the top 10 weather events that are most harmful to the population, considering both fatalities and injuries. As seen in the data, tornadoes are the leading cause of health issues in the US.

library(tidyr)
# Transform data for stacked bar plot
health_long <- health_impact %>%
  pivot_longer(cols = c(FATALITIES, INJURIES), names_to = "Type", values_to = "Count")

ggplot(health_long, aes(x = reorder(EVTYPE, Count), y = Count, fill = Type)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Health Impacts by Weather Event",
       x = "Event Type", y = "Total Number of People Affected") +
  theme_minimal()