Synopsis

This analysis explores the NOAA Storm Database to identify which weather event types have the greatest impact on public health and the economy in the United States between 1950 and 2011. Using variables such as fatalities, injuries, property damage, and crop damage, we summarize and visualize the impact of different event types. Data transformations include converting monetary values to numeric and grouping events. Tornadoes are found to be the most harmful to population health, while floods and hurricanes cause the highest economic damage. All data processing and plotting were performed in R, using packages such as dplyr, ggplot2, and knitr. The analysis begins with the raw data file and is fully reproducible. Figures are limited to three and include appropriate captions. This report aims to inform stakeholders involved in disaster preparedness. The final figures illustrate the top harmful and costly weather events. The code and results are included in this document.

Data Processing

### Install Packages ####
install.packages(c("dplyr", "ggplot2", "readxl", "readr"))

### Libraries ####

library(dplyr)
library(ggplot2)
library(readr)
library(readxl)

# Leer datos desde archivo comprimido
storm_data <- read.csv("repdata_data_StormData.csv.bz2")

# Filtrar columnas relevantes
storm_df <- storm_data %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)

# Función para convertir las magnitudes
convert_exp <- function(exp) {
  if (exp %in% c("K", "k")) return(1e3)
  if (exp %in% c("M", "m")) return(1e6)
  if (exp %in% c("B", "b")) return(1e9)
  if (exp %in% c("", "+", "-", "?", "0")) return(1)
  return(1)
}

# Aplicar la conversión
storm_df <- storm_df %>%
  mutate(
    PROPDMGEXP = sapply(PROPDMGEXP, convert_exp),
    CROPDMGEXP = sapply(CROPDMGEXP, convert_exp),
    PROP_DAMAGE = PROPDMG * as.numeric(PROPDMGEXP),
    CROP_DAMAGE = CROPDMG * as.numeric(CROPDMGEXP)
  )

Results

Eventos más dañinos para la salud pública

health_impact <- storm_df %>%
  group_by(EVTYPE) %>%
  summarise(
    Fatalities = sum(FATALITIES),
    Injuries = sum(INJURIES)
  ) %>%
  arrange(desc(Fatalities + Injuries)) %>%
  top_n(10, Fatalities + Injuries)

# Figura 1
library(ggplot2)
ggplot(health_impact, aes(x = reorder(EVTYPE, Fatalities + Injuries), y = Fatalities + Injuries)) +
  geom_bar(stat = "identity", fill = "darkred") +
  coord_flip() +
  labs(title = "Eventos Más Dañinos para la Salud Pública",
       x = "Tipo de Evento", y = "Total Fatalidades + Lesiones")

Eventos con mayor impacto económico

economic_impact <- storm_df %>%
  group_by(EVTYPE) %>%
  summarise(
    Total_Damage = sum(PROP_DAMAGE + CROP_DAMAGE)
  ) %>%
  arrange(desc(Total_Damage)) %>%
  top_n(10, Total_Damage)

# Figura 2
ggplot(economic_impact, aes(x = reorder(EVTYPE, Total_Damage), y = Total_Damage)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Eventos con Mayor Impacto Económico",
       x = "Tipo de Evento", y = "Daño Económico Total (USD)")