Synopsis

This analysis explores the NOAA Storm Database to identify the most harmful weather events in the United States in terms of public health and economic impact. The data includes events from 1950 to 2011. The analysis focuses on two main questions: (1) Which types of events are most harmful to population health (fatalities and injuries)? (2) Which types of events have the greatest economic consequences (property and crop damage)? Results show that tornadoes have the highest impact on human health, while floods lead in economic damage. All processing and visualization were done using R.

Data Processing

knitr::opts_chunk$set(echo = TRUE, cache = TRUE)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(readr)
# Load the raw data
data <- read.csv("repdata_data_StormData (1).csv.bz2")

# Check column names
colnames(data)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"
# Select relevant columns
storm_data <- data %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP,
         CROPDMG, CROPDMGEXP)

# Function to convert exponents to actual multipliers
exp_converter <- function(exp) {
  ifelse(exp %in% c("K", "k"), 1e3,
  ifelse(exp %in% c("M", "m"), 1e6,
  ifelse(exp %in% c("B", "b"), 1e9, 1)))
}

storm_data <- storm_data %>%
  mutate(PROPDMGVAL = PROPDMG * exp_converter(PROPDMGEXP),
         CROPDMGVAL = CROPDMG * exp_converter(CROPDMGEXP),
         TOTALDMG = PROPDMGVAL + CROPDMGVAL)

Results

Most Harmful Events to Population Health

health_impact <- storm_data %>%
  group_by(EVTYPE) %>%
  summarise(Fatalities = sum(FATALITIES, na.rm = TRUE),
            Injuries = sum(INJURIES, na.rm = TRUE)) %>%
  mutate(Total = Fatalities + Injuries) %>%
  arrange(desc(Total)) %>%
  top_n(10, Total)

ggplot(health_impact, aes(x = reorder(EVTYPE, -Total), y = Total)) +
  geom_bar(stat = "identity", fill = "tomato") +
  labs(title = "Top 10 Weather Events Causing Most Health Impact",
       x = "Event Type", y = "Total Injuries + Fatalities") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Events with the Greatest Economic Consequences

economic_impact <- storm_data %>%
  group_by(EVTYPE) %>%
  summarise(Economic_Damage = sum(TOTALDMG, na.rm = TRUE)) %>%
  arrange(desc(Economic_Damage)) %>%
  top_n(10, Economic_Damage)

ggplot(economic_impact, aes(x = reorder(EVTYPE, -Economic_Damage), y = Economic_Damage / 1e9)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  labs(title = "Top 10 Weather Events Causing Most Economic Damage",
       x = "Event Type", y = "Damage (in Billion USD)") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Conclusion

The analysis reveals that: - Tornadoes are the leading cause of human injuries and fatalities. - Floods, followed by hurricanes and tornadoes, cause the most economic damage.

This information can help municipalities prioritize resources for disaster preparedness and response.