Synopsis

This analysis explores the U.S. National Oceanic and Atmospheric Administration (NOAA) storm database to determine which types of events are most harmful to population health and which have the greatest economic consequences. The results show that tornadoes are the most harmful in terms of injuries and fatalities, while floods and hurricanes cause the most economic damage. Understanding these patterns helps prioritize disaster preparedness and resource allocation.

Data Processing

The dataset is loaded into R and the relevant variables required for the analysis are selected. These include event type, fatalities, injuries, and damage-related variables. Filtering the dataset at this stage helps reduce complexity and ensures that only meaningful information is used for further analysis.

# Load libraries
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.5.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Load data
data <- read.csv("repdata_data_StormData.csv")

# Select relevant columns
storm <- data %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)

Clean damage values

The property and crop damage values are recorded along with exponent indicators that represent their magnitude (e.g., K for thousands, M for millions). These values are converted into numeric form to calculate the actual damage amounts. This step is essential to ensure accurate computation of total economic impact.

# Function to convert damage exponent
convert_exp <- function(exp) {
  if (is.na(exp) || exp == "") return(1)
  if (exp %in% c("K", "k")) return(1e3)
  if (exp %in% c("M", "m")) return(1e6)
  if (exp %in% c("B", "b")) return(1e9)
  if (exp %in% c("H", "h")) return(1e2)
  if (grepl("^[0-9]+$", exp)) return(10^as.numeric(exp))
  return(1)
}

storm$PROPDMGEXP <- sapply(storm$PROPDMGEXP, convert_exp)
storm$CROPDMGEXP <- sapply(storm$CROPDMGEXP, convert_exp)

storm$PROPDMG <- storm$PROPDMG * storm$PROPDMGEXP
storm$CROPDMG <- storm$CROPDMG * storm$CROPDMGEXP

Results

1. Most harmful events to population health

To identify the most harmful weather events to population health, the total number of fatalities and injuries is calculated for each event type. The data is then sorted to highlight the top events contributing to the highest number of casualties.

health <- storm %>%
  group_by(EVTYPE) %>%
  summarise(
    fatalities = sum(FATALITIES, na.rm = TRUE),
    injuries = sum(INJURIES, na.rm = TRUE)
  ) %>%
  arrange(desc(fatalities + injuries)) %>%
  head(10)

health
## # A tibble: 10 × 3
##    EVTYPE            fatalities injuries
##    <chr>                  <dbl>    <dbl>
##  1 TORNADO                 5633    91346
##  2 EXCESSIVE HEAT          1903     6525
##  3 TSTM WIND                504     6957
##  4 FLOOD                    470     6789
##  5 LIGHTNING                816     5230
##  6 HEAT                     937     2100
##  7 FLASH FLOOD              978     1777
##  8 ICE STORM                 89     1975
##  9 THUNDERSTORM WIND        133     1488
## 10 WINTER STORM             206     1321
barplot(health$fatalities + health$injuries,
        names.arg = health$EVTYPE,
        las = 2,
        col = "red",
        main = "Top 10 Most Harmful Events to Population Health",
        xlab = "Event Type",
        ylab = "Total Casualties")

The visualization clearly shows that tornadoes dominate in terms of total casualties compared to other event types.


2. Events with greatest economic consequences

To evaluate the economic impact of weather events, the total property and crop damage is computed for each event type. These values are aggregated and sorted to determine which events result in the highest financial losses.

economic <- storm %>%
  group_by(EVTYPE) %>%
  summarise(
    damage = sum(PROPDMG + CROPDMG, na.rm = TRUE)
  ) %>%
  arrange(desc(damage)) %>%
  head(10)

economic
## # A tibble: 10 × 2
##    EVTYPE                   damage
##    <chr>                     <dbl>
##  1 FLOOD             150319678257 
##  2 HURRICANE/TYPHOON  71913712800 
##  3 TORNADO            57362333946.
##  4 STORM SURGE        43323541000 
##  5 HAIL               18761221986.
##  6 FLASH FLOOD        18243991078.
##  7 DROUGHT            15018672000 
##  8 HURRICANE          14610229010 
##  9 RIVER FLOOD        10148404500 
## 10 ICE STORM           8967041360
barplot(economic$damage / 1e6,
        names.arg = economic$EVTYPE,
        las = 2,
        col = "blue",
        main = "Top 10 Events with Greatest Economic Damage",
        xlab = "Event Type",
        ylab = "Total Damage (Millions USD)")

The plot highlights that floods and hurricanes account for significantly higher economic losses than other weather events.