Synopsis

This analysis explores the U.S. NOAA Storm Database to identify the most harmful weather event types in terms of population health and economic impact.

Data Processing

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

storm_data <- read.csv("repdata_data_StormData.csv.bz2")

storm_data$EVTYPE <- tolower(storm_data$EVTYPE)

Results

Types of Weather Events with the most harmful impact on population health

First the data is plotted in terms of the most harmful weather events based on population health. Here populations health is quantified in number of fatalities or injuries.

storm_data <- storm_data %>%
  mutate(HealthImpact = FATALITIES + INJURIES)

health_impact <- storm_data %>%
  group_by(EVTYPE) %>%
  summarise(TotalHealthImpact = sum(HealthImpact, na.rm = TRUE)) %>%
  arrange(desc(TotalHealthImpact))

top_health <- health_impact[1:10, ]

ggplot(top_health, aes(reorder(EVTYPE, TotalHealthImpact), TotalHealthImpact)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Event Types arranged by Health Impact", x = "Event Type", y = "Total Health Impact")

Weather events with the greatest economic consequences across USA

Another metric to asses the impact of weather events is the economic consequences. The dataset analysed in this report has numbers of property damage and damage to crops. Each type of damage has an estimated revenue loss that will be plotted based on the weather event couseing these revenue losses.

# Note that the units needs to be converted to numbers prior to assesing the money lost
# Here k is thusinde, m is million and b is billion.
convert_to_number <- function(value, exp) {
  multiplier <- case_when(
    exp %in% c("K", "k") ~ 1e3,
    exp %in% c("M", "m") ~ 1e6,
    exp %in% c("B", "b") ~ 1e9,
    TRUE ~ 1
  )
  value * multiplier
}

storm_data <- storm_data %>%
  mutate(
    PropDamage = convert_to_number(PROPDMG, PROPDMGEXP),
    CropDamage = convert_to_number(CROPDMG, CROPDMGEXP),
    EconImpact = PropDamage + CropDamage
  )

economic_impact <- storm_data %>%
  group_by(EVTYPE) %>%
  summarise(TotalEconImpact = sum(EconImpact, na.rm = TRUE)) %>%
  arrange(desc(TotalEconImpact))


top_econ <- economic_impact[1:10, ]

ggplot(top_econ, aes(reorder(EVTYPE, TotalEconImpact), TotalEconImpact)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Event Types by Economic Impact", x = "Event Type", y = "Total Economic Impact")