Synopsis:

In this analysis, we explore the NOAA Storm Database to identify severe weather events that have the greatest impact on public health and the economy in the United States. We begin by downloading and processing the raw data, focusing on key variables such as event type, fatalities, injuries, and damages. After cleaning and transforming the dataset, we analyze it to determine the top 10 weather events with the highest health and economic impacts. We find that certain event types, such as tornadoes, cause the most harm to public health, while others, like hurricanes and floods, lead to significant economic losses. Our findings can help government and municipal managers better allocate resources and prioritize planning efforts for various types of severe weather events. The results are visualized with bar plots to provide a clear understanding of the relative impacts of different weather events on both population health and the economy.

Data Processing:

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.0
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
# Download and Read the Data

url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(url, "StormData.csv.bz2")
data <- read.csv("StormData.csv.bz2", stringsAsFactors = FALSE)


# Data Processing: Keeping only relevant columns, converting to proper data types, and cleaning up event types.

clean_data <- data %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP) %>%
  mutate(
    EVTYPE = toupper(EVTYPE),
    PROPDMGEXP = case_when(
      PROPDMGEXP %in% c("K", "k") ~ 1e3,
      PROPDMGEXP %in% c("M", "m") ~ 1e6,
      PROPDMGEXP %in% c("B", "b") ~ 1e9,
      TRUE ~ 0
    ),
    CROPDMGEXP = case_when(
      CROPDMGEXP %in% c("K", "k") ~ 1e3,
      CROPDMGEXP %in% c("M", "m") ~ 1e6,
      CROPDMGEXP %in% c("B", "b") ~ 1e9,
      TRUE ~ 0
    ),
    PropertyDamage = PROPDMG * PROPDMGEXP,
    CropDamage = CROPDMG * CROPDMGEXP
  )


# Question 1: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

events_health <- clean_data %>%
  group_by(EVTYPE) %>%
  summarise(Fatalities = sum(FATALITIES), Injuries = sum(INJURIES), Total = Fatalities + Injuries) %>%
  arrange(desc(Total)) %>%
  head(10)

events_health
## # A tibble: 10 × 4
##    EVTYPE            Fatalities Injuries Total
##    <chr>                  <dbl>    <dbl> <dbl>
##  1 TORNADO                 5633    91346 96979
##  2 EXCESSIVE HEAT          1903     6525  8428
##  3 TSTM WIND                504     6957  7461
##  4 FLOOD                    470     6789  7259
##  5 LIGHTNING                816     5230  6046
##  6 HEAT                     937     2100  3037
##  7 FLASH FLOOD              978     1777  2755
##  8 ICE STORM                 89     1975  2064
##  9 THUNDERSTORM WIND        133     1488  1621
## 10 WINTER STORM             206     1321  1527
# Question 2: Across the United States, which types of events have the greatest economic consequences?

events_economic <- clean_data %>%
  group_by(EVTYPE) %>%
  summarise(PropertyDamage = sum(PropertyDamage), CropDamage = sum(CropDamage), Total = PropertyDamage + CropDamage) %>%
  arrange(desc(Total)) %>%
  head(10)

events_economic
## # A tibble: 10 × 4
##    EVTYPE            PropertyDamage  CropDamage        Total
##    <chr>                      <dbl>       <dbl>        <dbl>
##  1 FLOOD               144657709800  5661968450 150319678250
##  2 HURRICANE/TYPHOON    69305840000  2607872800  71913712800
##  3 TORNADO              56937160480   414953110  57352113590
##  4 STORM SURGE          43323536000        5000  43323541000
##  5 HAIL                 15732266720  3025954450  18758221170
##  6 FLASH FLOOD          16140811510  1421317100  17562128610
##  7 DROUGHT               1046106000 13972566000  15018672000
##  8 HURRICANE            11868319010  2741910000  14610229010
##  9 RIVER FLOOD           5118945500  5029459000  10148404500
## 10 ICE STORM             3944927810  5022113500   8967041310

RESULTS

# Visualizing the results with bar plots.

ggplot(events_health, aes(x = reorder(EVTYPE, -Total), y = Total)) +
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  xlab("Event Type") +
  ylab("Total Health Impact (Fatalities + Injuries)") +
  ggtitle("Top 10 Event Types with Highest Health Impact")

ggplot(events_economic, aes(x = reorder(EVTYPE, -Total), y = Total)) +
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  xlab("Event Type") +
  ylab("Total Economic Impact (Property Damage + Crop Damage)") +
  ggtitle("Top 10 Event Types with Highest Economic Impact")

Summary of Results:

Our analysis of the NOAA Storm Database revealed the top 10 weather events with the highest impact on public health and the economy in the United States. The results are as follows:

Health Impact (Fatalities + Injuries):

Tornadoes have the highest health impact, with 5,633 fatalities and 91,346 injuries, totaling 96,979 affected individuals. Excessive heat is the second most harmful event, causing 1,903 fatalities and 6,525 injuries, totaling 8,428 affected individuals. Other events in the top 10 list include TSTM wind, flood, lightning, heat, flash flood, ice storm, thunderstorm wind, and winter storm. Economic Impact (Property Damage + Crop Damage):

Floods cause the greatest economic losses, with $144,657,709,800 in property damage and $5,661,968,450 in crop damage, totaling $150,319,678,250. Hurricanes/typhoons are the second most costly events, with $69,305,840,000 in property damage and $2,607,872,800 in crop damage, totaling $71,913,712,800. Other events in the top 10 list include tornadoes, storm surge, hail, flash flood, drought, hurricane, river flood, and ice storm. These findings can help inform government and municipal decision-makers in allocating resources and prioritizing planning efforts for various types of severe weather events. The results have been visualized using bar plots to provide a clear understanding of the relative impacts of different weather events on both population health and the economy.