Synopsis

This report analyzes the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, which contains records of major storms and severe weather events between 1950 and November 2011. The purpose is to identify which types of events are most harmful to population health and which have the greatest economic consequences. Using data cleaning, aggregation, and visualization in R, the analysis shows that tornadoes are the leading cause of fatalities and injuries, while floods and hurricanes have the largest economic impacts.

Data Processing

library(dplyr)
library(ggplot2)

storm <- read.csv("repdata_data_StormData.csv")

storm_sub <- storm %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP,
         CROPDMG, CROPDMGEXP)

Impact on Population Health

health_impact <- storm_sub %>%
  group_by(EVTYPE) %>%
  summarise(
    fatalities = sum(FATALITIES, na.rm=TRUE),
    injuries   = sum(INJURIES, na.rm=TRUE)
  ) %>%
  mutate(total_health = fatalities + injuries) %>%
  arrange(desc(total_health))

top_health <- health_impact[1:10, ]
top_health
## # A tibble: 10 × 4
##    EVTYPE            fatalities injuries total_health
##    <chr>                  <dbl>    <dbl>        <dbl>
##  1 TORNADO                 5633    91346        96979
##  2 EXCESSIVE HEAT          1903     6525         8428
##  3 TSTM WIND                504     6957         7461
##  4 FLOOD                    470     6789         7259
##  5 LIGHTNING                816     5230         6046
##  6 HEAT                     937     2100         3037
##  7 FLASH FLOOD              978     1777         2755
##  8 ICE STORM                 89     1975         2064
##  9 THUNDERSTORM WIND        133     1488         1621
## 10 WINTER STORM             206     1321         1527
ggplot(top_health, aes(x=reorder(EVTYPE, total_health), y=total_health)) +
  geom_col(fill="tomato") +
  coord_flip() +
  labs(title="Top 10 Events Harmful to Population Health",
       x="Event Type", y="Total Fatalities & Injuries")

## Impact on Economy

# Recode exponent values
exp_map <- c("K"=1000, "M"=1e6, "B"=1e9,
             "k"=1000, "m"=1e6, "b"=1e9)

storm_sub <- storm_sub %>%
  mutate(
    PROPDMGEXP = ifelse(PROPDMGEXP %in% names(exp_map), exp_map[PROPDMGEXP], 1),
    CROPDMGEXP = ifelse(CROPDMGEXP %in% names(exp_map), exp_map[CROPDMGEXP], 1),
    prop_cost = PROPDMG * as.numeric(PROPDMGEXP),
    crop_cost = CROPDMG * as.numeric(CROPDMGEXP),
    total_cost = prop_cost + crop_cost
  )

econ_impact <- storm_sub %>%
  group_by(EVTYPE) %>%
  summarise(total_cost = sum(total_cost, na.rm=TRUE)) %>%
  arrange(desc(total_cost))

top_econ <- econ_impact[1:10, ]
top_econ
## # A tibble: 10 × 2
##    EVTYPE               total_cost
##    <chr>                     <dbl>
##  1 FLOOD             150319678257 
##  2 HURRICANE/TYPHOON  71913712800 
##  3 TORNADO            57352114049.
##  4 STORM SURGE        43323541000 
##  5 HAIL               18758221521.
##  6 FLASH FLOOD        17562129167.
##  7 DROUGHT            15018672000 
##  8 HURRICANE          14610229010 
##  9 RIVER FLOOD        10148404500 
## 10 ICE STORM           8967041360
ggplot(top_econ, aes(x=reorder(EVTYPE, total_cost), y=total_cost/1e9)) +
  geom_col(fill="steelblue") +
  coord_flip() +
  labs(title="Top 10 Events with Greatest Economic Consequences",
       x="Event Type", y="Total Cost (in Billions USD)")

##Conclusion The analysis demonstrates that: Tornadoes are the most harmful to population health, causing the highest number of fatalities and injuries.

Floods, hurricanes, and storm surges cause the greatest economic damage.