Synopsis

This report analyzes the U.S. National Oceanic and Atmospheric Administration (NOAA) Storm Database to identify the types of severe weather events most harmful to human health and those that cause the greatest economic losses. Using data from 1950–2011, we aggregated fatalities, injuries, and property/crop damages. Tornadoes were found to cause the highest combined fatalities and injuries, while floods resulted in the largest economic damages.

Loading Libraries which are essential for this assignment

library(dplyr)
library(ggplot2)
library(lattice)

Loading and Processing Data

This chunk of code is about to read and process data from a csv file. Here first we read the file and store the result in a variable called fsdata then use the pipe function to SELECT a few columns from the data set as we knew that there is a huge data in that table and so for a our question we just need a few columns from that table so i write a code for that columns and store the result in a variable that is already exists (we override the variable data)…

After that I write a code to replace na into 0 then i filter that rows which has the injuries, propdmg & cropdmg are greater than 0. So we get some results as…

Economic damage was computed by converting the exponent columns to numeric multipliers (K=1,000; M=1,000,000; B=1,000,000,000). Finally, we grouped the data by event type to obtain total health and economic impacts.

fsdata <- read.csv("fStormData.csv")
fsdata <- fsdata %>%
  select(EVTYPE, FATALITIES, INJURIES,
         PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
fsdata[is.na(fsdata)] <- 0
fsdata <- fsdata %>%
  filter(FATALITIES > 0 | INJURIES > 0 | PROPDMG > 0 | CROPDMG > 0)

exp_to_num <- function(x){
  x <- toupper(x)
  ifelse(x == "K", 1e3,
  ifelse(x == "M", 1e6,
  ifelse(x == "B", 1e9, 1)))
}

fsdata <- fsdata %>%
  mutate(
    prop_cost = PROPDMG * exp_to_num(PROPDMGEXP),
    crop_cost = CROPDMG * exp_to_num(CROPDMGEXP),
    total_cost = prop_cost + crop_cost
  )

Results

Q1. Events are most harmful with respect to population health?

health_summary <- fsdata %>%
  group_by(EVTYPE) %>%
  summarise(
    total_fatalities = sum(FATALITIES),
    total_injuries   = sum(INJURIES),
    total_harm = total_fatalities + total_injuries,
    .groups = "drop"
  ) %>%
  arrange(desc(total_harm))

head(health_summary, 10)
## # A tibble: 10 × 4
##    EVTYPE            total_fatalities total_injuries total_harm
##    <chr>                        <dbl>          <dbl>      <dbl>
##  1 TORNADO                       5633          91346      96979
##  2 EXCESSIVE HEAT                1903           6525       8428
##  3 TSTM WIND                      504           6957       7461
##  4 FLOOD                          470           6789       7259
##  5 LIGHTNING                      816           5230       6046
##  6 HEAT                           937           2100       3037
##  7 FLASH FLOOD                    978           1777       2755
##  8 ICE STORM                       89           1975       2064
##  9 THUNDERSTORM WIND              133           1488       1621
## 10 WINTER STORM                   206           1321       1527
top_health <- head(health_summary, 10)
ggplot(top_health, aes(x = reorder(EVTYPE, total_harm), y = total_harm)) +
  geom_col(fill = "tomato") +
  coord_flip() +
  labs(x = "Event type", y = "Total fatalities + injuries",
       title = "Top 10 Weather Events Harmful to Population Health")

Q2. events have the greatest economic consequences?

econ_summary <- fsdata %>%
  group_by(EVTYPE) %>%
  summarise(total_econ_damage = sum(total_cost),
            .groups = "drop") %>%
  arrange(desc(total_econ_damage))

head(econ_summary, 10)
## # A tibble: 10 × 2
##    EVTYPE            total_econ_damage
##    <chr>                         <dbl>
##  1 FLOOD                 150319678257 
##  2 HURRICANE/TYPHOON      71913712800 
##  3 TORNADO                57352114049.
##  4 STORM SURGE            43323541000 
##  5 HAIL                   18758221521.
##  6 FLASH FLOOD            17562129167.
##  7 DROUGHT                15018672000 
##  8 HURRICANE              14610229010 
##  9 RIVER FLOOD            10148404500 
## 10 ICE STORM               8967041360
top_econ <- head(econ_summary, 10)
ggplot(top_econ, aes(x = reorder(EVTYPE, total_econ_damage),
                     y = total_econ_damage/1e9)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  labs(x = "Event type",
       y = "Total economic damage (billion USD)",
       title = "Top 10 Weather Events by Economic Consequences")