Synopsis

This analysis explores the NOAA Storm Database to determine the types of weather events that are most harmful to population health and those with the greatest economic consequences across the United States. By examining data on fatalities, injuries, and property/crop damage from 1950 to 2011, the report aims to support municipal managers in making informed decisions for disaster preparedness. Data transformations were minimal and only used to clean and summarize results. Key findings are visualized using simple plots.

Data Processing

#Loading Packages
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(readr)

#Reading the Data
storm_data <- read.csv("repdata_data_StormData.csv.bz2")

# View structure of the dataset
str(storm_data[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")])
## 'data.frame':    902297 obs. of  7 variables:
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
# Using the valid exponent values
valid_data <- storm_data %>%
  filter(PROPDMGEXP %in% c("K", "M", "B") | CROPDMGEXP %in% c("K", "M", "B"))

# Convert EVTYPE to uppercase to avoid confusion
valid_data$EVTYPE <- toupper(valid_data$EVTYPE)

# Defining multipliers
exp_vals <- c("K"=1e3, "M"=1e6, "B"=1e9)

# Changing exponents with multipliers
valid_data <- valid_data %>%
  mutate(
    prop_mult = exp_vals[as.character(PROPDMGEXP)],
    crop_mult = exp_vals[as.character(CROPDMGEXP)],
    PROPDMGTOTAL = PROPDMG * prop_mult,
    CROPDMGTOTAL = CROPDMG * crop_mult,
    TOTALDAMAGE = PROPDMGTOTAL + CROPDMGTOTAL,
    TOTALHEALTH = FATALITIES + INJURIES
  )

In this section, I have processed the NOAA storm dataset to focus on the health and economic impacts of severe weather events. Firstly, started by keeping only the rows where the damage exponents for property or crop damage are clearly defined as “K” (thousand), “M” (million), or “B” (billion). This simplifies the conversion of damages to actual dollar amounts. Next, I have cleaned the event type names by converting them all to uppercase to avoid duplicates due to different letter cases. Then I converted the exponents into numeric multipliers and calculate the actual property and crop damages in dollars and created two new columns: one for total economic damage (property + crop), and another for total health impact (fatalities + injuries).

Results

In this section, I identified the types of weather events that have the most significant impact on public health and the economy across the United States. I Summed up fatalities and injuries to see which events are most harmful to population health and Summed up property and crop damage to find which events have the greatest economic consequences.

# Classifying data by event type and calculate total health impact 
health_impact <- aggregate(TOTALHEALTH ~ EVTYPE, data = valid_data, sum)

# Sorting the data in descending order of total health impact
health_impact <- health_impact[order(-health_impact$TOTALHEALTH), ]

# Top 10 most harmful event types
top10_health <- head(health_impact, 10)

# Ploting the results using a bar chart
ggplot(top10_health, aes(x = reorder(EVTYPE, TOTALHEALTH), y = TOTALHEALTH)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(
    title = "Top 10 Weather Events Affecting Population Health",
    x = "Event Type",
    y = "Total Health Impact (Fatalities + Injuries)"
  )

# Classifying data by event type and calculate total economic damage
economic_damage <- aggregate(TOTALDAMAGE ~ EVTYPE, data = valid_data, sum)

# Sorting the data by total damage in descending order
economic_damage <- economic_damage[order(-economic_damage$TOTALDAMAGE), ]

# Top 10 event types
top10_damage <- head(economic_damage, 10)

# Plot
ggplot(top10_damage, aes(x = reorder(EVTYPE, TOTALDAMAGE), y = TOTALDAMAGE / 1e9)) +
  geom_bar(stat = "identity", fill = "yellow") +
  coord_flip() +
  labs(
    title = "Top 10 Weather Events by Economic Damage",
    x = "Event Type",
    y = "Total Damage (Billions USD)"
  )

Conclusion

From the analysis, we can see that tornadoes are the most harmful to public health, while floods and hurricanes cause the most economic damage. This insight can help in better preparedness and resource allocation for disaster management.