#Synopsis This report analyzes the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database to identify which types of severe weather events are most harmful to population health and which have the greatest economic impact across the United States. The analysis covers data from 1950 to 2011, focusing on injuries, fatalities, and economic damage caused by various event types.

To ensure accuracy, the raw data was cleaned and processed to standardize event types and compute total health and economic impacts. The findings reveal that tornadoes are the most harmful events with respect to population health, causing the highest number of fatalities and injuries. In contrast, floods and hurricanes are found to cause the greatest economic damage, particularly in terms of property and crop losses.

The visualizations included provide clear comparisons of the top events by total harm and financial loss. These insights can help government and emergency management agencies prioritize resource allocation and disaster preparedness strategies.

#Data Processing We begin by loading the necessary libraries and reading in the raw data file provided by the NOAA Storm Database. The file is a compressed CSV and will be loaded directly into R.

# Load required packages
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
library(readr)
## Warning: package 'readr' was built under R version 4.4.3
# Read the compressed data
storm_data <- read.csv("repdata_data_StormData.csv")

# View the structure of the data
str(storm_data)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

To address the questions posed, we are interested in the following variables:

EVTYPE: Type of weather event

FATALITIES: Number of fatalities

INJURIES: Number of injuries

PROPDMG, PROPDMGEXP: Property damage and its multiplier

CROPDMG, CROPDMGEXP: Crop damage and its multiplier

We filter and transform the data to calculate the total health impact (fatalities + injuries) and total economic damage (property + crop damage). The damage exponent fields need to be mapped to actual multipliers.

# Select relevant columns
storm_data_clean <- storm_data %>%
  select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)

# Convert exponent fields to actual multipliers
exp_map <- function(exp) {
  ifelse(exp %in% c("h", "H"), 100,
  ifelse(exp %in% c("k", "K"), 1e3,
  ifelse(exp %in% c("m", "M"), 1e6,
  ifelse(exp %in% c("b", "B"), 1e9,
  ifelse(exp %in% c("", "+", "0"), 1,
  0)))))
}

storm_data_clean <- storm_data_clean %>%
  mutate(
    PROPDMGEXP = exp_map(PROPDMGEXP),
    CROPDMGEXP = exp_map(CROPDMGEXP),
    PROPDMGTOT = PROPDMG * PROPDMGEXP,
    CROPDMGTOT = CROPDMG * CROPDMGEXP,
    TOTAL_DAMAGE = PROPDMGTOT + CROPDMGTOT,
    TOTAL_HEALTH = FATALITIES + INJURIES
  )

Now, we group the data by event type (EVTYPE) and summarize the total damage and health impact per event type.

event_summary <- storm_data_clean %>%
  group_by(EVTYPE) %>%
  summarise(
    Total_Fatalities = sum(FATALITIES, na.rm = TRUE),
    Total_Injuries = sum(INJURIES, na.rm = TRUE),
    Total_Health = sum(TOTAL_HEALTH, na.rm = TRUE),
    Total_Property_Damage = sum(PROPDMGTOT, na.rm = TRUE),
    Total_Crop_Damage = sum(CROPDMGTOT, na.rm = TRUE),
    Total_Economic_Damage = sum(TOTAL_DAMAGE, na.rm = TRUE)
  ) %>%
  arrange(desc(Total_Health))

This processed data will be used in the next section to identify which types of weather events are most harmful to population health and which cause the greatest economic damage.

#Results

##Most Harmful Events to Population Health

To determine which types of events are most harmful to population health, we consider both fatalities and injuries. We select the top 10 event types with the highest combined total of fatalities and injuries.

# Top 10 events for health impact
top_health <- event_summary %>%
  top_n(10, Total_Health) %>%
  arrange(desc(Total_Health))

# Plot
ggplot(top_health, aes(x = reorder(EVTYPE, -Total_Health), y = Total_Health, fill = EVTYPE)) +
  geom_bar(stat = "identity") +
  labs(title = "Top 10 Weather Events Affecting Population Health",
       x = "Event Type",
       y = "Total Fatalities and Injuries") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "none")

Figure 1: This bar chart shows the top 10 weather events with the greatest health impact in the United States, based on total number of fatalities and injuries. Tornadoes are by far the most harmful, followed by excessive heat and floods.

##Events with the Greatest Economic Consequences

To determine which types of events have the greatest economic consequences, we sum both property and crop damage for each event type. The top 10 events with the highest total economic damage are shown below.

# Top 10 events for economic damage
top_econ <- event_summary %>%
  top_n(10, Total_Economic_Damage) %>%
  arrange(desc(Total_Economic_Damage))

# Plot
ggplot(top_econ, aes(x = reorder(EVTYPE, -Total_Economic_Damage), y = Total_Economic_Damage / 1e9, fill = EVTYPE)) +
  geom_bar(stat = "identity") +
  labs(title = "Top 10 Weather Events by Economic Damage",
       x = "Event Type",
       y = "Total Damage (Billion USD)") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "none")

Figure 2: This chart displays the top 10 events with the greatest economic impact. Floods cause the most economic damage, followed by hurricanes and storm surges.

#Conclusion

This analysis of the NOAA Storm Database reveals that tornadoes are the most harmful weather events to population health in the United States, causing the highest number of injuries and fatalities. In terms of economic consequences, floods result in the greatest total damage, followed by hurricanes and storm surges. These findings highlight the importance of prioritizing resources and preparedness efforts for these high-impact events.