This report explores the NOAA Storm Database to identify the weather events most harmful to population health and those with the greatest economic consequences in the United States. The database spans from 1950 to November 2011 and includes information on fatalities, injuries, and property damage. The goal is to help municipal or government officials prioritize resource allocation for disaster preparedness.
# Load necessary libraries
library(dplyr)
##
## Anexando pacote: 'dplyr'
## Os seguintes objetos são mascarados por 'package:stats':
##
## filter, lag
## Os seguintes objetos são mascarados por 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(data.table)
##
## Anexando pacote: 'data.table'
## Os seguintes objetos são mascarados por 'package:dplyr':
##
## between, first, last
# Load the data
data <- fread("repdata_data_StormData.csv.bz2")
# Check columns
names(data)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
# Summarizing population health impact
health_data <- data %>%
group_by(EVTYPE) %>%
summarise(Fatalities = sum(FATALITIES, na.rm = TRUE),
Injuries = sum(INJURIES, na.rm = TRUE)) %>%
mutate(TotalHarm = Fatalities + Injuries) %>%
arrange(desc(TotalHarm))
# Summarizing economic impact
# Convert property and crop damage using exponents
convert_exp <- function(val, exp) {
exp <- toupper(exp)
factor <- case_when(
exp == "K" ~ 1e3,
exp == "M" ~ 1e6,
exp == "B" ~ 1e9,
TRUE ~ 1
)
return(val * factor)
}
data$PROPDMGVAL <- convert_exp(data$PROPDMG, data$PROPDMGEXP)
data$CROPDMGVAL <- convert_exp(data$CROPDMG, data$CROPDMGEXP)
economic_data <- data %>%
group_by(EVTYPE) %>%
summarise(Property = sum(PROPDMGVAL, na.rm = TRUE),
Crop = sum(CROPDMGVAL, na.rm = TRUE)) %>%
mutate(TotalDamage = Property + Crop) %>%
arrange(desc(TotalDamage))
top_health <- health_data[1:10, ]
ggplot(top_health, aes(x = reorder(EVTYPE, -TotalHarm), y = TotalHarm)) +
geom_bar(stat = "identity", fill = "steelblue") +
theme_minimal() +
labs(title = "Top 10 Events Most Harmful to Population Health",
x = "Event Type", y = "Total Fatalities + Injuries") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
top_economic <- economic_data[1:10, ]
ggplot(top_economic, aes(x = reorder(EVTYPE, -TotalDamage), y = TotalDamage)) +
geom_bar(stat = "identity", fill = "darkgreen") +
theme_minimal() +
labs(title = "Top 10 Events with Greatest Economic Consequences",
x = "Event Type", y = "Total Property + Crop Damage (USD)") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
From the analysis above: - Tornadoes are the most harmful to population health. - Floods have the highest economic impact.
This analysis can help policy makers and emergency services better prioritize their efforts for future preparedness.