This analysis explores the U.S. National Oceanic and Atmospheric Administration (NOAA) storm database to determine which types of events are most harmful to population health and which have the greatest economic consequences. The results show that tornadoes are the most harmful in terms of injuries and fatalities, while floods and hurricanes cause the most economic damage. Understanding these patterns helps prioritize disaster preparedness and resource allocation.
The dataset is loaded into R and the relevant variables required for the analysis are selected. These include event type, fatalities, injuries, and damage-related variables. Filtering the dataset at this stage helps reduce complexity and ensures that only meaningful information is used for further analysis.
# Load libraries
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.5.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Load data
data <- read.csv("repdata_data_StormData.csv")
# Select relevant columns
storm <- data %>%
select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
The property and crop damage values are recorded along with exponent indicators that represent their magnitude (e.g., K for thousands, M for millions). These values are converted into numeric form to calculate the actual damage amounts. This step is essential to ensure accurate computation of total economic impact.
# Function to convert damage exponent
convert_exp <- function(exp) {
if (is.na(exp) || exp == "") return(1)
if (exp %in% c("K", "k")) return(1e3)
if (exp %in% c("M", "m")) return(1e6)
if (exp %in% c("B", "b")) return(1e9)
if (exp %in% c("H", "h")) return(1e2)
if (grepl("^[0-9]+$", exp)) return(10^as.numeric(exp))
return(1)
}
storm$PROPDMGEXP <- sapply(storm$PROPDMGEXP, convert_exp)
storm$CROPDMGEXP <- sapply(storm$CROPDMGEXP, convert_exp)
storm$PROPDMG <- storm$PROPDMG * storm$PROPDMGEXP
storm$CROPDMG <- storm$CROPDMG * storm$CROPDMGEXP
To identify the most harmful weather events to population health, the total number of fatalities and injuries is calculated for each event type. The data is then sorted to highlight the top events contributing to the highest number of casualties.
health <- storm %>%
group_by(EVTYPE) %>%
summarise(
fatalities = sum(FATALITIES, na.rm = TRUE),
injuries = sum(INJURIES, na.rm = TRUE)
) %>%
arrange(desc(fatalities + injuries)) %>%
head(10)
health
## # A tibble: 10 × 3
## EVTYPE fatalities injuries
## <chr> <dbl> <dbl>
## 1 TORNADO 5633 91346
## 2 EXCESSIVE HEAT 1903 6525
## 3 TSTM WIND 504 6957
## 4 FLOOD 470 6789
## 5 LIGHTNING 816 5230
## 6 HEAT 937 2100
## 7 FLASH FLOOD 978 1777
## 8 ICE STORM 89 1975
## 9 THUNDERSTORM WIND 133 1488
## 10 WINTER STORM 206 1321
barplot(health$fatalities + health$injuries,
names.arg = health$EVTYPE,
las = 2,
col = "red",
main = "Top 10 Most Harmful Events to Population Health",
xlab = "Event Type",
ylab = "Total Casualties")
The visualization clearly shows that tornadoes dominate in terms of total casualties compared to other event types.
To evaluate the economic impact of weather events, the total property and crop damage is computed for each event type. These values are aggregated and sorted to determine which events result in the highest financial losses.
economic <- storm %>%
group_by(EVTYPE) %>%
summarise(
damage = sum(PROPDMG + CROPDMG, na.rm = TRUE)
) %>%
arrange(desc(damage)) %>%
head(10)
economic
## # A tibble: 10 × 2
## EVTYPE damage
## <chr> <dbl>
## 1 FLOOD 150319678257
## 2 HURRICANE/TYPHOON 71913712800
## 3 TORNADO 57362333946.
## 4 STORM SURGE 43323541000
## 5 HAIL 18761221986.
## 6 FLASH FLOOD 18243991078.
## 7 DROUGHT 15018672000
## 8 HURRICANE 14610229010
## 9 RIVER FLOOD 10148404500
## 10 ICE STORM 8967041360
barplot(economic$damage / 1e6,
names.arg = economic$EVTYPE,
las = 2,
col = "blue",
main = "Top 10 Events with Greatest Economic Damage",
xlab = "Event Type",
ylab = "Total Damage (Millions USD)")
The plot highlights that floods and hurricanes account for significantly higher economic losses than other weather events.