This exploratory analysis explores the National Oceanic and Atmospheric Administration (NOAA) Storm Database from 1950 to 2011 to identify the types of severe weather events that significantly impact public health and its economic consequences in the United States. The data includes details on fatalities, injuries, and property damage. The analysis processes the raw data, calculates total health and economic impacts, and presents the top 10 devastating event types. The results highlight that tornadoes harm population health most, while floods cause significant economic damage. This information can assist government and municipal managers prioritize resources for severe weather preparedness.
The raw data was downloaded from the provided URL and read into R
using read.csv(). Relevant columns were selected, including
event type (EVTYPE), fatalities (FATALITIES), injuries (INJURIES),
property damage (PROPDMG), and crop damage (CROPDMG). The property and
crop damage values were converted to numerical/ financial values using
the exponent information (K = thousand, M = million, B = billion)
provided in the dataset. Total economic damage was then calculated by
summing the converted property and crop damage values. The data was
grouped by event type, and the total fatalities, injuries, and economic
damage were calculated. Finally, the events were ranked based on their
total health and economic impacts.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
library(knitr)
## Warning: package 'knitr' was built under R version 4.4.3
file_url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file_name <- "StormData.csv.bz2"
if (!file.exists(file_name)) {
download.file(file_url, file_name)
}
storm_data <- read.csv(file_name)
dim(storm_data)
## [1] 902297 37
names(storm_data)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
storm_data <- storm_data %>%
select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
storm_data <- storm_data %>%
mutate(PROPDMGEXP = toupper(PROPDMGEXP),
CROPDMGEXP = toupper(CROPDMGEXP),
prop_multiplier = case_when(
PROPDMGEXP == "K" ~ 1e3,
PROPDMGEXP == "M" ~ 1e6,
PROPDMGEXP == "B" ~ 1e9,
TRUE ~ 1),
crop_multiplier = case_when(
CROPDMGEXP == "K" ~ 1e3,
CROPDMGEXP == "M" ~ 1e6,
CROPDMGEXP == "B" ~ 1e9,
TRUE ~ 1),
total_prop_damage = PROPDMG * prop_multiplier,
total_crop_damage = CROPDMG * crop_multiplier,
total_economic_damage = total_prop_damage + total_crop_damage)
health_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarise(FATALITIES = sum(FATALITIES, na.rm = TRUE),
INJURIES = sum(INJURIES, na.rm = TRUE)) %>%
arrange(desc(FATALITIES + INJURIES))
top_health_events <- head(health_impact, 10)
kable(top_health_events[1:10, ],
caption = "Top 10 events affecting public health")
| EVTYPE | FATALITIES | INJURIES |
|---|---|---|
| TORNADO | 5633 | 91346 |
| EXCESSIVE HEAT | 1903 | 6525 |
| TSTM WIND | 504 | 6957 |
| FLOOD | 470 | 6789 |
| LIGHTNING | 816 | 5230 |
| HEAT | 937 | 2100 |
| FLASH FLOOD | 978 | 1777 |
| ICE STORM | 89 | 1975 |
| THUNDERSTORM WIND | 133 | 1488 |
| WINTER STORM | 206 | 1321 |
ggplot(top_health_events, aes(x = reorder(EVTYPE, FATALITIES + INJURIES),
y = FATALITIES + INJURIES)) +
geom_bar(stat = "identity",
fill = "#f68060",
alpha = .8,
width = .9) +
coord_flip() +
labs(title ="Top 10 Events Affecting Population Health",
x = "Event Type",
y = "Total Fatalities & Injuries",
caption = "Source: NOAA") +
theme_bw()
economic_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarise(total_damage = sum(total_economic_damage, na.rm = TRUE)) %>%
arrange(desc(total_damage))
top_economic_events <- head(economic_impact, 10)
kable(top_economic_events[1:10, ],
caption = "Top 10 events with significant economic impact")
| EVTYPE | total_damage |
|---|---|
| FLOOD | 150319678257 |
| HURRICANE/TYPHOON | 71913712800 |
| TORNADO | 57352114049 |
| STORM SURGE | 43323541000 |
| HAIL | 18758221521 |
| FLASH FLOOD | 17562129167 |
| DROUGHT | 15018672000 |
| HURRICANE | 14610229010 |
| RIVER FLOOD | 10148404500 |
| ICE STORM | 8967041360 |
ggplot(top_economic_events, aes(x = reorder(EVTYPE, total_damage),
y = total_damage / 1e9)) +
geom_bar(stat = "identity",
fill = "lightgreen") +
coord_flip() +
labs(title = "Top 10 Events with Greatest Economic Impact",
x = "Event Type",
y = "Total Damage Value (Property + Crop) (Billions, USD)",
caption = "Source: NOAA") +
theme_bw()
This analysis of the NOAA Storm Database has revealed significant insights into the impact of severe weather events on public health and the economy of the United States. Tornadoes emerged as the most detrimental event type concerning population health, causing the highest fatalities and injuries. Conversely, floods were identified as the leading cause of economic damage, resulting in substantial property and crop losses.
These findings underscore the importance of targeted preparedness and mitigation strategies. Municipal and government managers should prioritize resources based on the specific risks these event types pose. For instance, enhanced tornado warning systems and public shelters may be crucial for minimizing health impacts, while improved flood control measures and insurance programs could help mitigate economic losses.
It is essential to acknowledge that this analysis is based on data spanning several decades, and weather patterns and reporting practices may have evolved. Future research could explore recent trends and incorporate additional factors, such as population density and infrastructure vulnerability, to provide a more comprehensive understanding of severe weather impacts. This study offers a valuable foundation for informed disaster preparedness and response decision-making.