This report explores the U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm Database to identify which types of severe weather events are most harmful to population health and which cause the greatest economic damage. The dataset includes events from 1950 to 2011. We analyze total fatalities and injuries to assess population health impact, and we combine property and crop damage to estimate economic consequences. The results are summarized with tables and visualized with bar plots.
The original data file repdata_data_StormData.csv.bz2 is loaded directly using read.csv(). No pre-processing was done outside this document.
# download
if(!file.exists("repdata_data_StormData.csv.bz2")) {
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
destfile = "repdata_data_StormData.csv.bz2", mode = "wb")
}
# read
storm <- read.csv("repdata_data_StormData.csv.bz2")
We use the dplyr and ggplot2 packages for data manipulation and visualization.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
We standardize the event types to uppercase for consistency.
storm$EVTYPE <- toupper(storm$EVTYPE)
To determine which types of events are most harmful to population health, we sum fatalities and injuries for each event type and display the top 10.
health <- storm %>%
group_by(EVTYPE) %>%
summarise(
Fatalities = sum(FATALITIES, na.rm = TRUE),
Injuries = sum(INJURIES, na.rm = TRUE)
) %>%
mutate(Total = Fatalities + Injuries) %>%
arrange(desc(Total)) %>%
slice(1:10)
## `summarise()` ungrouping output (override with `.groups` argument)
print(health)
## # A tibble: 10 x 4
## EVTYPE Fatalities Injuries Total
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 5633 91346 96979
## 2 EXCESSIVE HEAT 1903 6525 8428
## 3 TSTM WIND 504 6957 7461
## 4 FLOOD 470 6789 7259
## 5 LIGHTNING 816 5230 6046
## 6 HEAT 937 2100 3037
## 7 FLASH FLOOD 978 1777 2755
## 8 ICE STORM 89 1975 2064
## 9 THUNDERSTORM WIND 133 1488 1621
## 10 WINTER STORM 206 1321 1527
ggplot(health, aes(x = reorder(EVTYPE, -Total), y = Total)) +
geom_bar(stat = "identity", fill = "red") +
coord_flip() +
labs(title = "Top 10 Weather Events Harmful to Population Health",
x = "Event Type", y = "Total Fatalities and Injuries")
To assess economic damage, we add property and crop damage per event type and identify the top 10.
econ <- storm %>%
group_by(EVTYPE) %>%
summarise(
Property = sum(PROPDMG, na.rm = TRUE),
Crop = sum(CROPDMG, na.rm = TRUE)
) %>%
mutate(Total = Property + Crop) %>%
arrange(desc(Total)) %>%
slice(1:10)
## `summarise()` ungrouping output (override with `.groups` argument)
print(econ)
## # A tibble: 10 x 4
## EVTYPE Property Crop Total
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 3212258. 100019. 3312277.
## 2 FLASH FLOOD 1420125. 179200. 1599325.
## 3 TSTM WIND 1335996. 109203. 1445198.
## 4 HAIL 688693. 579596. 1268290.
## 5 FLOOD 899938. 168038. 1067976.
## 6 THUNDERSTORM WIND 876844. 66791. 943636.
## 7 LIGHTNING 603352. 3581. 606932.
## 8 THUNDERSTORM WINDS 446293. 18685. 464978.
## 9 HIGH WIND 324732. 17283. 342015.
## 10 WINTER STORM 132721. 1979. 134700.
ggplot(econ, aes(x = reorder(EVTYPE, -Total), y = Total)) +
geom_bar(stat = "identity", fill = "darkgreen") +
coord_flip() +
labs(title = "Top 10 Weather Events with Greatest Economic Consequences",
x = "Event Type", y = "Total Property and Crop Damage")
The data shows that tornadoes are by far the most harmful weather event to public health, while floods cause the most economic damage. These insights can help guide policy makers and emergency management efforts in prioritizing preparedness for future severe weather events.