Synopsis

This exploratory analysis explores the National Oceanic and Atmospheric Administration (NOAA) Storm Database from 1950 to 2011 to identify the types of severe weather events that significantly impact public health and its economic consequences in the United States. The data includes details on fatalities, injuries, and property damage. The analysis processes the raw data, calculates total health and economic impacts, and presents the top 10 devastating event types. The results highlight that tornadoes harm population health most, while floods cause significant economic damage. This information can assist government and municipal managers prioritize resources for severe weather preparedness.

Data Processing

The raw data was downloaded from the provided URL and read into R using read.csv(). Relevant columns were selected, including event type (EVTYPE), fatalities (FATALITIES), injuries (INJURIES), property damage (PROPDMG), and crop damage (CROPDMG). The property and crop damage values were converted to numerical/ financial values using the exponent information (K = thousand, M = million, B = billion) provided in the dataset. Total economic damage was then calculated by summing the converted property and crop damage values. The data was grouped by event type, and the total fatalities, injuries, and economic damage were calculated. Finally, the events were ranked based on their total health and economic impacts.

Load necessary libraries

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
library(knitr)
## Warning: package 'knitr' was built under R version 4.4.3

Download and read the data

file_url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file_name <- "StormData.csv.bz2"
if (!file.exists(file_name)) {
        download.file(file_url, file_name)
}

storm_data <- read.csv(file_name)

Dimensions

dim(storm_data)
## [1] 902297     37

Variable Names

names(storm_data)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Select relevant dimensions

storm_data <- storm_data %>%
        select(EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)

Convert damage exponent to numeric multiplier

storm_data <- storm_data %>%
        mutate(PROPDMGEXP = toupper(PROPDMGEXP),
               CROPDMGEXP = toupper(CROPDMGEXP),
               prop_multiplier = case_when(
                       PROPDMGEXP == "K" ~ 1e3,
                       PROPDMGEXP == "M" ~ 1e6, 
                       PROPDMGEXP == "B" ~ 1e9,
                       TRUE ~ 1),
               crop_multiplier = case_when(
                       CROPDMGEXP == "K" ~ 1e3,
                       CROPDMGEXP == "M" ~ 1e6,
                       CROPDMGEXP == "B" ~ 1e9,
                       TRUE ~ 1),
               total_prop_damage = PROPDMG * prop_multiplier,
               total_crop_damage = CROPDMG * crop_multiplier,
               total_economic_damage = total_prop_damage + total_crop_damage)

Results

Events most harmful to population health:

Aggregate by event type

health_impact <- storm_data %>% 
        group_by(EVTYPE) %>% 
        summarise(FATALITIES = sum(FATALITIES, na.rm = TRUE),
                  INJURIES = sum(INJURIES, na.rm = TRUE)) %>%
        arrange(desc(FATALITIES + INJURIES))

Top 10 events affecting public health

top_health_events <- head(health_impact, 10)
kable(top_health_events[1:10, ],
      caption = "Top 10 events affecting public health")
Top 10 events affecting public health
EVTYPE FATALITIES INJURIES
TORNADO 5633 91346
EXCESSIVE HEAT 1903 6525
TSTM WIND 504 6957
FLOOD 470 6789
LIGHTNING 816 5230
HEAT 937 2100
FLASH FLOOD 978 1777
ICE STORM 89 1975
THUNDERSTORM WIND 133 1488
WINTER STORM 206 1321

Plot

ggplot(top_health_events, aes(x = reorder(EVTYPE, FATALITIES + INJURIES), 
                              y = FATALITIES + INJURIES)) +
        geom_bar(stat = "identity", 
                 fill = "#f68060",
                 alpha = .8, 
                 width = .9) +
        coord_flip() +
        labs(title ="Top 10 Events Affecting Population Health", 
             x = "Event Type", 
             y = "Total Fatalities & Injuries",
             caption = "Source: NOAA") +
        theme_bw()

Events with greatest economic consequences:

Aggregate by event type

economic_impact <- storm_data %>% 
        group_by(EVTYPE) %>% 
        summarise(total_damage = sum(total_economic_damage, na.rm = TRUE)) %>%
        arrange(desc(total_damage))

Top 10 events with significant economic impact

top_economic_events <- head(economic_impact, 10)
kable(top_economic_events[1:10, ],
      caption = "Top 10 events with significant economic impact")
Top 10 events with significant economic impact
EVTYPE total_damage
FLOOD 150319678257
HURRICANE/TYPHOON 71913712800
TORNADO 57352114049
STORM SURGE 43323541000
HAIL 18758221521
FLASH FLOOD 17562129167
DROUGHT 15018672000
HURRICANE 14610229010
RIVER FLOOD 10148404500
ICE STORM 8967041360

Plot

ggplot(top_economic_events, aes(x = reorder(EVTYPE, total_damage),
                                y = total_damage / 1e9)) +
        geom_bar(stat = "identity", 
                 fill = "lightgreen") +
        coord_flip() +
        labs(title = "Top 10 Events with Greatest Economic Impact", 
             x = "Event Type", 
             y = "Total Damage Value (Property + Crop) (Billions, USD)",
             caption = "Source: NOAA") +
        theme_bw()

Conclusion

This analysis of the NOAA Storm Database has revealed significant insights into the impact of severe weather events on public health and the economy of the United States. Tornadoes emerged as the most detrimental event type concerning population health, causing the highest fatalities and injuries. Conversely, floods were identified as the leading cause of economic damage, resulting in substantial property and crop losses.

These findings underscore the importance of targeted preparedness and mitigation strategies. Municipal and government managers should prioritize resources based on the specific risks these event types pose. For instance, enhanced tornado warning systems and public shelters may be crucial for minimizing health impacts, while improved flood control measures and insurance programs could help mitigate economic losses.

It is essential to acknowledge that this analysis is based on data spanning several decades, and weather patterns and reporting practices may have evolved. Future research could explore recent trends and incorporate additional factors, such as population density and infrastructure vulnerability, to provide a more comprehensive understanding of severe weather impacts. This study offers a valuable foundation for informed disaster preparedness and response decision-making.

Reference