Analysis of Storm Data: Impact on Population Health and Economic Consequences

In this analysis I found, that the event named ‘TORNADO’, in the stormdata database from the NOAA Satellite and Information Service, is the most harmful event regarding population health in the US, based on the most people got injured or died in this event between 1999 and 2011 Furthermore I found, that the Event named “FLOOD” has the greatest economic consequences in the US, this event was responible for damages as high as 150 trillion USD.

Synopsis

In this document we evaluate which types of events in the stormdata-database from 1950 to 2011 (NOAA Satellite and Information Service) are most harmful with respect to population health across the United States. Secondly we show which types of events have the greatest economic consequences.

Data Processing

Used Libraries

To perform the necessary analysis, the following libraries were used:

library(R.utils)
library(data.table)
library(dplyr)
library(ggplot2)
library(scales)

Reading the Data

The dataset used for this analysis was provided in the Coursera Course project (week 4). It was loaded using the fread function from the data.table package for efficient reading of the data:

stormdata <- fread("repdata_data_StormData.csv.bz2")

Variables of Interest for Population Health and Economic Consequences

To assess the impact of weather events on population health and the economy, we first examine the column names of the dataset to identify relevant variables:

names(stormdata)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

From the column names, we identify that: - For evaluating the impact on population health, the variables of interest are FATALITIES and INJURIES. - For evaluating the economic consequences, the relevant variables are PROPDMG, PROPDMGEXP, CROPDMG, and CROPDMGEXP.

Adjusting the Economic Damage Data

The PROPDMGEXP and CROPDMGEXP columns indicate the magnitude of the damage amounts. We need to convert these factors into numerical values and multiply them with PROPDMG and CROPDMG to get the actual damage values.

# Function to convert the PROPDMGEXP and CROPDMGEXP to numerical values
convert_exp <- function(exp) {
  ifelse(exp %in% c('h', 'H'), 100, 
    ifelse(exp %in% c('k', 'K'), 1000, 
      ifelse(exp %in% c('m', 'M'), 1e6, 
        ifelse(exp %in% c('b', 'B'), 1e9, 1))))
}

# Applying the conversion function to the data
stormdata <- stormdata %>%
  mutate(PROPDMGEXP = convert_exp(PROPDMGEXP),
         CROPDMGEXP = convert_exp(CROPDMGEXP),
         TOTAL_PROPDMG = PROPDMG * PROPDMGEXP,
         TOTAL_CROPDMG = CROPDMG * CROPDMGEXP,
         TOTAL_ECONOMIC_DAMAGE = TOTAL_PROPDMG + TOTAL_CROPDMG)

Results

Impact on Population Health

To determine which types of weather events are most harmful to population health, we calculate a weighted sum of fatalities and injuries. Fatalities are given a higher weight (10 times) due to their severe impact.

Calculation and Data Transformation

The following code selects the relevant variables, applies the weighting, and summarizes the data by event type:

health <- stormdata %>%
          select(EVTYPE, FATALITIES, INJURIES) %>%
          mutate(fat_weight = FATALITIES * 10, inj_weight = INJURIES) %>%
          mutate(fat_inj_sum = fat_weight + inj_weight) %>%
          group_by(EVTYPE) %>%
          summarise(fat_count = sum(fat_inj_sum)) %>%
          arrange(desc(fat_count))

# Display the top 10 most harmful events
health_top10 <- head(health, 10)
health_top10
## # A tibble: 10 × 2
##    EVTYPE         fat_count
##    <chr>              <dbl>
##  1 TORNADO           147676
##  2 EXCESSIVE HEAT     25555
##  3 LIGHTNING          13390
##  4 TSTM WIND          11997
##  5 FLASH FLOOD        11557
##  6 FLOOD              11489
##  7 HEAT               11470
##  8 RIP CURRENT         3912
##  9 HIGH WIND           3617
## 10 WINTER STORM        3381

Visualization of the Most Harmful Events

The following plot visualizes the top 10 most harmful weather events in terms of their impact on population health:

ggplot(health_top10, aes(x = reorder(EVTYPE, fat_count), y = fat_count)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Health Impact",
       x = "Event Type",
       y = "Weighted Fatalities and Injuries") +
  scale_y_continuous(labels = scales::comma) +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5))

Economic Consequences

To evaluate the economic impact of weather events, we now use the adjusted damage values and summarize them by event type.

Calculation and Data Transformation

economic <- stormdata %>%
            group_by(EVTYPE) %>%
            summarise(total_damage = sum(TOTAL_ECONOMIC_DAMAGE)) %>%
            arrange(desc(total_damage))

# Display the top 10 events by economic damage
economic_top10 <- head(economic, 10)
economic_top10
## # A tibble: 10 × 2
##    EVTYPE             total_damage
##    <chr>                     <dbl>
##  1 FLOOD             150319678257 
##  2 HURRICANE/TYPHOON  71913712800 
##  3 TORNADO            57352114049.
##  4 STORM SURGE        43323541000 
##  5 HAIL               18758222016.
##  6 FLASH FLOOD        17562129167.
##  7 DROUGHT            15018672000 
##  8 HURRICANE          14610229010 
##  9 RIVER FLOOD        10148404500 
## 10 ICE STORM           8967041360

Visualization of the Economic Impact

The following plot visualizes the top 10 weather events in terms of economic damage:

ggplot(economic_top10, aes(x = reorder(EVTYPE, total_damage), y = total_damage)) +
  geom_bar(stat = "identity", fill = "darkred") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Economic Damage",
       x = "Event Type",
       y = "Total Economic Damage (USD)") +
  scale_y_continuous(labels = scales::comma) +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5))

Conclusion

The analysis identified the most harmful weather events in terms of both population health and economic impact. Tornadoes, excessive heat, and floods are significant contributors to fatalities and injuries, while hurricanes and floods account for the largest economic losses. The findings underscore the importance of targeted disaster preparedness and mitigation efforts.