Title: Impact of Severe Weather Events on Public Health and the Economy in the U.S. (1950–2011)

Synopsis

Severe weather events pose significant risks to both public health and economic stability in the United States. This analysis explores data from the NOAA Storm Database, covering events from 1950 to 2011, to identify which weather events have had the greatest impact.

Two key aspects are analyzed: (1) the impact of weather events on population health, measured by fatalities and injuries, and (2) the economic consequences, assessed through property and crop damage estimates.The findings indicate that tornadoes are the most harmful weather events in terms of human health, accounting for the highest number of combined fatalities and injuries. Excessive heat, flash floods, and thunderstorms also contribute significantly to public health risks.

Regarding economic impact, floods cause the greatest financial damage, followed by hurricanes/typhoons, storm surges, and tornadoes. These events result in substantial property and agricultural losses, emphasizing their financial burden on communities and government agencies.

This report provides insights into the most impactful weather events, aiding policymakers and emergency management in resource allocation and disaster preparedness.

Data processing

1. Load the dataset

data <- read.csv("//vf-lumc-home.lumcnet.prod.intern/lumc-home$/cbecher/MyDocs/R/repdata_data_StormData.csv")

2. Briefly inspect the dataset

Take a quick look at its structure to understand which columns are there

str(data)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
summary(data)
##     STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:19.0   Class :character   Class :character   Class :character  
##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :31.2                                                           
##  3rd Qu.:45.0                                                           
##  Max.   :95.0                                                           
##                                                                         
##      COUNTY       COUNTYNAME           STATE              EVTYPE         
##  Min.   :  0.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.: 31.0   Class :character   Class :character   Class :character  
##  Median : 75.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :100.6                                                           
##  3rd Qu.:131.0                                                           
##  Max.   :873.0                                                           
##                                                                          
##    BGN_RANGE          BGN_AZI           BGN_LOCATI          END_DATE        
##  Min.   :   0.000   Length:902297      Length:902297      Length:902297     
##  1st Qu.:   0.000   Class :character   Class :character   Class :character  
##  Median :   0.000   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :   1.484                                                           
##  3rd Qu.:   1.000                                                           
##  Max.   :3749.000                                                           
##                                                                             
##    END_TIME           COUNTY_END COUNTYENDN       END_RANGE       
##  Length:902297      Min.   :0    Mode:logical   Min.   :  0.0000  
##  Class :character   1st Qu.:0    NA's:902297    1st Qu.:  0.0000  
##  Mode  :character   Median :0                   Median :  0.0000  
##                     Mean   :0                   Mean   :  0.9862  
##                     3rd Qu.:0                   3rd Qu.:  0.0000  
##                     Max.   :0                   Max.   :925.0000  
##                                                                   
##    END_AZI           END_LOCATI            LENGTH              WIDTH         
##  Length:902297      Length:902297      Min.   :   0.0000   Min.   :   0.000  
##  Class :character   Class :character   1st Qu.:   0.0000   1st Qu.:   0.000  
##  Mode  :character   Mode  :character   Median :   0.0000   Median :   0.000  
##                                        Mean   :   0.2301   Mean   :   7.503  
##                                        3rd Qu.:   0.0000   3rd Qu.:   0.000  
##                                        Max.   :2315.0000   Max.   :4400.000  
##                                                                              
##        F               MAG            FATALITIES          INJURIES        
##  Min.   :0.0      Min.   :    0.0   Min.   :  0.0000   Min.   :   0.0000  
##  1st Qu.:0.0      1st Qu.:    0.0   1st Qu.:  0.0000   1st Qu.:   0.0000  
##  Median :1.0      Median :   50.0   Median :  0.0000   Median :   0.0000  
##  Mean   :0.9      Mean   :   46.9   Mean   :  0.0168   Mean   :   0.1557  
##  3rd Qu.:1.0      3rd Qu.:   75.0   3rd Qu.:  0.0000   3rd Qu.:   0.0000  
##  Max.   :5.0      Max.   :22000.0   Max.   :583.0000   Max.   :1700.0000  
##  NA's   :843563                                                           
##     PROPDMG         PROPDMGEXP           CROPDMG         CROPDMGEXP       
##  Min.   :   0.00   Length:902297      Min.   :  0.000   Length:902297     
##  1st Qu.:   0.00   Class :character   1st Qu.:  0.000   Class :character  
##  Median :   0.00   Mode  :character   Median :  0.000   Mode  :character  
##  Mean   :  12.06                      Mean   :  1.527                     
##  3rd Qu.:   0.50                      3rd Qu.:  0.000                     
##  Max.   :5000.00                      Max.   :990.000                     
##                                                                           
##      WFO             STATEOFFIC         ZONENAMES            LATITUDE   
##  Length:902297      Length:902297      Length:902297      Min.   :   0  
##  Class :character   Class :character   Class :character   1st Qu.:2802  
##  Mode  :character   Mode  :character   Mode  :character   Median :3540  
##                                                           Mean   :2875  
##                                                           3rd Qu.:4019  
##                                                           Max.   :9706  
##                                                           NA's   :47    
##    LONGITUDE        LATITUDE_E     LONGITUDE_       REMARKS         
##  Min.   :-14451   Min.   :   0   Min.   :-14455   Length:902297     
##  1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0   Class :character  
##  Median :  8707   Median :   0   Median :     0   Mode  :character  
##  Mean   :  6940   Mean   :1452   Mean   :  3509                     
##  3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735                     
##  Max.   : 17124   Max.   :9706   Max.   :106220                     
##                   NA's   :40                                        
##      REFNUM      
##  Min.   :     1  
##  1st Qu.:225575  
##  Median :451149  
##  Mean   :451149  
##  3rd Qu.:676723  
##  Max.   :902297  
## 
head(data)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL TORNADO
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL TORNADO
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL TORNADO
##   BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1         0                                               0         NA
## 2         0                                               0         NA
## 3         0                                               0         NA
## 4         0                                               0         NA
## 5         0                                               0         NA
## 6         0                                               0         NA
##   END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1         0                      14.0   100 3   0          0       15    25.0
## 2         0                       2.0   150 2   0          0        0     2.5
## 3         0                       0.1   123 2   0          0        2    25.0
## 4         0                       0.0   100 2   0          0        2     2.5
## 5         0                       0.0   150 2   0          0        2     2.5
## 6         0                       1.5   177 2   0          0        6     2.5
##   PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1          K       0                                         3040      8812
## 2          K       0                                         3042      8755
## 3          K       0                                         3340      8742
## 4          K       0                                         3458      8626
## 5          K       0                                         3412      8642
## 6          K       0                                         3450      8748
##   LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       3051       8806              1
## 2          0          0              2
## 3          0          0              3
## 4          0          0              4
## 5          0          0              5
## 6          0          0              6

Conclusion: The key variables relevant to answer the questions are: EVTYPE, FATALITIES, INJURIES, PROPDMG (progerty damage estimates), PROPDMGEXP (eponent for property damage), CROPDMG (crop damage estimates), and CROPDMGEXP (exponent for crop damage)

Further data processing steps include: - Coded exponential format (K for thousand, M for million, B for billion), need to be converted to numeric form. - Calculation of the total damage - Grouping the data by event type - Summary of total property and crop damages

library(dplyr)
## Warning: package 'dplyr' was built under R version 4.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.3
# Function to convert exponent values
convert_exp <- function(exp) {
  # Ensure exp is treated as character
  exp <- as.character(exp)
  exp <- toupper(exp) # Convert to uppercase for consistency

  # Check if exp is NA or an empty string and return 1 (meaning no multiplier)
  if (is.na(exp) || exp == "") {
    return(1)
  }
  
  # Map known exponents to their numerical values
  if (exp == "B") {
    return(1e9)  # Billion
  } else if (exp == "M") {
    return(1e6)  # Million
  } else if (exp == "K") {
    return(1e3)  # Thousand
  } else if (grepl("^[0-9]$", exp)) {  
    # If exp is a single digit (e.g., "2" means 10^2 = 100)
    return(10^as.numeric(exp))
  } else {
    return(1)  # Default: treat unknown values as 1
  }
}

# Apply the function safely
data$PROPDMGEXP <- sapply(data$PROPDMGEXP, convert_exp)
data$CROPDMGEXP <- sapply(data$CROPDMGEXP, convert_exp)

Results

3. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

The Top 10 weather events causing the highest number of fatalities and injuries are shown in Figure 1. Tornadoes have the most significant impact on population health, following by excessive heat and thunderstorms.

# Summarize total fatalities and injuries per event type
health_impact <- data %>%
  group_by(EVTYPE) %>%
  summarise(
    total_fatalities = sum(FATALITIES, na.rm = TRUE),
    total_injuries = sum(INJURIES, na.rm = TRUE)
  ) %>%
  arrange(desc(total_fatalities + total_injuries))

# Show top 10 most harmful events
head(health_impact, 10)
## # A tibble: 10 × 3
##    EVTYPE            total_fatalities total_injuries
##    <chr>                        <dbl>          <dbl>
##  1 TORNADO                       5633          91346
##  2 EXCESSIVE HEAT                1903           6525
##  3 TSTM WIND                      504           6957
##  4 FLOOD                          470           6789
##  5 LIGHTNING                      816           5230
##  6 HEAT                           937           2100
##  7 FLASH FLOOD                    978           1777
##  8 ICE STORM                       89           1975
##  9 THUNDERSTORM WIND              133           1488
## 10 WINTER STORM                   206           1321

Let’s visualize this in a bar chart:

top_health_events <- head(health_impact, 10)
ggplot(top_health_events, aes(x = reorder(EVTYPE, -(total_fatalities + total_injuries)), 
                              y = total_fatalities + total_injuries, fill = total_fatalities)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Most Harmful Weather Events to Population Health",
       x = "Event Type",
       y = "Total Fatalities and Injuries") +
  theme_minimal()

Figure 1.Impact of weather event on population health: Representation of the top 10 weather event types that caused the highest humber of fatalities and injureis in the U.S between 1950 and 2011

4. Economic Consequences: Across the United States, which types of events have the greatest economic consequences?

# Compute total property and crop damage
data$prop_damage <- data$PROPDMG * data$PROPDMGEXP
data$crop_damage <- data$CROPDMG * data$CROPDMGEXP

# Summarize total economic impact per event type
economic_impact <- data %>%
  group_by(EVTYPE) %>%
  summarise(
    total_property_damage = sum(prop_damage, na.rm = TRUE),
    total_crop_damage = sum(crop_damage, na.rm = TRUE),
    total_economic_loss = total_property_damage + total_crop_damage
  ) %>%
  arrange(desc(total_economic_loss))

# Display top 10 most economically damaging events
head(economic_impact, 10)
## # A tibble: 10 × 4
##    EVTYPE            total_property_damage total_crop_damage total_economic_loss
##    <chr>                             <dbl>             <dbl>               <dbl>
##  1 FLOOD                     144657709807         5661968450       150319678257 
##  2 HURRICANE/TYPHOON          69305840000         2607872800        71913712800 
##  3 TORNADO                    56947380676.         414953270        57362333946.
##  4 STORM SURGE                43323536000               5000        43323541000 
##  5 HAIL                       15735267018.        3025954473        18761221491.
##  6 FLASH FLOOD                16822673978.        1421317100        18243991078.
##  7 DROUGHT                     1046106000        13972566000        15018672000 
##  8 HURRICANE                  11868319010         2741910000        14610229010 
##  9 RIVER FLOOD                 5118945500         5029459000        10148404500 
## 10 ICE STORM                   3944927860         5022113500         8967041360
top_economic_events <- head(economic_impact, 10)
ggplot(top_economic_events, aes(x = reorder(EVTYPE, -total_economic_loss), y = total_economic_loss)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  coord_flip() +
  labs(title = "Top 10 Weather Events Causing the Greatest Economic Damage",
       x = "Event Type",
       y = "Total Economic Damage (USD)") +
  theme_minimal()

Figure 2. Economic Impact of Weather Events: Illustration of the top 10 weather event types that resulted in the greatest economic damage in the U.S. from 1950 to 2011