1. Title

The Impact of Severe Weather Events on Health and Economy Across the United States

2. Synopsis

This analysis examines the impact of severe weather events in the United States using NOAA Storm Database data from 1950 to November 2011. It identifies the event types most harmful to population health and those causing the greatest economic damage. Health impacts are measured by fatalities and injuries, while economic impacts consider property and crop damage. Tornadoes are the most harmful to health, causing the highest fatalities and injuries, with heatwaves and floods also significant. Economically, floods lead to the greatest financial losses, followed by hurricanes and tornadoes. These findings can guide policymakers in prioritizing resources and strategies to mitigate the effects of severe weather events.

3. Data Processing

1. Load the Data

  • We read the compressed .bz2 file directly into R using read.csv().
data <- read.csv("~/repdata_data_StormData.csv.bz2")

2. Inspect the Data

  • We explore the structure using str(), head(), and summary() to understand the variables, especially EVTYPE, FATALITIES, INJURIES, and economic damage columns (PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP).
str(data)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
summary(data)
##     STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:19.0   Class :character   Class :character   Class :character  
##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :31.2                                                           
##  3rd Qu.:45.0                                                           
##  Max.   :95.0                                                           
##                                                                         
##      COUNTY       COUNTYNAME           STATE              EVTYPE         
##  Min.   :  0.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.: 31.0   Class :character   Class :character   Class :character  
##  Median : 75.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :100.6                                                           
##  3rd Qu.:131.0                                                           
##  Max.   :873.0                                                           
##                                                                          
##    BGN_RANGE          BGN_AZI           BGN_LOCATI          END_DATE        
##  Min.   :   0.000   Length:902297      Length:902297      Length:902297     
##  1st Qu.:   0.000   Class :character   Class :character   Class :character  
##  Median :   0.000   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :   1.484                                                           
##  3rd Qu.:   1.000                                                           
##  Max.   :3749.000                                                           
##                                                                             
##    END_TIME           COUNTY_END COUNTYENDN       END_RANGE       
##  Length:902297      Min.   :0    Mode:logical   Min.   :  0.0000  
##  Class :character   1st Qu.:0    NA's:902297    1st Qu.:  0.0000  
##  Mode  :character   Median :0                   Median :  0.0000  
##                     Mean   :0                   Mean   :  0.9862  
##                     3rd Qu.:0                   3rd Qu.:  0.0000  
##                     Max.   :0                   Max.   :925.0000  
##                                                                   
##    END_AZI           END_LOCATI            LENGTH              WIDTH         
##  Length:902297      Length:902297      Min.   :   0.0000   Min.   :   0.000  
##  Class :character   Class :character   1st Qu.:   0.0000   1st Qu.:   0.000  
##  Mode  :character   Mode  :character   Median :   0.0000   Median :   0.000  
##                                        Mean   :   0.2301   Mean   :   7.503  
##                                        3rd Qu.:   0.0000   3rd Qu.:   0.000  
##                                        Max.   :2315.0000   Max.   :4400.000  
##                                                                              
##        F               MAG            FATALITIES          INJURIES        
##  Min.   :0.0      Min.   :    0.0   Min.   :  0.0000   Min.   :   0.0000  
##  1st Qu.:0.0      1st Qu.:    0.0   1st Qu.:  0.0000   1st Qu.:   0.0000  
##  Median :1.0      Median :   50.0   Median :  0.0000   Median :   0.0000  
##  Mean   :0.9      Mean   :   46.9   Mean   :  0.0168   Mean   :   0.1557  
##  3rd Qu.:1.0      3rd Qu.:   75.0   3rd Qu.:  0.0000   3rd Qu.:   0.0000  
##  Max.   :5.0      Max.   :22000.0   Max.   :583.0000   Max.   :1700.0000  
##  NA's   :843563                                                           
##     PROPDMG         PROPDMGEXP           CROPDMG         CROPDMGEXP       
##  Min.   :   0.00   Length:902297      Min.   :  0.000   Length:902297     
##  1st Qu.:   0.00   Class :character   1st Qu.:  0.000   Class :character  
##  Median :   0.00   Mode  :character   Median :  0.000   Mode  :character  
##  Mean   :  12.06                      Mean   :  1.527                     
##  3rd Qu.:   0.50                      3rd Qu.:  0.000                     
##  Max.   :5000.00                      Max.   :990.000                     
##                                                                           
##      WFO             STATEOFFIC         ZONENAMES            LATITUDE   
##  Length:902297      Length:902297      Length:902297      Min.   :   0  
##  Class :character   Class :character   Class :character   1st Qu.:2802  
##  Mode  :character   Mode  :character   Mode  :character   Median :3540  
##                                                           Mean   :2875  
##                                                           3rd Qu.:4019  
##                                                           Max.   :9706  
##                                                           NA's   :47    
##    LONGITUDE        LATITUDE_E     LONGITUDE_       REMARKS         
##  Min.   :-14451   Min.   :   0   Min.   :-14455   Length:902297     
##  1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0   Class :character  
##  Median :  8707   Median :   0   Median :     0   Mode  :character  
##  Mean   :  6940   Mean   :1452   Mean   :  3509                     
##  3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735                     
##  Max.   : 17124   Max.   :9706   Max.   :106220                     
##                   NA's   :40                                        
##      REFNUM      
##  Min.   :     1  
##  1st Qu.:225575  
##  Median :451149  
##  Mean   :451149  
##  3rd Qu.:676723  
##  Max.   :902297  
## 

3. Clean and Transform

  • We standardize event types (EVTYPE) for consistency.
  • We also convert exponential damage variables (PROPDMGEXP, CROPDMGEXP) to numeric multipliers (e.g., K = 1,000, M = 1,000,000).
  • Then finally, we create aggregate variables for total health impact (FATALITIES + INJURIES) and economic impact (property + crop damage).
library(dplyr)
## Warning: package 'dplyr' was built under R version 4.4.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Define a helper function to handle the exponent mapping
convert_exp <- function(exp) {
  case_when(
    exp %in% c("K", "k") ~ 1e3,
    exp %in% c("M", "m") ~ 1e6,
    exp %in% c("B", "b") ~ 1e9,
    exp %in% c("", " ", NA) ~ 1, # Empty, missing, or blank values
    TRUE ~ 1                     # Default for unexpected values
  )
}

# Apply the helper function and calculate new columns
data <- data %>%
  mutate(
    PROPDMGEXP = convert_exp(PROPDMGEXP),
    CROPDMGEXP = convert_exp(CROPDMGEXP),
    Total_Economic_Impact = PROPDMG * PROPDMGEXP + CROPDMG * CROPDMGEXP,
    Total_Health_Impact = FATALITIES + INJURIES
  )

4. Pretesting the Results

  • After applying this code, check the unique values of the transformed columns and verify the total economic impact:
# Check transformed exponents
unique(data$PROPDMGEXP)
## [1] 1e+03 1e+06 1e+00 1e+09
unique(data$CROPDMGEXP)
## [1] 1e+00 1e+06 1e+03 1e+09
# Summary of economic impact
summary(data$Total_Economic_Impact)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.00e+00 0.00e+00 0.00e+00 5.28e+05 1.00e+03 1.15e+11

4. Results

Q1: Most Harmful Events for Population Health

top_health <- data %>%
  group_by(EVTYPE) %>%
  summarize(Total_Health_Impact = sum(Total_Health_Impact)) %>%
  arrange(desc(Total_Health_Impact)) %>%
  slice(1:10)

library(ggplot2)
ggplot(top_health, aes(x = reorder(EVTYPE, Total_Health_Impact), y = Total_Health_Impact)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Most Harmful Weather Events to Population Health",
       x = "Event Type", y = "Total Health Impact (Fatalities + Injuries)")

  • Based on the analysis, tornadoes are the most harmful weather events to population health, causing the highest total number of fatalities and injuries across the United States. Excessive heat events rank second, significantly contributing to fatalities. These findings highlight the critical need for preparedness and targeted safety measures for these specific event types.

Q2: Greatest Economic Consequences

top_economic <- data %>%
  group_by(EVTYPE) %>%
  summarize(Total_Economic_Impact = sum(Total_Economic_Impact)) %>%
  arrange(desc(Total_Economic_Impact)) %>%
  slice(1:10)

ggplot(top_economic, aes(x = reorder(EVTYPE, Total_Economic_Impact), y = Total_Economic_Impact)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  labs(title = "Top 10 Weather Events by Economic Impact",
       x = "Event Type", y = "Total Economic Impact (USD)")

  • According to the analysis, flood cause the greatest economic losses, resulting in the highest combined damage to property and crops across the United States. Hurricane and tornadoes follow closely, contributing significantly to financial damage. These findings highlight the substantial economic risks posed by severe weather events, emphasizing the need for robust infrastructure and resource allocation to mitigate these impacts.

Thank you.