Health and Economic Impact of Severe Weather Events in the United States

Michelle Jaeger

January 25, 2015

Synopsis

This paper explores the NOAA (National Oceanic and Atmospheric Administration) database in order to answer the following questions about severe weather events in the U.S. Which types of events are most harmful with respect to population health? Which types of events have the greatest economic consequences? The events in the database begin in 1950 and end in 2011.

Data Processing

The relevant variables for the questions at hand were event type, fatalities, injuries, property damage, and crop damage (along with the variables PROPDMGEXP and CROPDMGEXP, to be explained below).

# Download csv.bz2 file
file <- download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile="storm_data.csv", method="curl")
# read.csv unzips csv.bz2 file and reads it into an R object
storm_data <- read.csv("storm_data.csv", nrows=903871, colClasses=c('NULL',
                      'NULL', 'NULL', 'NULL', 'NULL', 'NULL', 'NULL',
                      'factor', 'NULL', 'NULL', 'NULL', 'NULL', 'NULL',
                      'NULL', 'NULL', 'NULL', 'NULL', 'NULL',
                      'NULL', 'NULL', 'NULL', 'NULL', 'numeric',
                      'numeric', 'numeric', 'factor', 'numeric', 'factor',
                      'NULL', 'NULL', 'NULL', 'NULL', 'NULL', 'NULL',
                       'NULL', 'NULL', 'NULL'))

The dataset was reduced for two reasons. First, the questions at hand ask which types of events cause the most damage. Therefore, it seemed sensical to focus on those records recording the most significant damage. Second, documentation by the National Weather Service details 48 types of severe weather events. However, 985 event types were found in the NOAA database. Upon examination this appears to be due to inconsistencies in data recording, misspellings, etc. By reducing the dataset it became somewhat easier to bundle events into similar categories.

# Remove rows with no fatalities/injuries or property/crop damage
reduced_storm_data <- storm_data[storm_data$FATALITIES > 0 | storm_data$INJURIES > 0 | 
                      storm_data$PROPDMG > 0 | storm_data$CROPDMG > 0, ]

The data for property damage and crop damage contain occasional very large numbers. This is likely why the data for property and crop damage were listed as a small number in one column (PROPDMG or CROPDMG), while the next column (PROPDMGEXP or CROPDMGEXP) specified the scale (so for example, 250 in one column and “K” in the next, indicated 250,000.) Two new columns with the actual (large) numbers were created.

library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
reduced_storm_data$PROPDMGEXP <- lapply(reduced_storm_data$PROPDMGEXP,
                                        function(x) if (x == "H" | x == "h")  
                                        {as.numeric(2)}
                                        else if (x == "K" | x == "k") {as.numeric(3)}
                                        else if (x == "M" | x == "m") {as.numeric(6)}
                                        else if (x == "B") {as.numeric(9)}
                                        else if (x == "?" | x == "-" | x == " " | x ==     
                                         "+") {as.numeric(0)}
                                        else as.numeric(x)
                                        )
reduced_storm_data$CROPDMGEXP <- lapply(reduced_storm_data$CROPDMGEXP,
                                        function(x) 
                                        if (x == "H" | x == "h") {as.numeric(2)}
                                        else if (x == "K" | x == "k") {as.numeric(3)}
                                        else if (x == "M" | x == "m") {as.numeric(6)}
                                        else if (x == "B") {as.numeric(9)}
                                        else if (x == "?" | x == "-" | x == " " | x ==  
                                        "+") {as.numeric(0)}
                                        else as.numeric(x)
                                        )
prop_dmg_v <- as.numeric(reduced_storm_data$PROPDMGEXP)
crop_dmg_v <- as.numeric(reduced_storm_data$CROPDMGEXP)
new_reduced_storm_data <- mutate(reduced_storm_data, REALPROPDMG = PROPDMG ^ prop_dmg_v)
new_reduced_storm_data <- mutate(new_reduced_storm_data, REALCROPDMG = CROPDMG ^ crop_dmg_v)
# Further reduce dataset
frsd <- new_reduced_storm_data[new_reduced_storm_data$FATALITIES > 50 | 
new_reduced_storm_data$INJURIES > 500 | new_reduced_storm_data$REALPROPDMG >= 100000000 | new_reduced_storm_data$REALCROPDMG >= 100000000, ]
# Merge together event types
reduction <- function(evtype) {
    evtype <- as.character(evtype)
    # Change plural to singular
    if (grepl("^.+(s)$", evtype, ignore.case = TRUE)) {
        new_evtype <- substr(evtype, 1, nchar(evtype)-1)      
    }
    else if (grepl("drought", evtype, ignore.case = TRUE)) {
      new_evtype <- "Drought"
    }
    else if (grepl("extreme cold", evtype, ignore.case = TRUE)) {
        new_evtype <- "Extreme Cold"
    }
    else if (grepl("cold", evtype, ignore.case = TRUE)) {
        new_evtype <- "Cold/Wind Chill"
    }
    else if (grepl("fire", evtype, ignore.case = TRUE)) {
        new_evtype <- "Wildfire"
    }
    else if (grepl("flash", evtype, ignore.case = TRUE)) {
        new_evtype <- "Flash Flood"
    }
    else if (grepl("flood", evtype, ignore.case = TRUE)) {
        new_evtype <- "Flood"
    }
    else if (grepl("fog", evtype, ignore.case = TRUE)) {
        new_evtype <- "Dense Fog"
    }
    else if (grepl("hail", evtype, ignore.case = TRUE)) {
        new_evtype <- "Hail"
    }
    else if (grepl("hurricane", evtype, ignore.case = TRUE)) {
        new_evtype <- "Hurricane"
    }
    else if (grepl("rain", evtype, ignore.case = TRUE)) {
        new_evtype <- "Heavy Rain"
    }
    else if (grepl("rip", evtype, ignore.case = TRUE)) {
        new_evtype <- "Rip Current"
    }
    else if (grepl("surf", evtype, ignore.case = TRUE)) {
        new_evtype <- "High Surf"
    }
    else if (grepl("surge", evtype, ignore.case = TRUE)) {
        new_evtype <- "Storm Surge/Tide"
    }
    else if (grepl("thunderstorm", evtype, ignore.case = TRUE)) {
       new_evtype <- "Thunderstorm Wind"
    }
    else if (grepl("TSTM", evtype, ignore.case = TRUE)) {
       new_evtype <- "Thunderstorm Wind"
    }
    else if (grepl("tornado", evtype, ignore.case = TRUE)) {
       new_evtype <- "Tornado"
    }
    else if (grepl("tropical storm", evtype, ignore.case = TRUE)) {
       new_evtype <- "Tropical Storm"
    }
    else if (grepl("typhoon", evtype, ignore.case = TRUE)) {
      new_evtype <- "Hurricane"
    }
    else if (grepl("wave", evtype, ignore.case = TRUE)) {
        new_evtype <- "Excessive Heat"  
    }
    else if (grepl("extreme heat", evtype, ignore.case = TRUE)) {
        new_evtype <- "Excessive Heat"  
    }
    else if (grepl("heat", evtype, ignore.case = TRUE)) {
        new_evtype <- "Heat"
    }
    else if (grepl("warm", evtype, ignore.case = TRUE)) {
       new_evtype <- "Heat"
    }
}

frsd$EVTYPE <- lapply(frsd$EVTYPE, reduction)
# Calculate Fatalities per Event Type
fatalities_vector <- c()
events_vector <- c()
for (ev in unique(frsd$EVTYPE)) {
    if (is.null(ev)) {
        ev <- "NA"
    }
    events_vector <- c(events_vector, ev)
    rows <- frsd[frsd$EVTYPE == ev,]
    total_fatalities <- sum(rows$FATALITIES)
    fatalities_vector <- c(fatalities_vector, total_fatalities)
}
fatalities <- fatalities_vector
events <- events_vector
df <- data.frame(events, fatalities)
new_df <- arrange(df, desc(fatalities))
top_df <- new_df[1:5,]
top_events <- as.character(top_df$events)
top_fatalities <- as.numeric(top_df$fatalities)

# Calculate Injuries per Event Type
injuries_vector <- c()
events_vector <- c()
for (ev in unique(frsd$EVTYPE)) {
  if (is.null(ev)) {
    ev <- "NA"
  }
  events_vector <- c(events_vector, ev)
  rows <- frsd[frsd$EVTYPE == ev,]
  total_injuries <- sum(rows$INJURIES)
  injuries_vector <- c(injuries_vector, total_injuries)
}
injuries <- injuries_vector
events <- events_vector
dfi <- data.frame(injuries, events)
dfi <- arrange(dfi, desc(injuries))
top_dfi <- dfi[1:5,]
top_events_i <- as.character(top_dfi$events)
top_injuries <- as.numeric(top_dfi$injuries)

# Calculate Property Damage per Event Type
property_vector <- c()
events_vector <- c()
for (ev in unique(frsd$EVTYPE)) {
  if (is.null(ev)) {
    ev <- "NA"
  }
  events_vector <- c(events_vector, ev)
  rows <- frsd[frsd$EVTYPE == ev,]
  total_prop_damage <- sum(rows$REALPROPDMG)
  property_vector <- c(property_vector, total_prop_damage)
}
prop_damage <- property_vector
events <- events_vector
dfp <- data.frame(events, prop_damage)
dfp <- arrange(dfp, desc(prop_damage))
top_dfp <- dfp[1:3,]
top_events_p <- as.character(top_dfp$events)
top_prop_damage <- as.numeric(top_dfp$prop_damage)

# Calculate Crop Damage per Event Type
crop_vector <- c()
events_vector <- c()
for (ev in unique(frsd$EVTYPE)) {
  if (is.null(ev)) {
    ev <- "NA"
  }
  events_vector <- c(events_vector, ev)
  rows <- frsd[frsd$EVTYPE == ev,]
  total_crop_damage <- sum(rows$REALCROPDMG)
  crop_vector <- c(crop_vector, total_crop_damage)
}
crop_damage <- crop_vector
events <- events_vector
dfc <- data.frame(events, crop_damage)
dfc <- arrange(dfc, desc(crop_damage))
top_dfc <- dfc[1:3,]
top_events_c <- as.character(top_dfc$events)
top_crop_damage <- as.numeric(top_dfc$crop_damage)

Results

boxplot(top_fatalities~top_events, top_df, xlab="Event Type", ylab="Fatalities",
        main="Fig.1 Weather Events Causing Most Fatalities")

According to the analyzed data (see Figure 1), tornado’s were the largest cause of fatalities, while heat was the second largest.

boxplot(top_injuries~top_events_i, top_dfi, xlab="Event Type", ylab="Injuries",
        main="Fig. 2 Weather Events Causing Most Injuries")

Tornado’s were also the largest cause of injury, by far (see Figure 2). Other notable events were floods, hurricane’s and heat.

boxplot(top_prop_damage~top_events_p, top_dfp, xlab="Event Type", ylab="Property Damage (E notation)", main="Fig. 3 Weather Events Causing Most Property Damage")

As shown in Figure 3, flash flood’s have caused the most property damage. Tornado’s also are not only culprits in damage to human health, but have caused the third most property damage of the various severe weather events. Conversely, drought appears to have caused the most damage to crops (about 10x10^21 in total), though extreme cold and flood have also done significant damage (3x10^19 and 4x10^18, respectively).

In sum, the severe weather events most detrimental to human health and property are tornados, floods, and drought. Extreme temperatures (heat and cold) can also have very significant effects on human and crop health.