###Health and Economic Impact of Weather Events in the US

##Synopsis

We are going to explore the U.S. National Oceanic and Atmospheric Administration’s storm database. This database associates major weather events with its consequences in the economy.

##Data Processing

The first step is to read the data into a data frame.

storm <- read.csv("stormDT.csv")

First we need to preprocess the data so events are represented in the same style or type:

#Unique weather events
length(unique(storm$EVTYPE))
## [1] 985
#To lower case
eventtypes <- tolower(storm$EVTYPE)

#Use spaces
eventtypes <- gsub("[[:blank:][:punct:]+]", " ", eventtypes)
length(unique(eventtypes))
## [1] 874
#Update
storm$EVTYPE <- eventtypes

#Event impact in population health

To find the event types that are most harmful to population health, the number of casualties are aggregated by the event type.

library(plyr)
casualties <- ddply(storm, .(EVTYPE), summarize,
                    fatalities = sum(FATALITIES),
                    injuries = sum(INJURIES))

# We then search for the deadliest and higher injury rate event types
deadlyevents <- head(casualties[order(casualties$fatalities, decreasing = T), ], 3)
injuryevents <- head(casualties[order(casualties$injuries, decreasing = T), ], 3)

#The top 3 deadliest event types

deadlyevents[, c("EVTYPE", "fatalities")]
##             EVTYPE fatalities
## 741        tornado       5633
## 116 excessive heat       1903
## 138    flash flood        978

#The top 3 higher injury rate event types

injuryevents[, c("EVTYPE", "injuries")]
##        EVTYPE injuries
## 741   tornado    91346
## 762 tstm wind     6957
## 154     flood     6789

#Event impact on economy

To analyze the impact of weather events on the economy, available property damage and crop damage reportings/estimates were used.

The property damage in US dollars is represented by PROPDMG and the exponent PROPDMGEXP. The crop damage in US dollars is represented by CROPDMG and the exponent CROPDMGEXP

We then calculate this estimated damages per event type:

exp_transform <- function(e) {
    # h -> hundred, k -> thousand, m -> million, b -> billion
    if (e %in% c('h', 'H'))
        return(2)
    else if (e %in% c('k', 'K'))
        return(3)
    else if (e %in% c('m', 'M'))
        return(6)
    else if (e %in% c('b', 'B'))
        return(9)
    else if (!is.na(as.numeric(e))) # if a digit
        return(as.numeric(e))
    else if (e %in% c('', '-', '?', '+'))
        return(0)
    else {
        stop("Invalid exponent value.")
    }
}
prop_dmg_exp <- sapply(storm$PROPDMGEXP, FUN=exp_transform)
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
storm$prop_dmg <- storm$PROPDMG * (10 ** prop_dmg_exp)
crop_dmg_exp <- sapply(storm$CROPDMGEXP, FUN=exp_transform)
## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción

## Warning in FUN(X[[i]], ...): NAs introducidos por coerción
storm$crop_dmg <- storm$CROPDMG * (10 ** crop_dmg_exp)
library(plyr)
econ_loss <- ddply(storm, .(EVTYPE), summarize,
                   prop_dmg = sum(prop_dmg),
                   crop_dmg = sum(crop_dmg))

econ_loss <- econ_loss[(econ_loss$prop_dmg > 0 | econ_loss$crop_dmg > 0), ]
prop_dmg_events <- head(econ_loss[order(econ_loss$prop_dmg, decreasing = T), ], 3)
crop_dmg_events <- head(econ_loss[order(econ_loss$crop_dmg, decreasing = T), ], 3)

Top 10 events that caused most property damage (in dollars) are as follows

prop_dmg_events[, c("EVTYPE", "prop_dmg")]
##                EVTYPE     prop_dmg
## 154             flood 144657709807
## 366 hurricane typhoon  69305840000
## 741           tornado  56947380677

Similarly, the events that caused biggest crop damage are

crop_dmg_events[, c("EVTYPE", "crop_dmg")]
##          EVTYPE    crop_dmg
## 84      drought 13972566000
## 154       flood  5661968450
## 519 river flood  5029459000

##Results

#Event impact in population health

library(ggplot2)
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 4.0.3
# Set the levels in order
p1 <- ggplot(data=deadlyevents,
             aes(x=reorder(EVTYPE, fatalities), y=fatalities, fill=fatalities)) +
    geom_bar(stat="identity") +
    coord_flip() +
    ylab("Total number of fatalities") +
    xlab("Event type") +
    theme(legend.position="none")
plot(p1)

p2 <- ggplot(data=injuryevents,
             aes(x=reorder(EVTYPE, injuries), y=injuries, fill=injuries)) +
    geom_bar(stat="identity") +
    coord_flip() + 
    ylab("Total number of injuries") +
    xlab("Event type") +
    theme(legend.position="none")
plot(p2)

#Event impact on economy

library(ggplot2)
library(gridExtra)
# Set the levels in order
p1 <- ggplot(data=prop_dmg_events,
             aes(x=reorder(EVTYPE, prop_dmg), y=log10(prop_dmg), fill=prop_dmg )) +
    geom_bar(stat="identity") +
    coord_flip() +
    xlab("Event type") +
    ylab("Property damage in dollars (log-scale)") +
    theme(legend.position="none")
plot(p1)

p2 <- ggplot(data=crop_dmg_events,
             aes(x=reorder(EVTYPE, crop_dmg), y=crop_dmg, fill=crop_dmg)) +
    geom_bar(stat="identity") +
    coord_flip() + 
    xlab("Event type") +
    ylab("Crop damage in dollars") + 
    theme(legend.position="none")
plot(p2)

###Conclusion

Tornadoes have the most impact on population health whilst droughts have the most economic impact on crops and floods in property.