Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database that tracks characteristics of major storms and weather events in the United States between 1950 and November 2011, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The goal of this analysis is to answer the following questions about the effects of severe weather events:
1. Across the United States, which types of events are most harmful with respect to population health?
2. Accross the United States, which types of events have the greatest economic consequences?

Data Processing

The data for this assignment can be downloaded from the course web site: Storm Data

Database documentation is also available: National Weather Service Storm Data Documentation
National Climatic Data Center Storm Events FAQ

The following packages were used for this analysis:

library(dplyr)
library(ggplot2)
library(gridExtra)
library(grid)

Download data set into current working directory and read into R

fileUrl<-"https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileUrl,"./stormData.csv.bz2")
stormData <- read.csv(bzfile("stormData.csv.bz2"))

Subset data for columns pertaining to health and economic consequences of severe weather events

stormDatasub <- stormData[,c("EVTYPE","FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]

Next the effects on population health and economic consequences are investigated.

Population Health

Fatalities summarized by event type in descending order

fatalityData <-stormDatasub %>% 
    group_by(EVTYPE) %>% 
    summarize(Fatalities = sum(FATALITIES, na.rm = T)) %>% 
    arrange(desc(Fatalities))

Injuries summarized by event type in descending order

injuryData <-stormDatasub %>% 
    group_by(EVTYPE) %>% 
    summarize(Injuries = sum(INJURIES, na.rm = T)) %>% 
    arrange(desc(Injuries))

Economic Consequences

To calculate financial damage, a function must be created to convert letter values stored in a separate column to usuable numbers

getExp <- function(e) {
    if (e %in% c("h", "H"))
        return(2)
    else if (e %in% c("k", "K"))
        return(3)
    else if (e %in% c("m", "M"))
        return(6)
    else if (e %in% c("b", "B"))
        return(9)
    else if (!is.na(as.numeric(e))) 
        return(as.numeric(e))
    else if (e %in% c("", "-", "?", "+"))
        return(0)
    else {
        stop("Invalid value.")
    }
}

The function is then called to cacluate property and crop damagne

propExp <-sapply(stormDatasub$PROPDMGEXP, FUN = getExp)
stormDatasub$propDamage<-stormDatasub$PROPDMG *(10**propExp)
cropExp<-sapply(stormDatasub$   CROPDMGEXP, FUN = getExp)
stormDatasub$cropDamage<-stormDatasub$CROPDMG * (10 **cropExp)

Financial damange for crops and property are then summarized by event type

econDamage<-stormDatasub %>% 
            group_by(EVTYPE) %>% 
            summarize(propDamage =sum(propDamage), cropDamage = sum(cropDamage)) 

and events not causing any financial damage are omitted

econDamage<-econDamage[(econDamage$propDamage>0)|econDamage$cropDamage>0, ]

Data is then sorted in decreasing order

propDmgSorted <- econDamage[order(econDamage$propDamage, decreasing = T), ]
cropDmgSorted <- econDamage[order(econDamage$cropDamage, decreasing = T), ]

Results

Effects on population health Top 5 weather events affecting injuries and deaths are as follows:

head(injuryData,5)
## # A tibble: 5 × 2
##           EVTYPE Injuries
##           <fctr>    <dbl>
## 1        TORNADO    91346
## 2      TSTM WIND     6957
## 3          FLOOD     6789
## 4 EXCESSIVE HEAT     6525
## 5      LIGHTNING     5230
head(fatalityData,5)
## # A tibble: 5 × 2
##           EVTYPE Fatalities
##           <fctr>      <dbl>
## 1        TORNADO       5633
## 2 EXCESSIVE HEAT       1903
## 3    FLASH FLOOD        978
## 4           HEAT        937
## 5      LIGHTNING        816

Plot of Top 10 Events

p1<-ggplot(head(injuryData,10), aes(x = reorder(EVTYPE,Injuries), y = Injuries)) +
               geom_bar(fill = "darkolivegreen",stat = "Identity")+
               coord_flip()+
               xlab("EVent Type")+
               ylab("Total Number of Injuries")+
               ggtitle("Health Impact of Top 10 Weather Events in the US")

p2<-ggplot(head(fatalityData,10), aes(x = reorder(EVTYPE, Fatalities), y = Fatalities)) +
    geom_bar(fill = "goldenrod", stat = "Identity")+
    coord_flip()+
    xlab("Event Type")+
    ylab("Total Number of Fatalities")
    
grid.arrange(p1, p2,nrow = 2)    

Tornoados are the most dangerous events as indicated by the plots above.

Economic Consequences

Top 5 weather events causing financial damage to property and crops are as follows

head(propDmgSorted[ ,c("EVTYPE","propDamage")],5)
## # A tibble: 5 × 2
##               EVTYPE   propDamage
##               <fctr>        <dbl>
## 1        FLASH FLOOD 6.820237e+13
## 2 THUNDERSTORM WINDS 2.086532e+13
## 3            TORNADO 1.078951e+12
## 4               HAIL 3.157558e+11
## 5          LIGHTNING 1.729433e+11
head(cropDmgSorted[ ,c("EVTYPE","cropDamage")],5)
## # A tibble: 5 × 2
##        EVTYPE  cropDamage
##        <fctr>       <dbl>
## 1     DROUGHT 13972566000
## 2       FLOOD  5661968450
## 3 RIVER FLOOD  5029459000
## 4   ICE STORM  5022113500
## 5        HAIL  3025974480

Flash floods, thunderstorms, and tornados cause the most economic damage of the weather events.

To confirm the findings above, plots of the Top 10 events for property and crop damage are shown below:

p1 <- ggplot(data=head(propDmgSorted,10), aes(x=reorder(EVTYPE, propDamage), y=log10(propDamage), fill=propDamage )) +
    geom_bar(fill="darkblue", stat="identity") + coord_flip() +
    xlab("Event type") + ylab("Property damage in dollars (log10)") +
    ggtitle("Economic impact of weather events in the US - Top 10") +
    theme(plot.title = element_text(hjust = 0))

p2 <- ggplot(data=head(cropDmgSorted,10), aes(x=reorder(EVTYPE, cropDamage), y=cropDamage, fill=cropDamage)) +
    geom_bar(fill="goldenrod", stat="identity") + coord_flip() + 
    xlab("Event type") + ylab("Crop damage in dollars") + 
    theme(legend.position="none")

grid.arrange(p1, p2, ncol=1, nrow =2)