Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. The purpose of this analysis was to explore which severe event types contribute the most to negative population health effects and economic consequences. Above all, the most serious weather threat to population health and the economy is tornadoes.

Data Processing

Download the NOAA storm data

download.file(url = paste("https://d396qusza40orc.cloudfront.net/", 
                          "repdata/data/StormData.csv.bz2", 
                          sep = ""), 
              destfile = "./StormData.csv.bz2", 
              method = "curl")

Import the data into R

StormData <- read.csv(bzfile("./StormData.csv.bz2"))

Clean up the data for analysis

## use the dplyr package
suppressMessages(library(dplyr))

## create a usable date field and clean up data types / classes
stormdf <- StormData %>% 
    mutate(RefNum = REFNUM, 
           State = as.character(STATE), 
           County = as.character(COUNTYNAME), 
           BeginDate = substr(BGN_DATE, 
                              1, 
                              as.integer(regexpr(" ", BGN_DATE)) - 1), 
           BeginDate = as.Date(strptime(BeginDate, "%m/%d/%Y")), 
           TimeZone = as.character(TIME_ZONE), 
           EventType = as.character(EVTYPE), 
           Fatalities = as.integer(FATALITIES), 
           Injuries = as.integer(INJURIES), 
           PropertyDamage = PROPDMG, 
           CropDamage = CROPDMG, 
           TotalDamage = PropertyDamage + CropDamage) %>% 
    select(38:48) %>% 
    arrange(RefNum)


Results

Population health effects - which event types are the most harmful?

Tornadoes are the leading cause of fatalities and injuries, with excessive heat also contributing moderately to fatalities.

## load necessary packages
suppressMessages(library(tidyr))
suppressMessages(library(ggplot2))
suppressMessages(library(scales))

## First, calculate the total number of fatalities and injuries by event type
events <- stormdf %>% 
    group_by(EventType) %>% 
    summarise(Fatalities = sum(Fatalities), 
              Injuries = sum(Injuries), 
              Damages = sum(TotalDamage))

## Next, tidy up the data to include an outcome type and keep the top 10
tPopHealth <- events %>% 
    gather(OutcomeType, NumOutcomes, Fatalities:Injuries) %>% 
    inner_join(data.frame(EventType = events$EventType, 
                          rankF = dense_rank(desc(events$Fatalities)), 
                          rankI = dense_rank(desc(events$Injuries))) %>% 
                   filter(rankF <= 10 | rankI <= 10) %>% 
                   group_by(EventType) %>% 
                   summarise(n = n()) %>% 
                   select(EventType), 
               "EventType") %>% 
    arrange(OutcomeType, desc(NumOutcomes))

## Then, change the order of the OutcomeType factor to descend
lOrder <- tPopHealth[tPopHealth$OutcomeType == "Fatalities", ]
lOrder <- lOrder[order(lOrder$NumOutcomes), ]
lOrder <- lOrder$EventType
tPopHealth$EventType <- factor(tPopHealth$EventType, 
                               levels = lOrder)

## Now, create plots that show the most harmful storm event types
ggplot(data = tPopHealth, 
       aes(x = NumOutcomes, 
           y = EventType)) + 
    geom_segment(aes(yend = EventType), 
                 xend = 0, 
                 colour="grey50") + 
    geom_point(size = 3, 
               aes(colour = OutcomeType)) + 
    scale_colour_brewer(palette = "Set1", 
                        limits = c("Fatalities", "Injuries"), 
                        guide = FALSE) + 
    theme(panel.grid.major.y = element_blank()) +
    facet_grid(~ OutcomeType, 
               scales = "free") + 
    xlab("Total Number of Outcomes") + 
    ylab("Event Type") + 
    scale_x_continuous(labels = comma) + 
    ggtitle("Figure 1-1: Number of Fatalities/Injuries by Event Type")


Economic consequences - which event types cause the most damage?

Tornadoes cause the most damage of any sever weather event type. Flash flood, thunderstorm wind and hail also cause a significant amount of damages.

## Pull the top 10 events with the most total damages
tEcon <- events %>% 
    arrange(desc(Damages)) %>% 
    filter(row_number() <= 10)

## Create plot that shows the most impactful storm event types
ggplot(data = tEcon, 
       aes(x = Damages, 
           y = reorder(EventType, Damages))) + 
    geom_segment(aes(yend = EventType), 
                 xend = 0, 
                 colour="grey50") + 
    geom_point(size = 3) + 
    theme(panel.grid.major.y = element_blank()) +
    xlab("Total Amount of Damages") + 
    ylab("Event Type") + 
    scale_x_continuous(labels = dollar) + 
    ggtitle("Figure 1-2: Total Damages by Event Type")