Synopsis

Based on the National Weather Service data from 1950 to 2011 the event type that was responsible for the largest cumulative number of fatalities and injuries was tornado. It caused 37% of all fatalities and 65% of all injuries from weather events. Based on the same data the event type that caused the greatest economic losses was flood. It resulted in 32% of all property and crop damage from weather events.

Data Processing

Health Data

To analyse health impact we read the data and aggregate fatalities and injuries by event type.

df <- read.csv("./repdata_data_StormData.csv.bz2",stringsAsFactors = FALSE)
ftfull <- aggregate(cbind(FATALITIES,INJURIES) ~ EVTYPE, data = df, sum)

It’s not difficult to notice that only a few event types resulted in over 400 cumulative fatalities.

nrow(ftfull[ftfull$FATALITIES>400,])
## [1] 7

We will assign the rest of event types to “ALL OTHER” category and aggregate fatalities and injuries again.

ftfull$type <- ifelse(ftfull$FATALITIES>400,ftfull$EVTYPE,"ALL OTHER")
ft <- aggregate(cbind(FATALITIES,INJURIES) ~ type, data = ftfull, sum)
ft$type <- as.factor(ft$type)

Next for each event type we compute the cumulative fatalities and injuries as percentage of total fatalities and injuries from all weather events.

ft$percft <- round(ft$FATALITIES/sum(ft$FATALITIES)*100,digits = 2)
ft$perinj <- round(ft$INJURIES/sum(ft$INJURIES)*100,digits = 2)

Lastly we reorganize the columns of the data set to prepare for graphing.

ft1 <- ft[,c(1,4)]
names(ft1)[2] <- "Percent"
ft1$loss <- "Fatalities"
ft2 <- ft[,c(1,5)]
names(ft2)[2] <- "Percent"
ft2$loss <- "Injuries"
ftgraph <- rbind(ft1,ft2)

Economic Data

To address economic impact we compute property and crop losses and add them up.

df$prop <- ifelse(df$PROPDMGEXP %in% c("B","b"), df$PROPDMG*1000,
                   ifelse(df$PROPDMGEXP %in% c("M","m"), df$PROPDMG, df$PROPDMG/1000))
df$crop <- ifelse(df$CROPDMGEXP %in% c("B","b"), df$CROPDMG*1000,
                   ifelse(df$CROPDMGEXP %in% c("M","m"), df$CROPDMG, df$CROPDMG/1000))
df$damage <- df$prop+df$crop

The losses above are computed in millions. There is a small number of cases where magnitude variable (PROPDMGEXP,CROPDMGEXP) is unrecognizable.

nrow(df[!df$PROPDMGEXP %in% c("K","k","M","m","B","b",""),])/nrow(df)
## [1] 0.0003557587

If magnitude variable is unrecognizable we assume the losses are in thousands.

Next we aggregate damage by event type.

econ <- aggregate(damage ~ EVTYPE, data = df, sum)

It’s not difficult to notice that only a few event types resulted in cumulative damage of over 10 billion.

nrow(econ[econ$damage>10000,])
## [1] 9

We will assign the rest of event types to “ALL OTHER” category and aggregate fatalities and injuries again.

econ$type <- ifelse(econ$damage>10000,econ$EVTYPE,"ALL OTHER")
ec <- aggregate(damage ~ type, data = econ, sum)

Next for each event type we compute the cumulative damages as percentage of total damages from all weather events.

ec$percdmg <- round(ec$damage/sum(ec$damage)*100,digits = 2)

Results

The graph below clearly demonstrates that tornado is the most harmful event type to population health. It 3 times exceeds the next most harmful event type (excessive heat) in cumulative number of fatalities and 13 times in cumulative number of injuries.

library(ggplot2)
ggplot(ftgraph,aes(x=type,y=Percent))+
      geom_bar(aes(fill=loss),position=position_dodge(.9),stat = "identity")+
      geom_text(aes(label=Percent,fill=loss), position=position_dodge(.9), vjust=-0.2)+
      xlab("Event Type")+
      ylab("Percent of Total")+
      ggtitle("Fatalities and Injuries by Event Type")+
      theme(text = element_text(size=12), axis.text.x = element_text(angle=90, hjust=1))

The next graph demonstrates that flood is the most harmful event type to economy. It over 2 times exceeds the next most harmful event type (hurricane/typhoon) in cumulative property and crop damages.

ggplot(ec,aes(x=type,y=percdmg))+
      geom_bar(fill="blue",position = position_dodge(.9),stat = "identity")+
      geom_text(aes(label=percdmg),position = position_dodge(.9),vjust=-0.2)+
      xlab("Event Type")+
      ylab("Percent of Total")+
      ggtitle("Economic Losses by Event Type")+
      theme(text = element_text(size=12), axis.text.x = element_text(angle=90, hjust=1))