Impact of Severe Weather Events on Population Health and Economy in the U.S.

Synopsis

This analysis uses the storm database collected from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) from 1950 - 2011 to analyze the impact of severe weather events on population health and economy. The fatalities and injuries data are used to assess the impact on population health. The property damage and crop damage data are used to measure the impact on the economy. This analysis indicates that tornadoes and excessive heat are most harmful with respect to population health, while floods, hurricanes/typhoons and tornadoes have the greatest economic consequences.

Loading Packages Used In This Analysis

library(ggplot2)
library(gridExtra)
library(plyr)

Data Processing

Downloading and reading the data

if (!file.exists("repdata-data-StormData.csv.bz2")) {
  download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2")
}
data <- read.csv(bzfile("repdata-data-StormData.csv.bz2"))

Preparing Data for Assessing Impact on Population Health

From the data, the 10 types of weather events that cause the highest number of fatalities and injuries are obtained.

sort <- function(fld, data = data) {
  rslt <- aggregate(data[, which(colnames(data) == fld)], by = list(data$EVTYPE), FUN = "sum")
  names(rslt) <- c("EVTYPE", fld)
  rslt <- head(arrange(rslt, rslt[, 2], decreasing = T), n = 10)
  rslt <- within(rslt, EVTYPE <- factor(x = EVTYPE, levels = rslt$EVTYPE))
  return (rslt)
}
fatalities <- sort("FATALITIES", data = data)
injuries <- sort("INJURIES", data = data)

Preparing Data for Assessing Impact on Economy

The property damage data and crop damage data are converted into the same numerical scale. The two types of damage are then added together. Finally, the 10 types of weather events that cause the most total damage are obtained.

convert <- function(data = data, fld, newFld) {
  len <- dim(data)[2]
  idx <- which(colnames(data) == fld)
  data[, idx] <- as.character(data[, idx])
  logic <- !is.na(toupper(data[, idx]))
  data[logic & toupper(data[, idx]) == "B", idx] <- "9"
  data[logic & toupper(data[, idx]) == "M", idx] <- "6"
  data[logic & toupper(data[, idx]) == "K", idx] <- "3"
  data[logic & toupper(data[, idx]) == "H", idx] <- "2"
  data[logic & toupper(data[, idx]) == "", idx] <- "0"
  data[, idx] <- as.numeric(data[, idx])
  data[is.na(data[, idx]), idx] <- 0
  data <- cbind(data, data[, idx - 1] * 10 ^ data[, idx])
  names(data)[len + 1] <- newFld
  return (data)
}
data <- convert(data, "PROPDMGEXP", "propertyDamage")
## Warning in convert(data, "PROPDMGEXP", "propertyDamage"): NAs introduced by
## coercion
data <- convert(data, "CROPDMGEXP", "cropDamage")
## Warning in convert(data, "CROPDMGEXP", "cropDamage"): NAs introduced by
## coercion
data$totalDamage = data$propertyDamage + data$cropDamage
totalDamage <- sort("totalDamage", data = data)

Results

The plot for the top 10 events causing the most fatalities and injuries is shown below.

plot1a <- qplot(EVTYPE, data = fatalities, weight = FATALITIES, geom = "bar", binwidth = 1) + 
  scale_y_continuous("No. of Fatalities") + 
  theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
  xlab("Severe Weather Type") + 
  ggtitle("Fatalities by Severe Weather\n Events in the U.S.\n from 1950 - 2011")
plot1b <- qplot(EVTYPE, data = injuries, weight = INJURIES, geom = "bar", binwidth = 1) + 
  scale_y_continuous("No. of Injuries") + 
  theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
  xlab("Severe Weather Type") + 
  ggtitle("Injuries by Severe Weather\n Events in the U.S.\n from 1950 - 2011")
grid.arrange(plot1a, plot1b, ncol = 2)

Based on the above plot, tornadoes and excessive heat cause the most fatalities, and tornatoes cause the most injuries in the United States from 1950 to 2011.

The plot for the top 10 weather events causing the most property damage and crop damage is shown below.

qplot(EVTYPE, data = totalDamage, weight = totalDamage, geom = "bar", binwidth = 1) + 
  theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
  scale_y_continuous("Total Damage in US Dollars") + 
  xlab("Severe Weather Type") + ggtitle("Total Damage by\n Severe Weather Events in\n the U.S. from 1950 - 2011")

Based on the above plot, floods, hurricanes/typhoons, and tornadoes cause the most economic damage in the United States from 1950 to 2011.

Conclusion

From the data, we can conclude that tornadoes and excessive heat are most harmful with respect to population health, while floods, hurricanes/typhoons and tornadoes have the greatest economic consequences.