Effects of Storm and Other Severe Weather Events on Public Health and Economics

Synopsis

In this study, we are using the data from U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database to answer the following two questions:
1. Across the United States, which types of events are most harmful with respect to population health?
2. Across the United States, which types of events have the greatest economic consequences?
The population health is evaluated by sum of two variables: FATALITIES and INJURIES. The economic consquences are evaluated by sum of another two indexes: property damage and crop damage. The results show that tornado is the most harmful with respect to population health and flood has the greatest economic consequences.

Data Processing

the storm data was downloaded to a local drive from the Reproducible Research course web site and was unzipped as repdata-data-StormData.csv. we loaded them csv file into R and keeps only those variables that are going to be used in this study.

# load the storm data
storm <- read.csv("repdata-data-StormData.csv", header = TRUE, sep = ",")
# only keep variables used for the analysis
storm <- storm[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", 
    "CROPDMG", "CROPDMGEXP")]

The two variables PROPDMGEXP and CROPDMGEXP represent the unit of PROPDMG and CROPDMG in terms of 'K','M','B' and etc. To make numeric calculation, we first relabel them in terms of numbers.

# update the level of PROPDMGEXP
levels(storm$PROPDMGEXP) <- c(rep("1", 13), "1000000000", "1", "1", "1000", 
    "1000000", "1000000")
# update the level of CROPDMGEXP
levels(storm$CROPDMGEXP) <- c(rep("1", 4), "1000000000", "1000", "1000", "1000000", 
    "1000000")

The economic consequence is evaluated by the sum of property damage and crop damage.

# evaluate economic damage
storm$PROPDMG <- storm$PROPDMG * as.numeric(as.character(storm$PROPDMGEXP))
storm$CROPDMG <- storm$CROPDMG * as.numeric(as.character(storm$CROPDMGEXP))
storm$ECO <- storm$PROPDMG + storm$CROPDMG

The loss of population health is evaluated by the sum of fatilities and injuries.

# evaluate populatin health loss
storm$HEALTH <- storm$FATALITIES + storm$INJURIES

The EVTYPE data has many spelling errors and some data cleaning effort was taken to clean the EVTYPE data.

a <- tolower(as.character(storm$EVTYPE))
# remove () and any char in betwween
a <- gsub("\\(.*\\)", "", a)
# remove numbers
a <- gsub("g?[0-9].*", "", a)
# remove a single letter at the end:
a <- gsub(" .$", "", a)
# remove heading and trailing space
a <- gsub("^\\s+|\\s+$", "", a)
# remove non-letter character at the end
a <- sub("[^a-z]$", "", a)
# ABBR replacement
a <- gsub("tstm", "thunderstorm", a)
a <- gsub("wnd", "wind", a)
# remove 's at the end
a <- sub("s$", "", a)
# regroup:
a[agrep("avalanche", a)] = "avalanche"  #7.2
a[agrep("blizzard", a)] = "blizzard"  #7.3
a <- sub("cst", "coastal", a)
a[agrep("coastal", a)] = "coastal flood"  #7.4
a <- sub("flooding", "flood", a)
a <- sub("floodin", "flood", a)
a[grepl("flash", a) & grepl("flood", a)] = "flash flood"  #7.14
a[!grepl("coastal flood", a) & !grepl("flash flood", a) & !grepl("lakeshore", 
    a) & grepl("flood", a)] = "flood"  #7.15, 7.27
a[agrep("extreme code", a)] = "extreme cold/wind chill"  #7.13
a[agrep("extreme wind", a)] = "extreme cold/wind chill"  #7.13
a[!grepl("extreme cold/wind chill", a) & grepl("chill", a)] = "cold/wind chill"  #7.5
a[a == "cold/wind"] = "cold/wind chill"
a[agrep("tornado", a)] = "tornado"  #7.40
a[grep("surge", a)] = "storm surge/tide"  #7.37
a <- gsub("tides", "tide", a)
a[agrep("high tide", a)] = "astronomical high tide"  # 7.1
a[!grepl("freezing fog", a) & grepl("fog", a)] = "dense fog"  # 7.7, 7.16
a[grep("smoke", a, value = F)] = "dense smoke"  #7.8
a[agrep("microburst", a)] = "microburst"  #
a[agrep("mircoburst", a)] = "microburst"  #
a[grep("dry", a, value = F)] = "drought"  #7.9
a[grep("drought", a, value = F)] = "drought"
a[grep("devil", a, value = F)] = "dust devil"  #7.10
a[agrep("dust storm", a)] <- "dust storm"  #7.11
a[agrep("excessive heat", a)] = "excessive heat"  #7.12
a[agrep("record heat", a)] = "excessive heat"  #7.12
a[!grepl("excessive heat", a) & grepl("heat", a)] = "heat"  #7.20
a[grep("frost", a, value = F)] = "frost/freeze"  #7.17
a[grep("^freeze|[^/]freeze", a, value = F)] = "frost/freeze"  #7.17
a[agrep("funnel", a)] = "funnel cloud"  #7.18
a[grepl("hail", a) & !grepl("marine hail", a)] <- "hail"  # 7.19, 7.30
a[grepl("rain", a)] <- "heavy rain"  #7.21
a[grepl("snow", a)] <- "heavy snow"  # 7.22
a[grep("surf", a)] = "high surf"  #7.23
a[agrep("marine thunderstorm wind", a)] = "7.33"
a[agrep("thunderstorm wind", a)] = "7.39"
a[agrep("thunderstorm", a)] = "thunderstorm"
a[grep("7.39", a)] <- "thunderstorm wind"
a[grep("7.33", a)] <- "marine thunderstorm wind"
a[agrep("hurricane", a)] <- "hurricane/typhoon"
a[agrep("typhoon", a)] <- "hurricane/typhoon"  #7.25
a[agrep("ice storm", a)] <- "ice storm"  #7.26
a[agrep("lightning", a)] <- "lightning"  #7.29
a[agrep("rip current", a)] <- "rip current"  #7.34
a[agrep("sleet", a)] <- "sleet"  #7.36
a[agrep("funnel", a)] <- "tornado"  # 7.40
a[agrep("waterspout", a)] <- "waterspout"  #7.45
a[agrep("tropical storm", a)] <- "tropical storm"  # 7.42
a[agrep("volcan", a)] <- "volcanic ash"  #7.44
a[agrep("wildfire", a)] <- "wildfire"  #7.46
a[agrep("winter storm", a)] <- "winter storm"  #7.47
a[agrep("winter weather", a)] <- "winter weather"  #7.48
a[agrep("winter mix", a)] <- "winter mix"
a[a == "wintry mix"] <- "winter mix"
# update EVTYPE variable
storm$EVTYPE <- as.factor(a)

Results

Analysis of population health loss

we sum up fatilities and injuries by event type to get the most harmful event to population health. The following chart shows the top 10 events that were most harmful. As shown, flood is the most harmful event.

poHealth <- tapply(storm$HEALTH, storm$EVTYPE, sum)
poHealth <- sort(poHealth, decreasing = TRUE)
library(lattice)
barchart(as.table(poHealth)[1:10], main = "Populaton Health", xlab = "Fatilities+Injuries", 
    ylab = "Top 10 Events", col = "red")

plot of chunk unnamed-chunk-6

Analysis of economic consequences

Property damage and Crop damage are considered equally to evaluate economic consequces. The following chart show the top 10 events that have the most effect on economics. As shown, tornado has the greatest economic consequences.

ecoCons <- tapply(storm$ECO, storm$EVTYPE, sum)
ecoCons <- sort(ecoCons, decreasing = TRUE)
barchart(as.table(ecoCons)[1:10], main = "Economy Consequence", xlab = "Damages ($)", 
    ylab = "Top 10 Events", col = "red")

plot of chunk unnamed-chunk-7