Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. For this project, data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database was analyzed to find the most harmful storm types with respect to population health and economic consequences across the US. Data recordings start from 1950, but the analysis here only takes into account recordings from 1996 onwards due to a lack of comprehensive records before this date. Furthermore, observations that did not correspond to one of the 48 official event types were excluded from the analysis.

Data Processing

## load packages
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(RColorBrewer)
## read data
data <- read.csv(bzfile("repdata_data_StormData.csv.bz2"))
## remove rows not corresponding to an official event type
eventNames <- toupper(trimws(readLines("names.txt")))
## Warning in readLines("names.txt"): incomplete final line found on 'names.txt'
data <- data[data$EVTYPE %in% eventNames,]
## remove data from before 1996
data$BGN_DATE <- mdy_hms(data$BGN_DATE)
data <- data[year(data$BGN_DATE) >= 1996,]

Results

1. Across the United States, which types of events are most harmful with respect to population health?

In order to find the most harmful storm types with respect to population health, fatalities and injuries from 1996-2011 were aggregated per storm type, and those in the top quantile were plotted.

data$sumFatInj <- data$FATALITIES + data$INJURIES
values1 <- as.vector(unlist(lapply(split(data$sumFatInj, data$EVTYPE), sum)))
names1 <- names(lapply(split(data$sumFatInj, data$EVTYPE), sum))
df1 <- data.frame(event = names1, fatalInj = values1)
## plot 25% most harmful events
top25 <- quantile(df1[,2])[4]
df1 <- df1[df1$fatalInj >= top25,]
cols <- brewer.pal(n = nrow(df1), name = "RdBu")
## Warning in brewer.pal(n = nrow(df1), name = "RdBu"): n too large, allowed maximum for palette RdBu is 11
## Returning the palette you asked for with that many colors
par(mar=c(12, 6, 4, 4))
barplot(df1$fatalInj, col=cols, density=80, ylim=c(0, max(df1[,2]+1000)), 
        main=c("Top 25% most harmful storm types in the USA \n with respect to population health"), 
        ylab="Total Fatalities + Injuries (1996-2011)", 
        names.arg = df1[,1], las=2, cex.axis = .75)

According to the plot, the storm types causing the most harm to the general population in terms of fatalities + injuries are the following: EXCESSIVE HEAT, FLASH FLOOD, HAIL, HEAT, HEAVY SNOW, HIGH WIND, LIGHTNING, THUNDERSTORM WIND, TORNADO, WILDFIRE AND WINTER STORM

2. Across the United States, which types of events have the greatest economic consequences?

In order to find the most harmful storm types with respect to economic consequence, crop and property damage form 1996-2011 were aggregated per storm type, those in the top quantile weer plotted.

expons <- c("M", "B") ## only include events with damage in the millions or billions
data <- data[data$CROPDMGEXP %in% expons & data$PROPDMGEXP %in% expons,]
for(i in 1:nrow(data)){ ## convert values to billions
    if (data$PROPDMGEXP[i] == "M") data$PROPDMG <= data$PROPDMG*.001
    if (data$CROPDMGEXP[i] == "M") data$CROPDMG <= data$CROPDMG*.001
}
data$sumPropCrp <- data$PROPDMG + data$CROPDMG
values2 <- as.vector(unlist(lapply(split(data$sumPropCrp, data$EVTYPE), sum)))
names2 <- names(lapply(split(data$sumPropCrp, data$EVTYPE), sum))
df2 <- data.frame(event = names2, propCrp = values2)
## plot 25% most harmful events
top25_2 <- quantile(df2[,2])[4]
df2 <- df2[df2$propCrp >= top25_2,]
cols2 <- brewer.pal(n = nrow(df2), name = "RdBu")
par(mar=c(8, 6, 4, 4))
barplot(df2$propCrp, col=cols2, density=80, ylim=c(0, max(df2[,2]+1000)), 
        main=c("Top 25% most harmful storm types in the USA \n with respect to economic consequence"), 
        ylab="Crop + property damage in billions of dollars (1996-2011)", 
        names.arg = df2[,1], las=2, cex.axis = .75)

According to the plot, the storm types causing the most harm to the economy in terms of crop and property damage are the following: FROUGHT, FLASH FLOOD, FLOOD, HAIL, AND HIGH WIND.