Synopsis

Although the citizens are aware of the damages that severe bad weather events produce due to the general public media, the total quantities of losts in human life and properties throughout the years it’s not well known. We analized the weather events that causes the worst damages. We compared the events identifying which provokes more lives lost against those that results in more properties lost. We also separated the outstanding events (identifying outliers) to emphasize on the less publicized but more common ones.

Data Processing

For this study we used the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. It records major weather events since 1950 including location, duration and estimates in property damages and fatalities and injuries.

The data comes in a cvs compressed file.

library(ggplot2)
Sys.setlocale("LC_TIME", "C")
stormdata <- read.csv(bzfile('repdata-data-StormData.csv.bz2'),
                      comment.char = "", nrows = 1500000,
                      stringsAsFactors = F)

Estimates of damages are coded in thousands, millions and billions dollars. We transformed them to dollars units.

todollars <- function(ammount, expfactor) {
    if (expfactor == "K" | expfactor == "k") { ammount * 1000 }
    else if (expfactor == "M" | expfactor == "m") { ammount * 1000000 }
    else if (expfactor == "B") { ammount * 1000000000 }
    else { ammount }
}
storms <- data.frame(as.Date(stormdata$BGN_DATE, '%m/%d/%Y'),
                     stormdata$STATE,
                     stormdata$EVTYPE,
                     stormdata$FATALITIES,
                     stormdata$INJURIES,
                     mapply(todollars, stormdata$PROPDMG, stormdata$PROPDMGEXP),
                     mapply(todollars, stormdata$CROPDMG, stormdata$CROPDMGEXP),
                     stringsAsFactors=F
)
names(storms) <- c("date","state","evtype",
                   "fatalities","injuries","propdmg","cropdmg")

As stated in http://www.ncdc.noaa.gov/stormevents/details.jsp the code for events is more detailed since January 1996. The last recorded event in this data file is November 2011. We filtered the records before January, 1st 1996 and also ignored records that are summary data.

storms96 <- storms[which(as.numeric(format(storms$date, "%Y%m")) >= 199601 &
                             !grepl("^summary",ignore.case=T,storms$evtype)),]

We did some small cleansing in the event type variable. We transformed all string to lower case and trimed white spaces. We normalized hurricanes and typhoons to “hurricane”. We applied similar processes to thunderstorms, floods and ice. We discarded further normalization because it could let to imputation errors. For instance, “heavy rain/high surf” could either fall in “heavy rain” or “high surf” categories. We also didn’t differentiate between property and crop (agriculture) damages and considered both of them together.

storms96$evtype <- tolower(storms96$evtype)
trim <- function (x) gsub("^\\s+|\\s+$", "", x)
storms96$evtype <- trim(storms96$evtype)
storms96$evtype <- gsub("  "," ", storms96$evtype)

storms96$evtype[storms96$evtype == "hurricane edouard"] <- "hurricane"
storms96$evtype[storms96$evtype == "hurricane/typhoon"] <- "hurricane"
storms96$evtype[storms96$evtype == "typhoon"] <- "hurricane"
storms96$evtype <- gsub("tstm.*","thunderstorm", storms96$evtype)
storms96$evtype[storms96$evtype == "thunderstorms"] <- "thunderstorm"
storms96$evtype[storms96$evtype == "thunderstorm wind"] <- "thunderstorm"
storms96$evtype[storms96$evtype == "thunderstorm wind (g40)"] <- "thunderstorm"
storms96$evtype <- gsub("ice.*","ice", storms96$evtype)
storms96$evtype <- gsub("icy.*","ice", storms96$evtype)
storms96$evtype <- gsub("flood.*","flood", storms96$evtype)
storms96$evtype <- gsub("flash flood.*","flash flood", storms96$evtype)

storms96$evtype <- as.factor(storms96$evtype)

storms96$damages <- storms96$propdmg + storms96$cropdmg

Results

Human Life Costs

We grouped the event types by the casualties they cause.

fatlpertype <- aggregate(fatalities ~ evtype,
                    data=storms96[storms96$fatalities > 0,],
                    FUN=sum)
injrpertype <- aggregate(injuries ~ evtype,
                         data=storms96[storms96$injuries > 0,],
                         FUN=sum)

Most the bad weather events (0.8%) didn’t cause any victims from Jan’96 to Nov’11.

round(sum(storms96$fatalities != 0 | storms96$fatalities != 0)
            / nrow(storms96) * 100, 1)
## [1] 0.8

But some of them were breathless and hard to forget like the Joplin tornado and the heatwaves in the Midwest.

storms96[storms96$fatalities >= 50, c(3,1,2,4,5)]
##                evtype       date state fatalities injuries
## 355128 excessive heat 1999-07-28    IL         99        0
## 371112 excessive heat 1999-07-04    PA         74      135
## 862634        tornado 2011-05-22    MO        158     1150

The 5 worst events presented in descending order according the fatalities per year (factor 5.916 because on 2011 there are only 11 months) they cause are:

worstfatl <- order(fatlpertype$fatalities, decreasing=T)
fatlpertype$fatalitiesperyear <- round(fatlpertype$fatalities / 5.916)
head(fatlpertype[worstfatl, c("evtype","fatalitiesperyear")], n=5)
##            evtype fatalitiesperyear
## 18 excessive heat               304
## 79        tornado               255
## 24    flash flood               150
## 54      lightning               110
## 25          flood                70

The same table ordered by injuries:

worstinjr <- order(injrpertype$injuries, decreasing=T)
injrpertype$injuriesperyear <- round(injrpertype$injuries / 5.916)
head(injrpertype[worstinjr, c("evtype","injuriesperyear")], n=5)
##            evtype injuriesperyear
## 74        tornado            3493
## 22          flood            1142
## 15 excessive heat            1080
## 72   thunderstorm             867
## 47      lightning             700

Damages

Regarding damages we calculated the total damages per year in US dollars for each bad weather event type.

dmgspertype <- aggregate(damages ~ evtype,
                         data=storms96[storms96$damages > 0,],
                         FUN=sum)
worstdmgs <- order(dmgspertype$damages, decreasing=T)
dmgspertype$damagesperyear <- round(dmgspertype$damages / (5.916 * 1000000))
head(dmgspertype[worstdmgs, c("evtype","damagesperyear")], n=5)
##          evtype damagesperyear
## 32        flood          25172
## 64    hurricane          14718
## 102 storm surge           7301
## 108     tornado           4209
## 48         hail           2886

Filtering outliers shows that storms and bad weather causes great lost in properties and agriculture throughout the year summing up high ammount of liabilities. We used R boxplot to identify outliers.

dmgspermonth <- aggregate(damages ~ evtype + format(date, "%m %Y"),
                          data=storms96, FUN=sum)
names(dmgspermonth)[2] <- "when"
worstevents <- head(dmgspertype[worstdmgs, c("evtype")], n=5)
dmgspermonthworst5 <- dmgspermonth[dmgspermonth$evtype %in% worstevents,]
dmgspermonthworst5$dwhen <- as.Date(paste("1 ", dmgspermonthworst5$when),
                                    format="%d %m %Y")
outl <- boxplot(dmgspermonthworst5$damages, plot=F)$out
ggplot(subset(dmgspermonthworst5, ! damages %in% outl),
       aes(x=dwhen, y=damages/1000000)) +
    geom_line(aes(group=1)) +
    facet_grid(evtype~.) +
    theme(strip.text.y = element_text(
        size = 8, colour = "red", angle = 90)) +
    labs(x="Date of Event", y="Damages (millions of US$)",
         title="Damages per type since January 1996 (outliers filtered).")

plot of chunk unnamed-chunk-13

Figure 1: Damages since January 1996 in millions of US$ without outliers. It’s only shown the 5 worst bad weather event types.

Conclusions

We conclude that Excessive Heat is the ‘hidden plague’. Its lack of catastrophic images doesn’t serve well from a newstand point of view. Regrettably it causes aproximately 300 deaths and 1000 injured people per year. Tornadoes, 250 fatalities and 3,500 injuries. Floods of different types led to more than 200 deaths.

However floods, hurricanes, storms and tornados induce less fatalities than heat waves but enourmous ammount of damages: near to 25, 15, 7 and 4 billions of dollars per year respectively.