Impact of Storms and Other Severe Weather Events on the Public Health and the Economy in the United States

Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This data analysis explores the NOAA Storm Database and answer some basic questions about severe weather events. Based on the results of the data analysis described next, tornados have the most adverse impact to public health across the United States. Tornados also have the greatest economic impact based on property damage data. Crop damage is also discussed in this analysis.

Data Processing

This data analysis is based on two public sources using the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database:
- National Weather Service Storm Data Documentation.
- National Climatic Data Center Storm Events, FAQ.

The data can be downloaded here. First download the data, then decompress and read the data as followed:

df_rawDataFromBz2file <- read.csv(bzfile("repdata-data-StormData.csv.bz2"))
#returns data frame with 902297 obs. of 37 variables

The data is then categorized by event type for analysis. Event types include 985 different factors including TORNADO, WINTER STORM, HIGH SURF ADVISORY, etc. The structure of the dataset published by NOAA is described below:

#split by event type, $ EVTYPE: factor
splitPerEventType <- split(df_rawDataFromBz2file, df_rawDataFromBz2file$EVTYPE, drop=TRUE)
str(df_rawDataFromBz2file)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
##  $ BGN_TIME  : Factor w/ 3608 levels "00:00:00 AM",..: 272 287 2705 1683 2584 3186 242 1683 3186 3186 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : Factor w/ 35 levels "","  N"," NW",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_LOCATI: Factor w/ 54429 levels ""," Christiansburg",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_DATE  : Factor w/ 6663 levels "","1/1/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_TIME  : Factor w/ 3647 levels ""," 0900CST",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_LOCATI: Factor w/ 34506 levels ""," CANTON"," TULIA",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ WFO       : Factor w/ 542 levels ""," CI","%SD",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ZONENAMES : Factor w/ 25112 levels "","                                                                                                                               "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : Factor w/ 436781 levels "","\t","\t\t",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

Results

Types of events that are most harmful with respect to population health

First, as shown below fatalities are highest for the following event types:
1. Tornados,
2. Excessive heat,
3. Flash flood,
4. Heat, and
5. Lightning.

#add fatalities for each event type
splitPerEventTypeSum <- sapply(splitPerEventType, function(x) { sum(x$FATALITIES)})
#sort the totals starting with highest counts
sortedEvents <- sort(splitPerEventTypeSum, decreasing = TRUE)
topEvents <- head(sortedEvents, 10)
#plot the results
op <- par(mar = c(14,8,4,2) + 0.1)
barplot(topEvents,
     main="Event types with highest number of fatalities",
     ylab = "Fatality count",
     ylim=c(0,1000+max(topEvents)),
     las = 2)

More specifically, the fatality count per event type is listed below:

topEvents
##        TORNADO EXCESSIVE HEAT    FLASH FLOOD           HEAT      LIGHTNING 
##           5633           1903            978            937            816 
##      TSTM WIND          FLOOD    RIP CURRENT      HIGH WIND      AVALANCHE 
##            504            470            368            248            224

Second, as shown below injuries are highest for the following event types:
1. Tornados,
2. Thunderstorm wind,
3. Flood,
4. Excessive heat, and
5. Lightning.

#add injuries for each event type
splitSumInjuries <- sapply(splitPerEventType, function(x) { sum(x$INJURIES)})
#sort the totals starting with highest counts
sortedInjuries <- sort(splitSumInjuries, decreasing = TRUE)
topInjuries <- head(sortedInjuries, 10)
#plot the results
op <- par(mar = c(14,8,4,2) + 0.1)
barplot(topInjuries,
     main="Event types and the number of injuries",
     ylim=c(0,1000+max(topInjuries)),
     las = 2)

More specifically, the injury count per event type is listed below:

topInjuries
##           TORNADO         TSTM WIND             FLOOD    EXCESSIVE HEAT 
##             91346              6957              6789              6525 
##         LIGHTNING              HEAT         ICE STORM       FLASH FLOOD 
##              5230              2100              1975              1777 
## THUNDERSTORM WIND              HAIL 
##              1488              1361

Types of events with the greatest economic consequences

To achieve this we’ll look at property damage (PROPDMG) and crop damage (CROPDMG), and the sum of both per event type will indicate the outcome.

As shown below the event types with the greatest economic consequences are:
1. Tornados,
2. Flash floods,
3. Thunderstorm wind,
4. Hail, and
5. Flood.

#add property and crop damage for each event type
splitSumEconomic <- sapply(splitPerEventType, function(x) { sum(x$PROPDMG) + sum(x$CROPDMG)})
#sort the totals starting with highest counts
sortedEconomic <- sort(splitSumEconomic, decreasing = TRUE)
topEconomic <- head(sortedEconomic, 10)
#plot the results
op <- par(mar = c(14,8,4,2) + 0.1)
barplot(topEconomic,
     main="Highest economic cost per event type",
     ylim=c(0,max(topEconomic)*1.1),
     las = 2)

More specifically, the top 10 event types with the greatest economic consequences are listed below:

topEconomic
##            TORNADO        FLASH FLOOD          TSTM WIND 
##          3312276.7          1599325.1          1445168.2 
##               HAIL              FLOOD  THUNDERSTORM WIND 
##          1268289.7          1067976.4           943635.6 
##          LIGHTNING THUNDERSTORM WINDS          HIGH WIND 
##           606932.4           464978.1           342014.8 
##       WINTER STORM 
##           134699.6

Suggested future data analyses

Based on these results, suggested future analyses on this dataset would include:

  1. Comparing the trends for severe events in each 10-year period from 1950 to 2011,

  2. Looking at specific regions with similar weather patterns, ex. states located in the Northeast region are likely to have different trends than the Southwest region.

  3. Comparing individual states within regions experiencing similar weather patterns and determine if the impacts can be correlated with demographic data.