Using data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, we classify damaging weather events into 26 categories. These include such things as tornadoes, heat and cold events, and fog. We then find the average number of fatalities, injuries, and economic damages for each type of event, and compare. The types of events that are most likely to cause human injury or death are heat, tsunamis, avalanches, cold, surf and coastal events, hurricanes, fog, and dust. The types of events that are most likely to cause the most economic damage were hurricanes, droughts, surf and coastal events, tropical storms, tsunamis, and ice events. There is little or no relation between the types of events that cause economic damages and the type of events that cause injuries or deaths.
The data for our analysis come from NOAA’s storm database. The database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage, going back to 1950. However, the data are very messy, as they appear to come directly from state reporters. Some variables are missing or have incorrect values, and variations in terminology and spelling are common. Before we can perform our analysis, we need to tidy up our data into recognizeable categories.
#download data
temp <- tempfile()
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",temp)
F<- "factor"
N<- "numeric"
C<- "character"
colclass<- c(F,C,C,F,C,C,F,F,N,F,C,C,C,C,C,N,F,C,N,N,F,N,N,N,N,F,N,F,C,C,C,N,N,N,N,C,N)
StormData<- read.csv(temp, colClasses=colclass)
StormData$BGN_DATE<-as.Date(StormData$BGN_DATE,"%m/%d/%Y")
After downloading the data, we use the variable “EVTYPE” (event type) to create different types of weather event categories. In many cases, multiple types of damage occured in a single event. For instance, a thunderstorm event may include both tornadoes and hail. Because of this, we created 26 variables to indicate whether a particular type of event was mentioned in the “EVTYPE” variable. In most cases, similar types of events were collapsed into a single variable. The definitions of the different categories of event types are presented below.
It was also necessary to create a summary variable of economic damages. In the original dataset, the values for economic damages could be indicated in hundreds, thousands, millions, or billions of dollars, and were divided into two different types of damages: property damages and crop damages. We put all of the different damages on the same scale, and added property and crop damages together.
Finally, based on the notes included in the dataset, it is clear that $0 in damages sometimes means that there was no damages (as in a sighting of the Northern Lights), and sometimes means that the damages were not recorded or not calculated. Because of this all economic analyses exclude events for which there were no damages reported. This may overstate the economic damages of certain types of events.
library(plyr, quietly = TRUE)
library(dplyr, quietly = TRUE)
## Warning: package 'dplyr' was built under R version 3.6.1
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
#create summary variables for use
SmlData<- StormData%>%
#types of events
mutate(TORNADO=ifelse(grepl("tornado|torndao|dust dev|landspout",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
WIND=ifelse(grepl("wind|wnd",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("wind( )?ch(ill)?",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
COLD=ifelse(grepl("cold|freeze|wind( )?ch(ill)?|low temp|cool|HYPOTHERMIA|frost|record low",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("cold air funnel|cold air tornado",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
FOG=ifelse(grepl("[vf]og",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
WINTWEATH=ifelse(grepl("winter|snow|ice|blizzard|sleet|wintry|freezing (rain|drizzle|fog|spray)|icy|glaze|HEAVY MIX|MIXED PRECIP",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("(lack of|unusually late) snow|snow (drought|melt)|ice (jam|floe)",StormData$EVTYPE,ignore.case = TRUE),
"YES","NO"),
FLOOD=ifelse(grepl("floo(o)?d|dam |fld|SMALL STREAM|high water|urban|RISING WATER",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
FIRE=ifelse(grepl("fire|smoke",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
DROUGHT=ifelse(grepl("dry|drought|lack of snow|unusually late|BELOW NORMAL PRECIPITATION",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("dry micro|dry mirco",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
THUNDERSTORM=ifelse(grepl("thunderstorm|tstm|lightning|LIGHTING|LIGNTNING",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("non[ -]tstm",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
HAIL=ifelse(grepl("hail",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
HEAT=ifelse(grepl("heat|warm|hot|high te|HYPERTHERMIA|record high",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
MICROB=ifelse(grepl("mic(r)?oburst|downburst",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
DUST=ifelse(grepl("dust",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("dust dev",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
SNOW=ifelse(grepl("snow",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("(lack of|unusually late) snow|snow( drought|melt| melt)",StormData$EVTYPE,ignore.case = TRUE)
,"YES","NO"),
WATERSP=ifelse(grepl("wa(y)?ter( )?spout",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
ICE=ifelse(grepl("ice|freezing (rain|drizzle|fog|spray)|icy|glaze",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("ice (melt|jam|floe)",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
VOLCANO=ifelse(grepl("volcan",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
HURRICANE=ifelse(grepl("hurricane|typhoon",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
TROPSTORM=ifelse(grepl("tropical|remnants",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
RAIN=ifelse(grepl("rain|(unseasonably|abnormally|EXTREMELY|EXCESSIVE) wet|drizzle|(RECORD|heavy|EXCESSIVE) PRECIP[ia]TATION|HEAVY SHOWER|wet (weather|year|month)",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("freezing (rain|drizzle)",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
SURF=ifelse(grepl("surf|rip current|wave|seas|tide|beach|STORM SURGE|coastal (er|surge)|swell",StormData$EVTYPE,ignore.case = TRUE)&
!grepl("(cold|heat) wave|season",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
MUDSLIDE=ifelse(grepl("(mud|rock)( )?slide|landsl",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
FUNNEL=ifelse(grepl("funnel|gustnado",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
AVALANCHE=ifelse(grepl("avalanche|AVALANCE",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
TSUNAMI=ifelse(grepl("tsunami",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"),
OTHER=ifelse(grepl("SEICHE|WALL CLOUD|other$|Coastal( )?St|ICE (floes|jam)",StormData$EVTYPE,ignore.case = TRUE),"YES","NO"))%>%
#events not included
mutate(NONE=ifelse(TORNADO=="NO"& WIND=="NO"& COLD=="NO"& FOG=="NO"
& WINTWEATH=="NO"& FLOOD=="NO"& FIRE=="NO"
& DROUGHT=="NO"& THUNDERSTORM=="NO"& HAIL=="NO"
& HEAT=="NO"& MICROB=="NO"& DUST=="NO"& SNOW=="NO"
& WATERSP=="NO"& ICE=="NO"& OTHER=="NO"& TROPSTORM=="NO"
& VOLCANO=="NO"& HURRICANE=="NO"& RAIN=="NO"
& SURF=="NO"& MUDSLIDE=="NO"& FUNNEL=="NO"& AVALANCHE=="NO"
& TSUNAMI=="NO"
,"YES","NO"))%>%
#economic damages summary variables
mutate(
ECON_PR=case_when(
PROPDMG==0 ~ PROPDMG,
PROPDMGEXP %in% c("h","H") ~ 100*PROPDMG,
PROPDMGEXP %in% c("k","K") ~ 1000*PROPDMG,
PROPDMGEXP %in% c("m","M") ~ 1000000*PROPDMG,
PROPDMGEXP %in% c("b","B") ~ 1000000000*PROPDMG),
ECON_CR=case_when(
CROPDMG==0 ~ CROPDMG,
CROPDMGEXP %in% c("h","H") ~ 100*CROPDMG,
CROPDMGEXP %in% c("k","K") ~ 1000*CROPDMG,
CROPDMGEXP %in% c("m","M") ~ 1000000*CROPDMG,
CROPDMGEXP %in% c("b","B") ~ 1000000000*CROPDMG),
ECON_T=ifelse(!is.na(ECON_PR) & !is.na(ECON_CR), ECON_PR+ECON_CR,NA))%>%
#create smaller analysis dataset with limited variables
select(REFNUM,TORNADO:NONE,FATALITIES,INJURIES,ECON_PR:ECON_T,EVTYPE)%>%
#only include event with economic damages or human injuries
subset((FATALITIES>0|INJURIES>0|ECON_T>0))
We analyse 26 different types of events.
Two types frequently occur with more specific damages as well: thunderstorms and winter weather. Thunderstorms included any event that mentioned thunderstorms or lightning. Winter weather included any event that included snow, ice, blizzards, sleet, winter storms, freezing rain or fog, or mixed precipitation. Some events that included these terms were excluded if they indicated spring weather or other types of events, such as the term “snow melt.”
One type of event, wind frequently occured with other events, and was often the explanation for damages associated with thunderstorms, rain, or winter weather. It also occured on its own. Events with the word “windchill” were not included, as the term generally refers to cold, rather than high winds.
Other events include:
-Tornadoes. This includes tornadoes, dust devils, and landspouts.
-Waterspouts. This includes any type of waterspout.
-Funnel clouds. This includes any type of funnel or gustnado.
-Microbursts and downbursts. This includes any event that includes microbursts or downbursts.
-Cold. This includes any mention of cold or cool weather, freezes, windchill, low temperatures, frost, hypothermia, or record lows.
-Heat. This includes any mentions of heat or warmth, high temperatures, hyperthermia, or record highs.
-Drought and dry weather. This includes events that include any type of drought or dry weather, late rains or snow, or below normal precipitation.
-Snow. This includes most events that mention snow, including storms, snow pack, blowing snow, and snow accumulation. Snowmelt floods and snow droughts are excluded.
-Ice. This includes ice storms, ice glazes, and freezing precipitation. It excludes ice melts, james, and floes.
-Rain. This includes any event that mentions rain, excess wetness or precipitation or drizzle. Freezing rain is excluded.
-Tropical storms. This includes any event that mentions tropical storms or remnants of a named storm.
-Hurricanes. This includes both hurricanes and typhoons.
-Hail. This includes any type of hail damage, including floods caused by hail.
-Fog. This includes any type of fog, including freezing fog.
-Flooding. This includes flooding (river, coastal, street, or unspecified), high or rising water, dam breakages, and mentions of streams or “urban.”
-Surf and coastal events. This includes surf, rip currents, waves, high seas, tides, storm surges, and beach erosion. It does not include tsunamis.
-Fire. This includes any event that mentions fire or smoke.
-Dust. This includes duststorms and poor air quality due to dust, but does not include dust devils (which are classified as a type of tornado).
-Volcanoes. This includes damage from volcanic ash or volcanic activity.
-Landslides and mudslides. This includes landslides, rock falls, and mud slides.
-Avalanches. This includes any type of avalanche.
-Tsunamis. This includes any tsunami.
-Other. This includes “other” events, coastal storms, ices floes and jams, and seiches.
Once we had defined our events, we wanted to look at the average reported damages for each type of event. We calculated the average number of fatalities and injuries for each event, as well as the average economic impact, in thousands of dollars. We also wanted to examine separately events that have at least one injury or fatality, to see if the story changes when only looking at “serious” events. It is also important to see the frequency of events. If there were hardly any injuries from the average thunderstorm, but they were much more common than other types of events, then they might still be a concern.
#create empty variable vectors
TYPE<- as.character(names(SmlData[2:27]))
MEANF<- as.numeric(rep(NA,26))
MEANSrF<- as.numeric(rep(NA,26))
MEANI<- as.numeric(rep(NA,26))
MEANSrI<- as.numeric(rep(NA,26))
MEANE<- as.numeric(rep(NA,26))
MEANSrE<- as.numeric(rep(NA,26))
NUM_EVENTS<- as.integer(rep(NA,26))
NUM_EVENTSSr<- as.integer(rep(NA,26))
#fill in means and number of events.
for(i in 2:27){
MEANF[i-1]<-mean(SmlData$FATALITIES[SmlData[i]=="YES"])
MEANSrF[i-1]<-mean(SmlData$FATALITIES[SmlData[i]=="YES"&SmlData$INJURIES+SmlData$FATALITIES>0])
MEANI[i-1]<-mean(SmlData$INJURIES[SmlData[i]=="YES"])
MEANSrI[i-1]<-mean(SmlData$INJURIES[SmlData[i]=="YES"&SmlData$INJURIES+SmlData$FATALITIES>0])
MEANE[i-1]<-round(mean(SmlData$ECON_T[SmlData[i]=="YES"],na.rm = TRUE)/1000,0)
MEANSrE[i-1]<-round(mean(SmlData$ECON_T[SmlData[i]=="YES"&SmlData$INJURIES+SmlData$FATALITIES>0],na.rm = TRUE)/1000,0)
NUM_EVENTS[i-1]<-nrow(SmlData[SmlData[i]=="YES",])
NUM_EVENTSSr[i-1]<- nrow(SmlData[SmlData[i]=="YES"&SmlData$INJURIES+SmlData$FATALITIES>0,])
}
#create data frame
MeanDam<-data.frame(TYPE,NUM_EVENTS,MEANF=round(MEANF,2),MEANI=round(MEANI,2),MEANE,
NUM_EVENTSSr,MEANSrF=round(MEANSrF,2),MEANSrI=round(MEANSrI,2),MEANSrE)
#print table
library(knitr)
kable(MeanDam[order(-NUM_EVENTS),c(1:4,6:8)],
col.names=c("Type","Number of Events","Average Fatalities","Average Injuries",
"Number of Serious Events","Average Fatalities (serious events)","Average Injuries (serious events)"),
caption="Health Outcomes by Event Type: 1950-2011",
row.names=FALSE)
| Type | Number of Events | Average Fatalities | Average Injuries | Number of Serious Events | Average Fatalities (serious events) | Average Injuries (serious events) |
|---|---|---|---|---|---|---|
| THUNDERSTORM | 132873 | 0.01 | 0.11 | 7340 | 0.21 | 2.01 |
| WIND | 129446 | 0.01 | 0.09 | 4974 | 0.24 | 2.30 |
| TORNADO | 40055 | 0.14 | 2.28 | 7950 | 0.71 | 11.50 |
| FLOOD | 33152 | 0.05 | 0.26 | 1457 | 1.07 | 5.96 |
| HAIL | 26649 | 0.00 | 0.06 | 321 | 0.14 | 4.57 |
| WINTWEATH | 5108 | 0.13 | 1.25 | 783 | 0.86 | 8.18 |
| SNOW | 1898 | 0.09 | 0.61 | 229 | 0.74 | 5.09 |
| FIRE | 1260 | 0.07 | 1.28 | 333 | 0.27 | 4.83 |
| RAIN | 1174 | 0.09 | 0.24 | 128 | 0.83 | 2.20 |
| SURF | 1129 | 0.70 | 0.76 | 829 | 0.95 | 1.03 |
| HEAT | 990 | 3.21 | 9.34 | 947 | 3.36 | 9.76 |
| ICE | 890 | 0.14 | 2.75 | 160 | 0.78 | 15.32 |
| COLD | 658 | 0.74 | 0.50 | 363 | 1.34 | 0.90 |
| TROPSTORM | 456 | 0.14 | 0.84 | 43 | 1.53 | 8.91 |
| DROUGHT | 278 | 0.13 | 0.07 | 4 | 8.75 | 4.75 |
| AVALANCHE | 270 | 0.83 | 0.63 | 241 | 0.93 | 0.71 |
| HURRICANE | 233 | 0.58 | 5.72 | 71 | 1.90 | 18.77 |
| MUDSLIDE | 213 | 0.21 | 0.26 | 24 | 1.83 | 2.29 |
| FOG | 189 | 0.43 | 5.70 | 127 | 0.64 | 8.48 |
| DUST | 105 | 0.21 | 4.19 | 45 | 0.49 | 9.78 |
| MICROB | 87 | 0.03 | 0.32 | 14 | 0.21 | 2.00 |
| WATERSP | 64 | 0.09 | 1.12 | 8 | 0.75 | 9.00 |
| OTHER | 62 | 0.06 | 0.10 | 6 | 0.67 | 1.00 |
| FUNNEL | 19 | 0.00 | 0.16 | 2 | 0.00 | 1.50 |
| TSUNAMI | 14 | 2.36 | 9.21 | 2 | 16.50 | 64.50 |
| VOLCANO | 2 | 0.00 | 0.00 | 0 | NaN | NaN |
Overall, heat events have the greatest impact on human health, both in terms of fatalities and injuries, followed by tsunamis. However, when only serious events are counted (those causing at least one injury), tsunamis have the highest impact on human health, although they are very rare in the United States. Other events with high injury or fatality averages are avalanches, cold events, surf and coastal events, hurricanes, fog, and dust. However, average fatalities for these events are often less than 1, meaning most of the time there are no fatalities at all. There were no volcanic events that resulted in any injuries, and hail and wind events also had generally small effects on human health.
#print table
kable(MeanDam[order(-NUM_EVENTS),c(1:2,5:6,9)],
col.names=c("Type","Number of Events","Average Economic Damages","Number of Serious Events","Average Economic Damages (events with at least 1 injury)"),
caption="Economic Damages in thousands of dollars, by Event Type: 1950-2011",
row.names=FALSE)
| Type | Number of Events | Average Economic Damages | Number of Serious Events | Average Economic Damages (events with at least 1 injury) |
|---|---|---|---|---|
| THUNDERSTORM | 132873 | 111 | 7340 | 587 |
| WIND | 129446 | 153 | 4974 | 1667 |
| TORNADO | 40055 | 1472 | 7950 | 5495 |
| FLOOD | 33152 | 5429 | 1457 | 7694 |
| HAIL | 26649 | 777 | 321 | 16525 |
| WINTWEATH | 5108 | 3478 | 783 | 8903 |
| SNOW | 1898 | 613 | 229 | 990 |
| FIRE | 1260 | 7067 | 333 | 14582 |
| RAIN | 1174 | 3676 | 128 | 511 |
| SURF | 1129 | 42600 | 829 | 4919 |
| HEAT | 990 | 934 | 947 | 534 |
| ICE | 890 | 10124 | 160 | 4968 |
| COLD | 658 | 5627 | 363 | 424 |
| TROPSTORM | 456 | 18445 | 43 | 156242 |
| DROUGHT | 278 | 54025 | 4 | 588 |
| AVALANCHE | 270 | 32 | 241 | 24 |
| HURRICANE | 233 | 390011 | 71 | 593514 |
| MUDSLIDE | 213 | 1631 | 24 | 2899 |
| FOG | 189 | 132 | 127 | 130 |
| DUST | 105 | 88 | 45 | 41 |
| MICROB | 87 | 84 | 14 | 2 |
| WATERSP | 64 | 949 | 8 | 6264 |
| OTHER | 62 | 210 | 6 | 8 |
| FUNNEL | 19 | 16 | 2 | 2 |
| TSUNAMI | 14 | 10292 | 2 | 42010 |
| VOLCANO | 2 | 250 | 0 | NaN |
Hurricane events had, by far, the largest average economic impact, at $390 million. Droughts, surf and coastal events, tropical storms, tsunamis, and ice events also averaged over $10 million dollars in damages. On the other side, dust events, microbursts and downdrafts, avalanches, and funnel clouds generally averaged less than $100 thousand in damages.
Events with fatalities and injuries are not always the ones that cause the most economic damage. In almost half of the types of events, limiting the analysis to those with injuries lowers the average economic damages, sometimes greatly. In fact, a scatterplot of the average log injuries or fatalities with the average economic impacts of a particular event show basically no relationship between the two.
par(mfrow=c(1,2),mar=c(4,2.1,2.1,2.1),oma=c(2,3,2,0),main="Average Economic Damages and Fatalities/Injuries by Event Type")
## Warning in par(mfrow = c(1, 2), mar = c(4, 2.1, 2.1, 2.1), oma = c(2, 3, :
## "main" is not a graphical parameter
plot(log(MeanDam$MEANF),log(MeanDam$MEANE), ylab="",xlab="Mean Log Fatalities")
plot(log(MeanDam$MEANI),log(MeanDam$MEANE),ylab="",xlab="Mean Log Injuries")
mtext("Average Economic Damages and Fatalities/Injuries by Event Type", side=3, outer = TRUE)
mtext("Mean Economic Damages (log thousands of dollars)", side=2, outer = TRUE)
The types of events that were most likely to cause human injury or death were heat, tsunamis, avalanches, cold, surf and coastal events, hurricanes, fog, and dust. The types of events that were most likely to cause the most economic damage were hurricanes, droughts, surf and coastal events, tropical storms, tsunamis, and ice events. There was little or no relation between the types of events that caused economic damages and the type of events that caused injuries or deaths.
One caution to this analysis is that the events that were reported were those that caused injuries or had calculated economic damages. For instance, most heat waves may be harmless, but when they turn deadly (or when the deaths and injuries are reported), those effects are high. The many generally harmless heatwaves were not included in this analysis, so the deadliness of heatwaves may be overstated.