This report will find that excessive heat was the biggest killer over the examined period, while tornadoes were the most injurious. Hurricanes and floods had the greatest economic impact with severe drought having the greatest economic impact on crop damage.
The Storm Events Database contains data and information for the United States, its territories, and possessions. Data are available from 1950 to the present, and the Database contains over 1.2 million records (reference).
Data across all event types has been available since the beginning of 1996 (reference), as this report is comparing all event types, only events of the period from 1996 until end of 2011 (where the supplied data finishes) were considered and consists of 902,297 event records.
The report addresses two questions:
Across the United States, which types of events are most harmful with respect to population health?
In terms of death, excessive heat was the cause of the highest mortality count while tornadoes by far caused the greatest amount of injury.
Across the United States, which types of events have the greatest economic consequences?
Drought was the cause of the highest economic damage to crops. Hurricanes had the greatest impact on property damage, however, if all flood types were considered a single event type, then these would exceed hurricane damage.
It was found that the singular Hurricane Katrina event contributed to over 25% of the total damage caused by all events over the 16 year period but after removing events linked to Hurricane Katrina, flooding and hurricanes were still the chief cause of economic damage – this is discussed in detail later in the report.
Set caching and echo to TRUE
# set defaults: cache chunks
knitr::opts_chunk$set(cache=TRUE, echo = TRUE)
The following libraries are required to run the code in this analysis and will be installed if not present locally:
readr, dplyr, stringr,
lubridate, ggplot2, kableExtra,
plotly, selectr, Hmisc
packages = c('readr', 'dplyr', 'stringr', 'lubridate',
'ggplot2', 'kableExtra', 'plotly', 'selectr', 'Hmisc')
package.check <- lapply(
packages,
FUN = function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE)
library(x, character.only = TRUE)
}
}
)
The data is available for download
in bzip2 compressed format. It is downloaded to the subdirectory
./data/raw which is first created if it doesn’t already
exist. The file is left compressed due to size.
if (!dir.exists('./data/raw')){
dir.create('./data/raw', recursive = T)
}
download.file(
"https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
"./data/raw/repdata_data_StormData.csv.bz2"
)
The data is read using read_csv() from the
readr package, selecting only those variables of
interest:
dfEvents <- read_csv(
"./data/raw/repdata_data_StormData.csv.bz2",
col_select = c("BGN_DATE", "EVTYPE",
"FATALITIES", "INJURIES",
"PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP",
"REFNUM", "REMARKS"),
col_types = cols(
BGN_DATE = col_character(),
EVTYPE = col_character(),
FATALITIES = col_number(),
INJURIES = col_number(),
PROPDMG = col_number(),
PROPDMGEXP = col_character(),
CROPDMG = col_number(),
CROPDMGEXP = col_character(),
REFNUM = col_number(),
REMARKS = col_character(),
)
)
This required several steps:
We are only interested in:
events that contribute towards one of the four key indicators (fatalities, injuries, property damage or crop damage)
events from 1996 as this is the earliest year when all event types were being recorded
Before filtering by date, we eliminate non-contributors first to minimise processing:
dfEvents <- dfEvents %>%
filter(FATALITIES>0 | INJURIES>0 | PROPDMG>0 | CROPDMG>0) %>%
mutate(BGN_DATE=mdy(word(BGN_DATE,1))) %>%
filter(BGN_DATE > '1996-01-01')
head(dfEvents, 10)
## # A tibble: 10 x 10
## BGN_DATE EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## <date> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <chr>
## 1 1996-01-06 WINTER ~ 0 0 380 K 38 K
## 2 1996-01-11 TORNADO 0 0 100 K 0 <NA>
## 3 1996-01-11 TSTM WI~ 0 0 3 K 0 <NA>
## 4 1996-01-11 TSTM WI~ 0 0 5 K 0 <NA>
## 5 1996-01-11 TSTM WI~ 0 0 2 K 0 <NA>
## 6 1996-01-18 HIGH WI~ 0 0 400 K 0 <NA>
## 7 1996-01-19 TSTM WI~ 0 0 12 K 0 <NA>
## 8 1996-01-24 TSTM WI~ 0 0 8 K 0 <NA>
## 9 1996-01-24 TSTM WI~ 0 0 12 K 0 <NA>
## 10 1996-01-26 FLASH F~ 0 0 75 K 0 <NA>
## # ... with 2 more variables: REFNUM <dbl>, REMARKS <chr>
dim(dfEvents)
## [1] 201313 10
The subsetted data now has 201313 rows.
According to the National Weather Service Storm Data Documentation, only the following values should be allowed in the recorded EVTYPE:
| Astronomical Low Tide | Funnel Cloud | Rip Current |
| Astronomical High Tide | Freezing Fog | Seiche |
| Avalanche | Hail | Sleet |
| Blizzard | Heat | Storm Surge/Tide |
| Coastal Flood | Heavy Rain | Strong Wind |
| Cold/Wind Chill | Heavy Snow | Thunderstorm Wind |
| Debris Flow | High Surf | Tornado |
| Dense Fog | High Wind | Tropical Depression |
| Dense Smoke | Hurricane (Typhoon) | Tropical Storm |
| Drought | Ice Storm | Tsunami |
| Dust Devil | Lake-Effect Snow | Volcanic Ash |
| Dust Storm | Lakeshore Flood | Waterspout |
| Excessive Heat | Lightning | Wildfire |
| Extreme Cold/Wind Chill | Marine Hail | Winter Storm |
| Flash Flood | Marine High Wind | Winter Weather |
| Flood | Marine Strong Wind | Marine Thunderstorm Wind |
| Frost/Freeze |
National Weather Service Storm Data Documentation, Section 2.1.1 Storm Data Event Table
Very few of the actual recorded events have followed this directive, there are a huge number of variations, mixes and spelling errors to deal with.
Without classifying the event records properly, any aggregate analysis would be somewhat meaningless.
As many of these entries needed to be examined on case-by-case
process, in the end it was more reliable to just map each recorded
EVTYPE to an actual allowed event type.
The first step for this is to create a mapping list:
# Assign recorded EVTYPES to allowed EVTYPES
event_types <- list(
`Astronomical Low Tide` = c('ASTRONOMICAL LOW TIDE'),
`Astronomical High Tide` = c('ASTRONOMICAL HIGH TIDE','HIGH TIDES'),
Avalanche = c('AVALANCE','AVALANCHE','HEAVY SNOW/BLIZZARD/AVALANCHE'),
Blizzard = c('BLIZZARD AND EXTREME WIND CHIL','BLIZZARD AND HEAVY SNOW','BLIZZARD SUMMARY','BLIZZARD WEATHER','BLIZZARD','BLIZZARD/FREEZING RAIN','BLIZZARD/HEAVY SNOW','BLIZZARD/HIGH WIND','BLIZZARD/WINTER STORM','GROUND BLIZZARD'),
`Coastal Flood` = c('BEACH EROSION/COASTAL FLOOD ','BEACH FLOOD','COASTAL FLOODING/EROSION','COASTAL FLOOD','COASTAL FLOODING','COASTAL FLOODING/EROSION','COASTAL STORM','COASTAL/TIDAL FLOOD','COASTALFLOOD','COASTALSTORM','CSTL FLOODING/EROSION','EROSION/CSTL FLOOD'),
`Cold/Wind Chill` = c('COLD','COLD/WIND CHILL','COLD TEMPERATURE','COLD TEMPERATURES','COLD WAVE','COLD WEATHER','COLD WIND CHILL TEMPERATURES','COLD/WIND CHILL','COLD/WINDS','UNSEASONABLE COLD','UNSEASONABLY COLD','UNSEASONABLY COOL & WET','UNSEASONABLY COOL','UNSEASONAL LOW TEMP','UNUSUALLY COLD'),
`Debris Flow` = c('LANDSLIDE','LANDSLIDE/URBAN FLOOD','LANDSLIDES','LANDSLUMP','MUD SLIDE','MUD SLIDES URBAN FLOODING','MUD SLIDES','MUD/ROCK SLIDE','MUDSLIDE','MUDSLIDE/LANDSLIDE','MUDSLIDES','ROCK SLIDE'),
`Dense Fog` = c('DENSE FOG','FOG AND COLD TEMPERATURES','FOG','PATCHY DENSE FOG'),
`Dense Smoke` = c('DENSE SMOKE','SMOKE'),
Drought = c('ABNORMALLY DRY','BELOW NORMAL PRECIPITATION','DRIEST MONTH','DROUGHT','DROUGHT/EXCESSIVE HEAT','DRY CONDITIONS ','DRY HOT WEATHER','DRY PATTERN','DRY SPELL','DRY WEATHER','DRY','DRYNESS','EXCESSIVELY DRY','HEAT DROUGHT','HEAT WAVE DROUGHT','HEAT/DROUGHT','LACK OF SNOW','RECORD DRY MONTH','RECORD DRYNESS','RECORD LOW RAINFALL','SNOW DROUGHT','UNSEASONABLY DRY','UNSEASONABLY WARM AND DRY','VERY DRY'),
`Dust Devil` = c('DUST DEVEL','DUST DEVIL WATERSPOUT','DUST DEVIL'),
`Dust Storm` = c('BLOWING DUST','DUST STORM','DUST STORM/HIGH WINDS','DUSTSTORM','HIGH WINDS DUST STORM','SAHARAN DUST'),
`Excessive Heat` = c('EXCESSIVE HEAT','EXCESSIVE HEAT/DROUGHT','EXTREME HEAT','HEAT WAVE','HEAT WAVES','HEAT','HEATBURST','HIGH TEMPERATURE RECORD','HOT AND DRY','HOT PATTERN','HOT SPELL','HOT WEATHER','HOT/DRY PATTERN','PROLONG WARMTH','RECORD HEAT WAVE','RECORD HEAT','RECORD HIGH TEMPERATURE','RECORD HIGH TEMPERATURES','RECORD HIGH','RECORD TEMPERATURE','RECORD TEMPERATURES','RECORD WARM TEMPS.','RECORD WARM','RECORD WARMTH','RECORD/EXCESSIVE HEAT','TEMPERATURE RECORD','UNSEASONABLY HOT'),
`Extreme Cold/Wind Chill` = c('BITTER WIND CHILL TEMPERATURES','BITTER WIND CHILL ','EXCESSIVE COLD','EXTENDED COLD','EXTREME COLD','EXTREME COLD/WIND CHILL','EXTREME WIND CHILL','EXTREME WIND CHILL/BLOWING SNO','EXTREME WIND CHILLS','EXTREME WINDCHILL TEMPERATURES','EXTREME WINDCHILL','EXTREME/RECORD COLD','HIGH WIND/WIND CHILL','HIGH WIND/WIND CHILL/BLIZZARD','HIGH WINDS AND WIND CHILL','HYPERTHERMIA/EXPOSURE','HYPOTHERMIA','HYPOTHERMIA/EXPOSURE','LOW TEMPERATURE RECORD','LOW TEMPERATURE','LOW WIND CHILL','PROLONG COLD','PROLONG COLD/SNOW','RECORD COLD','RECORD COLD AND HIGH WIND','RECORD COLD','RECORD COLD/FROST','RECORD COOL','SEVERE COLD','SNOW/ BITTER COLD','WIND CHILL','WIND CHILL/HIGH WIND'),
`Flash Flood` = c('BREAKUP FLOODING','DAM BREAK','DAM FAILURE','FLASH FLOOD - HEAVY RAIN','FLASH FLOOD FROM ICE JAMS','FLASH FLOOD LANDSLIDES','FLASH FLOOD WINDS','FLASH FLOOD','FLASH FLOOD/ FLOOD','FLASH FLOOD/ STREET','FLASH FLOOD/','FLASH FLOOD/FLOOD','FLASH FLOOD/HEAVY RAIN','FLASH FLOOD/LANDSLIDE','FLASH FLOODING','FLASH FLOODING/FLOOD','FLASH FLOODING/THUNDERSTORM WI','FLASH FLOODS','FLASH FLOOODING','FLOOD FLASH','FLOOD FLOOD/FLASH','FLOOD/FLASH FLOOD','FLOOD/FLASH FLOODING','FLOOD/FLASH','FLOOD/FLASH/FLOOD','FLOOD/FLASHFLOOD','LOCAL FLASH FLOOD','RAPIDLY RISING WATER'),
Flood = c('FLOOD & HEAVY RAIN','FLOOD WATCH/','FLOOD','FLOOD/RAIN/WIND','FLOOD/RAIN/WINDS','FLOOD/RIVER FLOOD','FLOOD/STRONG WIND','FLOODING','FLOODING/HEAVY RAIN','FLOODS','HEAVY RAIN AND FLOOD','HEAVY RAIN/MUDSLIDES/FLOOD','HEAVY RAIN/URBAN FLOOD','HEAVY RAIN; URBAN FLOOD WINDS;','HIGH WATER','HIGHWAY FLOODING','ICE JAM FLOOD (MINOR','ICE JAM FLOODING','LOCAL FLOOD','MAJOR FLOOD','MINOR FLOOD','MINOR FLOODING','RIVER AND STREAM FLOOD','RIVER FLOOD','RIVER FLOODING','RURAL FLOOD','SMALL STREAM AND URBAN FLOOD','SMALL STREAM AND URBAN FLOODIN','SMALL STREAM AND','SMALL STREAM FLOOD','SMALL STREAM FLOODING','SMALL STREAM URBAN FLOOD','SMALL STREAM','SMALL STREAM/URBAN FLOOD','SML STREAM FLD','SNOWMELT FLOODING','STREAM FLOODING','STREET FLOOD','STREET FLOODING','URBAN AND SMALL STREAM FLOOD','URBAN AND SMALL STREAM FLOODIN','URBAN AND SMALL STREAM','URBAN AND SMALL','URBAN FLOOD LANDSLIDE','URBAN FLOOD','URBAN FLOODING','URBAN FLOODS','URBAN SMALL STREAM FLOOD','URBAN SMALL','URBAN/SMALL FLOODING','URBAN/SMALL STREAM FLOOD','URBAN/SMALL STREAM FLOOD','URBAN/SMALL STREAM FLOODING','URBAN/SMALL STREAM','URBAN/SMALL STRM FLDG','URBAN/SMALL','URBAN/SML STREAM FLD','URBAN/SML STREAM FLDG','URBAN/STREET FLOODING'),
`Frost/Freeze` = c('AGRICULTURAL FREEZE','BLACK ICE','COLD AND FROST','DAMAGING FREEZE','EARLY FREEZE','EARLY FROST','FREEZE','FREEZING DRIZZLE AND FREEZING','FREEZING DRIZZLE','FREEZING RAIN AND SLEET','FREEZING RAIN AND SNOW','FREEZING RAIN SLEET AND LIGHT','FREEZING RAIN SLEET AND','FREEZING RAIN','FREEZING RAIN/SLEET','FREEZING RAIN/SNOW','FREEZING SPRAY','FROST','FROST/FREEZE','FROST\\FREEZE','GLAZE ICE','GLAZE','GLAZE/ICE STORM','HARD FREEZE','PATCHY ICE','SNOW FREEZING RAIN'),
`Funnel Cloud` = c('COLD AIR FUNNEL','COLD AIR FUNNELS','FUNNEL CLOUD','FUNNEL CLOUD.','FUNNEL CLOUD/HAIL','FUNNEL CLOUDS','FUNNEL','FUNNELS','WALL CLOUD/FUNNEL CLOUD','WHIRLWIND'),
`Freezing Fog` = c('FREEZING FOG','ICE FOG'),
Hail = c('DEEP HAIL','HAIL 0.75','HAIL 0.88','HAIL 075','HAIL 088','HAIL 1.00','HAIL 1.75','HAIL 1.75)','HAIL 100','HAIL 125','HAIL 150','HAIL 175','HAIL 200','HAIL 225','HAIL 275','HAIL 450','HAIL 75','HAIL 80','HAIL 88','HAIL ALOFT','HAIL DAMAGE','HAIL FLOODING','HAIL STORM','HAIL','HAIL(0.75)','HAIL/ICY ROADS','HAIL/WIND','HAIL/WINDS','HAILSTORM','HAILSTORMS','NON SEVERE HAIL','SMALL HAIL','THUNDERSTORM HAIL','THUNDERSTORM HAIL'),
Heat = c('ABNORMAL WARMTH','UNSEASONABLY WARM & WET','UNSEASONABLY WARM YEAR','UNSEASONABLY WARM','UNSEASONABLY WARM/WET','UNUSUAL WARMTH','UNUSUAL/RECORD WARMTH','UNUSUALLY WARM','VERY WARM','WARM DRY CONDITIONS','WARM WEATHER'),
`Heavy Rain` = c('ABNORMALLY WET','DOWNBURST','EARLY RAIN','EXCESSIVE PRECIPITATION','EXCESSIVE RAIN','EXCESSIVE RAINFALL','EXCESSIVE WETNESS','EXTREMELY WET','HEAVY PRECIPATATION','HEAVY PRECIPITATION','HEAVY RAIN AND WIND','HEAVY RAIN EFFECTS','HEAVY RAIN','HEAVY RAIN/FLOODING','HEAVY RAIN/HIGH SURF','HEAVY RAIN/LIGHTNING','HEAVY RAIN/SEVERE WEATHER','HEAVY RAIN/SMALL STREAM URBAN','HEAVY RAIN/SNOW','HEAVY RAIN/WIND','HEAVY RAINFALL','HEAVY RAINS','HEAVY RAINS/FLOODING','HEAVY SHOWER','HEAVY SHOWERS','HIGH WINDS HEAVY RAINS','HIGH WINDS/HEAVY RAIN','HVY RAIN','LOCALLY HEAVY RAIN','PROLONGED RAIN','RAIN (HEAVY)','RAIN AND WIND','RAIN DAMAGE','RAIN','RAIN/SNOW','RAINSTORM','RECORD PRECIPITATION','RECORD RAINFALL','RECORD/EXCESSIVE RAINFALL','TORRENTIAL RAIN','TORRENTIAL RAINFALL','TSTM HEAVY RAIN','UNSEASONAL RAIN','WET MICOBURST','WET MICROBURST','WET MONTH','WET WEATHER','WET YEAR','UNSEASONABLY WET'),
`Heavy Snow` = c('ACCUMULATED SNOWFALL','DRIFTING SNOW','EXCESSIVE SNOW','HEAVY SNOW FREEZING RAIN','HEAVY SNOW & ICE','HEAVY SNOW AND HIGH WINDS','HEAVY SNOW AND ICE STORM','HEAVY SNOW AND ICE','HEAVY SNOW AND STRONG WINDS','HEAVY SNOW AND','HEAVY SNOW ANDBLOWING SNOW','HEAVY SNOW SHOWER','HEAVY SNOW SQUALLS','HEAVY SNOW','HEAVY SNOW-SQUALLS','HEAVY SNOW/BLIZZARD','HEAVY SNOW/BLOWING SNOW','HEAVY SNOW/FREEZING RAIN','HEAVY SNOW/HIGH WIND','HEAVY SNOW/HIGH WINDS & FLOOD','HEAVY SNOW/HIGH WINDS','HEAVY SNOW/HIGH WINDS/FREEZING','HEAVY SNOW/HIGH','HEAVY SNOW/ICE STORM','HEAVY SNOW/ICE','HEAVY SNOW/SLEET','HEAVY SNOW/SQUALLS','HEAVY SNOW/WIND','HEAVY SNOW/WINTER STORM','HIGH WIND/ BLIZZARD','HIGH WIND/BLIZZARD','HIGH WIND/BLIZZARD/FREEZING RA','HIGH WIND/HEAVY SNOW','MOUNTAIN SNOWS','NEAR RECORD SNOW','RECORD SNOW','RECORD SNOW/COLD','RECORD SNOWFALL','RECORD WINTER SNOW','RECORD MAY SNOW','SNOW ACCUMULATION','SNOW ADVISORY','SNOW AND COLD','SNOW AND HEAVY SNOW','SNOW AND ICE STORM','SNOW AND ICE','SNOW AND SLEET','SNOW AND WIND','SNOW SQUALL','SNOW SQUALLS','SNOWFALL RECORD'),
`High Surf` = c('HAZARDOUS SURF','HEAVY SEAS','HEAVY SURF AND WIND','HEAVY SURF COASTAL FLOODING','HEAVY SURF','HEAVY SURF/HIGH SURF','HEAVY SWELLS','HIGH SWELLS','HIGH SEAS','HIGH SURF ADVISORIES','HIGH SURF ADVISORY','HIGH SURF','HIGH SWELLS','HIGH WAVES','ROUGH SEAS','ROUGH SURF'),
`High Wind` = c('DOWNBURST WINDS','DRY MICROBURST 50','DRY MICROBURST 53 ','DRY MICROBURST 58','DRY MICROBURST 61','DRY MICROBURST 84','DRY MICROBURST WINDS','DRY MICROBURST','DRY MIRCOBURST WINDS','GUSTY LAKE WIND','GUSTY WIND','GUSTY WIND/HAIL','GUSTY WIND/HVY RAIN','GUSTY WIND/RAIN','GUSTY WINDS','HIGH WINDS','HIGH WIND (G40)','HIGH WIND 48','HIGH WIND 63','HIGH WIND 70','HIGH WIND AND HEAVY SNOW','HIGH WIND DAMAGE','HIGH WIND','HIGH WIND/LOW WIND CHILL','HIGH WINDS 55','HIGH WINDS 57','HIGH WINDS 58','HIGH WINDS 63','HIGH WINDS 66','HIGH WINDS 67','HIGH WINDS 73','HIGH WINDS 76','HIGH WINDS 80','HIGH WINDS 82','HIGH WINDS','HIGH WINDS/','HIGH WINDS/COASTAL FLOOD','HIGH WINDS/COLD','HIGH WINDS/FLOODING','MICROBURST WINDS','MICROBURST','STORM FORCE WINDS','STRONG WIND GUST','STRONG WIND','STRONG WINDS','WIND ADVISORY','WIND AND WAVE','WIND DAMAGE','WIND GUSTS','WIND STORM','WIND','WIND/HAIL','WINDS','WND'),
`Hurricane (Typhoon)` = c('HURRICANE EDOUARD','HURRICANE EMILY','HURRICANE ERIN','HURRICANE FELIX','HURRICANE GORDON','HURRICANE OPAL','HURRICANE OPAL/HIGH WINDS ','HURRICANE','HURRICANE-GENERATED SWELLS','HURRICANE/TYPHOON','REMNANTS OF FLOYD','TYPHOON'),
`Ice Storm` = c('ICE STORM AND SNOW','ICE STORM','ICE STORM/FLASH FLOOD','ICE/STRONG WINDS','ICESTORM/BLIZZARD'),
`Lake-Effect Snow` = c('HEAVY LAKE SNOW','LAKE EFFECT SNOW','LAKE-EFFECT SNOW'),
`Lakeshore Flood` = c('LAKE FLOOD','LAKESHORE FLOOD'),
Lightning = c('LIGHTING','LIGHTNING WAUSEON','LIGHTNING AND HEAVY RAIN','LIGHTNING AND THUNDERSTORM WIN','LIGHTNING AND WINDS','LIGHTNING DAMAGE','LIGHTNING FIRE','LIGHTNING INJURY','LIGHTNING','LIGHTNING.','LIGHTNING/HEAVY RAIN','LIGNTNING'),
`Marine Hail` = c('MARINE HAIL'),
`Marine High Wind` = c('HIGH WIND AND SEAS','HIGH WIND AND HIGH TIDES','HIGH WIND/SEAS'),
`Marine Strong Wind` = c('MARINE HIGH WIND','MARINE STRONG WIND'),
`Marine Thunderstorm Wind` = c('MARINE THUNDERSTORM WIND','MARINE TSTM WIND'),
`Rip Current` = c('RIP CURRENT','RIP CURRENTS HEAVY SURF','RIP CURRENTS','RIP CURRENTS/HEAVY SURF'),
Seiche = c('ROGUE WAVE','SEICHE'),
Sleet = c('SLEET & FREEZING RAIN','SLEET STORM','SLEET','SLEET/FREEZING RAIN','SLEET/ICE STORM','SLEET/RAIN/SNOW','SLEET/SNOW'),
`Storm Surge/Tide` = c('BLOW-OUT TIDE','BLOW-OUT TIDES','COASTAL SURGE','STORM SURGE','STORM SURGE/TIDE','TIDAL FLOOD','TIDAL FLOODING'),
`Strong Wind` = c('GRADIENT WIND','GRADIENT WINDS','NON TSTM WIND','NON-SEVERE WIND DAMAGE','NON-TSTM WIND'),
`Thunderstorm Wind` = c('GUSTNADO','GUSTNADO AND','GUSTY THUNDERSTORM WIND','GUSTY THUNDERSTORM WINDS','LARGE WALL CLOUD','LIGHTNING THUNDERSTORM WINDS','LIGHTNING THUNDERSTORM WINDSS','ROTATING WALL CLOUD','SEVERE THUNDERSTORM WINDS','SEVERE THUNDERSTORM','SEVERE THUNDERSTORMS','SEVERE TURBULENCE','THUDERSTORM WINDS','THUNDEERSTORM WINDS','THUNDERESTORM WINDS','THUNDERSTORM WINDS','THUNDERSTORM DAMAGE TO','THUNDERSTORM DAMAGE','THUNDERSTORM W INDS','THUNDERSTORM DAMAGE TO','THUNDERSTORM DAMAGE','THUNDERSTORM W INDS','THUNDERSTORM WIND (G40)','THUNDERSTORM WIND 50','THUNDERSTORM WIND 52','THUNDERSTORM WIND 56','THUNDERSTORM WIND 59 MPH','THUNDERSTORM WIND 59 MPH.','THUNDERSTORM WIND 59','THUNDERSTORM WIND 60 MPH','THUNDERSTORM WIND 65 MPH','THUNDERSTORM WIND 65MPH','THUNDERSTORM WIND 69','THUNDERSTORM WIND 98 MPH','THUNDERSTORM WIND G50','THUNDERSTORM WIND G51','THUNDERSTORM WIND G52','THUNDERSTORM WIND G55','THUNDERSTORM WIND G60','THUNDERSTORM WIND G61','THUNDERSTORM WIND TREES','THUNDERSTORM WIND','THUNDERSTORM WIND.','THUNDERSTORM WIND/ TREE','THUNDERSTORM WIND/ TREES','THUNDERSTORM WIND/AWNING','THUNDERSTORM WIND/HAIL','THUNDERSTORM WIND/LIGHTNING','THUNDERSTORM WINDS LE CEN','THUNDERSTORM WINDS 13','THUNDERSTORM WINDS 2','THUNDERSTORM WINDS 50','THUNDERSTORM WINDS 52','THUNDERSTORM WINDS 53','THUNDERSTORM WINDS 60','THUNDERSTORM WINDS 61','THUNDERSTORM WINDS 62','THUNDERSTORM WINDS 63 MPH','THUNDERSTORM WINDS AND','THUNDERSTORM WINDS FUNNEL CLOU','THUNDERSTORM WINDS G','THUNDERSTORM WINDS G60','THUNDERSTORM WINDS HAIL','THUNDERSTORM WINDS HEAVY RAIN','THUNDERSTORM WINDS LIGHTNING','THUNDERSTORM WINDS SMALL STREA','THUNDERSTORM WINDS URBAN FLOOD','THUNDERSTORM WINDS','THUNDERSTORM WINDS.','THUNDERSTORM WINDS/ FLOOD','THUNDERSTORM WINDS/ HAIL','THUNDERSTORM WINDS/FLASH FLOOD','THUNDERSTORM WINDS/FLOODING','THUNDERSTORM WINDS/FUNNEL CLOU','THUNDERSTORM WINDS/HAIL','THUNDERSTORM WINDS/HEAVY RAIN','THUNDERSTORM WINDS53','THUNDERSTORM WINDSHAIL','THUNDERSTORM WINDSS','THUNDERSTORM WINS','THUNDERSTORM','THUNDERSTORMS WIND','THUNDERSTORMS WINDS','THUNDERSTORMS','THUNDERSTORMW 50','THUNDERSTORMW WINDS','THUNDERSTORMW','THUNDERSTORMWINDS','THUNDERSTROM WIND','THUNDERSTROM WINDS','THUNDERTORM WINDS','THUNDERTSORM WIND','THUNDESTORM WINDS','THUNERSTORM WINDS','TSTM WIND (G45)','TSTM WIND (41)','TSTM WIND (G35)','TSTM WIND (G40)','TSTM WIND (G45)','TSTM WIND 40','TSTM WIND 45','TSTM WIND 50','TSTM WIND 51','TSTM WIND 52','TSTM WIND 55','TSTM WIND 65)','TSTM WIND AND LIGHTNING','TSTM WIND DAMAGE','TSTM WIND G45','TSTM WIND G58','TSTM WIND','TSTM WIND/HAIL','TSTM WINDS','TSTM WND','TSTM','TSTMW','TUNDERSTORM WIND','WAKE LOW WIND','WALL CLOUD'),
Tornado = c('COLD AIR TORNADO','LANDSPOUT','TORNADO DEBRIS','TORNADO F0','TORNADO F1','TORNADO F2','TORNADO F3','TORNADO','TORNADO/WATERSPOUT','TORNADOES','"TORNADOES',' TSTM WIND',' HAIL"','TORNADOS','TORNDAO'),
`Tropical Depression` = c('TROPICAL DEPRESSION'),
`Tropical Storm` = c('TROPICAL STORM ALBERTO','TROPICAL STORM DEAN','TROPICAL STORM GORDON','TROPICAL STORM JERRY','TROPICAL STORM'),
Tsunami = c('TSUNAMI'),
`Volcanic Ash` = c('VOG','VOLCANIC ASH PLUME','VOLCANIC ASH','VOLCANIC ASHFALL','VOLCANIC ERUPTION'),
Waterspout = c('WATER SPOUT','WATERSPOUT FUNNEL CLOUD','WATERSPOUT TORNADO','WATERSPOUT','WATERSPOUT-','WATERSPOUT-TORNADO','WATERSPOUT/ TORNADO','WATERSPOUT/','WATERSPOUT/TORNADO','WATERSPOUTS','WAYTERSPOUT'),
Wildfire = c('BRUSH FIRE','BRUSH FIRES','FOREST FIRES','GRASS FIRES','RED FLAG FIRE WX','WILD FIRES','WILD/FOREST FIRE','WILD/FOREST FIRES','WILDFIRE','WILDFIRES'),
`Winter Storm` = c('HIGH WINDS/SNOW','SNOW- HIGH WIND- WIND CHILL','SNOW/ICE STORM','SNOWSTORM','THUNDERSNOW SHOWER','THUNDERSNOW','WINTER STORM HIGH WINDS','WINTER STORM','WINTER STORM/HIGH WIND','WINTER STORM/HIGH WINDS','WINTER STORMS'),
`Winter Weather` = c('BLOWING SNOW & EXTREME WIND CH','BLOWING SNOW','BLOWING SNOW- EXTREME WIND CHI','BLOWING SNOW/EXTREME WIND CHIL','COLD AND SNOW','COLD AND WET CONDITIONS','COOL AND WET','COOL SPELL ','EARLY SNOW','EARLY SNOWFALL','FALLING SNOW/ICE','FIRST FROST','FIRST SNOW','HEAVY SNOWPACK','HEAVY WET SNOW','ICE AND SNOW','ICE FLOES','ICE JAM','ICE ON ROAD','ICE PELLETS','ICE ROADS','ICE','ICE/SNOW','ICY ROADS','LATE FREEZE','LATE SEASON HAIL','LATE SEASON SNOW','LATE SEASON SNOWFALL','LATE SNOW','LIGHT FREEZING RAIN','LIGHT SNOW AND SLEET','LIGHT SNOW','LIGHT SNOW/FLURRIES','LIGHT SNOW/FREEZING PRECIP','LIGHT SNOWFALL','MODERATE SNOW','MODERATE SNOWFALL','SEASONAL SNOWFALL','SNOW SHOWERS','SNOW SLEET','SNOW','SNOW/ ICE','SNOW/BLOWING SNOW','SNOW/COLD','SNOW/FREEZING RAIN','SNOW/HEAVY SNOW','SNOW/HIGH WINDS','SNOW/ICE','SNOW/RAIN','SNOW/RAIN/SLEET','SNOW/SLEET','SNOW/SLEET/FREEZING RAIN','SNOW/SLEET/RAIN','SNOW\\COLD','UNUSUALLY LATE SNOW','WET SNOW','WINTER MIX','WINTER WEATHER MIX','WINTER WEATHER','WINTER WEATHER/MIX','WINTERY MIX','WINTRY MIX'),
Unknown = c('APACHE COUNTY','DROWNING','EXCESSIVE','HEAVY MIX','HIGH','MARINE ACCIDENT','MARINE MISHAP','"METRO STORM',' MAY 26"','MILD AND DRY PATTERN','MILD PATTERN','MILD/DRY PATTERN','MIXED PRECIPITATION','MIXED PRECIP','MONTHLY PRECIPITATION','MONTHLY RAINFALL','MONTHLY SNOWFALL','MONTHLY TEMPERATURE','NO SEVERE WEATHER','NONE','NORMAL PRECIPITATION','NORTHERN LIGHTS','OTHER','RECORD LOW','RED FLAG CRITERIA','SOUTHEAST','SUMMARY AUGUST 10','SUMMARY AUGUST 11','SUMMARY AUGUST 17','SUMMARY AUGUST 2-3','SUMMARY AUGUST 21','SUMMARY AUGUST 28','SUMMARY AUGUST 4','SUMMARY AUGUST 7','SUMMARY AUGUST 9','SUMMARY JAN 17','SUMMARY JULY 23-24','SUMMARY JUNE 18-19','SUMMARY JUNE 5-6','SUMMARY JUNE 6','SUMMARY OF APRIL 12','SUMMARY OF APRIL 13','SUMMARY OF APRIL 21','SUMMARY OF APRIL 27','SUMMARY OF APRIL 3RD','SUMMARY OF AUGUST 1','SUMMARY OF JULY 11','SUMMARY OF JULY 2','SUMMARY OF JULY 22','SUMMARY OF JULY 26','SUMMARY OF JULY 29','SUMMARY OF JULY 3','SUMMARY OF JUNE 10','SUMMARY OF JUNE 11','SUMMARY OF JUNE 12','SUMMARY OF JUNE 13','SUMMARY OF JUNE 15','SUMMARY OF JUNE 16','SUMMARY OF JUNE 18','SUMMARY OF JUNE 23','SUMMARY OF JUNE 24','SUMMARY OF JUNE 3','SUMMARY OF JUNE 30','SUMMARY OF JUNE 4','SUMMARY OF JUNE 6','SUMMARY OF MARCH 14','SUMMARY OF MARCH 23','SUMMARY OF MARCH 24','SUMMARY OF MARCH 24-25','SUMMARY OF MARCH 27','SUMMARY OF MARCH 29','SUMMARY OF MAY 10','SUMMARY OF MAY 13','SUMMARY OF MAY 14','SUMMARY OF MAY 22 AM','SUMMARY OF MAY 22 PM','SUMMARY OF MAY 22','SUMMARY OF MAY 26 AM','SUMMARY OF MAY 26 PM','SUMMARY OF MAY 31 AM','SUMMARY OF MAY 31 PM','SUMMARY OF MAY 9-10','SUMMARY SEPT. 25-26','SUMMARY SEPTEMBER 20','SUMMARY SEPTEMBER 23','SUMMARY SEPTEMBER 3','SUMMARY SEPTEMBER 4','SUMMARY: NOV. 16','SUMMARY: NOV. 6-7','SUMMARY: OCT. 20-21','SUMMARY: OCTOBER 31','SUMMARY: SEPT. 18','BEACH EROSIN','BEACH EROSION','COASTAL EROSION')
)
The mapping process involves finding which element of the list the
EVTYPE exists in, and using that element name as the
corrected value in a new Event column:
set_allowed_event_type <- function(observation) {
names(
which(
sapply(
event_types,
function(x) any(x == toupper(observation['EVTYPE']))
)
)
)[1]
}
dfEvents <- dfEvents %>%
mutate(Event = apply(dfEvents, 1, set_allowed_event_type)) %>%
relocate(Event, .after = EVTYPE)
dfEvents %>%
select(BGN_DATE, EVTYPE, Event) %>%
slice_head(n=10)
## # A tibble: 10 x 3
## BGN_DATE EVTYPE Event
## <date> <chr> <chr>
## 1 1996-01-06 WINTER STORM Winter Storm
## 2 1996-01-11 TORNADO Tornado
## 3 1996-01-11 TSTM WIND Thunderstorm Wind
## 4 1996-01-11 TSTM WIND Thunderstorm Wind
## 5 1996-01-11 TSTM WIND Thunderstorm Wind
## 6 1996-01-18 HIGH WIND High Wind
## 7 1996-01-19 TSTM WIND Thunderstorm Wind
## 8 1996-01-24 TSTM WIND Thunderstorm Wind
## 9 1996-01-24 TSTM WIND Thunderstorm Wind
## 10 1996-01-26 FLASH FLOOD Flash Flood
Note: in the case that an EVTYPE has been added more
than once to the event_types list, only the first result is
returned.
The official allowed event types can be somewhat arbitrary (what draws the line between a ‘Heat’ event and an ‘Excessive Heat’ event for example?), and this is really a weakness in the analysis. How the entries have been mapped has a direct impact on the final results.
Having said that, it is a simple task to revisit the mapping and rerun the analysis once that has been done if necessary.
In case of future analysis, we convert the list to a dataframe and push out to CSV:
max_length <- max(sapply(event_types, length))
event_types <- lapply(
event_types,function(x) {
ans <- rep(NA,length=max_length);
ans[1:length(x)]<- x;
return(ans)
}
)
event_types <- do.call(cbind,event_types)
if (!dir.exists('./data/processed')){
dir.create('./data/processed', recursive = T)
}
write.csv(event_types, './data/processed/map_evtypes.csv', row.names = F)
Since we are comparing costs across a lengthy period of time, we need to adjust values for inflation as the currency devalues with time. $1 Billion in 1996 was worth considerably more than the same value in 2011, at the end of the analysis period.
To get an approximation of inflation over the period, we can use the CPI values for start and end using the following formula:
(B - A)/A where A is the starting CPI and
B is the ending CPI
The average annual inflation rate is approximated by dividing this number by the number of years.
The US Bureau of Labor Statistics gives the CPI for 1996 as 66.18, and for 2012 as 96.87.
cpi1996 <- 66.18
cpi2012 <- 96.87
mean_inflation_rate <- (cpi2012-cpi1996)/cpi1996/(2012-1996)
mean_inflation_rate
## [1] 0.02898345
The average inflation rate over this period was 2.9%.
For each year in the period, we can calculate the equivalent 2012 value using the formula:
Value2012 = PV × (1 + i)^n where PV is the recorded
value and n is the years until 2012.
As an example, $1000 in 1996 is calculated as follows:
$1,000 × (1 + 0.029)^16 = $1,579.96
We create a row to indicate this n value for each entry and apply it later when we calculate the property and crop costs:
end_date <- ymd("2012-01-01")
dfEvents <- dfEvents %>%
mutate(inflation_period = as.numeric(interval(BGN_DATE, end_date) / years(1)))
The EXP columns have generated much discussion as once again, they do not follow allowed entry values. These should be one of B or b = Billion, M or m = Million, K or k = Thousand, H or h = Hundred.
Fortunately, since we are only approaching this after subsetting the data, the values remaining all fit within those allowed values:
unique(dfEvents$PROPDMGEXP)
## [1] "K" NA "M" "B"
unique(dfEvents$CROPDMGEXP)
## [1] "K" NA "M" "B"
For the sake of completeness, I have included and catered for all encountered values in the original dataset in the case that a newer set of data is imported in the future.
We use a function to apply to the dataset where we calculate either property or crop damage and apply the inflation adjustment at the same time:
calculate_cost <- function(observation, prefix, annual_inflation){
DMG=as.numeric(observation[prefix])
if (DMG <= 0 | is.na(DMG)) {
# skip if cost is zero
0
} else {
# read EXP value and apply relative factoring to cost
EXP <- paste0(prefix,'EXP')
exponent <- toupper(observation[EXP])
cost <- case_when(
exponent == 'H' ~ DMG * 1e2,
exponent == 'K' ~ DMG * 1e3,
exponent == 'M' ~ DMG * 1e6,
exponent == 'B' ~ DMG * 1e9,
is.numeric(exponent) ~ DMG * 10,
TRUE ~ DMG
)
# adjust for relevant years of inflation
round(cost * ((1 + annual_inflation) ** as.numeric(observation['inflation_period'])))
}
}
Now we use this function for the property and crop values:
# create calculated property damage cost column
dfEvents <- dfEvents %>%
mutate(
PROPERTY_DAMAGE = apply(
dfEvents,
1,
calculate_cost,
prefix='PROPDMG',
annual_inflation = mean_inflation_rate
)
)
# create calculated crop damage cost column
dfEvents <- dfEvents %>%
mutate(
CROP_DAMAGE = apply(
dfEvents,
1,
calculate_cost,
prefix='CROPDMG',
annual_inflation = mean_inflation_rate
)
)
dfEvents %>%
select(BGN_DATE, Event, PROPERTY_DAMAGE, CROP_DAMAGE) %>%
slice_head(n=10)
## # A tibble: 10 x 4
## BGN_DATE Event PROPERTY_DAMAGE CROP_DAMAGE
## <date> <chr> <dbl> <dbl>
## 1 1996-01-06 Winter Storm 599995 60000
## 2 1996-01-11 Tornado 157832 0
## 3 1996-01-11 Thunderstorm Wind 4735 0
## 4 1996-01-11 Thunderstorm Wind 7892 0
## 5 1996-01-11 Thunderstorm Wind 3157 0
## 6 1996-01-18 High Wind 630981 0
## 7 1996-01-19 Thunderstorm Wind 18928 0
## 8 1996-01-24 Thunderstorm Wind 12614 0
## 9 1996-01-24 Thunderstorm Wind 18921 0
## 10 1996-01-26 Flash Flood 118235 0
Looking at max values of each of the indicator columns, we can check for spurious values:
dfEvents %>% summarise(
max(FATALITIES),
max(INJURIES),
max(PROPERTY_DAMAGE)/1e9,
max(CROP_DAMAGE)/1e9
)
## # A tibble: 1 x 4
## `max(FATALITIES)` `max(INJURIES)` `max(PROPERTY_DAMAGE)/~ `max(CROP_DAMAGE)/1~
## <dbl> <dbl> <dbl> <dbl>
## 1 158 1150 137. 1.81
The value of $137 Billion property damage seems excessive.
Check the date of the event:
refnum <- dfEvents %>%
filter(PROPERTY_DAMAGE == max(PROPERTY_DAMAGE))
refnum <- as.numeric(refnum$REFNUM)
dfEvents$BGN_DATE[dfEvents$REFNUM == refnum]
## [1] "2006-01-01"
Checking the NOAA website for any events recorded with this value during this period shows no events even close.
We can look at the remarks for this event:
dfEvents$REMARKS[dfEvents$REFNUM == refnum]
## [1] "Major flooding continued into the early hours of January 1st, before the Napa River finally fell below flood stage and the water receeded. Flooding was severe in Downtown Napa from the Napa Creek and the City and Parks Department was hit with $6 million in damage alone. The City of Napa had 600 homes with moderate damage, 150 damaged businesses with costs of at least $70 million."
The discussion appears to be talking of values in the league of $100 Million rather than $100 Billion.
We can assume this was mis-keyed with a B instead of an
M.
Correcting this:
dfEvents$PROPDMGEXP[dfEvents$REFNUM == refnum] <- 'M'
dfEvents$PROPERTY_DAMAGE[dfEvents$REFNUM == refnum] <-
dfEvents$PROPERTY_DAMAGE[dfEvents$REFNUM == refnum] / 1000
dfEvents$PROPERTY_DAMAGE[dfEvents$REFNUM == refnum]
## [1] 136504884
The corrected value is now $137 Million.
# Recheck for spurious values:
dfEvents %>% summarise(
max(FATALITIES),
max(INJURIES),
max(PROPERTY_DAMAGE)/1e9,
max(CROP_DAMAGE)/1e9
)
## # A tibble: 1 x 4
## `max(FATALITIES)` `max(INJURIES)` `max(PROPERTY_DAMAGE)/~ `max(CROP_DAMAGE)/1~
## <dbl> <dbl> <dbl> <dbl>
## 1 158 1150 37.5 1.81
The maximum values now appear realistic.
For this analysis, we look at deaths and injuries attributed to each event type over the 16 year period:
human_damage <- dfEvents %>%
group_by(Event) %>%
summarise(
Deaths = sum(FATALITIES),
Injuries = sum(INJURIES),
Total = sum(Deaths + Injuries)
) %>%
arrange(desc(Total)) %>%
slice_head(n=10) %>%
mutate(Event = factor(Event, levels=unique(Event)))
human_damage %>%
kbl() %>%
kable_styling(bootstrap_options = c("condensed"))
| Event | Deaths | Injuries | Total |
|---|---|---|---|
| Tornado | 1511 | 20667 | 22178 |
| Excessive Heat | 2036 | 7683 | 9719 |
| Flood | 447 | 6838 | 7285 |
| Thunderstorm Wind | 379 | 5129 | 5508 |
| Lightning | 650 | 4140 | 4790 |
| Flash Flood | 887 | 1674 | 2561 |
| High Wind | 371 | 1503 | 1874 |
| Wildfire | 87 | 1458 | 1545 |
| Winter Storm | 191 | 1292 | 1483 |
| Hurricane (Typhoon) | 125 | 1328 | 1453 |
Ten most severe Storm Event Types ranked in order of greatest US human cost over 1996 - 2011 period
plot_ly(data=human_damage, x = ~Event, y = ~Deaths, type = 'bar', name = 'Deaths') %>%
add_trace(y = ~Injuries, name = 'Injuries') %>%
layout(title = list(text = paste0('Ten Highest Causes of Human Costs for Period 1996 to 2011 (inclusive)')),
xaxis = list(title = ""),
yaxis = list(title = 'Total Count'),
barmode = 'stack'
) %>%
config(displayModeBar = F) %>%
layout(xaxis=list(fixedrange=TRUE)) %>%
layout(yaxis=list(fixedrange=TRUE))
From this we can discern that the event type with the highest mortality rate is due to excessive heat, followed by tornadoes.
By far, the most injurious type are tornadoes.
For this analysis, we look at property and crop damage costs attributed to each event type over the 16 year period:
financial_damage <- dfEvents %>%
group_by(Event) %>%
summarise(
Property = round(sum(PROPERTY_DAMAGE) / 1e9, 2),
Crop = round(sum(CROP_DAMAGE) / 1e9, 2),
Total = sum(Property + Crop)
) %>%
arrange(desc(Total)) %>%
slice_head(n=10) %>%
mutate(Event = factor(Event, levels=unique(Event)))
financial_damage %>%
kbl() %>%
kable_styling(bootstrap_options = c("condensed"))
| Event | Property | Crop | Total |
|---|---|---|---|
| Hurricane (Typhoon) | 100.87 | 7.02 | 107.89 |
| Storm Surge/Tide | 56.88 | 0.00 | 56.88 |
| Flood | 36.16 | 6.07 | 42.23 |
| Tornado | 29.10 | 0.34 | 29.44 |
| Hail | 17.92 | 3.19 | 21.11 |
| Flash Flood | 18.98 | 1.64 | 20.62 |
| Drought | 1.37 | 17.82 | 19.19 |
| Thunderstorm Wind | 9.92 | 1.29 | 11.21 |
| Tropical Storm | 10.14 | 0.87 | 11.01 |
| Wildfire | 9.81 | 0.50 | 10.31 |
Ten most severe Storm Event Types ranked in order of greatest US
damage cost over 1996 - 2011 period.
Damage costs in USD Billions
adjusted to 2012 USD value.
plot_ly(data=financial_damage, x = ~Event, y = ~Property, type = 'bar', name = 'Property') %>%
add_trace(y = ~Crop, name = 'Crop') %>%
layout(title = list(text = paste0(
'Ten Highest Causes of Damage Costs for Period 1996 to 2011 (inclusive)',
'<br>',
'<sup>',
'(USD adjusted to 2012 US Dollar Value)',
'</sup>'
)
),
xaxis = list(title = ""),
yaxis = list(title = 'Total Damage Cost'),
barmode = 'stack'
) %>%
config(displayModeBar = F) %>%
layout(xaxis=list(fixedrange=TRUE)) %>%
layout(yaxis=list(fixedrange=TRUE))
The greatest damage to property comes from Hurricanes, however if we look at the three flood related categories combined (Storm Surge/Tide, Flood and Flash Flood), these add up to a greater cost.
The analysis is a product of the categorization of the allowed event types and perhaps these should in reality be grouped together. It should also be kept in mind that some of these flood events will be related to the hurricanes.
The greatest crop damage is due to drought.
Looking at the table of top ten events, we can see five of the first six events are all related to Hurricane Katrina:
# Consider property and crop damage as one
dfEvents <- dfEvents %>%
mutate(DAMAGE = PROPERTY_DAMAGE + CROP_DAMAGE)
#Top 10 financial events
top10_financial_events <- dfEvents %>%
select(BGN_DATE, EVTYPE, FATALITIES, INJURIES, DAMAGE) %>%
mutate(DAMAGE=round((DAMAGE/1e9),2)) %>%
arrange(desc(DAMAGE)) %>%
slice_head(n=10)
top10_financial_events %>%
kbl() %>%
kable_styling(bootstrap_options = c("condensed"))
| BGN_DATE | EVTYPE | FATALITIES | INJURIES | DAMAGE |
|---|---|---|---|---|
| 2005-08-29 | STORM SURGE | 0 | 0 | 37.52 |
| 2005-08-28 | HURRICANE/TYPHOON | 0 | 0 | 20.29 |
| 2005-08-29 | STORM SURGE | 0 | 0 | 13.50 |
| 2005-10-24 | HURRICANE/TYPHOON | 5 | 0 | 11.93 |
| 2005-08-29 | HURRICANE/TYPHOON | 15 | 104 | 8.86 |
| 2005-08-28 | HURRICANE/TYPHOON | 0 | 0 | 8.81 |
| 2004-08-13 | HURRICANE/TYPHOON | 7 | 780 | 7.05 |
| 2001-06-05 | TROPICAL STORM | 22 | 0 | 6.97 |
| 2004-09-04 | HURRICANE/TYPHOON | 0 | 0 | 6.07 |
| 1999-09-15 | HURRICANE | 0 | 0 | 4.97 |
Ten most severe storm events ranked in order of greatest US
damage cost over 1996 - 2011 period.
Damage costs in USD Billions
adjusted to 2012 USD value.
We can look at these by filtering on billion dollar events over the period 27th to 30th August 2005:
katrina <- dfEvents %>%
filter(BGN_DATE>='2005-08-27' & BGN_DATE<='2005-08-30' & DAMAGE>1e9) %>%
select(BGN_DATE, EVTYPE, DAMAGE, REFNUM, REMARKS)
katrina_sum <- sum(katrina$DAMAGE)/1e9
total_sum <- sum(dfEvents$DAMAGE)/1e9
# katrina effect
katrina_percent <- round(katrina_sum * 100 / total_sum)
$90 billion - total of Katrina Billion Dollar events
$354 billion
- total of all events over 1995-2011 period
-———–
25% -
contribution of Katrina events to total damage over the 1995-2011
period
Does the effect of this mega event skew the data for the rest of the period?
We can consider the data again with the Katrina events omitted:
top10_financial_events_no_katrina <- dfEvents %>%
filter(REFNUM %nin% katrina$REFNUM) %>%
group_by(Event) %>%
summarise(
Property = round(sum(PROPERTY_DAMAGE) / 1e9, 2),
Crop = round(sum(CROP_DAMAGE) / 1e9, 2),
Total = sum(Property + Crop)
) %>%
arrange(desc(Total)) %>%
slice_head(n=10) %>%
mutate(Event = factor(Event, levels=unique(Event)))
top10_financial_events_no_katrina %>%
kbl() %>%
kable_styling(bootstrap_options = c("condensed"))
| Event | Property | Crop | Total |
|---|---|---|---|
| Hurricane (Typhoon) | 63.51 | 5.21 | 68.72 |
| Flood | 36.16 | 6.07 | 42.23 |
| Tornado | 29.10 | 0.34 | 29.44 |
| Hail | 17.92 | 3.19 | 21.11 |
| Flash Flood | 18.98 | 1.64 | 20.62 |
| Drought | 1.37 | 17.82 | 19.19 |
| Thunderstorm Wind | 9.92 | 1.29 | 11.21 |
| Tropical Storm | 10.14 | 0.87 | 11.01 |
| Wildfire | 9.81 | 0.50 | 10.31 |
| High Wind | 6.63 | 0.89 | 7.52 |
Ten most severe storm events ranked in order of greatest US
damage cost over 1996 - 2011 period with Katrina events
omitted.
Damage costs in USD Billions adjusted to 2012 USD
value.
Cumulatively, hurricanes still are #1 cause, however flood and flash flood combined almost equal this value.
Conclusion, the Katrina event does not skew the data.
Are these floods linked to the hurricane events as happened with Katrina?
We can look at the top five contributors for each year:
top5_by_year_no_katrina <- dfEvents %>%
filter(REFNUM %nin% katrina$REFNUM) %>%
group_by(Year=year(BGN_DATE), Event) %>%
summarise(
Total = round(sum(DAMAGE) / 1e9, 2)
) %>%
arrange(desc(Total)) %>%
arrange(Year, desc(Total)) %>%
slice_max(order_by=Total, n=5) %>%
slice_head(n=5)
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
p <- ggplot(top5_by_year_no_katrina, aes(x=Year, y=Total, color=Event))
p + geom_line(aes(group=Event)) +
theme(axis.text.x = element_text(angle=45, hjust = 1)) +
scale_x_continuous(breaks = unique(top5_by_year_no_katrina$Year)) +
scale_y_continuous(breaks = 0:ceiling(max(top5_by_year_no_katrina$Total))) +
xlab(NULL) +
ylab("Total Damage Cost") +
labs(
title = "Total Damage Costs for Top 5 Event Types in Each Year with Katrina data omitted",
subtitle = "(USD adjusted to 2012 US Dollar Value)",
color = "Event Type",
) +
theme_minimal()
The graph doesn’t show any strong relationship between hurricane peaks and flooding.
Other than the hurricane spikes in 2004/2005, no particular event dominates each year, rather it’s a cumulative effect over the years that dominates.
Conclusion:
Hurricanes and the various forms of flooding cause the greatest economic damage even when omitting the Hurricane Katrina mega event.