“every step deeper into the analysis takes a soul away… Teilhard de Chardin”

SYNOPSIS. (1)The U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database holds events from 1950 to November 2011. (2)Respecting personal damages, this paper considers the loss of life the utmost consequence of any event in the database related to health condition. (3))Old age and early childhood, not by themselves “health conditions”, are determinant in relation to hot weather events as the second main cause of fatalities. Tornadoes, whose fatalities migth allegedly be excluded from a “health condition” direct relationship due to their striking nature, are the first cause of deaths in the database. (5)In the other hand, infrastructure and property damages/losses provoked by direct weather events have been confronted to their pure economic impact. (6)The analysis seems to show an increase in data recording from the 1990s on, or an increase in events related with heat, as the Fig.3 changes in the bottom line from tornados and surges to heat and hurricanes. (7) In general, despite some peaks, there seems to be an increase of heat and hurricane and flood events and a decrease of deaths in the main deadly events towards the last dates. (8)The system configuration for this April/2017 paper was: Ubuntu 16.04 LTS, Rstudio Version 1.0.136 – © 2009-2016 RStudio, Inc. under R version 3.3.3 (2017-03-06) – “Another Canoe”.

1. DATA PROCESSING

fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileUrl, destfile="./repdata_data_StormData.csv.bz2")
stormdat <- read.csv("repdata_data_StormData.csv.bz2")
library(dplyr)
library(lubridate)
library(ggplot2)
str(stormdat)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : Factor w/ 16335 levels "10/10/1954 0:00:00",..: 6523 6523 4213 11116 1426 1426 1462 2873 3980 3980 ...
##  $ BGN_TIME  : Factor w/ 3608 levels "000","0000","00:00:00 AM",..: 212 257 2645 1563 2524 3126 122 1563 3126 3126 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "?","ABNORMALLY DRY",..: 830 830 830 830 830 830 830 830 830 830 ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : Factor w/ 35 levels "","E","Eas","EE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_LOCATI: Factor w/ 54429 levels "","?","(01R)AFB GNRY RNG AL",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_DATE  : Factor w/ 6663 levels "","10/10/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_TIME  : Factor w/ 3647 levels "","?","0000",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_LOCATI: Factor w/ 34506 levels "","(0E4)PAYSON ARPT",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ WFO       : Factor w/ 542 levels "","2","43","9V9",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ZONENAMES : Factor w/ 25112 levels "","                                                                                                                               "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : Factor w/ 436781 levels ""," ","  ","   ",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
stormdat <- rename(stormdat, event = EVTYPE, state = STATE, 
                        bgn_date = BGN_DATE, end_date = END_DATE,
                        cdmg = CROPDMG, pdmg = PROPDMG,
                        cropdmgexp = CROPDMGEXP, propdmgexp = PROPDMGEXP,
                        fatalities = FATALITIES)

1.1 Personal Issues processing

  • 1.1.1 A subsetting of the original dataset (stormdat) into databasees stormdat.f and stormdat.f.desc with columns event, state, bgn_date, end_date and criterium column fatalities > 0 filtered in descending order…
stormdat.f <- filter(stormdat, fatalities > 0)
stormdat.f.desc <- select(stormdat.f, event, state, bgn_date, 
                                      end_date, fatalities) %>% 
                   arrange(desc(fatalities))

… and a look into first and last rows of resulting stormdat.f.desc dataframe…

head(stormdat.f.desc)
##            event state          bgn_date          end_date fatalities
## 1           HEAT    IL 7/12/1995 0:00:00 7/16/1995 0:00:00        583
## 2        TORNADO    MO 5/22/2011 0:00:00 5/22/2011 0:00:00        158
## 3        TORNADO    MI  6/8/1953 0:00:00                          116
## 4        TORNADO    TX 5/11/1953 0:00:00                          114
## 5 EXCESSIVE HEAT    IL 7/28/1999 0:00:00 7/31/1999 0:00:00         99
## 6        TORNADO    MA  6/9/1953 0:00:00                           90
tail(stormdat.f.desc) 
##              event state           bgn_date           end_date fatalities
## 6969     HIGH WIND    CA 11/30/2011 0:00:00 11/30/2011 0:00:00          1
## 6970   RIP CURRENT    FL  11/9/2011 0:00:00  11/9/2011 0:00:00          1
## 6971 COASTAL FLOOD    AK  11/8/2011 0:00:00 11/10/2011 0:00:00          1
## 6972   STRONG WIND    FL 11/28/2011 0:00:00 11/28/2011 0:00:00          1
## 6973     HIGH WIND    WA 11/22/2011 0:00:00 11/22/2011 0:00:00          1
## 6974     AVALANCHE    UT 11/13/2011 0:00:00 11/13/2011 0:00:00          1

… put us on the track of the types of events (as indicated in the event variable) that are most harmful with respect to population health across the United States; narrowing the scoping to 6974 observations (as shown in the last row of the above section).

  • 1.1.2 Grouping and Summarizing the filtered database, results in Dplyr’s tibble object summarizedby.deaths
groupedby.evt <- group_by(stormdat.f.desc, event)
summarizedby.deaths <- summarize(groupedby.evt, fatalities = sum(fatalities))
summarizedby.deaths
## # A tibble: 168 × 2
##               event fatalities
##              <fctr>      <dbl>
## 1          AVALANCE          1
## 2         AVALANCHE        224
## 3         BLACK ICE          1
## 4          BLIZZARD        101
## 5      blowing snow          1
## 6      BLOWING SNOW          1
## 7     COASTAL FLOOD          3
## 8  Coastal Flooding          2
## 9  COASTAL FLOODING          1
## 10     COASTALSTORM          1
## # ... with 158 more rows
  • 1.1.3 …that once taken back into data.frame formatted oject by.deaths and poured in descending order in the set by.deaths.desc
by.deaths <- as.data.frame(summarizedby.deaths)
by.deaths.desc <- arrange(by.deaths, desc(fatalities))

… can be observed in its first and last six rows…

head(by.deaths.desc)
##            event fatalities
## 1        TORNADO       5633
## 2 EXCESSIVE HEAT       1903
## 3    FLASH FLOOD        978
## 4           HEAT        937
## 5      LIGHTNING        816
## 6      TSTM WIND        504
tail(by.deaths.desc)
##                              event fatalities
## 163 URBAN AND SMALL STREAM FLOODIN          1
## 164                      Whirlwind          1
## 165                          WINDS          1
## 166                     WIND STORM          1
## 167        WINTER STORM HIGH WINDS          1
## 168                     WINTRY MIX          1

… narrowing the observations to 168. Entering command summarizedby.deaths %>% tbl_df %>% print(n=168) you can see this tibble object in its entirety and adding respectively by hand the fatalities in rows containing the words “TORNADO” and “HEAT” in the event column, should account to fatalities (due to tornados = 5661, due to heat events = 3138). Notes: 1) Read ahead in the RESULTS section for an automatized procedure for this totals. 2) We maintained the different descriptions for the same type of event (e.g HEAT VS. EXTREME HEAT; TORNADO VS WATERSPROUT/TORNADO ). We found those diversities useful to illustrate the ahead plotS captioned as Fig. 1 and Fig. 2.

  • 1.1.4 Preprocessing for total fatalities by Heat Events and Tornados; and respective plots Fig. 1 and Fig. 2.

Objects fatsby.heat and fatsby.tndo filter database stormdat.f.dec, respectively by heat events and by tornado events. Objects fatsby.heat.arr and fatsby.tndo.arr first mutate their initdate column from factor attribute to character attribute, then transform their column initdate into Lubridate’s date column initdate2. Objects fastby.heat2 and fastby.tndo2 trim undesired columns. Objects toplot.heat and toplot.tndo (to be used in the ahead plots) group the data by date and type of event.

fatsby.heat <- stormdat.f.desc[grepl("HEAT|heat",stormdat.f.desc$event),]
fatsby.tndo <- stormdat.f.desc[grepl("TORNADO|tornado",stormdat.f.desc$event),]

fatsby.heat.arr <- mutate(fatsby.heat, initdate = as.character.Date(bgn_date))
fatsby.heat.arr <- mutate(fatsby.heat.arr, initdate2 = mdy_hms(initdate))
fatsby.heat2 <- select(fatsby.heat.arr, event, initdate2, fatalities)
toplot.heat <- group_by(fatsby.heat2, initdate2, event)

fatsby.tndo.arr <- mutate(fatsby.tndo, initdate = as.character.Date(bgn_date))
fatsby.tndo.arr <- mutate(fatsby.tndo.arr, initdate2 = mdy_hms(initdate))
fatsby.tndo2 <- select(fatsby.tndo.arr, event, initdate2, fatalities)
toplot.tndo <- group_by(fatsby.tndo2, initdate2, event)

1.2 Property/Infrastructure Issues processing

  • 1.2.1 A first subsettings of the original dataset (under stormdat name) into dmages in properties and crops database stormdat.dinPC in order to filter indicators for Billions, Millions and Thousands(“Kilos”)…
stormdat.dinPC <- filter(stormdat, pdmg > 0 & pdmg < 10000 |
                                   cdmg > 0 & cdmg < 10000)

stormdat.dinPC <- filter(stormdat.dinPC, propdmgexp == "B" |
                         propdmgexp == "b" | propdmgexp == "M" |
                         propdmgexp == "m" | propdmgexp == "K" |
                         propdmgexp == "k" |
                         cropdmgexp == "B" |
                         cropdmgexp == "b" | cropdmgexp == "M" |
                         cropdmgexp == "m" | cropdmgexp == "K" |
                         cropdmgexp == "k") 

… and formatting of beginning dates of events. As well and renaming the damages and indicators (pplyr for properties indicator, cplyr for crops indicator). Also, columns dlspr (dollars in property damagges), dlscr (dollars in crops damages), and dlstot for summarizing both, were created and initialized with value = 0. Finally, the columns were arranged in a more readable fashion.

stormdat.dinPC <- mutate(stormdat.dinPC, initdate = as.character.Date(bgn_date))
stormdat.dinPC <- mutate(stormdat.dinPC, bgndate = mdy_hms(initdate))

stormdat.dinPC <- rename(stormdat.dinPC, pplyr = propdmgexp, cplyr = cropdmgexp )

stormdat.dinPC <- mutate(stormdat.dinPC, dlspr = 0, dlscr = 0, dlstot = 0 )
stormdat.dinPC <- select(stormdat.dinPC, event, state, bgndate, 
                                         pdmg, pplyr, dlspr, cdmg, cplyr, dlscr, dlstot)
  • 1.2.2 Further processing, to separate in BILLIONS, was made as follows. A first filtering of inidcators “B” or “b” followed by the conversion of column pdmg times 1 billion was poured into respective dlspr and dlscr columns of databases stormdat.dinPC.Bprop and stormdat.dinPC.Bcrop After that, in dataframe Bbind those two sets were binded and put in descending oder. Finally, first and last three rows are displayed…
stormdat.dinPC.Bprop <- filter(stormdat.dinPC, (pplyr == "B" | pplyr == "b" ))
stormdat.dinPC.Bprop <- mutate(stormdat.dinPC.Bprop, dlspr = 1000000000 * pdmg)

stormdat.dinPC.Bcrop <- filter(stormdat.dinPC, (cplyr == "B" | cplyr == "b" ))
stormdat.dinPC.Bcrop <- mutate(stormdat.dinPC.Bcrop, dlscr = 1000000000 * cdmg)

Bbind <- rbind(stormdat.dinPC.Bprop, stormdat.dinPC.Bcrop)
Bbind <- mutate(Bbind, dlstot = dlspr + dlscr)
Bbind <- arrange(Bbind, desc(dlstot))
head(Bbind, 3)
##               event state    bgndate   pdmg pplyr     dlspr cdmg cplyr
## 1             FLOOD    CA 2006-01-01 115.00     B 1.150e+11 32.5     M
## 2       STORM SURGE    LA 2005-08-29  31.30     B 3.130e+10  0.0      
## 3 HURRICANE/TYPHOON    LA 2005-08-28  16.93     B 1.693e+10  0.0      
##   dlscr    dlstot
## 1     0 1.150e+11
## 2     0 3.130e+10
## 3     0 1.693e+10
tail(Bbind, 3)
##                        event state    bgndate pdmg pplyr dlspr cdmg cplyr
## 45                      HEAT    AL 1995-08-20  0.0       0e+00  0.4     B
## 46                    FREEZE    IA 1995-09-21  0.0       0e+00  0.2     B
## 47 HURRICANE OPAL/HIGH WINDS    AL 1995-10-04  0.1     B 1e+08 10.0     M
##    dlscr dlstot
## 45 4e+08  4e+08
## 46 2e+08  2e+08
## 47 0e+00  1e+08

… and a similar procedure was made for MILLIONS, changing accordingly the names of the dataframes, indicators and the number 1,000,000 to be multiplied. The same with the displayin of the first and last three rows…

stormdat.dinPC.Mprop <- filter(stormdat.dinPC, (pplyr == "M" | pplyr == "m" ))
stormdat.dinPC.Mprop <- mutate(stormdat.dinPC.Mprop, dlspr = 1000000 * pdmg)

stormdat.dinPC.Mcrop <- filter(stormdat.dinPC, (cplyr == "M" | cplyr == "m" ))
stormdat.dinPC.Mcrop <- mutate(stormdat.dinPC.Mcrop, dlscr = 1000000 * cdmg)

Mbind <- rbind(stormdat.dinPC.Mprop, stormdat.dinPC.Mcrop)
Mbind <- mutate(Mbind, dlstot = dlspr + dlscr)
Mbind <- filter(Mbind, dlstot > 0)
Mbind <- arrange(Mbind, desc(dlstot))
head(Mbind, 3)
##       event state    bgndate   pdmg pplyr     dlspr cdmg cplyr dlscr
## 1 HIGH WIND    FL 2004-08-13 929.00     M 929000000  175     M     0
## 2      HAIL    AZ 2010-10-05 900.00     M 900000000    0     K     0
## 3 HURRICANE    NC 1996-09-04 792.15     M 792150000    0           0
##      dlstot
## 1 929000000
## 2 900000000
## 3 792150000
tail(Mbind, 3)
##                  event state    bgndate pdmg pplyr dlspr cdmg cplyr dlscr
## 13243 WILD/FOREST FIRE    SD 1994-08-09 3.00     K     0 0.01     M 10000
## 13244       HEAVY SNOW    SD 1995-03-26 0.01     M     0 0.01     M 10000
## 13245       HEAVY SNOW    SD 1995-04-26 0.01     M     0 0.01     M 10000
##       dlstot
## 13243  10000
## 13244  10000
## 13245  10000

… finally, same procedure was made for the thounsands (“kilos”) resulting amounts of damages, up to the first and last rows displaying.

stormdat.dinPC.Kprop <- filter(stormdat.dinPC, (pplyr == "K" | pplyr == "k" ))
stormdat.dinPC.Kprop <- mutate(stormdat.dinPC.Kprop, dlspr = 1000 * pdmg)

stormdat.dinPC.Kcrop <- filter(stormdat.dinPC, (cplyr == "K" | cplyr == "k" ))
stormdat.dinPC.Kcrop <- mutate(stormdat.dinPC.Kcrop, dlscr = 1000 * cdmg)

 Kbind <- rbind(stormdat.dinPC.Kprop, stormdat.dinPC.Kcrop)
 Kbind <- mutate(Kbind, dlstot = dlspr + dlscr)
 Kbind <- arrange(Kbind, desc(dlstot))
 head(Kbind, 3)
##               event state    bgndate pdmg pplyr dlspr cdmg cplyr dlscr
## 1 THUNDERSTORM WIND    NC 2009-07-26 5000     K 5e+06    0     K     0
## 2       FLASH FLOOD    IL 2010-05-13 5000     K 5e+06    0     K     0
## 3       FLASH FLOOD    IL 2010-05-13 5000     K 5e+06    0     K     0
##   dlstot
## 1  5e+06
## 2  5e+06
## 3  5e+06
 tail(Kbind, 3)
##              event state    bgndate pdmg pplyr dlspr cdmg cplyr dlscr
## 327036 STRONG WIND    KY 2011-11-13  1.0     K     0    0     K     0
## 327037     DROUGHT    TX 2011-11-01  2.0     K     0    0     K     0
## 327038   HIGH WIND    CA 2011-11-30  7.5     K     0    0     K     0
##        dlstot
## 327036      0
## 327037      0
## 327038      0
  • 1.2.3 Binding of the three previous databases into BMKbind data set, and further analysis. In order to narrow our results, we filter the binded data frame with values bigger than the average of the median and mean of house prices in the year 1983 ($75,300 and $89,800 respectively), as obtained from the Median and Average Sales Prices of New Homes Sold in United States of the US Government census page.
BMKbind <- rbind(Bbind, Mbind, Kbind)
medianUSAHomePrice1983 <- 75300
meanUSAHomePrice1983 <- 89800
avgPrice <- (medianUSAHomePrice1983 + meanUSAHomePrice1983) / 2
BMKbind <- filter(BMKbind, dlstot > avgPrice + 300000)
head(BMKbind)
##               event state    bgndate   pdmg pplyr     dlspr cdmg cplyr
## 1             FLOOD    CA 2006-01-01 115.00     B 1.150e+11 32.5     M
## 2       STORM SURGE    LA 2005-08-29  31.30     B 3.130e+10  0.0      
## 3 HURRICANE/TYPHOON    LA 2005-08-28  16.93     B 1.693e+10  0.0      
## 4       STORM SURGE    MS 2005-08-29  11.26     B 1.126e+10  0.0      
## 5 HURRICANE/TYPHOON    FL 2005-10-24  10.00     B 1.000e+10  0.0      
## 6 HURRICANE/TYPHOON    MS 2005-08-28   7.35     B 7.350e+09  0.0      
##   dlscr    dlstot
## 1     0 1.150e+11
## 2     0 3.130e+10
## 3     0 1.693e+10
## 4     0 1.126e+10
## 5     0 1.000e+10
## 6     0 7.350e+09
tail(BMKbind)
##              event state    bgndate pdmg pplyr  dlspr cdmg cplyr  dlscr
## 20522   HEAVY RAIN    OK 2007-06-01    0     K      0  387     K 387000
## 20523 WINTER STORM    VT 1997-03-05  385     K 385000    0            0
## 20524  FLASH FLOOD    KY 2001-08-04  385     K 385000    0     K      0
## 20525    HIGH WIND    NC 2004-03-07  385     K 385000    0            0
## 20526  FLASH FLOOD    PR 2010-05-31  385     K 385000    0     K      0
## 20527         HAIL    GA 2011-05-26  385     K 385000    0     K      0
##       dlstot
## 20522 387000
## 20523 385000
## 20524 385000
## 20525 385000
## 20526 385000
## 20527 385000

In order to group and summarize the filtered set, a table of the events is extracted from the first 1000 rows of the data frame

table(droplevels.factor(head(BMKbind, 1000)$event))
## 
##                   BLIZZARD              COASTAL FLOOD 
##                         10                          1 
##           COASTAL FLOODING    COLD AND WET CONDITIONS 
##                          1                          1 
##            Damaging Freeze            DAMAGING FREEZE 
##                          1                          1 
##                    DROUGHT                Early Frost 
##                         81                          1 
##             EXCESSIVE HEAT          EXCESSIVE WETNESS 
##                          1                          1 
##               EXTREME COLD                FLASH FLOOD 
##                          7                        107 
##          FLASH FLOOD/FLOOD             FLASH FLOODING 
##                          5                          4 
##                      FLOOD          FLOOD/FLASH FLOOD 
##                        152                          4 
##                   FLOODING           FLOOD/RAIN/WINDS 
##                          2                          2 
##                     FREEZE                      FROST 
##                          5                          1 
##               FROST/FREEZE                       HAIL 
##                          9                         96 
##                  HAILSTORM                       HEAT 
##                          1                          1 
##                 HEAVY RAIN                HEAVY RAINS 
##                         10                          1 
##  HEAVY RAIN/SEVERE WEATHER                 HEAVY SNOW 
##                          1                         10 
##                  HIGH WIND                 HIGH WINDS 
##                         17                          2 
##            HIGH WINDS/COLD                  HURRICANE 
##                          2                         52 
##            HURRICANE EMILY             HURRICANE ERIN 
##                          1                          2 
##             HURRICANE OPAL  HURRICANE OPAL/HIGH WINDS 
##                          3                          1 
##          HURRICANE/TYPHOON                  ICE STORM 
##                         36                         30 
##                  LANDSLIDE                MAJOR FLOOD 
##                          2                          2 
##                RECORD COLD                RIVER FLOOD 
##                          1                          3 
##             River Flooding        SEVERE THUNDERSTORM 
##                          1                          1 
##                STORM SURGE           STORM SURGE/TIDE 
##                          6                          4 
##                STRONG WIND          THUNDERSTORM WIND 
##                          2                         10 
##         THUNDERSTORM WINDS                    TORNADO 
##                         18                        193 
## TORNADOES, TSTM WIND, HAIL             TROPICAL STORM 
##                          1                         25 
##                  TSTM WIND                    TSUNAMI 
##                         20                          1 
##                    TYPHOON         WATERSPOUT/TORNADO 
##                          2                          1 
##                   WILDFIRE                  WILDFIRES 
##                         24                          2 
##                 WILD FIRES           WILD/FOREST FIRE 
##                          1                          9 
##               WINTER STORM    WINTER STORM HIGH WINDS 
##                          7                          1

… putting us on the track of the types of events (as indicated by the event variable) that have had the bigger impacts on properties/infrastructure and crops; allowing us to determine the criteria for grouping/summarizing our scoping.

  • 1.2.4 After slendering of consolidated BMKbind, data frame (leaving columns: event, state, bgndate and dlstot)…
BMKbind <- select(BMKbind, event, state, bgndate, dlstot)
  • 1.2.4 […continued. The below described procedure for events of type “flood”“, is repeated for events of types”hurricane“,”tornado“, etc.] Firstly, the dmgsby.flood subset is populated by a “grepl” command that searches the events column for character strings “FLOOD” or “flood” along the column’s width. After obtaining the subset, the event column is filled with the unique string “FLOOD” and the column dyear is created and populated with the event year’s 4 digits from the date column bgndate. Next step creates the database variable toplot.flood and populates it with a grouped (by year + name of event, columns) set. Finally, the first ten rows of the data frame set (that later is going to be binded for plot Fig. 3) are displayed; including its dimensions and attributes.
dmgsby.flood <- BMKbind[grepl("FLOOD|flood", BMKbind$event),]
dmgsby.flood <- mutate(dmgsby.flood, event = "FLOOD", dyear = year(bgndate))
toplot.flood <- group_by(dmgsby.flood, dyear, event)
toplot.flood
## Source: local data frame [6,154 x 5]
## Groups: dyear, event [19]
## 
##    event  state    bgndate   dlstot dyear
##    <chr> <fctr>     <dttm>    <dbl> <dbl>
## 1  FLOOD     CA 2006-01-01 1.15e+11  2006
## 2  FLOOD     IL 1993-08-31 5.00e+09  1993
## 3  FLOOD     IL 1993-08-31 5.00e+09  1993
## 4  FLOOD     ND 1997-04-18 3.00e+09  1997
## 5  FLOOD     TN 2011-05-01 2.00e+09  2011
## 6  FLOOD     TN 2010-05-01 1.50e+09  2010
## 7  FLOOD     AL 2003-05-07 1.00e+09  2003
## 8  FLOOD     MS 2011-05-01 1.00e+09  2011
## 9  FLOOD     IA 2008-06-01 7.50e+08  2008
## 10 FLOOD     NV 1997-01-01 6.40e+08  1997
## # ... with 6,144 more rows
dmgsby.hurricane <- BMKbind[grepl("HURRICANE|TYPHOON", BMKbind$event),]
dmgsby.hurricane <- mutate(dmgsby.hurricane,
                           event = "HURRICANE|TYPHOON", dyear = year(bgndate))
toplot.hurricane <- group_by(dmgsby.hurricane, dyear, event)
toplot.hurricane
## Source: local data frame [249 x 5]
## Groups: dyear, event [16]
## 
##                event  state    bgndate    dlstot dyear
##                <chr> <fctr>     <dttm>     <dbl> <dbl>
## 1  HURRICANE|TYPHOON     LA 2005-08-28 1.693e+10  2005
## 2  HURRICANE|TYPHOON     FL 2005-10-24 1.000e+10  2005
## 3  HURRICANE|TYPHOON     MS 2005-08-28 7.350e+09  2005
## 4  HURRICANE|TYPHOON     MS 2005-08-29 5.880e+09  2005
## 5  HURRICANE|TYPHOON     FL 2004-08-13 5.420e+09  2004
## 6  HURRICANE|TYPHOON     FL 2004-09-04 4.830e+09  2004
## 7  HURRICANE|TYPHOON     FL 2004-09-13 4.000e+09  2004
## 8  HURRICANE|TYPHOON     LA 2005-09-23 4.000e+09  2005
## 9  HURRICANE|TYPHOON     NC 1999-09-15 3.000e+09  1999
## 10 HURRICANE|TYPHOON     AL 2004-09-13 2.500e+09  2004
## # ... with 239 more rows
dmgsby.tropical <- BMKbind[grepl("TROPICAL STORM", BMKbind$event),]
dmgsby.tropical <- mutate(dmgsby.tropical,
                          event = "TROPICAL STORM", dyear = year(bgndate))
toplot.tropical <- group_by(dmgsby.tropical , dyear, event)
toplot.tropical
## Source: local data frame [194 x 5]
## Groups: dyear, event [17]
## 
##             event  state    bgndate     dlstot dyear
##             <chr> <fctr>     <dttm>      <dbl> <dbl>
## 1  TROPICAL STORM     TX 2001-06-05 5150000000  2001
## 2  TROPICAL STORM     MD 2003-09-18  530470000  2003
## 3  TROPICAL STORM     TX 1998-09-07  287180000  1998
## 4  TROPICAL STORM     AZ 1997-09-25  200000000  1997
## 5  TROPICAL STORM     FL 2004-09-05  179400000  2004
## 6  TROPICAL STORM     FL 2004-09-25  134800000  2004
## 7  TROPICAL STORM     DC 2003-09-18  125000000  2003
## 8  TROPICAL STORM     LA 2002-09-25  108630000  2002
## 9  TROPICAL STORM     PR 2004-09-14  101500000  2004
## 10 TROPICAL STORM     FL 2005-08-27  100000000  2005
## # ... with 184 more rows
dmgsby.hail <- BMKbind[grepl("HAIL", BMKbind$event),]
dmgsby.hail <- mutate(dmgsby.hail,
                          event = "HAIL", dyear = year(bgndate))
toplot.hail <- group_by(dmgsby.hail , dyear, event)
toplot.hail
## Source: local data frame [2,302 x 5]
## Groups: dyear, event [19]
## 
##    event  state    bgndate  dlstot dyear
##    <chr> <fctr>     <dttm>   <dbl> <dbl>
## 1   HAIL     AZ 2010-10-05 1.8e+09  2010
## 2   HAIL     FL 1993-03-12 1.6e+09  1993
## 3   HAIL     AZ 2010-10-05 9.0e+08  2010
## 4   HAIL     KY 1998-04-16 5.1e+08  1998
## 5   HAIL     WI 2006-06-25 5.0e+08  2006
## 6   HAIL     MN 1998-05-15 4.5e+08  1998
## 7   HAIL     MO 2001-04-10 4.0e+08  2001
## 8   HAIL     CO 2009-07-20 3.5e+08  2009
## 9   HAIL     MO 2001-04-10 3.0e+08  2001
## 10  HAIL     NE 2001-04-10 3.0e+08  2001
## # ... with 2,292 more rows
dmgsby.winterstorm <- BMKbind[grepl("WINTER STORM", BMKbind$event),]
dmgsby.winterstorm <- mutate(dmgsby.winterstorm,
                          event = "WINTER STORM", dyear = year(bgndate))
toplot.winterstorm <- group_by(dmgsby.winterstorm , dyear, event)
toplot.winterstorm
## Source: local data frame [244 x 5]
## Groups: dyear, event [19]
## 
##           event  state    bgndate    dlstot dyear
##           <chr> <fctr>     <dttm>     <dbl> <dbl>
## 1  WINTER STORM     AL 1993-03-12 5.000e+09  1993
## 2  WINTER STORM     OH 2008-03-07 7.500e+08  2008
## 3  WINTER STORM     CA 1995-12-09 6.000e+07  1995
## 4  WINTER STORM     OH 2004-12-22 5.490e+07  2004
## 5  WINTER STORM     CA 1993-01-13 5.000e+07  1993
## 6  WINTER STORM     NC 1993-03-12 5.000e+07  1993
## 7  WINTER STORM     CO 2003-03-17 3.100e+07  2003
## 8  WINTER STORM     KS 2005-01-04 3.000e+07  2005
## 9  WINTER STORM     NC 1998-02-03 2.218e+07  1998
## 10 WINTER STORM     KS 2006-12-29 2.100e+07  2006
## # ... with 234 more rows
dmgsby.tornado <- BMKbind[grepl("TORNADO", BMKbind$event),]
dmgsby.tornado <- mutate(dmgsby.tornado,
                          event = "TORNADO", dyear = year(bgndate))
toplot.tornado <- group_by(dmgsby.tornado , dyear, event)
toplot.tornado
## Source: local data frame [5,857 x 5]
## Groups: dyear, event [62]
## 
##      event  state    bgndate  dlstot dyear
##      <chr> <fctr>     <dttm>   <dbl> <dbl>
## 1  TORNADO     MO 2011-05-22 2.8e+09  2011
## 2  TORNADO     FL 1993-03-12 1.6e+09  1993
## 3  TORNADO     AL 2011-04-27 1.5e+09  2011
## 4  TORNADO     AL 2011-04-27 1.0e+09  2011
## 5  TORNADO     AL 2011-04-27 7.0e+08  2011
## 6  TORNADO     MS 2011-04-04 5.0e+08  2011
## 7  TORNADO     OK 1999-05-03 4.5e+08  1999
## 8  TORNADO     OK 1999-05-03 4.5e+08  1999
## 9  TORNADO     AL 1989-11-15 2.5e+08  1989
## 10 TORNADO     AL 1989-11-15 2.5e+08  1989
## # ... with 5,847 more rows
dmgsby.fire <- BMKbind[grepl("FIRE|fire", BMKbind$event),]
dmgsby.fire <- mutate(dmgsby.fire,
                          event = "WILD FIRE", dyear = year(bgndate))
toplot.fire <- group_by(dmgsby.fire , dyear, event)
toplot.fire
## Source: local data frame [372 x 5]
## Groups: dyear, event [19]
## 
##        event  state    bgndate    dlstot dyear
##        <chr> <fctr>     <dttm>     <dbl> <dbl>
## 1  WILD FIRE     NM 2000-05-04 1.500e+09  2000
## 2  WILD FIRE     CA 2003-10-25 1.040e+09  2003
## 3  WILD FIRE     CA 2003-10-25 6.964e+08  2003
## 4  WILD FIRE     CA 1993-11-01 6.190e+08  1993
## 5  WILD FIRE     CA 2000-09-29 5.470e+08  2000
## 6  WILD FIRE     CA 2007-06-24 5.000e+08  2007
## 7  WILD FIRE     CA 2003-11-01 2.786e+08  2003
## 8  WILD FIRE     TX 2011-09-04 2.500e+08  2011
## 9  WILD FIRE     CO 2010-09-06 2.170e+08  2010
## 10 WILD FIRE     FL 1998-07-01 2.000e+08  1998
## # ... with 362 more rows
dmgsby.blizzard <- BMKbind[grepl("BLIZZARD|blizzard", BMKbind$event),]
dmgsby.blizzard <- mutate(dmgsby.blizzard,
                          event = "BLIZZARD", dyear = year(bgndate))
toplot.blizzard <- group_by(dmgsby.blizzard , dyear, event)
toplot.blizzard
## Source: local data frame [69 x 5]
## Groups: dyear, event [14]
## 
##       event  state    bgndate    dlstot dyear
##       <chr> <fctr>     <dttm>     <dbl> <dbl>
## 1  BLIZZARD     ND 1997-04-05 102000000  1997
## 2  BLIZZARD     CO 2003-03-17  62000000  2003
## 3  BLIZZARD     ND 1997-01-09  55080000  1997
## 4  BLIZZARD     GA 1993-03-13  50000000  1993
## 5  BLIZZARD     NY 1993-03-13  50000000  1993
## 6  BLIZZARD     SD 1997-04-05  50000000  1997
## 7  BLIZZARD     GA 1993-03-13  50000000  1993
## 8  BLIZZARD     GA 1993-03-13  50000000  1993
## 9  BLIZZARD     ND 1997-04-04  44720000  1997
## 10 BLIZZARD     UT 1997-01-11  40000000  1997
## # ... with 59 more rows
# 
dmgsby.heat <- BMKbind[grepl("HEAT|heat", BMKbind$event),]
dmgsby.heat <- mutate(dmgsby.heat,
                          event = "HEAT", dyear = year(bgndate))
toplot.heat2 <- group_by(dmgsby.heat , dyear, event)
toplot.heat2
## Source: local data frame [16 x 5]
## Groups: dyear, event [6]
## 
##    event  state    bgndate    dlstot dyear
##    <chr> <fctr>     <dttm>     <dbl> <dbl>
## 1   HEAT     AL 1995-08-20 400000000  1995
## 2   HEAT     CA 2006-07-16 492400000  2006
## 3   HEAT     OH 1994-06-13   5000000  1994
## 4   HEAT     MD 1995-07-14   4700000  1995
## 5   HEAT     IA 1995-07-12   3800000  1995
## 6   HEAT     NE 1999-07-19   3300000  1999
## 7   HEAT     NE 2005-07-22   3000000  2005
## 8   HEAT     IA 1995-07-10   2400000  1995
## 9   HEAT     MN 1995-07-10   2000000  1995
## 10  HEAT     NE 2009-06-22   1500000  2009
## 11  HEAT     IN 1995-07-13   1000000  1995
## 12  HEAT     VA 1995-07-14    600000  1995
## 13  HEAT     MO 1995-07-17    400000  1995
## 14  HEAT     OH 1995-07-12    600000  1995
## 15  HEAT     OH 1995-08-08    500000  1995
## 16  HEAT     MO 1995-08-01    400000  1995
dmgsby.surge <- BMKbind[grepl("SURGE|surge", BMKbind$event),]
dmgsby.surge <- mutate(dmgsby.surge,
                          event = "SURGE", dyear = year(bgndate))
toplot.surge <- group_by(dmgsby.surge , dyear, event)
toplot.surge
## Source: local data frame [75 x 5]
## Groups: dyear, event [13]
## 
##    event  state    bgndate    dlstot dyear
##    <chr> <fctr>     <dttm>     <dbl> <dbl>
## 1  SURGE     LA 2005-08-29 3.130e+10  2005
## 2  SURGE     MS 2005-08-29 1.126e+10  2005
## 3  SURGE     TX 2008-09-12 4.000e+09  2008
## 4  SURGE     TX 2008-09-12 5.000e+08  2008
## 5  SURGE     LA 2005-09-23 4.320e+08  2005
## 6  SURGE     LA 2008-09-12 7.500e+07  2008
## 7  SURGE     FL 1993-03-13 5.000e+07  1993
## 8  SURGE     FL 1993-03-13 5.000e+07  1993
## 9  SURGE     MD 2003-09-19 5.000e+07  2003
## 10 SURGE     NC 2011-08-26 4.000e+07  2011
## # ... with 65 more rows
dmgsby.rain <- BMKbind[grepl("RAIN|rain", BMKbind$event),]
dmgsby.rain <- mutate(dmgsby.rain,
                          event = "RAIN", dyear = year(bgndate))
toplot.rain <- group_by(dmgsby.rain , dyear, event)
toplot.rain
## Source: local data frame [182 x 5]
## Groups: dyear, event [19]
## 
##    event  state    bgndate   dlstot dyear
##    <chr> <fctr>     <dttm>    <dbl> <dbl>
## 1   RAIN     LA 1995-05-08 2.50e+09  1995
## 2   RAIN     CA 1998-05-01 2.00e+08  1998
## 3   RAIN     CA 2005-01-07 1.90e+08  2005
## 4   RAIN     CA 1998-05-01 7.36e+07  1998
## 5   RAIN     CA 2005-01-07 7.00e+07  2005
## 6   RAIN     AL 1994-07-03 5.00e+07  1994
## 7   RAIN     ID 1995-05-26 5.00e+07  1995
## 8   RAIN     ND 1994-07-03 5.00e+07  1994
## 9   RAIN     CA 1995-03-01 4.80e+07  1995
## 10  RAIN     CA 2005-02-18 4.00e+07  2005
## # ... with 172 more rows
dmgsby.drought <- BMKbind[grepl("DROUGHT|drought", BMKbind$event),]
dmgsby.drought <- mutate(dmgsby.drought,
                          event = "DROUGHT", dyear = year(bgndate))
toplot.drought <- group_by(dmgsby.drought , dyear, event)
toplot.drought
## Source: local data frame [203 x 5]
## Groups: dyear, event [17]
## 
##      event  state    bgndate     dlstot dyear
##      <chr> <fctr>     <dttm>      <dbl> <dbl>
## 1  DROUGHT     TX 2006-01-01 1000000000  2006
## 2  DROUGHT     IA 1995-08-01  500000000  1995
## 3  DROUGHT     IA 2003-08-01  645150000  2003
## 4  DROUGHT     IA 2001-08-01  578850000  2001
## 5  DROUGHT     TX 2000-11-01  515000000  2000
## 6  DROUGHT     OK 1998-07-06  500000000  1998
## 7  DROUGHT     PA 1999-07-01  500000000  1999
## 8  DROUGHT     NE 2002-12-01  480000000  2002
## 9  DROUGHT     TX 1998-12-01  450000000  1998
## 10 DROUGHT     TX 2001-12-01  420000000  2001
## # ... with 193 more rows
  • 1.2.4.1 Data set toplot.alltogether binds the rows of all the previous obtained sets that are going to be drawn in the Fig. 3 plot of below RESULTS section.
toplot.alltogether <- rbind(toplot.flood, toplot.hurricane, toplot.surge,
                            toplot.tornado, toplot.tropical, toplot.fire,
                            toplot.blizzard, toplot.hail, toplot.heat2,
                            toplot.winterstorm, toplot.rain, toplot.drought)

2. RESULTS

2.1 Personal Issues results

  • 2.1.1 Total deaths due to heat events:
sum(fatsby.heat$fatalities)
## [1] 3138
  • 2.1.2 Total deaths due to tornadoes:
sum(fatsby.tndo$fatalities)
## [1] 5661
  • 2.1.3 Time series plot showing fatalities by heat events.

Code for Fig. 1 takes previously preprocessed object toplot.heat as argument for ggplot object ggheat. X (years) and Y (fatalities) intersections painted as dots of size 4 and transparency of 50% are coloured by type of event.

ggheat <- ggplot(toplot.heat, aes(initdate2, fatalities))
ggheat + geom_point(aes(color = event), size = 4, alpha = 1/2) +
        scale_y_continuous(limits = c(0, 600), breaks = seq(0, 600, by = 50)) +
        labs(title = "                      Deaths by Heat Events") +
        labs(x = "Year", y = "Number of fatalities", 
 caption = "Fig. 1 NOOA Storm Database. Deaths by Heat Events")

Note. The highest point in the plot corresponds to 583 deaths in one day of the widely documented 1995 Chicago heat wave here referenced from the Wikipedia.

  • 2.1.4 Time series plot showing fatalities by tornadoes.

Code for Fig. 2 takes previously preprocessed object toplot.tndo as argument for ggplot object ggtndo. X (years) and Y (fatalities) intersections painted as dots of size 4 and transparency of 50% are coloured by type of event.

ggtndo <- ggplot(toplot.tndo, aes(initdate2, fatalities))
ggtndo + geom_point(aes(color = event), size = 4, alpha = 1/2) +
        scale_y_continuous(limits = c(0, 600), breaks = seq(0, 600, by = 50)) +
        labs(title = "                Deaths by Tornado Events") +
        labs(x = "Year", y = "Number of fatalities", 
 caption = "Fig. 2 NOOA Storm Database. Deaths by Tornado Events")

Marginal Note. A single life loss can show its value for this paper, increasing the number of fatalities in section 1.1.1 and looking the effect in the Fig. 1 plot. E.g., when reaching the fatalities > 9 conditional, more than ten years of deaths by heat disappear from the plot; and both plots lose a lot of density at their bottom lines; and in year 2004 there was only one death by heat, as point for year 2004 disappear from the plot bottom line at fatalities > 1

2.2 Propery/Infrastructure and Crops Issues results

  • 2.2.1 Time series plot showing damages in Property/Infrastructure and Crops.
ggdmgs <- ggplot(toplot.alltogether, aes(bgndate, dlstot/1000000))
ggdmgs + geom_point(aes(color = event), size = 4, alpha = 1/2) +
        labs(title = "   Damages in Property/Infrastructure and Crops") +
        labs(x = "Year", y = "Millions of US Dollars", 
 caption = "Fig. 3 NOOA Storm Database. Property/Infrastructure and Crops Issues")

Note. The events of the first days of January of year 2006 reflected in the 115 billion dollars by flood damages reported in this plot, are in depth described in the
Storms and Flooding in California in December 2005 and January 2006 paper from the U.S. Department of the Interior, and the U.S. Geological Survey.

3. FINAL REMARKS

Data gathering of fatalities due to heat events begins around 1993, while fatalities by tornados start in the early 1950s. In general, recordings of fatalities lie beneath the 50 deaths by event line. With no further analysis, the damages in Property/Infrastructure and Crops evolve towards events (reaching peaks in years 2005 and 2006) related to heat and drought events, seemingly showing correspondence with “El niño” and “La niña” latest aggravated phenomenons in the Pacific Ocean warmest shore regions.