Storm Events Analysis: Fatalities, Injuries and Economic Damages

Synopsis

In this report, analyzing the U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm Database, we aim to ask to two different questions:
1. Across the United States, which types of events are most harmful with respect to population health?
2.Across the United States, which types of events have the greatest economic consequences?

After loading, reading and data, we first approach the whole data set for the period 1950-2011, and we found some inconsistency between the periods 1950-1991 and 1992-2011, with big differences in some total values: maybe indicator of problem of accuracy, specially in relation to the economic damages variables.
So, focusing on the period 1992-2011, we analyzed the different impacts of different kinds of storm in terms of Fatalities, Injuries and Economic Damages.

Data processing

Loading packages

We obtained the data on storm events, monitored across the U.S. between the years 1950 and 2011, from the NOAA Database link.

# Loading packages
library(dplyr)
library(knitr)
library(tidyr)

Reading in the data

We first read in the file included in the zip archive. The data set is a CSV file.

# Download and read the zip file
setwd("C:/DataPortatil/Coursera/DataScience_JHopskins/5-ReproducibleResearch/AssignmentW4")
if(!file.exists("data")){
        dir.create("data")
}
fileURL <- ("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2")
if(!file.exists("./data/stormData.zip")){
    download.file(fileURL, destfile = "./data/stormData.zip")
}
stormData <- read.csv("./data/stormData.zip")

Cleaning the data

Date variables

We changed the format of a the variable BGN_DATE (beginning date), from which we created BGN_YEAR, setting it as an integer format. This last variable will be very important in the analysis.

# Changing class of BGN_DATE to date class. creating BGN_YEAR from BGN_DATE
# and setting it as integer.
stormData$BGN_DATE <- as.Date(stormData$BGN_DATE, "%m/%d/%Y")
stormData <- transform(stormData, BGN_YEAR = format(BGN_DATE, "%Y"))
stormData$BGN_YEAR <- as.integer(stormData$BGN_YEAR)

Unifying amounts in PROPDMG (PROPERTY DAMAGE) and CROPDMG (CROP DAMAGE)

The amounts in both variables (PROPDMG and CROPDMG) have to be differently weighted, as indicated in the PROPDMGXP and CROPDMGXP variables. PROPDMGXP and CROPDMGXP indicate if the amounts in PROPDMG and CROPDMG are hundreds, thousands or millions. For this, we preferred to create two new variables: CROPLOSS and PROPLOSS, with the unified amounts:

# Unifying amounts of "CROPDMGE", (which means CROP DAMAGE), using CROPDMGXP:
# when CROPDMGEXP is: B= billions, K/k= thousand, M/m = millions.
# And create the CROPLOSS variable.
stormData<- stormData %>%
    mutate(CROPLOSS =
               if_else(CROPDMGEXP == "B", CROPDMG*1000000000,
                       if_else(CROPDMGEXP == "k", CROPDMG*1000,
                               if_else(CROPDMGEXP == "K", CROPDMG*1000,
                                       if_else(CROPDMGEXP =="m", CROPDMG*1000000,
                                               if_else(CROPDMGEXP == "M",
                                                        CROPDMG*100000, CROPDMG))))))

# Unifying amounts of "PROPDMGE", (which means PROPERTY DAMAGE), using PROPDMGXP:
# when CROPDMGEXP is: B= billions, K/k= thousand, M/m = millions, H/h = hundreds.
# And create the CROPLOSS variable.
stormData<- stormData %>%
    mutate(PROPLOSS =
               if_else(PROPDMGEXP == "B", PROPDMG*1000000000,
                       if_else(PROPDMGEXP == "k", PROPDMG*1000,
                               if_else(PROPDMGEXP == "K", PROPDMG*1000,
                                       if_else(PROPDMGEXP =="m", PROPDMG*1000000,
                                               if_else(PROPDMGEXP == "M",
                                                                PROPDMG*100000,
                                                       if_else(PROPDMGEXP == "h",
                                                                PROPDMG*100,
                                                               if_else(PROPDMGEXP=="H",
                                                                PROPDMG*100, PROPDMG))))))))

Cleaning the EVTYPE variable

This character variable indicates which kind of storm event is reported, using different descriptors (character string).
The first problem is that we have more than 900 descriptors and, the second, there not seem to be clear criteria to limit the description. Which provokes this huge number of descriptors that make very difficult the analysis.
Sp, we have tried to limit and to concentrate the number of these, at least as to the main descriptors:
We identified different main event descriptors around which we concentrate other descriptors: HURRICANE, TORNADO, HEAT, WILDFIRE AND GRASS FIRE, FLOOD, LIGHTNING, HIGH WINDS, THUNDERSTORM AND TSTM WIND, ICE AND SNOW STORM AND WINTER STORM.
Note that, in the case of FLOOD, this was complicate. Anyway our criterion was considering within FLOOD only continental floods, generated by rivers, lakes, etc., not in the coast. And the second criterion was to exclude duplicated descriptors (in this case we have a pair…).
We used mutate() and if_else() functions together with grep() (very important).

# WORKING TO UNIFY AND CONCENTRATE THE MAIN DESCRIPTORS

# Unifying the terminology "HURRICANE" in ONE variable of EVTYPE
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("HURRICANE", stormData$EVTYPE,ignore.case = TRUE) == TRUE,
                       "HURRICANE", EVTYPE))
# Unifying the terminology "storm surge" in ONE variable of EVTYPE
stormData <- stormData %>%
  mutate(EVTYPE =
           if_else(grepl("SURGE", stormData$EVTYPE,ignore.case = TRUE) == TRUE,
                   "STORM SURGE", EVTYPE))
# Unifying the terminology "TORNADO" in ONE variable of EVTYPE
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("TORNADO", stormData$EVTYPE,ignore.case = TRUE) == TRUE,
                                "TORNADO", EVTYPE))
# Unifying the terminology "HEAT" in the variable EVTYPE
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("HEAT", stormData$EVTYPE,ignore.case = TRUE) == TRUE,
                                "HEAT", EVTYPE))

# Unifying the terminology "WILDFIRE AND GRASS FIRE" in the variable EVTYPE
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("FIRE", stormData$EVTYPE,ignore.case = TRUE) == TRUE,
                            "WILDFIRE AND GRASS FIRE", EVTYPE))

# Unifying the terminology "FLOOD" in ONE variable of EVTYPE
# EXCLUDING:
# "ICE STORM/FLASH FLOOD","ICE JAM","COASTAL FLOOD","COASTAL FLOODING","TIDAL FLOOD","BEACH FLOOD",
# "HIGH WINDS/COASTAL FLOOD","COASTAL/TIDAL FLOOD","BEACH EROSION/COASTAL FLOOD",
# "COASTALFLOOD","TIDAL FLOODING","COASTAL FLOODING/EROSION","COASTAL  FLOODING/EROSION"
# All these are different kind of flood: coastal flood, or -as in the case of the first-
# in contrast with ICE STORM, ICE JAM.
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("^(?=.*FLOOD)(?!.*ICE STORM)(?!.*ICE JAM)(?!.*BEACH)(?!.*COASTAL)(?!.*TIDAL)",
                                      stormData$EVTYPE,ignore.case = TRUE, perl = TRUE)==
                                        TRUE, "FLOOD", EVTYPE))
# Unifying the terminology "LIGHTNING" in ONE variable of EVTYPE
# EXCLUDING:
# THUNDERSTORMS, FIRES, TSTM
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("^(?=.*LIGHTNING)(?!.*THUNDERSTORM)(?!.*FIRE)(?!.*TSTM)",
                                  stormData$EVTYPE, ignore.case = TRUE, perl = TRUE)==
                                TRUE, "LIGHTINING", EVTYPE))

# Unifying the terminology "HIGH WINDS" in ONE variable of EVTYPE
# EXCLUDING:
# HURRICANE, SNOW, FLOOD
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("^(?=.*HIGH WIND)(?!.*HURRICANE)(?!.*SNOW)(?!.*FLOOD)",
                                  stormData$EVTYPE,ignore.case = TRUE, perl = TRUE)==
                                TRUE, "HIGH WINDS", EVTYPE))



# Unifying the terminology "THUNDERSTORM AND TSTM WIND" in ONE variable of EVTYPE
# EXCLUDING: TORNADOES
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("THUNDERSTORM", stormData$EVTYPE,ignore.case = TRUE,
                                      perl = TRUE) == TRUE, "THUNDERSTORM AND TSTM WIND",
                       if_else(grepl("^(?=.*TSTM)(?!.*TORNADO)",stormData$EVTYPE,
                                              ignore.case = TRUE, perl = TRUE)== TRUE,
                                        "THUNDERSTORM AND TSTM WIND",EVTYPE)))

# Unifying the terminology "ICE AND SNOW STORM AND WINTER STORM" in ONE variable of EVTYPE
stormData <- stormData %>%
    mutate(EVTYPE =
               if_else(grepl("WINTER STORM", stormData$EVTYPE,
                             ignore.case = TRUE, perl = TRUE) == TRUE,
                       "ICE AND SNOW STORM AND WINTER STORM",
                       if_else(grepl("SNOW", stormData$EVTYPE,
                                     ignore.case = TRUE, perl = TRUE) == TRUE,
                               "ICE AND SNOW STORM AND WINTER STORM",
                               if_else(grepl("ICE", stormData$EVTYPE,
                                             ignore.case = TRUE, perl = TRUE) == TRUE,
                                       "ICE AND SNOW STORM AND WINTER STORM",EVTYPE ))))

Finally, we created a variable, TOTALLOSS which is the sum os CROPLOSS + PROPLOSS, and that we will use during all the study to analyze economical damages.

# Creating a new variable: TOTALLOSS = CROPLOSS+PROPLOSS, TO WORK ON DAMAGES
stormData <- stormData %>%
    mutate(TOTALLOSS = CROPLOSS+PROPLOSS) # We create a variable of total damages

Results

Calculating the totals of fatalities, injuries and damages for different storm events

Now, after cleaning the data, we can start exploring the main general data in order to to set the focus on the core of the analysis.

By now, we only want to get the total sum of our focus variables (FATALITIES, INJURIES and TOTALLOSS, as economic damage) in relation to the type of storm event, as described in the EVTYPE variable.

# Total Damages, Fatalities and Injuries
globalLOSS <- stormData %>% # The global total LOSSES since 1950
    summarise(sum= sum(TOTALLOSS))
globalFAT <- stormData %>% # The global total FATALITIES since 1950
  summarise(sum= sum(FATALITIES))
globalINJ<- stormData %>% # The global total INJURIES since 1950
  summarise(sum= sum(INJURIES))

globalINJ2<- as.character(prettyNum(globalINJ,big.mark = ","))
globalFAT2<- as.character(prettyNum(globalFAT,big.mark = ","))
gloss_text <- paste(prettyNum(round(globalLOSS/1000000,2),big.mark = ","),
                    "Millions of $")

The main 5 descriptors, as amount of total damages (in millions of $) and as number of fatalities and injuries

In order to limit the number of figures in this document, let us only explain that, according to this data set, for the period 1950-2011, the total amount of the Economic Economic Damages provoked by meteorological storms was 319,071.1 Millions of $¹, the number of Fatalities recorded was 15,145² and the number of Injuries was 140,528³.

On the other hand, we can sum the values recorded for the different descriptors and see the descriptors with highest values.

# Comparing Fatalities, Injuries, and Damages 1950 - 2000
# Fatalities: list of the first 5 descriptors
fatal <- stormData%>% # Number of Fatalities
  group_by(EVTYPE) %>%
  summarise(Fatalities = sum(FATALITIES))%>%
  arrange(desc(Fatalities))
fatallist<-paste(unlist(list(fatal[1:5,"EVTYPE"])),collapse = ", ")

# Injuries: list of the first 5 descriptors
injured <- stormData%>% # Number of Injuries
  group_by(EVTYPE) %>%
  summarise(Injuries = sum(INJURIES))%>%
  arrange(desc(Injuries))
injuredlist<-paste(unlist(list(injured[1:5,"EVTYPE"])),collapse = ", ")

# Economic damages: list of the first 5 descriptors
damages <- stormData %>% # Material damages in proportion
  group_by(EVTYPE) %>%
  summarise(TotalLosses_M = round(sum(TOTALLOSS)/1000000,1)) %>%
  arrange(desc(TotalLosses_M))
damageslist<-paste(unlist(list(damages[1:5,"EVTYPE"])), collapse = ", ")

We see that the descriptors with highest number of Fatalities are:
TORNADO, HEAT, FLOOD, LIGHTINING, THUNDERSTORM AND TSTM WIND⁴.
As to Injuries, the highest ranked descriptors are:
TORNADO, THUNDERSTORM AND TSTM WIND, HEAT, FLOOD, LIGHTINING⁵.
Finally, in the case of Economic Damages, the highest ranked are:
FLOOD, HURRICANE, STORM SURGE, TORNADO, ICE AND SNOW STORM AND WINTER STORM⁶.⁷
As we can see, there seems to be a certain continuity in the high ranked main descriptors, which change only slightly between economical damages, fatalities and injuries.

Analyzing the distribution of the descriptors during the whole period (1950-2000)

Now, let’s make a quick analysis calculating the the amounts and the totals numbers per year and visualizing them in a plot.

# Year total Damages, Fatalities and Injures - 1950-2000
yearDamages <- stormData %>% # Grouping economic losses by year 
  group_by(BGN_YEAR) %>%
  summarise(Milions = round(sum(TOTALLOSS)/1000000,2))
yearFATAL <- stormData %>% # Grouping FATALITIES by year 
  group_by(BGN_YEAR) %>%
  summarise(Fatalities = sum(FATALITIES))
yearINJUR <- stormData %>% # Grouping FATALITIES by year 
  group_by(BGN_YEAR) %>%
  summarise(Injuries = sum(INJURIES))

# Plots
par(mfrow= c(3,1), mar= c(4,4,5.5,1))
plot(yearDamages, type = "l", lwd=2, col="red", xlab = "Year",
     ylab = "Millions of $",
     main= "Total Damages in Millions of $ per Year")
plot(yearFATAL, type = "l",lwd=2, col="blue", xlab = "Year", ylab = "Number of Fatalities",
     main= "Total Number of Fatalities per Year")
plot(yearINJUR, type = "l", lwd=2, col= "green", xlab= "Year", ylab= "Number of Injuries",
     main= "Total Number of Injuries per Year")
mtext("Storm Events per YEAR: DAMAGES, FATALITIES and INJURIES", line = -1.5,
      side = 3, outer = TRUE)

As we have seen in the plot above, there seems to be some important differences in the data between the period 1991-2011 and the period before. And it could be a systematic difference, not due to the data.
For that, we filter the decade 1992-2011 in order to sum some value and to assess these differences:

#  FILTERING 1991-2000 YEARS

stormData_last <- stormData %>% 
    filter(BGN_YEAR> 1991)

globLoss_last<- stormData_last %>%
  summarise(TotLoss_last = sum(TOTALLOSS))

On the other hand, let’s analyze the proportions of the sums recorded in the last decades. We see that exists a disproportion:

propLoss_last<- stormData_last %>%
  summarise(paste(round((TotLoss_last = sum(TOTALLOSS)/globalLOSS*100),2), "%",sep=""))
propFat_last<- stormData_last %>%
  summarise(paste(round((TotFAT_last = sum(FATALITIES)/globalFAT*100),2), "%",sep=""))
propInj_last<- stormData_last %>%
  summarise(paste(round((TotINJ_last = sum(INJURIES)/globalINJ*100),2), "%",sep=""))

In the case of Fatalities and Injuries the % of the events of the 1991-2000 decade are 72.1%⁸, and 50.18%⁹, respectively of the total 1950-2011 period. But in the case of the economic Damages this percentage is 98.6%!!¹⁰. These values, along with the plots we have shown before, could indicate a change in the data, or a change in the accuracy of recording data during the firs four decades compared with the last. This fact would make difficult to make comparatives, specially on economic damages.

For this reason we have decided to analyze, from now on, only the 1992-2011 period.

Analyzing the 1992-2011 period

Comparing main descriptors in relation to Fatalities, Incjuries and Economic Damages

The differences between descriptors in relation to our variables (FATALITIES, INJURIES, TOTALLOSS, which means economic damage), during the period 1992-2011, can be analyzed using a set of different plots.
Anyway, let’s first go to see which are the main descriptors for each of our 3 variables:

# LISTS ABOUT FATALITIES, INJURIES AND DAMAGES 1992-2011
dam_last <- stormData_last %>% # Material damages 1992-2011
  group_by(EVTYPE) %>%
  summarise(Damage_M = round(sum(TOTALLOSS/1000000),2)) %>%
  slice_max(Damage_M, n=5)
damlist_last <- paste(unlist(list(dam_last[1:5,"EVTYPE"])),collapse = ", ")

fat_last <- stormData_last %>% # Fatalities 1992-2011
  group_by(EVTYPE) %>%
  summarise(Fatalities = sum(FATALITIES)) %>%
  slice_max(Fatalities, n=5)
fatlist_last <- paste(unlist(list(fat_last[1:5,"EVTYPE"])),collapse = ", ")

inj_last <- stormData_last %>% # Injuries 1992-2011
  group_by(EVTYPE) %>%
  summarise(Injuries = sum(INJURIES)) %>%
  slice_max(Injuries, n=10)
injlist_last <- paste(unlist(list(inj_last[1:5,"EVTYPE"])),collapse = ", ")

The highest ranked 5 descriptors for economic Damage, as amount of $ are:
FLOOD, HURRICANE, STORM SURGE, ICE AND SNOW STORM AND WINTER STORM, TORNADO¹¹.
The highest ranked 5 descriptors for Fatalities number are:
HEAT, TORNADO, FLOOD, LIGHTINING, ICE AND SNOW STORM AND WINTER STORM¹².
The highest ranked 5 descriptors for Injuries number are:
TORNADO, HEAT, FLOOD, THUNDERSTORM AND TSTM WIND, LIGHTINING¹³.

Plotting variables and descriptors

We can visualize the relations of the variable and the descriptors listed above, using a set of three barplots (each for every variable: TOTALLOSS, FATALITIES AND INJURIES), where we show the relations between variables and descriptors:

# BARPLOTS ON DAMAGES FATALITIES AND INJURIES 1992-2011

# Filtering on DAMAGES 1992-2011
lossData_last <- stormData_last %>%
    filter(EVTYPE== "FLOOD" | EVTYPE== "STORM SURGE" |
             EVTYPE== "HURRICANE" | EVTYPE== "TORNADO" |
             EVTYPE== "THUNDERSTORM AND TSTM WIND")%>%
    select(EVTYPE, TOTALLOSS)%>%
    group_by(EVTYPE)%>%
    summarise(round(across(.cols = everything(),sum ,na.rm= TRUE)/1000000,2))
lossorder<- c("FLOOD","HURRICANE","STORM SURGE","THUNDERSTORM AND TSTM WIND",
              "TORNADO")
# Filtering on FATALITIES 1992-2011
fatalData_last <- stormData_last %>%
  filter(EVTYPE== "HEAT" | EVTYPE== "FLOOD" | EVTYPE== "TORNADO" |
           EVTYPE== "LIGHTINING" | EVTYPE== "ICE AND SNOW STORM AND WINTER STORM")%>%
  select(EVTYPE, FATALITIES)%>%
  group_by(EVTYPE)%>%
  summarise(across(.cols = everything(),sum ,na.rm= TRUE))
fatalorder<- c("FLOOD","HEAT","ICE AND SNOW STORM\n AND WINTER STORM","LIGHTNING", "TORNADO")

# Barplot on INJURIES 1992-2011
injurData_last <- stormData_last %>%
  filter(EVTYPE== "HEAT" | EVTYPE== "FLOOD" | EVTYPE== "TORNADO" |
           EVTYPE== "THUNDERSTORM AND TSTM WIND" |
           EVTYPE== "LIGHTINING")%>%
  select(EVTYPE, INJURIES)%>%
  group_by(EVTYPE)%>%
  summarise(across(.cols = everything(),sum ,na.rm= TRUE))
injurorder<- c("FLOOD","HEAT", "LIGHTINING", "THUNDERSTORM\n AND TSTM WIND","TORNADO")

# PANEL OF THREE BARPLOTS
par(mfrow= c(3,1), mar= c(2.5,4,5.5,0))
lossplot<-barplot(lossData_last$TOTALLOSS, names.arg = lossorder, cex.names = 0.7, ylim = c(0,160000),
        ylab = "Millions of $", xlab = "Type of event",
        main = "Total Damages of the 5 Main Storm Types - 1992 - 2011",
        col = c("lightblue", "lightcyan", "lavender", "lightgreen","cornsilk"))
ybarLossdat<- as.matrix(lossData_last$TOTALLOSS)
text(lossplot,ybarLossdat, labels = round(ybarLossdat, 0), pos= 3, cex = 1)

fatplot<-barplot(fatalData_last$FATALITIES, names.arg = fatalorder, cex.names = 0.7,
        ylab = "Number of Fatalities", xlab = "Type of event",
        main = "Total Fatalities of the 5 Main Storm Types - 1992 - 2011",
        col = c("lightblue","red", "mistyrose", "orange", "cornsilk"))
ybarFatdat<- as.matrix(fatalData_last$FATALITIES)
text(fatplot,ybarFatdat, labels = round(ybarFatdat, 0), pos= 1, cex = 1)

injplot<-barplot(injurData_last$INJURIES, names.arg = injurorder, cex.names = 0.7,
        ylab = "Number of Injuries", xlab = "Type of event",
        main = "Total Injuries of the 5 Main Storm Types - 1992 - 2011",
        col = c("lightblue","red", "orange", "lightgreen", "cornsilk"))
ybarInjdat<- as.matrix(injurData_last$INJURIES)
text(injplot,ybarInjdat, labels = round(ybarInjdat, 0), pos= 1, cex = 1)
mtext("Storm events: comparing DAMAGES, FATALITIES and INJURIES", line = -2,
      side = 3, outer = TRUE)

In the barplots above we can notice some main characteristics in the different kind of storm according to the different kind of damage. For exemple Heat seems to have big impact in term of fatalities, but much less in term of economic/material damages. We can consider Flood the most dangerous type of storm becaus it is the first according to economic/material damages an the secon as to Fatalities and Injuries.

Comparing in the time (1991-200) main descriptors in relation to Fatalities, Injuries and Economic Damages

To make a time comparative of the different main descriptors in relation to our variables we created three different tables, one for every variable, where the descriptors are spread like variables:

# Pivot TOTALLOSS with yearly data of 4 main descriptors
mainDatalast <- stormData_last%>%
  filter(EVTYPE=="FLOOD" | EVTYPE== "HURRICANE" | EVTYPE=="STORM SURGE" | 
           EVTYPE=="TORNADO") %>%
  group_by(BGN_YEAR, EVTYPE)%>%
  summarise(sum= sum(TOTALLOSS))%>%
  arrange(BGN_YEAR)
mainDatalast<- mainDatalast %>%
  pivot_wider(names_from = EVTYPE,values_from=sum) %>%
  select(BGN_YEAR,"FLOOD", "HURRICANE","STORM SURGE", "TORNADO")%>%
  group_by(BGN_YEAR) %>%
  summarise(round(across(.cols = everything(),sum ,na.rm= TRUE)/1000000,2))

#Pivot FATALITIES with yearly data of 4 main descriptors
fatDatalast <- stormData_last%>%
  filter(EVTYPE=="HEAT" | EVTYPE== "FLOOD" | EVTYPE=="TORNADO" | 
           EVTYPE=="LIGHTINING") %>%
  group_by(BGN_YEAR, EVTYPE)%>%
  summarise(sum= sum(FATALITIES))%>%
  arrange(BGN_YEAR)
fatDatalast<- fatDatalast %>%
  pivot_wider(names_from = EVTYPE,values_from=sum) %>%
  select(BGN_YEAR,"HEAT","FLOOD","TORNADO","LIGHTINING")%>%
  group_by(BGN_YEAR) %>%
  summarise(round(across(.cols = everything(),sum ,na.rm= TRUE)))

#Pivot INJURIES with yearly data of 4 main descriptors
injDatalast <- stormData_last%>%
  filter(EVTYPE=="TORNADO" | EVTYPE== "FLOOD" | EVTYPE=="HEAT" |   
           EVTYPE=="THUNDERSTORM AND TSTM WIND") %>%
  group_by(BGN_YEAR, EVTYPE)%>%
  summarise(sum= sum(INJURIES))%>%
  arrange(BGN_YEAR)
injDatalast<- injDatalast %>%
  pivot_wider(names_from = EVTYPE,values_from=sum) %>%
  select(BGN_YEAR,"TORNADO", "FLOOD", "HEAT","THUNDERSTORM AND TSTM WIND")%>%
  group_by(BGN_YEAR) %>%
  summarise(round(across(.cols = everything(),sum ,na.rm= TRUE)))

Using the three objects, we can now make a panel with 3 line plots, showing the tendency of each of the main descriptors in relation to DAMAGES, FATALITIES and INJURIES.

# Panel of three plots
# Panel of three plots
par(mfrow= c(3,1), mar= c(4,4,5.5,1))
plot(mainDatalast$BGN_YEAR, mainDatalast$FLOOD, ylim = c(0,120000),
     type = "b", lwd=2, col= "blue",lty=3, xlab = "YEAR", ylab = "Millions of $",
     main = "Total Damages per year of the 4 Main Storm Types - 1992 - 2011")
lines(mainDatalast$BGN_YEAR, mainDatalast$HURRICANE,
      lwd=2, col= "red", type = "b", lty=3)
lines(mainDatalast$BGN_YEAR,mainDatalast$`STORM SURGE`,
      lwd=2, col= "green")
lines(mainDatalast$BGN_YEAR,mainDatalast$TORNADO,
      lwd=2, col= "black")
legend("topright", lwd = 2, col = c("blue","red","green","black"),
       legend = c("FLOOD", "HURRICANE","STORM SURGE", "TORNADO"),
       cex = 0.6, lty = c(3,3,1,1))

plot(fatDatalast$BGN_YEAR, fatDatalast$HEAT,
     type = "l", lwd=2, col= "red",lty=1, xlab = "YEAR", ylab = "Number of Fatalities",
     main = "Total Fatalities per year of the 4 Main Storm Types - 1992 - 2011")
lines(fatDatalast$BGN_YEAR, fatDatalast$FLOOD,
      lwd=2, col= "blue", type = "b", lty=3)
lines(fatDatalast$BGN_YEAR, fatDatalast$TORNADO,
      lwd=2, col= "black")
lines(fatDatalast$BGN_YEAR, fatDatalast$LIGHTINING,
      lwd=2, col= "orange")
legend("topright", lwd = 2, col = c("red","blue","black","orange"),
       legend = c("HEAT","FLOOD", "TORNADO", "LIGHTINING"),
       cex = 0.6, lty = c(1,3,1,1))

plot(injDatalast$BGN_YEAR, injDatalast$TORNADO, ylim = c(0,6500),
     type = "l", lwd=2, col= "black",lty=1, xlab = "YEAR", ylab = "Number of Injuries",
     main = "Total Injuries per year of the 4 Main Storm Types - 1992 - 2011")
lines(injDatalast$BGN_YEAR, injDatalast$FLOOD,
      lwd=2, col= "blue", type = "b", lty=3)
lines(injDatalast$BGN_YEAR, injDatalast$HEAT,
      lwd=2, col= "red")
lines(injDatalast$BGN_YEAR, injDatalast$`THUNDERSTORM AND TSTM WIND`,
      lwd=2, col= "violet")
legend("topleft", lwd = 2, col = c("black","blue","red","violet"),
       legend = c("TORNADO","FLOOD", "HEAT", "THUNDERSTORM AND TSTM WIND"),
       cex = 0.6, lty = c(1,3,1,1))
mtext("Storm events: comparing DAMAGES, FATALITIES and INJURIES - 1992-2011", line = -2,
      side = 3, outer = TRUE)

So, this last panel show the irregularity of the impact of the different kind of storm. Fact that obvously make difficult to forecast them. For exemple we see that ICE AND SNOW STORM had a strong economic/material impact in 1993 and 1994, but afterwads we had no impact from this factor. The same we can say of the impact of HEAT on Fatilities and Injuries number, where we see two big peaks in 1995 and 1999.
An aspect that we can see very clearly is that we can have very high peaks of different storm types for different aspects. They are peaks so important that have to be recorded by other data source. For example the peak in the FLOOD of 2006 is recorded here as 2006 Mid-Atlantic United States flood. On the other hand, the HEAT peak in Fatalities of 1995 is recorded for example in The New England Journal of Medicine.
These alternative data sources confirm and enrich our data.

“r gloss_text”.↩︎
“r globalFAT2”.↩︎
“r globalINJ2”.↩︎
“r fatallist”↩︎
“r injuredlist”↩︎
“r damageslist”↩︎
All these values and lists can be easily checked running the codes in the chunks.↩︎
“r r propFat_last”↩︎
“r propInj_last”↩︎
“r propLoss_last”↩︎
“r damlist_last”.↩︎
“r fatlist_last”.↩︎
“r injlist_last”.↩︎