1.Synopsis

In this report, we analyze the NOAA Storm Database about the damages that severe weather events caused across the United States to answer below questions,

  1. Which types of severe weather events are most harmful with respect to population health?
  2. Which types of severe weather events have the greatest economic consequences?

We will find out which severe weather events are attributed to higher fatalities/injuries or greater damages to the crops/properties, so that readers can conduct further analyses and take appropriate prevention measures for those events.

2.Data Processing

In this section, we load the raw data, subset and alter some columns to make the data ready for analyses.

First, we load the data.

url <- 'https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2'
download.file(url,'stormdata.csv')
data <- read.csv('stormdata.csv')
nrows <- dim(data)[1]
ncols <- dim(data)[2]

The data has 902297 observations and 37 features.

The first 3 rows of data:

head(data,3)
##   STATE__          BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE  EVTYPE
## 1       1 4/18/1950 0:00:00     0130       CST     97     MOBILE    AL TORNADO
## 2       1 4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL TORNADO
## 3       1 2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL TORNADO
##   BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END COUNTYENDN
## 1         0                                               0         NA
## 2         0                                               0         NA
## 3         0                                               0         NA
##   END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES INJURIES PROPDMG
## 1         0                      14.0   100 3   0          0       15    25.0
## 2         0                       2.0   150 2   0          0        0     2.5
## 3         0                       0.1   123 2   0          0        2    25.0
##   PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE
## 1          K       0                                         3040      8812
## 2          K       0                                         3042      8755
## 3          K       0                                         3340      8742
##   LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1       3051       8806              1
## 2          0          0              2
## 3          0          0              3

Subset the columns we need,

# Remove unnecessary columns
library(dplyr)
data <- select(data,c('STATE__',"STATE","BGN_DATE",'EVTYPE',"LENGTH","WIDTH","F","MAG","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP","REMARKS"))

Subset the observation we need. (we only need the observation which caused fatalities/injuries or damages to crops/properties. Thus remove the observations where all of those records equal to zero.)

# remove data with ALL FATALITIES/INJURIES/CROPDMG/PROPDMG == 0 

noDmgLogical <- (data$FATALITIES == 0)&
  (data$INJURIES == 0)&
  (data$PROPDMG == 0)&
  (data$CROPDMG == 0)
# Check if above logic subset all-zero-observations
colSums(data[noDmgLogical,c("FATALITIES","INJURIES","PROPDMG","CROPDMG")])
## FATALITIES   INJURIES    PROPDMG    CROPDMG 
##          0          0          0          0
# Subset data opposite of above logic
data <- subset(data, !(noDmgLogical))
nrows <- dim(data)[1]
ncols <- dim(data)[2]

It resulted in 254633 observations and 15 columns.

Next, in order to get the amount of damage to the crops and properties, we need to calculate actual values. The total amounts can be calculated by multiplying PROPDMG/CROPDMG columns by PROPDMGEXP/CROPDMGEXP columns respectively where multipliers are coded in alphabets(B for Billions,M for Millions and so on). Thus, after changing alphabets to multipliers in numbers, calculate values and make new columns.
(I referenced the report How To Handle Exponent Value of PROPDMGEXP and CROPDMGEXP for the clarification.)

## Difine exponent column handling function
expFunc <- function(x){
  if ((x=='B') | (x=='b')) {x <- 1e+9
  }else if ((x=='M') | (x=='m')){x <- 1e+6
  }else if((x=='K') | (x=='k')){x <- 1e+3
  }else if((x=='H') | (x=='h')){x <- 1e+2
  }else if(x %in% as.character(0:8)){x <- 1e+1
  }else{x <- 1}
}

# Change alphabets to multiplier for CROPDMGEXP
data$CROPDMGEXP <- sapply(data$CROPDMGEXP,expFunc)

# Make a new column for actual damage
data <- mutate(data, totCropDmg=CROPDMG * CROPDMGEXP)

# Repeat the same process for PROPDMGEXP
data$PROPDMGEXP <- sapply(data$PROPDMGEXP, expFunc)
data <- mutate(data, totPropDmg=PROPDMG*PROPDMGEXP)

Now, the most important task in this data preprocessing phase is to organize the EVTYPE(event types) column. The number of official event types are 48. But if we count all unique values in this column, they are 488 at this point. It is due to the mistakes in spelling or being coded with own words. In order to compare all 48 event types, we have to assign incorrect event types to the official ones.

Steps for Assigning incorrect EVTYPE to Official EVTYPE:
Change all values to upper cases.(It erase the difference between upper cases and lower cases.) >> Load the official event type categories, (It was copied from the documentation in PDF file.) and change them also in upper cases. >> List the event types which do not match with the official event types. >> check the pattern and use the grepl function to correct them together, if it was not sure which categories the values are pointing, read REMARKS and decide which events they possibly belong to.

# Change EVTYPE column to upper cases
data$EVTYPE <- toupper(data$EVTYPE)

# Load the official 48 EVTYPE
library(kableExtra)
url2 <- 'https://raw.githubusercontent.com/belanello/storm_data/main/events.txt'
events <- read.table(url2,sep = '\n')
events <- toupper(events$V1)
# display official event list
kbl(matrix(events, nrow=16,dimnames =NULL), 
    caption='Table.1: Official Event Types',type='html')  %>%    kable_styling(latex_options = 'hold_position',font_size=10)
Table.1: Official Event Types
ASTRONOMICAL LOW TIDE FUNNEL CLOUD MARINE THUNDERSTORM WIND
AVALANCHE FREEZING FOG RIP CURRENT
BLIZZARD HAIL SEICHE
COASTAL FLOOD HEAT SLEET
COLD/WIND CHILL HEAVY RAIN STORM SURGE/TIDE
DEBRIS FLOW HEAVY SNOW STRONG WIND
DENSE FOG HIGH SURF THUNDERSTORM WIND
DENSE SMOKE HIGH WIND TORNADO
DROUGHT HURRICANE (TYPHOON) TROPICAL DEPRESSION
DUST DEVIL ICE STORM TROPICAL STORM
DUST STORM LAKE-EFFECT SNOW TSUNAMI
EXCESSIVE HEAT LAKESHORE FLOOD VOLCANIC ASH
EXTREME COLD/WIND CHILL LIGHTNING WATERSPOUT
FLASH FLOOD MARINE HAIL WILDFIRE
FLOOD MARINE HIGH WIND WINTER STORM
FROST/FREEZE MARINE STRONG WIND WINTER WEATHER
# List the event types which do not match with the official event types
notInListEvents <- unique(data[!(data$EVTYPE %in% events),'EVTYPE'])
notInListEventsCounts <- length(notInListEvents)
head(notInListEvents)
## [1] "TSTM WIND"                 "ICE STORM/FLASH FLOOD"    
## [3] "HURRICANE OPAL/HIGH WINDS" "THUNDERSTORM WINDS"       
## [5] "HURRICANE ERIN"            "HURRICANE OPAL"

In total, 401 events are unlisted.

# Correct typos or change them to official expressions of events

data$EVTYPE[grepl("^\\s*TSTM WIND",data$EVTYPE)] <- "THUNDERSTORM WIND"
data$EVTYPE[grepl("^HURRICANE",data$EVTYPE)] <- "HURRICANE (TYPHOON)"  
data$EVTYPE[grepl("^THUNDERSTORM",data$EVTYPE)] <- "THUNDERSTORM WIND"
data$EVTYPE[grepl("FLASH FLOOD",data$EVTYPE)] <- "FLASH FLOOD"
data$EVTYPE[grepl("TORNDAO",data$EVTYPE)|
              grepl('TORNADO',data$EVTYPE)] <- "TORNADO"
data$EVTYPE[grepl("^HIGH WIND",data$EVTYPE)] <- "HIGH WIND" 
data$EVTYPE[grepl("^WIND",data$EVTYPE)|
              grepl("^HIGH$",data$EVTYPE)] <- "HIGH WIND"
data$EVTYPE[grepl("^HEAVY RAIN",data$EVTYPE)] <- "HEAVY RAIN"
data$EVTYPE[grepl("LIGHTNING",data$EVTYPE)] <- "LIGHTNING"
data$EVTYPE[grepl("COLD",data$EVTYPE)&
                     !grepl("EXTREME",data$EVTYPE)&
                     !grepl('SNOW',data$EVTYPE)] <- "COLD/WIND CHILL"
data$EVTYPE[grepl("^FLOOD",data$EVTYPE)&
         !grepl("FLASH",data$EVTYPE)] <- "FLOOD"
data$EVTYPE[grepl("EXTREME COLD",data$EVTYPE)|
              grepl("HYPOTHERMIA" ,data$EVTYPE)|
              grepl("LOW TEMPERATURE",data$EVTYPE)] <- "EXTREME COLD/WIND CHILL"
data$EVTYPE[grepl("FREEZE",data$EVTYPE)] <- "FROST/FREEZE"

# Flood has different categories so do Not include other flood categories
data$EVTYPE[grepl("FLOOD",data$EVTYPE)&
                   !grepl("FLASH",data$EVTYPE)&
                     !grepl("COSTAL",data$EVTYPE)&
                     !grepl("LAKESHORE",data$EVTYPE)&
                     !grepl("COASTAL",data$EVTYPE)&
                     !grepl("LAKE",data$EVTYPE)] <-"FLOOD"

data$EVTYPE[grepl("MARINE MISHAP",data$EVTYPE)] <- "MARINE HIGH WIND"
data$EVTYPE[(grepl("^HIGH",data$EVTYPE)& !grepl('WIND',data$EVTYPE)&!grepl("^HIGH$",data$EVTYPE))|
              grepl("HEAVY SEAS",data$EVTYPE)] <- "HIGH SURF"

data$EVTYPE[grepl("SEVERE TURBULENCE",data$EVTYPE)] <- "THUNDERSTORM WIND"
data$EVTYPE[grepl("SNOW",data$EVTYPE)&
                     !(grepl('LAKE',data$EVTYPE))] <- "HEAVY SNOW" 
data$EVTYPE[grepl("APACHE COUNTY",data$EVTYPE)] <- "THUNDERSTORM WIND"
data$EVTYPE[(grepl("TUNDERSTORM",data$EVTYPE)|
                  grepl("^TH.+M",data$EVTYPE)|
                  grepl("SEVERE TH.+M",data$EVTYPE))&
                !grepl("MARINE THUNDERSTORM WIND",data$EVTYPE)] <- "THUNDERSTORM WIND"
data$EVTYPE[grepl("TYPHOON",data$EVTYPE)] <- "HURRICANE (TYPHOON)"
data$EVTYPE[grepl("GUSTY",data$EVTYPE)] <- "STRONG WIND"
data$EVTYPE[grepl("LAKE.+SNOW",data$EVTYPE)] <- "LAKE-EFFECT SNOW"
data$EVTYPE[grepl("HAIL",data$EVTYPE)&
                     !grepl('MARINE',data$EVTYPE)] <- "HAIL" 
data$EVTYPE[grepl("SLIDE",data$EVTYPE)] <- "DEBRIS FLOW"
data$EVTYPE[grepl("TROPICAL",data$EVTYPE)&
                     !grepl('DEPRESSION',data$EVTYPE)] <- "TROPICAL STORM"
data$EVTYPE[grepl("RAIN",data$EVTYPE)] <- "HEAVY RAIN"
data$EVTYPE[grepl("FOG",data$EVTYPE)&
              !grepl('FREEZING',data$EVTYPE)] <- "DENSE FOG"
data$EVTYPE[grepl("COASTAL FLOODING",data$EVTYPE)|
                     grepl('COASTAL FLOODING/EROSION',data$EVTYPE)] <- "COASTAL FLOOD"
data$EVTYPE[grepl("HIGH.+WINDS$",data$EVTYPE)] <- "HIGH WIND"
data$EVTYPE[grepl("FIRE",data$EVTYPE)] <- "WILDFIRE"
data$EVTYPE[grepl("ICE",data$EVTYPE)&
                     !grepl("STORM",data$EVTYPE)] <- "FROST/FREEZE"
data$EVTYPE[grepl("FROST",data$EVTYPE)|
                      grepl("GLAZE",data$EVTYPE)|
                      grepl("ICY ROADS",data$EVTYPE)|
                      grepl("EARLY FROST",data$EVTYPE)] <- "FROST/FREEZE"
data$EVTYPE[grepl("DROUGHT",data$EVTYPE)] <- "DROUGHT"
data$EVTYPE[(grepl("HEAT",data$EVTYPE)&
                     grepl('EX',data$EVTYPE))|
              grepl("HYPERTHERMIA/EXPOSURE",data$EVTYPE)] <- "EXCESSIVE HEAT"
data$EVTYPE[grepl("CHILL",data$EVTYPE)&
                     grepl('EX',data$EVTYPE)] <- "EXTREME COLD/WIND CHILL"
data$EVTYPE[grepl("FLOOD",data$EVTYPE)&
                     grepl('FLASH',data$EVTYPE)] <- "FLASH FLOOD"
data$EVTYPE[grepl("SURF",data$EVTYPE)|
              grepl("ROGUE WAVE",data$EVTYPE)] <- "HIGH SURF"
data$EVTYPE[grepl("WINTER",data$EVTYPE)&
                     !grepl('STORM',data$EVTYPE)] <- "WINTER WEATHER"
data$EVTYPE[grepl("BLIZZARD",data$EVTYPE)] <- "BLIZZARD"
data$EVTYPE[(grepl("WIN[^DG].+",data$EVTYPE)|
                      grepl('COOL',data$EVTYPE)|
                      grepl('FREEZING\\s[^F]',data$EVTYPE))&
                     !grepl('STORM',data$EVTYPE)] <- "WINTER WEATHER"
data$EVTYPE[grepl("HEAVY\\s[PMS][^NUEW]'",data$EVTYPE)|
                     grepl("EXCESSIVE WETNESS",data$EVTYPE)|
                     grepl("PRECIP",data$EVTYPE)] <- "HEAVY RAIN"
data$EVTYPE[grepl("^[^MHST].+WIN[DS]$",data$EVTYPE)|
                     grepl('FORCE',data$EVTYPE)|
              grepl("DRY MIRCOBURST WINDS",data$EVTYPE)] <- "STRONG WIND"
data$EVTYPE[grepl("URBAN",data$EVTYPE)] <- "FLOOD"
data$EVTYPE[grepl("MICROBURST",data$EVTYPE)] <- "STRONG WIND"
data$EVTYPE[grepl("COASTAL.+[N]$",data$EVTYPE)] <- "COASTAL FLOOD"
data$EVTYPE[grepl("HEAT WAVE",data$EVTYPE)|
                     grepl("RECORD HEAT",data$EVTYPE)] <- "EXCESSIVE HEAT"
data$EVTYPE[grepl("WARM",data$EVTYPE)] <- "HEAT"
data$EVTYPE[grepl("WINTER STORMS",data$EVTYPE)] <- "WINTER STORM" 
data$EVTYPE[grepl("STRONG WINDS",data$EVTYPE)] <- "STRONG WIND"
data$EVTYPE[grepl("RIP CURRENTS",data$EVTYPE)] <- "RIP CURRENT"
data$EVTYPE[grepl("WATERSPOUT-",data$EVTYPE)] <- "WATERSPOUT"
data$EVTYPE[grepl("TSTMW",data$EVTYPE)] <- "THUNDERSTORM WIND"
data$EVTYPE[grepl("STORM SURGE",data$EVTYPE)|
              grepl("COASTAL SURGE",data$EVTYPE)] <- "STORM SURGE/TIDE"
data$EVTYPE[grepl("LIG.+G",data$EVTYPE)] <- "LIGHTNING"
data$EVTYPE[grepl("HEAVY SWELLS",data$EVTYPE)|
                     grepl("ROUGH SEAS",data$EVTYPE)] <- "HIGH SURF"
data$EVTYPE[grepl("NON-SEVERE WIND DAMAGE",data$EVTYPE)|
                     grepl("BLOWING DUST",data$EVTYPE)] <- "STRONG WIND" 
data$EVTYPE[grepl("MARINE TSTM WIND",data$EVTYPE)] <- "MARINE THUNDERSTORM WIND"
data$EVTYPE[grepl("LANDSPOUT",data$EVTYPE)] <- "TORNADO"
data$EVTYPE[grepl("DOWNBURST",data$EVTYPE)] <- "STRONG WIND"
data$EVTYPE[grepl("COASTAL STORM",data$EVTYPE)] <- "MARINE THUNDERSTORM WIND"
data$EVTYPE[grepl("ASTRONOMICAL HIGH TIDE",data$EVTYPE)] <- "ASTRONOMICAL LOW TIDE"
data$EVTYPE[grepl("BEACH EROSION",data$EVTYPE)] <- "DEBRIS FLOW"
data$EVTYPE[grepl("RAPIDLY RISING WATER",data$EVTYPE)] <- "FLOOD"
data$EVTYPE[grepl("DROWNING",data$EVTYPE)] <- "HEAVY RAIN"
data$EVTYPE[grepl("AVALANCE",data$EVTYPE)] <- "AVALANCHE"
data$EVTYPE[grepl("HEAVY MIX",data$EVTYPE)] <- "HEAVY SNOW"
data$EVTYPE[grepl("COASTALSTORM",data$EVTYPE)|
              grepl("HEAVY SHOWER",data$EVTYPE)] <- "HEAVY RAIN"
data$EVTYPE[grepl("SLEET/ICE STORM",data$EVTYPE)] <- "SLEET"
data$EVTYPE[grepl("LAKE FLOOD",data$EVTYPE)] <- "LAKESHORE FLOOD"
data$EVTYPE[grepl("LANDSLUMP",data$EVTYPE)] <- "DEBRIS FLOW"
data$EVTYPE[grepl("MARINE ACCIDENT",data$EVTYPE)] <- "HIGH SURF"
data$EVTYPE[grepl("DAM BREAK",data$EVTYPE)] <- "HEAVY RAIN"
data$EVTYPE[grepl("GUSTNADO",data$EVTYPE)] <- "THUNDERSTORM WIND"
data$EVTYPE[grepl("DUST DEVIL WATERSPOUT",data$EVTYPE)] <- "DUST DEVIL"

# remove '?' EVTYPE since no information for weather
data <- subset(data, EVTYPE!='?')

# for "OTHER" type, need to look up REMARKS column 
# get the row index and read all the REMARKS
rownames(data[data$EVTYPE=='OTHER',])
##  [1] "249470" "296122" "298603" "298604" "298605" "298606" "298607" "298608"
##  [9] "298609" "298610" "298611" "298612" "298613" "298614" "298615" "298616"
## [17] "298617" "298618" "298619" "298620" "298621" "298622" "298623" "298624"
## [25] "298625" "298626" "298627" "298628" "298629" "298630" "298631" "298632"
## [33] "399112" "435359"
# there were only 3 possible event types. Assign them using row index.
data["249470",'EVTYPE'] <- "AVALANCHE"
data[c("296122","399112","435359"),'EVTYPE'] <- "DUST DEVIL"
data[c("298603","298604","298605","298606","298607","298608","298609",
       "298610", "298611", "298612", "298613", "298614", "298615",
       "298616", "298617", "298618","298619", "298620", "298621",
       "298622" ,"298623", "298624", "298625", "298626", "298627",
       "298628", "298629", "298630", "298631", "298632"),'EVTYPE'] <- "HEAVY RAIN"

# Unlisted events down to 0....!!!!
unique(data$EVTYPE[!(data$EVTYPE %in% events)])
## character(0)

Storm Data has the records from 1950 to 2011. There are 48 official categories of events but from 1950 to 1992, only 1 - 3 events were recorded each year. Thus we subset the data from 1993.

# Change BGN_DATE column to Date class
library(lubridate)
date <- gsub('/','-',data$BGN_DATE); date <- gsub(' 0:00:00','',date)

# Change class of BGN_DATE and Make a new column for YEAR
data <- mutate(data, BGN_DATE=mdy(date),YEAR=year(BGN_DATE))

before <- dim(data)[1]
# filter the data only after 1993
data <- filter(data, YEAR >= '1993')
after <- dim(data)[1]

Now data has only records from 1993 to 2011. The number of records reduced to 227256 from 254632 .

And lastly, make new columns of the sum of the damages to make an analysis easier.

data <- mutate(data, 
               totFatInj=FATALITIES+INJURIES,
               totCropProp=totPropDmg+totCropDmg)

3.Results

3.1 The weather events that are most harmful with respect to population health

Table.2 is a list of top 5 weather events which caused the highest fatalities and injuries from 1993 to 2011. (see full list for appendix 1).

library(kableExtra)
healthDmgSum <- data %>% 
  group_by(EVTYPE) %>%
  summarize(Fatalities=sum(FATALITIES),Injuries=sum(INJURIES),Total=sum(totFatInj)) %>%
  arrange(desc(Total)) 

Rank <- 1:5
kbl(cbind(Rank,healthDmgSum[1:5,]), caption='Table.2: Top 5 Events Recorded Highest Fatalities/Injuries (Descending order of Total)',
             type='html')  %>% 
  kable_styling(latex_options = 'hold_position',full_width=F,position='left') %>%
  column_spec(2,bold=T) %>%
  column_spec(5,bold=T)
Table.2: Top 5 Events Recorded Highest Fatalities/Injuries (Descending order of Total)
Rank EVTYPE Fatalities Injuries Total
1 TORNADO 1649 23371 25020
2 EXCESSIVE HEAT 2196 7109 9305
3 FLOOD 513 6874 7387
4 THUNDERSTORM WIND 448 6182 6630
5 LIGHTNING 817 5232 6049

Figure.1. is the graph of total fatalities/injuries by year for each top 5 weather events.

library(ggplot2)
top5healthDmg <- healthDmgSum$EVTYPE[1:5]

PlotHealth <- data %>%
  select(c(YEAR,EVTYPE,totFatInj)) %>%
  filter(EVTYPE %in% top5healthDmg) %>%
  group_by(YEAR,EVTYPE) %>%
  summarize(YEARLY=sum(totFatInj))
  
g <- ggplot(PlotHealth,aes(x=YEAR,y=YEARLY,color=EVTYPE))
g <- g + geom_point() + geom_line()
g <- g + labs(title='Figure.1: Fatalities and Injuries by Year (1993-2011)',
          y='Total Fatalities/Injuries')
g

  • TORNADO has the highest figure in the records, especially in 2011, 77% of the total fatalities/injuries in this year were caused by TORNADO.
  • The EXCESSIVE HEAT has the highest fatalities among all the events.
  • The number of the fatalities/injuries by FLOOD were relatively low except 1998. The figure in this year was nearly 84% of all the fatalities/injuries caused by Flood in the record, which means this high figure is due to the record of this single year.

3.2 The weather events having the greatest economic consequences

Table.3 is a list of top 5 weather events which caused the greatest damages to crops and properties from 1993 to 2011. (see full list for appendix 2)

library(scales)
dollarFormat <- function(x) dollar(x)
economicDmgSum <- data %>%
  group_by(EVTYPE) %>%
  summarize('Crop Damage'=sum(totCropDmg),
            'Property Damage'=sum(totPropDmg),
            Total=sum(totCropProp)) %>%
  arrange(desc(Total)) %>%
  mutate_if(is.numeric,dollarFormat)

Rank <- 1:5
kbl(cbind(Rank,economicDmgSum[1:5,]),
    align=c('c','l','r','r','r'),
    caption='Table.3: Top 5 Events Recorded Highest Damages to Crops/Properties (Descending order of Total)',
             type='html')  %>% 
  kable_styling(latex_options = 'hold_position',full_width=F,position='left') %>%
  column_spec(2,bold=T) %>%
  column_spec(5,bold=T)
Table.3: Top 5 Events Recorded Highest Damages to Crops/Properties (Descending order of Total)
Rank EVTYPE Crop Damage Property Damage Total
1 FLOOD $10,855,961,050 $150,234,461,439 $161,090,422,489
2 HURRICANE (TYPHOON) $5,516,117,800 $85,356,410,010 $90,872,527,810
3 STORM SURGE/TIDE $855,000 $47,965,224,000 $47,966,079,000
4 TORNADO $417,462,960 $27,994,909,080 $28,412,372,040
5 HAIL $3,046,887,803 $15,974,542,434 $19,021,430,237

Figure.2 is the graph of total damages to crops/properties by year for each top 5 weather events.

top5economicDmg <- economicDmgSum$EVTYPE[1:5]

PlotEconomy <- data %>%
  select(c(YEAR,EVTYPE,totCropProp)) %>%
  filter(EVTYPE %in% top5economicDmg) %>%
  group_by(YEAR,EVTYPE) %>%
  summarize(YEARLY=sum(totCropProp)/1e+6)
  
g <- ggplot(PlotEconomy,aes(x=YEAR,y=YEARLY,color=EVTYPE))
g <- g + geom_point() + geom_line()
g <- g + labs(title='Figure.2: Crops and Properties Damages by Year (1993-2011)',
          y='Total Damages to Crops/Properties (in Million US dollars)')
g

As we can see in Figure.2, most of the economic damages to crops and properties have been caused by three events, FLOOD, HURRICANE (TYPHOON) and STORM SURGE/TIDE between 2004 and 2006.

References

How To Handle Exponent Value of PROPDMGEXP and CROPDMGEXP

Storm Data Documentation

Appendix

1.The Full list of total Fatalities/Injuries for each event.

kable(healthDmgSum,type='html') %>% 
  kable_styling(font_size=10,full_width=F,position='left')
EVTYPE Fatalities Injuries Total
TORNADO 1649 23371 25020
EXCESSIVE HEAT 2196 7109 9305
FLOOD 513 6874 7387
THUNDERSTORM WIND 448 6182 6630
LIGHTNING 817 5232 6049
HEAT 977 2119 3096
FLASH FLOOD 1035 1802 2837
ICE STORM 89 1975 2064
HIGH WIND 319 1574 1893
WILDFIRE 90 1608 1698
WINTER STORM 216 1338 1554
HURRICANE (TYPHOON) 135 1333 1468
HEAVY SNOW 166 1159 1325
DENSE FOG 80 1076 1156
RIP CURRENT 572 529 1101
HAIL 10 970 980
BLIZZARD 101 805 906
WINTER WEATHER 65 630 695
EXTREME COLD/WIND CHILL 319 260 579
STRONG WIND 120 350 470
DUST STORM 22 440 462
HIGH SURF 188 263 451
FROST/FREEZE 22 427 449
TROPICAL STORM 66 383 449
HEAVY RAIN 112 329 441
AVALANCHE 225 170 395
COLD/WIND CHILL 149 61 210
TSUNAMI 33 129 162
DEBRIS FLOW 44 55 99
STORM SURGE/TIDE 24 43 67
MARINE THUNDERSTORM WIND 22 36 58
DUST DEVIL 2 47 49
MARINE STRONG WIND 14 22 36
WATERSPOUT 3 29 32
DROUGHT 6 19 25
MARINE HIGH WIND 8 6 14
COASTAL FLOOD 6 7 13
FUNNEL CLOUD 0 3 3
SLEET 2 0 2
ASTRONOMICAL LOW TIDE 0 0 0
DENSE SMOKE 0 0 0
FREEZING FOG 0 0 0
LAKE-EFFECT SNOW 0 0 0
LAKESHORE FLOOD 0 0 0
MARINE HAIL 0 0 0
SEICHE 0 0 0
TROPICAL DEPRESSION 0 0 0
VOLCANIC ASH 0 0 0

2.The full list of total damages to the crops/properties for each event.

kable(economicDmgSum,type='html') %>% 
  kable_styling(font_size=10,full_width=F,position='left')
EVTYPE Crop Damage Property Damage Total
FLOOD $10,855,961,050 $150,234,461,439 $161,090,422,489
HURRICANE (TYPHOON) $5,516,117,800 $85,356,410,010 $90,872,527,810
STORM SURGE/TIDE $855,000 $47,965,224,000 $47,966,079,000
TORNADO $417,462,960 $27,994,909,080 $28,412,372,040
HAIL $3,046,887,803 $15,974,542,434 $19,021,430,237
FLASH FLOOD $1,532,197,150 $16,907,931,420 $18,440,128,570
DROUGHT $13,972,621,780 $1,046,306,000 $15,018,927,780
THUNDERSTORM WIND $1,271,666,258 $10,970,671,099 $12,242,337,357
ICE STORM $5,022,113,500 $3,944,928,310 $8,967,041,810
WILDFIRE $403,281,630 $8,496,628,500 $8,899,910,130
TROPICAL STORM $694,896,000 $7,714,390,550 $8,409,286,550
HIGH WIND $692,661,900 $6,074,134,763 $6,766,796,663
WINTER STORM $27,444,000 $6,688,997,260 $6,716,441,260
HEAVY RAIN $949,540,200 $3,247,288,690 $4,196,828,890
FROST/FREEZE $1,997,061,800 $37,117,700 $2,034,179,500
EXTREME COLD/WIND CHILL $1,330,023,000 $77,190,400 $1,407,213,400
HEAVY SNOW $134,663,100 $986,222,767 $1,120,885,867
LIGHTNING $12,092,090 $933,743,920 $945,836,010
BLIZZARD $112,060,000 $659,813,950 $771,873,950
EXCESSIVE HEAT $502,952,000 $18,328,750 $521,280,750
COASTAL FLOOD $56,000 $428,382,060 $428,438,060
HEAT $401,471,500 $1,797,000 $403,268,500
DEBRIS FLOW $20,017,000 $327,496,100 $347,513,100
STRONG WIND $70,178,500 $186,952,840 $257,131,340
COLD/WIND CHILL $96,742,500 $58,644,000 $155,386,500
TSUNAMI $20,000 $144,062,000 $144,082,000
HIGH SURF $0 $102,067,000 $102,067,000
WINTER WEATHER $20,000,000 $27,455,500 $47,455,500
LAKE-EFFECT SNOW $0 $40,682,000 $40,682,000
DENSE FOG $0 $22,829,500 $22,829,500
ASTRONOMICAL LOW TIDE $0 $9,745,000 $9,745,000
WATERSPOUT $0 $9,564,200 $9,564,200
DUST STORM $3,100,000 $5,549,000 $8,649,000
LAKESHORE FLOOD $0 $7,570,000 $7,570,000
MARINE THUNDERSTORM WIND $50,000 $5,907,400 $5,957,400
AVALANCHE $0 $3,771,800 $3,771,800
FREEZING FOG $0 $2,182,000 $2,182,000
TROPICAL DEPRESSION $0 $1,737,000 $1,737,000
MARINE HIGH WIND $0 $1,297,010 $1,297,010
SEICHE $0 $980,000 $980,000
DUST DEVIL $0 $724,630 $724,630
SLEET $0 $500,000 $500,000
VOLCANIC ASH $0 $500,000 $500,000
MARINE STRONG WIND $0 $418,330 $418,330
FUNNEL CLOUD $0 $194,600 $194,600
RIP CURRENT $0 $163,000 $163,000
DENSE SMOKE $0 $100,000 $100,000
MARINE HAIL $0 $4,000 $4,000

3.Code used in 3.1 for observation.

# SECTION 3-1 OBSERVATION OF THE FATALITIES AND INJURIES
# Calculate 2011 tornado percentage in total fatalities/injuries
data2011 <- data %>%
  filter(YEAR=='2011') %>%
  group_by(EVTYPE) %>%
  summarize(S=sum(totFatInj))

pctTornado <- round(data2011$S[data2011$EVTYPE=='TORNADO'] / sum(data2011$S),2) * 100
# Calculate flood percentage of 1998 
flood1998 <- filter(data, EVTYPE=='FLOOD' & YEAR=='1998') 
flood <- filter(data,EVTYPE=='FLOOD') 

pct1998 <- round(sum(flood1998$totFatInj) / sum(flood$totFatInj),2)*100