Impact of Weather Events on Population Health and Economic Losses

An Analysis of NOAA’s Storm Events Database

Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. This report provides an analysis as to which weather events effect both aspects across the United States. It was found that “Heat” related events have caused the most fatalities, followed by “Tornado” and “Flood” related events. “Flash Flood” had the highest occurences among the top 5 fatality events. In respect of economic losses, It was found that “Flood” events have caused the most property damage, followed by “Hurricane/Typhoon”. “Drought” events, on the other hand, have caused the most crop damage, followed by “Flood”. However, it was observed how significantly the damage caused by Flood was higher than Hurricane/Typhoon. A further investigation would be necessary to see if any recording errors have occured.

The NOAA’s storm events database collected records from 1950 to 2011. Due to lack of diversed event types, records prior to 1993 were excluded in this analysis.

Data Source

Database

The dataset was obtained from the course web site. It came in the form of a comma-separated-value file compressed in the bzip2 algorithm.

file name: “repdata-data-StormData.csv.bz2”

Storm Data Event Table

The standard 48 event types were obtained from NWS Directive 10-1605 page6, section 2.1.1

This NOAA webpage provided a timeline of how event type records have progressed over the years.

Consumer Price Index

The data ranged from 1950 to 2011, therefore infation must be considered. In this analysis, all dollar amounts were adjusted with the U.S. Consumer Price Index(CPI).

CPI was download from BUREAU of Labor Statistics :

series ID CUUR0000SA0 (Not Seasonally Adjusted. Area: U.S. city average. Item: All items. Base Period: 1982-84=100).

The downloaded file was in MS Excel (.xlsx) format and then saved as a comma-separated-value file for R coding.

Because obtaining CPI data required website “point-and-click” by hand, the detailed process were documented in this github repository

file name: “cpi_1950_2011.csv”, “Getting CPI data from BUREAU of Labor Statistics.pdf”

# Check and load required R packages
pkg<-c("knitr", "ggplot2", "grid", "dplyr", "tidyr")
pkgCheck<-pkg %in% rownames(installed.packages())
for(i in 1:length(pkg)) {
    if(pkgCheck[i]==FALSE) {
        install.packages(pkg[i])
    } 
    library(pkg[i],character.only = TRUE)
}

Data Processing

Loading Raw Data

con<-bzfile("repdata-data-StormData.csv.bz2", "r")
storm<- read.csv(con)
close(con)

#set table and figure counter
tnum<- 0L
fnum<- 0L

Variables

colnames(storm)

##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Among 37 variables in the dataset. We were interested in only the following for the intended analysis:

event type: EVTYPE
population health : FATALITIES / INJURIES
economic consequences: PROPDMG / CROPDMG for “number”, PROPDMGEXP / CROPDMGEXP for “magnitude”. The damage dollar amount would be “number” x “magnitude”
Others that may be useful for analysis: STATE / BGN_DATE

str(storm[, c(2,7,8,23,24:28)])

## 'data.frame':    902297 obs. of  9 variables:
##  $ BGN_DATE  : Factor w/ 16335 levels "1/1/1966 0:00:00",..: 6523 6523 4242 11116 2224 2224 2260 383 3980 3980 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...

# convert string to Date format
storm$BGN_DATE<- strptime(storm$BGN_DATE, "%m/%d/%Y %H:%M:%S")
storm$YEAR<- storm$BGN_DATE$year+1900

Cleaning and Processing Data

1. Property and Crop Damage

1.1 Check data quality

As stated in NWS Directive 10-1605 page12, section 2.7 Damage: “…Alphabetical characters used to signify magnitude include”K" for thousands, “M” for millions, and “B” for billions…"

However, the dataset had many magnitudes that seemed invalid:

levels(storm$PROPDMGEXP) #property damage

##  [1] ""  "-" "?" "+" "0" "1" "2" "3" "4" "5" "6" "7" "8" "B" "h" "H" "K"
## [18] "m" "M"

levels(storm$CROPDMGEXP) # crop damage

## [1] ""  "?" "0" "2" "B" "k" "K" "m" "M"

1.2 Exame the invalid damage magnitudes

It was assumed that “H” and “h” signified “Hundred” and thus included it as a “valid” magnitudes along with “K”, “M”, and “B”. Below codes examed how many events were marked with invalid damage magnitudes. All letter magnitudes were unified to lower cases.

# assume h="hundred". use lower cases
mag<-c("h", "k", "m", "b")

# (inline codes) check invalid magnitudes with numbers(ie. PROPDMG>0 or CROPDMG>0)
invalidP<- which(!tolower(storm$PROPDMGEXP) %in% mag & storm$PROPDMG>0)
invalidPROPDMG<- list(Count=length(invalidP),
                      Count.Perc=length(invalidP)/nrow(storm[storm$PROPDMG>0,]),
                      Number.Perc=sum(storm$PROPDMG[invalidP])/sum(storm$PROPDMG),
                      Count.byYear=with(storm[invalidP, ], table(tolower(PROPDMGEXP), YEAR)))

invalidC<- which(!tolower(storm$CROPDMGEXP) %in% mag & storm$CROPDMG>0)
invalidCROPDMG<- list(Count=length(invalidC),
                      Count.Perc=length(invalidC)/nrow(storm[storm$CROPDMG>0,]),
                      Number.Perc=sum(storm$CROPDMG[invalidC])/sum(storm$CROPDMG),
                      Count.byYear=with(storm[invalidC, ], table(tolower(CROPDMGEXP), YEAR)))

Number of events with invalid damage magnitude by year, Property and Crop:

invalidPROPDMG$Count.byYear

##    YEAR
##     1993 1994 1995
##        6    3   67
##   -    0    0    1
##   +    0    1    4
##   0    1   25  183
##   2    0    0    1
##   3    0    0    1
##   4    0    1    3
##   5    0    1   17
##   6    0    0    3
##   7    0    0    2

invalidCROPDMG$Count.byYear

##    YEAR
##     1994 1995
##        3    0
##   0    8    4

Among all events with a property damage number record(ie. PROPDMG > 0), ~0.1% (320 events) were marked with an invalid magnitude, and all of them were before year 1995.

Among all the events with a crop damage number record(ie. CROPDMG > 0), ~0.1% (15 events) were marked with an invalid magnitude, and all of them were before year 1995.

Rather than assuming(guessing) how to correct the invalid damage magnitudes events, we decided to ignore them since they had very minimum impact to the intended analysis.

1.3 Calculate damage dollar amount

Calculated property and crop damage dollar amount in Millions for each event:

if number>0 and magnitude was valid, the dollar amount would be calculated by multiplying the two
if the number=0, the dollar amount would be 0 as well
if the magnitude was invalid, the dolllar amount would be “NA”

# set multipliers
mag10<-10^c(2, 3, 6, 9)

storm$PROPDMG.M<- storm$PROPDMG * mag10[match(tolower(storm$PROPDMGEXP), mag)]/10^6
storm$CROPDMG.M<- storm$CROPDMG * mag10[match(tolower(storm$CROPDMGEXP), mag)]/10^6

#if damage number=0, its dollar amount should be also 0, not NA
storm$PROPDMG.M[storm$PROPDMG==0]<- 0
storm$CROPDMG.M[storm$CROPDMG==0]<- 0

1.4 Adjustment for inflation

# read cpi file, use the "Annual" column
cpi<- read.csv("cpi_1950_2011.csv", skip=10, colClasses=c(NA,rep("NULL",12),NA,"NULL","NULL"))

# index for converting to present value
cpi<- transform(cpi, pv=Annual[Year==2011]/Annual)

# calculate adjmsted dollar amount
storm<- transform(storm, adjPROPDMG.M=round(PROPDMG.M * cpi[match(YEAR, cpi$Year),"pv"],0), 
                  adjCROPDMG.M=round(CROPDMG.M * cpi[match(YEAR, cpi$Year),"pv"],0))

2. Event Type Grouping

2.1 Check data quality

# EVTYPE factor levels
nlevels(storm$EVTYPE)

## [1] 985

# event types with equal or less than 5 & only 1 record in the database
list(five.or.less= sum(table(storm$EVTYPE)<6),
     only.one= sum(table(storm$EVTYPE)==1))

## $five.or.less
## [1] 760
## 
## $only.one
## [1] 489

Among the dataset’s 985 event types, many had only 5 or less records, and almost half had only one record. It would be beneficial to the analysis to group them into the standard 48 event types defined in NWS Directive 10-1605. Below 2.2 to 2.5 sections show how the event-matching was achieved.

2.2 Get standard event type table

In keeping with the principle of “Don’t do things by hand”, we import the event table from the NWS Directive 10-1605 document.

url <- "http://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf"
dest <- tempfile(fileext = ".pdf")
download.file(url, dest, mode = "wb")

# set path to pdftotxt.exe and convert pdf to text
exe <- "C:\\Program Files (x86)\\Git\\bin\\pdftotext.exe"
system(paste("\"", exe, "\" \"", dest, "\"", sep = ""), wait = F)

# get txt-file name and rename it  
filetxt <- sub(".pdf", ".txt", dest)
fsave<- ".\\NWS Directive 10-1605.txt"
file.rename(filetxt, fsave)
evtb<- readLines(fsave)

t<-grep("Event Name", evtb)[1] # get the line where the event type table begins
evtb<-tolower(evtb[c((t+1):(t+24), (t+29):(t+52))])
evtb[25]<- "hurricane/typhoon"  #change from "hurrican (typhoon)"

#. move the order of the following event types up in the hierachy: "Flood", "Heat", "strong wind", "thunderstorm wind"
evtb<-evtb[c(15,20,38,39,1:14,16:19,21:37, 40:48)]
print(evtb)

##  [1] "flood"                    "heat"                    
##  [3] "strong wind"              "thunderstorm wind"       
##  [5] "astronomical low tide"    "avalanche"               
##  [7] "blizzard"                 "coastal flood"           
##  [9] "cold/wind chill"          "debris flow"             
## [11] "dense fog"                "dense smoke"             
## [13] "drought"                  "dust devil"              
## [15] "dust storm"               "excessive heat"          
## [17] "extreme cold/wind chill"  "flash flood"             
## [19] "frost/freeze"             "funnel cloud"            
## [21] "freezing fog"             "hail"                    
## [23] "heavy rain"               "heavy snow"              
## [25] "high surf"                "high wind"               
## [27] "hurricane/typhoon"        "ice storm"               
## [29] "lake-effect snow"         "lakeshore flood"         
## [31] "lightning"                "marine hail"             
## [33] "marine high wind"         "marine strong wind"      
## [35] "marine thunderstorm wind" "rip current"             
## [37] "seiche"                   "sleet"                   
## [39] "storm surge/tide"         "tornado"                 
## [41] "tropical depression"      "tropical storm"          
## [43] "tsunami"                  "volcanic ash"            
## [45] "waterspout"               "wildfire"                
## [47] "winter storm"             "winter weather"

2.3 Group (match) database events with the standard 48 event types

By referencing at the variable EVTYPE, some general discrepancies were noticed and modified before the main event-matching task as below :

# transform datbase event type factors to lower cases
f<- tolower(levels(storm$EVTYPE))

# thunderstorm wind
f<- gsub("tstm","thunderstorm wind", f)
# NOAA glossary(http://forecast.weather.gov/glossary.php?word=tstm), tstm=thunderstorm
f<- gsub("thunderstorm","thunderstorm wind", f)
f<- gsub("burst","thunderstorm wind", f)

# from NWS Directive  10-1605 page 1 Summary of Revisions: "Landslide" was renamed to "Debris Flow"
f<- gsub("landslide","debris flow", f)

# hurricane/typhoon
f<- gsub("^typhoon","hurricane/typhoon", f)
f<- gsub("hurricane","hurricane/typhoon", f)

# others/typos
f<- gsub("wintery","winter", f)
f<- gsub("wintry","winter", f)
f<- gsub("avalance","avalanche", f)
f<- gsub("fld","flood", f)

Main event-matching task using for-loop. Events that failed to match were grouped as “others” :

fClean<- rep("others", times=length(f))
for (i in 1:length(evtb)) {
    fClean[grep(evtb[i], f)]<- evtb[i]
}
rm(i)

Further grouping for events that could be clearly matched :

fClean[grepl("wild[ /]", f) & fClean=="others"]<- "wildfire"
fClean[grepl("extreme cold", f) & fClean=="others"]<- "extreme cold/wind chill"
fClean[grepl("surge", f) & fClean=="others"]<- "storm surge/tide"
fClean[grepl("surf", f) & fClean=="others"]<- "high surf"

#(inline codes) number of event typs in the "others" group
unmatchedEV<- sum(fClean=="others")

2.4 Exame unmatched event types

There were 451 event types not successfully matched. They were generally ambiguous, and were examed with following codes :

unmatched<- which(storm$EVTYPE %in% levels(storm$EVTYPE)[fClean=="others"])

# check unmatched event types that have less then 10/only one record throught out 62 years of data
list(five.or.less= sum(table(droplevels(storm$EVTYPE[unmatched]))<6),
     only.one= sum(table(droplevels(storm$EVTYPE[unmatched]))==1))

## $five.or.less
## [1] 361
## 
## $only.one
## [1] 223

# (inline codes) exame the unmatched events
unmatchedCheck<- list(Count=length(unmatched),
                      Count.Perc=length(unmatched)/nrow(storm), 
                      Harm.Dmg.Perc=colSums(storm[unmatched, c("FATALITIES","INJURIES","adjPROPDMG.M","adjCROPDMG.M")], na.rm=T)/colSums(storm[,c("FATALITIES","INJURIES","adjPROPDMG.M","adjCROPDMG.M")], na.rm=T))

tnum=tnum+1
kable(data.frame(Variables=names(unmatchedCheck[[3]]), Perc=paste0(format(unmatchedCheck[[3]]*100, digit=1),"%")), col.names=c("Variables","Unmatched event %"), row.names=F)

Variables	Unmatched event %
FATALITIES	2.22%
INJURIES	1.15%
adjPROPDMG.M	0.04%
adjCROPDMG.M	2.68%

Table 1. Percentage of harm and damage variables from unmatched events.

The unmatched events represented 0.5% (4781 rows) of the dataset. Many had only 5 or less records, and almost half had only one record. They were accounted for little share of fatalities, injuries, property and crop damage. It was therefore decided that no further event-matching was needed. The unmatched events were grouped as “others”.

2.5. Create “Clean” event type variable

storm$EV.Clean<- storm$EVTYPE
levels(storm$EV.Clean)<- fClean

#check against 48 standard event type table
length(intersect(levels(storm$EV.Clean), evtb)) ; setdiff(levels(storm$EV.Clean), evtb)

## [1] 48

## [1] "others"

The “Clean” event types were labeled with 49 categories, 48 from the standard event and 1 from “others”.

3. Event Type over Years

According to the NOAA website, only 3 event types(tornado, thunderstorm wind, and hail) were reported prior to 1993. This was examed with the codes below:

main3EV<- c("tornado", "thunderstorm wind", "hail")
tnum=tnum+1

#(inline cods) number of "non-tornado/thunder"
tb<-table(storm$EV.Clean %in% main3EV, storm$YEAR>1992)

Event Type	Before 1993	1993 Onward
Main 3*	187559	486993
All Others	0	227745

* Main 3 event types: tornado, thunderstorm, and hail

Table 2 Number of events reported before and after 1993 by event types

fnum=fnum+1
ggplot(storm, aes(x=YEAR)) + 
    geom_histogram(aes(fill = EV.Clean %in% main3EV)) + 
    labs(title = "Number of Events Reported\nAll States, 1993-2011", x = "Year", y = "Number of Events Reported") + 
    scale_fill_discrete(name="Event Type", breaks=c("TRUE", "FALSE"), labels=c("Main 3*","All Others")) + 
    theme_classic() + 
    scale_x_continuous(limits=c(1950,2011), breaks=c(seq(1951,2011,6))) + 
    theme(title = element_text(size=14, face="bold"), axis.title=element_text(size=12, face="bold"), 
          panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank()) + 
    geom_vline(x=1993, linetype="dotted", colour="red", size=2)

* Main 3 event types: tornado, thunderstorm, and hail

Figure 1 Number of events reported over time

Indeed there were only the main 3 event types reported before 1993. Therefore, data from 1950 to 1992 were excluded from this analysis.

4. Data Selection

Selected only data of year 1993-2011 and interested variables

storm1993on<- subset(storm, YEAR>1992, select=c("STATE","YEAR","EV.Clean","FATALITIES","INJURIES","adjPROPDMG.M","adjCROPDMG.M"))

Data Processing Summary

Property and Crop Damage

Recognized only valid magnitudes “H, K, M, B”(hundred, thousand, million, and billion respectedly). Ignored records with invalid magnitudes.
Calculated damage dollar amount in million and adjusted by CPI

Event Type

Regrouped event types, initially 985, into the standard 48 categories.
Grouped ambiguous events as “others”

Years

Used only data from 1993 to 2011 for this analysis due to the drastic change of event types reported.

Results

With Respect to Population Health

The Question : Across the United States, which types of events are most harmful with respect to population health?

harm.byEV <- storm1993on %>%
    select(EV.Clean, FATALITIES, INJURIES) %>%
    group_by(EV.Clean) %>%
    summarize(count=n(), fat=sum(FATALITIES), inj=sum(INJURIES)) %>% 
    mutate(fat.perc=prop.table(fat), inj.perc=prop.table(inj)) %>%
    arrange(desc(fat))

harmTop5<- cbind(harm.byEV[c(1:5),c(1,3,5)], arrange(harm.byEV, desc(inj))[c(1:5),c(1,4,6)])
countTop5<- arrange(harm.byEV, desc(count))[c(1:5),c(1:3)]

tnum<- tnum+1
kable(transform(harmTop5, fat.perc=paste0(format(fat.perc*100,digit=0),"%") ,inj.perc=paste0(format(inj.perc*100,digit=0),"%")), 
           col.names=c("Top 5 Fatality Events","Fatalities","%", "Top 5 Injury Events","Injuries","%"), 
           row.names=F)

Top 5 Fatality Events	Fatalities	%	Top 5 Injury Events	Injuries	%
excessive heat	1922	18%	tornado	23328	34%
tornado	1646	15%	flood	6874	10%
heat	1212	11%	excessive heat	6525	9%
flash flood	1035	10%	thunderstorm wind	6116	9%
lightning	817	8%	lightning	5232	8%

Table 3 Top 5 events that caused most fatalities and injuries

“Excessive Heat” was high on both fatality and injury top 5 lists. Another heat related event “Heat” also showed up on the top 5 fatality list. “Tornado” caused most injuries. Three weather events were on both top 5 lists: Excessive Heat, Tornado, and Lightning.

harmTop5.F<- as.character(harmTop5[,1])
harm.byEVplot<- transform(harm.byEV, Event.Type=EV.Clean)[,c(2,3,7)]
levels(harm.byEVplot$Event.Type)[which(!levels(harm.byEVplot$Event.Type) %in% harmTop5.F)]<- "all other events"

fnum<- fnum+1
#the legend of this chart needs some work
ggplot(harm.byEVplot, aes(x=count, y=fat, group= Event.Type)) + 
    geom_point(aes(color= Event.Type, size= Event.Type, alpha=Event.Type)) + 
    scale_color_manual(values=c("black","red","deepskyblue","green4","purple","orange")) + 
    scale_alpha_manual(values=c(0.5,1,1,1,1,1)) + 
    scale_size_manual(values=c(5,8,8,8,8,8)) + 
    #scale_fill_manual("", name="Event Type", breaks=c(harmTop5.F, "all other events"), labels=c(harmTop5.F, "all other events"))  + 
    labs(title = "Weather Events Fatalities by Number of Occurences\nAll States, 1993-2011", x = "Number of Events Reported", y = "Fatalities") + 
    #guides(fill=guide_legend(title="Event Type")) + 
    scale_fill_discrete(name="Event Type", breaks=c(harmTop5.F, "all other events"), labels=c(harmTop5.F, "all other events"))  + 
    theme_bw() + 
    theme(title = element_text(size=14, face="bold"), axis.title=element_text(size=12, face="bold"), 
          panel.grid.major.x = element_line(colour="gray90", size=0.3), panel.grid.major.y = element_line(colour="gray90", size=0.3), legend.position="right")

Figure2 Weather Events Fatalities by Number of Occurences

Among the top 5 fatality events, “Flash Flood” events have occured the most time. It was also observed that “thunderstorm wind”, which ranked no.8 in fatality and no.4 in injury, has had the highest occurences in all events.

With Respect to Economic Consequences

The Question : Across the United States, which types of events have the greatest economic consequences?

dmg.byEV <- storm1993on %>%
    select(EV.Clean, adjPROPDMG.M, adjCROPDMG.M) %>%
    group_by(EV.Clean) %>%
    summarize(count=n(), prop=sum(adjPROPDMG.M, na.rm=T)/10^3, crop=sum(adjCROPDMG.M, na.rm=T)/10^3) %>%  # damage dollors changed to "Billions"
    mutate(prop.perc=prop.table(prop), crop.perc=prop.table(crop),
           total=prop+crop, total.perc=prop.table(total)) %>%
    arrange(desc(prop))

dmgTop5<- cbind(dmg.byEV[c(1:5),c(1,3,5)], arrange(dmg.byEV, desc(crop))[c(1:5),c(1,4,6)], 
                arrange(dmg.byEV, desc(total))[c(1:5),c(1,7,8)])
dmgCountTop5<- arrange(dmg.byEV, desc(count))[c(1:5),c(1:4,7)]

tnum<- tnum+1
kable(transform(dmgTop5, prop.perc=paste0(format(prop.perc*100,digit=0),"%"), crop.perc=paste0(format(crop.perc*100,digit=0),"%"), total.perc=paste0(format(total.perc*100,digit=0),"%")), 
           col.names=c("Top 5 Property Damage","$B","%", "Top 5 Crop Damage","$B","%", "Top 5 Property+Crop Damage","$B","%"), row.names=F)

Top 5 Property Damage	$B	%	Top 5 Crop Damage	$B	%	Top 5 Property+Crop Damage	$B	%
flood	171.709	37%	drought	17.778	28%	flood	186.489	35%
hurricane/typhoon	102.190	22%	flood	14.780	23%	hurricane/typhoon	109.124	21%
storm surge/tide	54.801	12%	ice storm	7.619	12%	storm surge/tide	54.802	10%
tornado	32.420	7%	hurricane/typhoon	6.934	11%	tornado	32.902	6%
flash flood	19.937	4%	hail	3.581	6%	hail	22.420	4%

Table 4 Top 5 events that caused most proterty and crop damages (in $Billions)

dmgTop10EV<- as.character(arrange(dmg.byEV, desc(total))[c(1:10),]$EV.Clean)
dmgRegroup<- dmg.byEV %>%
    arrange(desc(total)) %>%
    select(c(1,3,4)) %>%
    gather(dmg.type, dmg.M, -EV.Clean)

levels(dmgRegroup$EV.Clean)[which(!levels(dmgRegroup$EV.Clean) %in% dmgTop10EV)]<- "all other events"

dmg.EV.order<- droplevels(rbind(arrange(dmg.byEV, desc(total))[c(1:10),c(1,7)], data.frame(EV.Clean="all other events", total=sum(dmgRegroup$dmg.M[dmgRegroup$EV.Clean=="all other events"]))))
dmg.EV.order<- as.character(arrange(dmg.EV.order, desc(total))$EV.Clean)

fnum=fnum+1
ggplot(dmgRegroup, aes(x=EV.Clean, y=dmg.M, fill=dmg.type)) + 
    geom_bar(stat = "identity", width=.7) + coord_flip() + 
    labs(title = "Weather Events Property and Crop Damage\nAll States, 1993-2011", x = "Event Type (damage:  less  -->  more)", y = "Damage ($Billions)") + 
    scale_fill_discrete(name="Damage Type", labels=c("Properties","Crops")) +
    scale_x_discrete(limits=rev(dmg.EV.order)) + 
    scale_y_continuous(limits=c(0,200), breaks=seq(0,200,25)) + 
    theme_bw() + 
    theme(title = element_text(size=14, face="bold"), axis.title=element_text(size=12, face="bold"), 
          panel.grid.major.x = element_line(colour="gray50", size=0.3, linetype="dotted"), panel.grid.major.y = element_blank(),
          legend.position=c(0.9,0.4))

Figure 3 Weather Events Property and Crop Damages

“Flood” has caused the most property damage, followed by “Hurricane/Typhoon”. “Drought” has caused the most crop damage, followed by “Flood”.

It was surprising how Flood has caused significantly more(about 70%) of property damage than Hurricane/Typhoon, considering the recent events such as Hurricane Sandy and Hurricane Katrina. A further investigation would be necessary to see if any recording errors have occured, for example, incorrect event type or incorrect damage magnitude.