Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and crop and property damage. The purpose of this analysis is to explore and identify consequences of the adverse weather events that have caused (1) the greatest number of fatalities and injuries to the US population (population health), (2) as well as inflicted the maximum damage to the US economy (damage on properties and crops). We are using the U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm Database as the input for this analysis. Our analysis shows that between the years 1950 and 2011 in the US:
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration's (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
In order to ensure that all the strings returned by R are in English, we set appropriate locale for language (US English).
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
To ensure that all of our code and results are shown in an analysis document we set appropriate global options for knitr.
### Set Global Options for knitr
opts_chunk$set(echo = TRUE, results='markup' )
We load all necessary packages for our analysis.
require(utils)
require(R.utils)
## Loading required package: R.utils
## Loading required package: R.oo
## Loading required package: R.methodsS3
## R.methodsS3 v1.6.1 (2014-01-04) successfully loaded. See ?R.methodsS3 for help.
## R.oo v1.18.0 (2014-02-22) successfully loaded. See ?R.oo for help.
##
## Attaching package: 'R.oo'
##
## The following objects are masked from 'package:methods':
##
## getClasses, getMethods
##
## The following objects are masked from 'package:base':
##
## attach, detach, gc, load, save
##
## R.utils v1.32.4 (2014-05-14) successfully loaded. See ?R.utils for help.
##
## Attaching package: 'R.utils'
##
## The following object is masked from 'package:utils':
##
## timestamp
##
## The following objects are masked from 'package:base':
##
## cat, commandArgs, getOption, inherits, isOpen, parse, warnings
require(data.table)
## Loading required package: data.table
require(ggplot2)
## Loading required package: ggplot2
require(grid)
## Loading required package: grid
require(xtable)
## Loading required package: xtable
Download and unzip the U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm Database.
# Download and unzip input file if it does not exist in current directory
filename <- "repdata_data_StormData.csv"
filename.zip <- paste0(filename, ".bz2")
url<-"https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
if (file.exists(filename) == FALSE) {
if (file.exists(filename.zip) == FALSE) {
download.file(url, destfile = filename.zip, method = "curl",
quiet = TRUE)
}
bunzip2(filename.zip)
}
First, we define the appropriate classes and then load the input file into data frame.
cclasses <- c("numeric", "character", "character", "character", "numeric", "character",
"character", "character", "numeric", "character", "character",
"character", "character", "numeric", "character", "numeric", "character",
"character", "numeric", "numeric", "character", "numeric", "numeric",
"numeric", "numeric", "character", "numeric", "character", "character",
"character", "character","numeric", "numeric","numeric", "numeric",
"character", "numeric")
stormData <- read.table( filename, header=TRUE, sep=",", colClasses=cclasses,
stringsAsFactors=FALSE, comment.char="")
The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years are to be considered more complete. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The following list synthesizes the attributes of data set.
names(stormData)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
dim(stormData)
## [1] 902297 37
As we can see the data set consists of 902297 observations with 37 attributes.
For our analysis two questions are of interest:
EVTYPE variable) are most harmful with respect to population health?Since, the ultimate goal of this analysis is to address the impact of general types of events on population health and economic consequences, we subset the raw data with necessary variables for computational purposes.
EVTYPE - The hazardous event type. The original data documentation lists 48 main event types and several sub-types. To answer the first question we will be using the two attributes:
FATALITIES - Number of deaths associated with the event,INJURIES - Number of injuries associated with the event.To answer the second question we will be using the four attributes:
PROPDMG - Property damage in US dollars,PROPDMGEXP - The unit of expression for property damage,CROPDMG - Crop damage in US dollars,CROPDMGEXP - The unit of expression for crop damage.data <- stormData[, c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]
To make the data tidy we transform all the variable names in lower case.
colnames(data) <- tolower(names(data))
sapply(data, function(missing) any(is.na(missing)))
## evtype fatalities injuries propdmg propdmgexp cropdmg
## FALSE FALSE FALSE FALSE FALSE FALSE
## cropdmgexp
## FALSE
As we can see, there are no missing values in our data, however, as discussed below, we have to do some data cleaning and transformations for event type, and health and economic attributes.
data$evtype <- as.factor(data$evtype)
eventTypes <- levels(data$evtype)
length(eventTypes)
## [1] 985
There is 985 levels in the evtype attribute. Take note that we are not showing the evtype attribute levels, because display would be 985 lines long and would be completely opaque. However, by analyzing the event types, it is clear that we can break them down in fewer categories, since many of them are expressed with multiple names (e.g. “Urban flood”, “Urban Flood”, “Urban Flooding”, “URBAN FLOODING”, “URBAN FLOOD” , …) and there are some levels which are month summaries (e.g., “Summary August 11”, “Summary of April 27”, …) and irrelevant characters (e.g., “?”). Therefore some cleaning is necessary.
## Remove observations with the `evtype` values "Summary" and "?".
data <- data[grep("SUMMARY", data$evtype, invert=TRUE),]
data <- data[grep("Summary", data$evtype, invert=TRUE),]
data <- data[grep("\\?", data$evtype, invert=TRUE),]
We then use grep() to trim down the hundreds of different values input into the evtype column.
data$evtype <- tolower(data$evtype)
data$evtype[grep("fog|vog", data$evtype)] <- "Dense Fog"
data$evtype[grep("dense smoke|smoke", data$evtype)] <- "Dense Smoke"
data$evtype[grep("heavy snow|snow", data$evtype)] <- "Heavy Snow"
data$evtype[grep("high surf|blow-out tide|surf|swells|high tides|high seas|high waves|heavy seas|rough seas|rogue wave", data$evtype)] <- "High Surf"
data$evtype[grep("drought|dryness|driest month|dry conditions|record dryness|excessively dry|dry weather|dry spell|mild and dry pattern|mild/dry pattern|dry pattern|record dry month|hot pattern|dry hot weather|below normal precipitation|dry|unseasonably dry", data$evtype)] <- "Drought"
data$evtype[grep("astronomical", data$evtype)] <- "Astronomical Low/High Tide"
data$evtype[grep("avalanche|avalance", data$evtype)] <- "Avalanche"
data$evtype[grep("blizzard", data$evtype)] <- "Blizzard"
data$evtype[grep("urban|fld|small stream and|small stream", data$evtype)] <- "Urban Flooding"
data$evtype[grep("coastal.|cstl", data$evtype)] <- "Coastal Flood"
data$evtype[grep("debris flow|lands[ .]?lides|land[ .]?slide|mud[ .]?slides|mud[ .]?slide|landslump|rock slide", data$evtype)] <- "Debris Flow"
data$evtype[grep("dust devil|dust storm|dust", data$evtype)] <- "Dust Storm"
data$evtype[grep("excessive heat|unusually warm|excessive precipitation|very warm|prolong warmth|hot weather|record warm|unusual/record warmth|unusual warmth|abnormal warmth|unseasonably hot|hot spell|record warm temps\\.|hot and dry|hot pattern|hot/dry pattern|warm dry conditions|record high temperature[s]?|record warmth|extreme heat|heat|unseasonably warm|warm weather", data$evtype)] <- "Excessive Heat"
data$evtype[grep("[ .]?flash flood[ing]?|flash floooding|unseasonably wet|flood/flash|flood|high water|rapidly rising water", data$evtype)] <- "Flash Flood"
data$evtype[grep("frost/freeze|freeze|frost", data$evtype)] <- "Frost/Freeze"
data$evtype[grep("funnel cloud|funnel", data$evtype)] <- "Funnel Cloud"
data$evtype[grep("hail", data$evtype)] <- "Hail"
data$evtype[grep("heavy rain|excessive wetness|abnormally wet|monthly precipitation|record precipitation|extremely wet|remnants of floyd|early rain|rain \\(heavy\\)|prolonged rain|metro storm|wet month|rain damage|wet year|torrential rain|wet weather|excessive rain|rain$|normal precipitation|wall cloud|hvy rain|heavy precip[ai]tation|record rainfall|rainstorm|unseasonal rain|heavy shower|torrential rainfall|excessive rainfall", data$evtype)] <- "Heavy Rain"
data$evtype[grep("high wind|high$", data$evtype)] <- "High Wind"
data$evtype[grep("hurricane|typhoon", data$evtype)] <- "Hurricane"
data$evtype[grep("ice storm", data$evtype)] <- "Ice Storm"
data$evtype[grep("lake-effect snow", data$evtype)] <- "Lake-Effect Snow"
data$evtype[grep("lakeshore flood", data$evtype)] <- "Lakeshore Flood"
data$evtype[grep("lightning|lighting|ligntning", data$evtype)] <- "Lightning"
data$evtype[grep("rip current[s]?", data$evtype)] <- "Rip Current"
data$evtype[grep("seiche", data$evtype)] <- "Seiche"
data$evtype[grep("sleet|mix[ed]?$|freezing drizzle|freezing rain|mixed percipitation|mixed precip|freezing spray", data$evtype)] <- "Sleet"
data$evtype[grep("storm surge", data$evtype)] <- "Storm Surge/Tide"
data$evtype[grep("strong wind|wnd$|gusty wind|wind|winds", data$evtype)] <- "Strong Wind"
data$evtype[grep("thunderstorm wind[s]?|thundeer.|^([ .])?tstm|thunderstorm[s]?|thunderstrom|thundertorm|thuderstorm|thunderestorm", data$evtype)] <- "Thunderstorm Wind"
data$evtype[grep("tornado|gustnado|torndao", data$evtype)] <- "Tornado"
data$evtype[grep("microburst|micoburst", data$evtype)] <- "Microburst"
data$evtype[grep("ice|glaze|icy", data$evtype)] <- "Ice Storm"
data$evtype[grep("erosion|erosin", data$evtype)] <- "Beach/Coastal Erosion"
data$evtype[grep("extreme cold|wind[ .]?chill|record low|cold|cool|hyperthermia/exposure|unseasonal low temp|low temperature", data$evtype)] <- "Cold/Wind Chill"
data$evtype[grep("downburst", data$evtype)] <- "Downburst"
data$evtype[grep("dam break|dam failure", data$evtype)] <- "Dam Break"
data$evtype[grep("landspout", data$evtype)] <- "Landspout"
data$evtype[grep("marine accident|marine mishap", data$evtype)] <- "Marine Accident"
data$evtype[grep("other|apache|\\?|other/unknown|summary|southeast|monthly temperature|no severe weather|red flag criteria|northern lights|severe turbulence|record temperatures|excessive$|mild pattern|temperature record|record temperature", data$evtype)] <- "Other/Unknown"
data$evtype[grep("drowning", data$evtype)] <- "Drowning"
data$evtype[grep("none", data$evtype)] <- "Other/Unknown"
data$evtype[grep("tropical depression", data$evtype)] <- "Tropical Depression"
data$evtype[grep("tropical storm", data$evtype)] <- "Tropical Storm"
data$evtype[grep("tsunami", data$evtype)] <- "Tsunami"
data$evtype[grep("tstm", data$evtype)] <- "Thunderstorm Winds"
data$evtype[grep("volcanic ash|volcanic eruption", data$evtype)] <- "Volcanic Ash"
data$evtype[grep("waterspout|water spout|wayterspout", data$evtype)] <- "Waterspout"
data$evtype[grep("wild[ .]?fire[s]|forest fire[s]|fire[s]?$|red flag fire wx", data$evtype)] <- "Wildfire"
data$evtype[grep("winter storm", data$evtype)] <- "Winter Storm"
data$evtype[grep("winter weather", data$evtype)] <- "Winter Weather"
In this way hundreds of different values for the evtype attribute (i.e. 985 levels) have been reduce to 47 values.
unique(data$evtype)
## [1] "Tornado" "Strong Wind"
## [3] "Hail" "Heavy Rain"
## [5] "Heavy Snow" "Flash Flood"
## [7] "Winter Storm" "High Wind"
## [9] "Cold/Wind Chill" "Hurricane"
## [11] "Lightning" "Dense Fog"
## [13] "Rip Current" "Thunderstorm Wind"
## [15] "Funnel Cloud" "Excessive Heat"
## [17] "Waterspout" "Blizzard"
## [19] "Frost/Freeze" "Coastal Flood"
## [21] "High Surf" "Ice Storm"
## [23] "Avalanche" "Marine Accident"
## [25] "Other/Unknown" "Dust Storm"
## [27] "Sleet" "Urban Flooding"
## [29] "Wildfire" "Debris Flow"
## [31] "Drought" "Microburst"
## [33] "Downburst" "Winter Weather"
## [35] "Storm Surge/Tide" "Tropical Storm"
## [37] "Dam Break" "Beach/Coastal Erosion"
## [39] "monthly rainfall" "Volcanic Ash"
## [41] "Seiche" "Tropical Depression"
## [43] "Landspout" "Dense Smoke"
## [45] "Astronomical Low/High Tide" "Drowning"
## [47] "Tsunami"
So, after data cleaning and evtype attribute filtering, our data set consists of of 902220 observations with 7 attributes and 47 different values for the evtype attribute:
names(data)
## [1] "evtype" "fatalities" "injuries" "propdmg" "propdmgexp"
## [6] "cropdmg" "cropdmgexp"
dim(data)
## [1] 902220 7
length(unique(data$evtype))
## [1] 47
First we inspect economic related columns for missing data.
sum(is.na(data$propdmg))
## [1] 0
sum(is.na(data$cropdmg))
## [1] 0
As we can see we do not have any missing values within propdmg and cropdmg columns.
The attributes propdmg and cropdmg represent amount of damage in USD. The numbers expressed have to be scaled according to the
the units of expression represented in propdmgexp and cropdmgexp. Therefore, to calculate the cost we have to multiply propdmg * 10propdmgexp and cropdmg * 10cropdmgexp .
unique(data$propdmgexp)
## [1] "K" "M" "" "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-"
## [18] "1" "8"
unique(data$cropdmgexp)
## [1] "" "M" "K" "m" "B" "?" "0" "k" "2"
As we can see propdmgexp and cropdmgexp have values in numbers and letters (e.g., “K” or “k” = 1000, 5 = 105, etc). There are also other values (i.e., “”, “?”, “+” and “-”). In order to convert these characters to numbers, we write a function that takes a character from propdmgexp and cropdmgexp and returns a number which is 10 to the power of character.
# Function `convert.exp` takes a character c (exponent) and returns a number
# which is 10 to the power c. Valid values of c are h or H (hundred),
# k or K (thousand), m or M (million), b or B (billion), and numbers 0-9. For any other values
# of c a value of 0 is returned.
convert.exp <- function(c) {
exp <- switch( EXPR = tolower(c),
"0" = 0, "1" = 1, "2" = 2, "3" = 3,
"4" = 4, "5" = 5, "6" = 6, "7" = 7, "8" = 8, "9" = 9,
"h" = 2, "k" = 3, "m" = 6, "b" = 9,
0 )
return(10 ^ exp)
}
Now we can calculate the cost for property damage and crop damage.
propdmgCost = (data$propdmg * vapply(data$propdmgexp, convert.exp, 1.0))
cropdmgCost = (data$cropdmg * vapply(data$cropdmgexp, convert.exp, 1.0))
summary(propdmgCost)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00e+00 0.00e+00 0.00e+00 4.75e+05 5.00e+02 1.15e+11
summary(cropdmgCost)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00e+00 0.00e+00 0.00e+00 5.44e+04 0.00e+00 5.00e+09
We create new Econ.costData data frame with columns eventType, propdmgCost and cropdmgCost, which we will use in the analysis of the impact of adverse events on economy.
Econ.costData <- data.frame("eventType"=data$evtype, "propdmgCost" =propdmgCost, "cropdmgCost" = cropdmgCost)
summary(Econ.costData)
## eventType propdmgCost cropdmgCost
## Strong Wind:341768 Min. :0.00e+00 Min. :0.00e+00
## Hail :290399 1st Qu.:0.00e+00 1st Qu.:0.00e+00
## Flash Flood: 81463 Median :0.00e+00 Median :0.00e+00
## Tornado : 60707 Mean :4.75e+05 Mean :5.44e+04
## High Wind : 21927 3rd Qu.:5.00e+02 3rd Qu.:0.00e+00
## Heavy Snow : 17705 Max. :1.15e+11 Max. :5.00e+09
## (Other) : 88251
Since values less than or equal to zero are not relevant, we form a subset in which a number of property damage or crop damage is greater than zero.
Econ.costData <- subset(Econ.costData, (propdmgCost > 0 | cropdmgCost > 0))
summary(Econ.costData)
## eventType propdmgCost cropdmgCost
## Strong Wind:120750 Min. :0.00e+00 Min. :0e+00
## Tornado : 39392 1st Qu.:2.50e+03 1st Qu.:0e+00
## Flash Flood: 31568 Median :1.00e+04 Median :0e+00
## Hail : 26498 Mean :1.75e+06 Mean :2e+05
## Lightning : 10373 3rd Qu.:4.00e+04 3rd Qu.:0e+00
## High Wind : 5996 Max. :1.15e+11 Max. :5e+09
## (Other) : 10453
dim(Econ.costData)
## [1] 245030 3
So our final Econ.costData data frame consists of of 245030 observations with 3 attributes.
First we inspect health related columns for missing data.
sum(is.na(data$fatalities))
## [1] 0
sum(is.na(data$injuries))
## [1] 0
As we can see we do not have any missing values within fatalities and injuries columns. We create new Health.Data data frame with columns eventType, fatalities and injuries, which we will use in the analysis of the impact of adverse events on health. Since values less than or equal to zero are not relevant, we form a subset in which a number of fatalities or injuries is greater than zero.
Health.Data <- data.frame("eventType"=data$evtype, "fatalities" = data$fatalities, "injuries" = data$injuries)
Health.Data <- subset(Health.Data, fatalities > 0 | injuries >0)
summary(Health.Data)
## eventType fatalities injuries
## Tornado :7934 Min. : 0.0 Min. : 0.0
## Strong Wind :4487 1st Qu.: 0.0 1st Qu.: 1.0
## Lightning :3308 Median : 0.0 Median : 1.0
## Flash Flood :1406 Mean : 0.7 Mean : 6.4
## Excessive Heat: 943 3rd Qu.: 1.0 3rd Qu.: 3.0
## Rip Current : 637 Max. :583.0 Max. :1700.0
## (Other) :3214
dim(Health.Data)
## [1] 21929 3
So our final Health.Data data frame consists of of 21929 observations with 3 attributes.
Now we are prepared for analysis.
In this section the question of interest is:
1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
We first create a summary data table with columns deaths and injuries.
healthSummary <- as.data.table(Health.Data)
setkey(healthSummary, eventType)
healthSummary <- healthSummary[ , list(deaths = sum(fatalities),
injuries= sum(injuries),
number = .N),
keyby = eventType ]
The table below shows the 5 most common causes of death.
# Get top 5 causes of death.
deaths <- healthSummary[head(order(-deaths), nrow(healthSummary)),
list(Event = eventType, Deaths = deaths)]
deaths[6,] <- list(Event = "Other", Deaths = sum(deaths[6:nrow(deaths),Deaths]))
deaths <- head(deaths,6)
# Print as a HTML table.
print(xtable(deaths, caption = "Table 1: Top 5 causes of death"), type="html", caption.placement="top")
| Event | Deaths | |
|---|---|---|
| 1 | Tornado | 5636.00 |
| 2 | Excessive Heat | 3143.00 |
| 3 | Flash Flood | 1522.00 |
| 4 | Strong Wind | 1119.00 |
| 5 | Lightning | 817.00 |
| 6 | Other | 2908.00 |
The table below shows the 5 most common causes of injuries.
# Get top 5 causes of injuries.
injuries <- healthSummary[head(order(-injuries), nrow(healthSummary)),
list(Event = eventType, Injuries = injuries)]
injuries[6,] <- list(Event = "Other", Injuries = sum(injuries[6:nrow(injuries),Injuries]))
injuries <- head(injuries,6)
# Print as a HTML table.
print(xtable(injuries, caption = "Table 2: Top 5 causes of injuries"), type="html", caption.placement="top")
| Event | Injuries | |
|---|---|---|
| 1 | Tornado | 91407.00 |
| 2 | Strong Wind | 9878.00 |
| 3 | Excessive Heat | 9228.00 |
| 4 | Flash Flood | 8597.00 |
| 5 | Lightning | 5232.00 |
| 6 | Other | 16186.00 |
The public health harm by weather event types is shown below in Figure 1.
# Deaths subplot
pDeaths <- qplot(data=deaths, x= reorder(Event, -Deaths), y= Deaths,
xlab = "",
ylab="Number of fatalities",
main="Five most common causes of deaths")
pDeaths <- pDeaths + geom_histogram(stat="identity") + aes(fill=Event)
pDeaths <- pDeaths + theme(axis.text.x = element_text(angle = 20, hjust = 1))
#Injuries subplot
pInjuries <- qplot(data=injuries,x= reorder(Event, -Injuries), y= Injuries,
xlab = "",
ylab="Number of injuries",
main="Five most common causes of injuries")
pInjuries <- pInjuries + geom_histogram(stat="identity") + aes(fill=Event )
pInjuries <- pInjuries + theme(axis.text.x = element_text(angle = 20, hjust = 1))
#Print the two subplots
grid.newpage()
pushViewport(viewport(layout = grid.layout(2, 1)))
print(pDeaths, vp = viewport(layout.pos.row= 1, layout.pos.col = 1))
print(pInjuries, vp = viewport(layout.pos.row= 2, layout.pos.col = 1))
Figure 1 This figure shows the 5 most severe weather events in the US during the documented time interval. Most fatalities (top panel) had been caused by tornados, excessive heath and to some degree by floods. The injuries (bottom panel), by far, had been caused by tornados. To a minor degree, strong wind and thunderstorms can also be made responsible for many deaths between the years 1950 and 2011. Note that Other bar represents the sum of all the remaining events and is added as the sixth bar. Also note, that all numbers are cumulative for observed period.
The number of deaths caused by tornado is 5636 (that is 37.2 % of total deaths), and the number of injuries caused by tornado
is 91407 (that is 65 % of total crops damage).
# The number of deaths caused by tornado
deaths[1,Deaths]
## [1] 5636
# ... % of total number of deaths
round(deaths[1,Deaths]/sum(deaths[,Deaths])*100,1)
## [1] 37.2
# The number of injuries caused by tornado
injuries[1,Injuries]
## [1] 91407
# ... % of total crops damage
round(injuries[1,Injuries]/sum(injuries[,Injuries])*100,1)
## [1] 65
As it can be seen in Table 1, Table 2 and Figure 1, tornado is the most harmful weather event in terms of both injuries and fatalities, and thus in total harm as well between the years 1950 and 2011 in the US.
In this section the question of interest is:
2. Across the United States, which types of events have the greatest economic consequences?
We first create a summary data table with columns deaths and injuries.
economicsSummary <- as.data.table(Econ.costData)
setkey(economicsSummary, eventType)
economicsSummary <- economicsSummary[ , list(propdmgCost = sum(propdmgCost),
cropdmgCost= sum(cropdmgCost),
number = .N),
keyby = eventType ]
The table below shows the 5 financially most severe weather events in relation to property damage.
# Get the 5 financially most severe weather events in relation to property damage.
propDamage <- economicsSummary[head(order(-propdmgCost), nrow(economicsSummary)),
list(Event = eventType, Damage = round(propdmgCost/(10^9),2))]
propDamage[6,] <- list(Event = "Other", Damage = sum(propDamage[6:nrow(propDamage),Damage]))
propDamage <- head(propDamage,6)
# Print as a HTML table.
print(xtable(propDamage, caption = "Table 3: The 5 financially most severe weather events in relation to property damage (cost in Billions USD)"), type="html", caption.placement="top")
| Event | Damage | |
|---|---|---|
| 1 | Flash Flood | 167.73 |
| 2 | Hurricane | 85.26 |
| 3 | Tornado | 57.00 |
| 4 | Storm Surge/Tide | 47.96 |
| 5 | Hail | 17.62 |
| 6 | Other | 52.64 |
The table below shows the 5 financially most severe weather events in relation to crop damage.
# Get the 5 financially most severe weather events in relation to crop damage.
cropDamage <- economicsSummary[head(order(-cropdmgCost), nrow(economicsSummary)),
list(Event = eventType, Damage = round(cropdmgCost/(10^9),2))]
cropDamage[6,] <- list(Event = "Other", Damage = round((sum(cropDamage[6:nrow(cropDamage),Damage])),2))
cropDamage <- head(cropDamage,6)
# Print as a HTML table.
print(xtable(cropDamage, caption = "Table 4: The 5 financially most severe weather events in relation to crop damage (cost in Billions USD)"), type="html", caption.placement="top")
| Event | Damage | |
|---|---|---|
| 1 | Drought | 13.97 |
| 2 | Flash Flood | 12.38 |
| 3 | Hurricane | 5.51 |
| 4 | Ice Storm | 5.02 |
| 5 | Hail | 3.11 |
| 6 | Other | 9.09 |
The property and crops damage by weather event types is shown below in Figure 2.
# Property damage subplot
pProperty <- qplot(data=propDamage, x= reorder(Event, -Damage), y= Damage,
xlab = "",
ylab="Cost [in Billion USD]",
main="The 5 financially most severe weather events in relation to propery damage")
pProperty <- pProperty + geom_histogram(stat="identity") + aes(fill=Event)
pProperty <- pProperty + theme(axis.text.x = element_text(angle = 20, hjust = 1))
#Crops damage subplot
pCrop <- qplot(data=cropDamage,x= reorder(Event, -Damage), y= Damage,
xlab = "",
ylab="Cost [in Billion USD]",
main="The 5 financially most severe weather events in relation to crop damage")
pCrop <- pCrop + geom_histogram(stat="identity") + aes(fill=Event)
pCrop <- pCrop + theme(axis.text.x = element_text(angle = 20, hjust = 1))
#Print the two subplots
grid.newpage()
pushViewport(viewport(layout = grid.layout(2, 1)))
print(pProperty, vp = viewport(layout.pos.row= 1, layout.pos.col = 1))
print(pCrop, vp = viewport(layout.pos.row= 2, layout.pos.col = 1))
Figure 2 This figure shows the average property damage (top panel) and the average crops damage (bottom panel) of the 5 financially most severe weather events in the US. Floods have the greatest economic consequences regarding properties (top panel) while extreme temperatures (draughts) have the greatest economic consequences regarding crops between the years 1950 and 2011. Note that Other bar represents the sum of all the remaining events and is added as the sixth bar. Also note, that all numbers are cumulative for observed period.
The cost of property damage caused by floods is 167.73 Billions USD (that is 39.2 % of total property damage), and the cost of crops damage caused by droughts is 13.97 Billions USD (that is 28.5 % of total crops damage).
# The cost of property damage caused by floods (in Billions USD)
propDamage[1,Damage]
## [1] 167.7
# ... % of total property damage
round(propDamage[1,Damage]/sum(propDamage[,Damage])*100,1)
## [1] 39.2
# The cost of crops damage caused by floods (in Billions USD)
cropDamage[1,Damage]
## [1] 13.97
# ... % of total crops damage
round(cropDamage[1,Damage]/sum(cropDamage[,Damage])*100,1)
## [1] 28.5
As it can be seen in Table 3, Table 4 and Figure 2, floods have the greatest economic consequences regarding properties while extreme temperatures (draughts) have the greatest economic consequences regarding crops between the years 1950 and 2011 in the US.
The table below shows the 5 events that have the greatest overall economic consequences.
# Get the 5 financially most severe weather events in relation to overall economic consequences.
econDamage <- economicsSummary[head(order(-cropdmgCost), nrow(economicsSummary)),
list(Event = eventType, Damage = round((propdmgCost+cropdmgCost)/(10^9),2))]
econDamage <- econDamage[order(-Damage),]
econDamage[6,] <- list(Event = "Other", Damage = round((sum(econDamage[6:nrow(econDamage),Damage])),2))
econDamage <- head(econDamage,6)
# Check if total economic damage is sum of property damage and crops damage
round(sum(econDamage[,Damage]),1) == round(sum(propDamage[,Damage]) + sum(cropDamage[,Damage]), 1)
[1] TRUE
# Print as a HTML table.
print(xtable(econDamage, caption = "Table 5: The 5 most severe weather events that have the greatest overall economic consequences (cost in Billions USD)"), type="html", caption.placement="top")
| Event | Damage | |
|---|---|---|
| 1 | Flash Flood | 180.11 |
| 2 | Hurricane | 90.76 |
| 3 | Tornado | 57.42 |
| 4 | Storm Surge/Tide | 47.97 |
| 5 | Hail | 20.74 |
| 6 | Other | 80.34 |
The overall economic damage by weather event types is shown below in Figure 3.
# Overall economic damage plot
pEconomic <- qplot(data=econDamage, x= reorder(Event, -Damage), y= Damage,
xlab = "",
ylab="Cost [in Billion USD]",
main="The 5 financially most severe weather events
that have the greatest overall economic consequences")
pEconomic <- pEconomic + geom_histogram(stat="identity") + aes(fill=Event)
pEconomic <- pEconomic + theme(axis.text.x = element_text(angle = 20, hjust = 1))
print(pEconomic)
Figure 3 This figure shows the average overall economic damage of the 5 financially most severe weather events in the US. Floods have the greatest economic consequences, followed by hurricane and tornado. All other events together have smaller economic consequences than floods and also smaller economic consequences than hurricen. Note that Other bar represents the sum of all the remaining events and is added as the sixth bar. Also note, that all numbers are cumulative for observed period.
The cost of overall economic damage caused by floods is 180.11 Billions USD (that is 37.7 % of total overall economic damage).The cost of overall economic damage caused by hurricane is 90.76 Billions USD (that is 19 % of total overall economic damage). The cost of overall economic damage caused by tornado is 57.42 Billions USD (that is 12 % of total overall economic damage).
# The cost of overall economic damage caused by floods (in Billions USD)
econDamage[1,Damage]
## [1] 180.1
# ... % of total overall economic damage
round(econDamage[1,Damage]/sum(econDamage[,Damage])*100,1)
## [1] 37.7
# The cost of overall economic damage caused by hurricane (in Billions USD)
econDamage[2,Damage]
## [1] 90.76
# ... % of total overall economic damage
round(econDamage[2,Damage]/sum(econDamage[,Damage])*100,1)
## [1] 19
# The cost of overall economic damage caused by tornado (in Billions USD)
econDamage[3,Damage]
## [1] 57.42
# ... % of total overall economic damage
round(econDamage[3,Damage]/sum(econDamage[,Damage])*100,1)
## [1] 12
As it can be seen in Table 5 and Figure 3, floods have the greatest overall economic consequences between the years 1950 and 2011 in the US, followed by hurricane and tornado.
Our analysis of the data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm Databaseshows that:
Tornado is the most harmful weather event in terms of both injuries and fatalities, and thus in total harm as well, between the years 1950 and 2011 in the US.
Floods have the greatest overall economic consequences between the years 1950 and 2011 in the US, followed by hurricane and tornado.
Floods have the greatest economic consequences regarding properties, while extreme temperatures (droughts) have the greatest economic consequences regarding crops between the years 1950 and 2011 in the US.