Weather Events that Caused the Highest Public health and Economic problems in the United States for the Time Period Between 1950 and 2011

Author: Philip Abraham
Date: September 3, 2016

Synopsis

The aim of this report is to address the question of which types of weather events in the United States (U.S.) cause the greatest harm to the population health and have the greatest economic consequences.

This assignment involved analyzing the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States including, when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The NOAA database listed weather event information starting in the Year 1950 and ending in November 2011.

From the given data, it is found that, cumulatively across the U.S., tornadoes caused the greatest harm to the human health resulting in about 97,000 fatalities and/or injuries.

Floods resulted in the highest property and/or crop damage in the U.S., with a total incurred cost of about $178 billion.

Data Processing

The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. The dataset was obtained from the course web site: StormData.

Loading the Data (i.e. read.csv()).

# https (Secure) URL to the repdata%2Fdata%2FStormData.csv.bz2 file.
require("readr") || install.packages("readr")
## Loading required package: readr
## [1] TRUE
library(readr)
url_csv <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(url_csv, destfile= "repdata%2Fdata%2FStormData.csv.bz2")
dateDownloaded <- date()
dateDownloaded
## [1] "Fri Sep 02 20:07:00 2016"
# read csv file into memory
storm <- read.csv("repdata%2Fdata%2FStormData.csv.bz2",stringsAsFactors=FALSE)
dim(storm)
## [1] 902297     37
names(storm)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Processed/transformed the data into a format suitable for the analysis.
As shown above, this downloaded NOAA storm database contains 902297 observations with 37 variables. Since the project objective is to discover the weather events causing the most damage to the U.S. population, in terms of health and financial loss, the storm dataset was subsetted to show only a few variables required for the data analysis to meet the project objectives.

To reduce the computer time and memory usage for the data processing, the dataset was further subsetted to reflect only values greater than zero for population health damage counts and economic cost values.

There were also quite a few weather event types in the dataset that were consolidated to maintain naming consistency for weather events.

The PROPDMG and CROPDMG columns in the raw NOAA dataset contains the damage values in dollars, and the PROPDMGEXP and CROPDMGEXP listed the standard K (thousand), M(million) or B(billion) unit designations. In the processed dataset, the PROPDMG and CROPDMG columns values given were converted to the actual values. For example, if a row had a value of 25 in the PROPDMG column, and a “K” designation for that row in the PROPDMGEXP column, then the assigned cost value for that row for PROPDMG is 25 x1000 =$25000, replacing the 25.

Note that a few of the property and crop damage amounts in the raw dataset displayed units that were not of the standard K, M or B unit designation. For those particular rows in the dataset, the unknown units were replaced with a “?” character. The programming strategy was to use “as is” the values for the property and crop damages in these “?” designated rows. For example, if a row had a value of 25 in the PROPDMG column, and a “?” designation for that row in the PROPDMGEXP column, then the assigned cost value for that row for PROPDMG is 25x1 =$25, in essence keeping the same value as before. This might introduce some errors in the final cost tally, but the number of rows replaced with the “?” were considerably smaller than the overall number of rows in the dataset.

In the final processed dataset, the fatality and injury counts were added together to form a single value for each row representing the population health damage count per each weather event. Similarily, the sum of the property and crop damage costs were generated to represent the economic losses for each weather event by a single value for each row. For analysis purposes, new columns were added to the dataframe to better reflect the given population health and economic damage cost estimates in the dataset.

Population Health Damage Computations.

## Population Health Damage

# subset dataframe for event type and population health variables
storm_health <- storm[,c("EVTYPE","FATALITIES", "INJURIES")]

# subset further to not include zero for fatalities and injuries
storm_health <- subset(storm_health, storm_health$FATALITIES>0 | 
                               storm_health$INJURIES>0)

# Clean up datasets from duplicates
#converts all event type column values to upper case to get rid of duplicates
storm_health$EVTYPE <-  toupper(storm_health$EVTYPE)

# clean up duplicate weather events
storm_health$EVTYPE[storm_health$EVTYPE == "EXCESSIVE HEAT"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HEAT"]
storm_health$EVTYPE[storm_health$EVTYPE == "HEAT WAVE"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HEAT"] 
storm_health$EVTYPE[storm_health$EVTYPE == "EXTREME HEAT"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HEAT"]
storm_health$EVTYPE[storm_health$EVTYPE == "RECORD HEAT"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HEAT"]
storm_health$EVTYPE[storm_health$EVTYPE == "TSTM WIND"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "THUNDERSTORM WIND"]
storm_health$EVTYPE[storm_health$EVTYPE == "THUNDERSTORM WINDS"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "THUNDERSTORM WIND"]
storm_health$EVTYPE[storm_health$EVTYPE == "FLASH FLOOD"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "FLOOD"] 
storm_health$EVTYPE[storm_health$EVTYPE == "URBAN/SML STREAM FLD"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "FLOOD"] 
storm_health$EVTYPE[storm_health$EVTYPE == "ICE STORM"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WINTER STORM"]
storm_health$EVTYPE[storm_health$EVTYPE == "HEAVY SNOW"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WINTER STORM"]
storm_health$EVTYPE[storm_health$EVTYPE == "BLIZZARD"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WINTER STORM"]
storm_health$EVTYPE[storm_health$EVTYPE == "WILD/FOREST FIRE"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WILDFIRE"]
storm_health$EVTYPE[storm_health$EVTYPE == "WILD FIRES"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WILDFIRE"]
storm_health$EVTYPE[storm_health$EVTYPE == "WIND"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HIGH WIND"]
storm_health$EVTYPE[storm_health$EVTYPE == "HIGH WINDS"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HIGH WIND"]
storm_health$EVTYPE[storm_health$EVTYPE == "STRONG WIND"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HIGH WIND"]
storm_health$EVTYPE[storm_health$EVTYPE == "HURRICANE"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HURRICANE/TYPHOON"]
storm_health$EVTYPE[storm_health$EVTYPE == "DENSE FOG"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "FOG"]
storm_health$EVTYPE[storm_health$EVTYPE == "RIP CURRENT"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "RIP CURRENTS"]
storm_health$EVTYPE[storm_health$EVTYPE == "EXTREME COLD/WIND CHILL"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "EXTREME COLD"]
storm_health$EVTYPE[storm_health$EVTYPE == "COLD/WIND CHILL"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "EXTREME COLD"]
storm_health$EVTYPE[storm_health$EVTYPE == "HEAVY SURF/HIGH SURF"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HIGH SURF"]
storm_health$EVTYPE[storm_health$EVTYPE == "HEAVY SURF"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HIGH SURF"]
storm_health$EVTYPE[storm_health$EVTYPE == "TROPICAL STORM GORDON"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "TROPICAL STORM"]
storm_health$EVTYPE[storm_health$EVTYPE == "WINTER WEATHER/MIX"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WINTER WEATHER"]
storm_health$EVTYPE[storm_health$EVTYPE == "WINTRY MIX"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WINTER WEATHER"]
storm_health$EVTYPE[storm_health$EVTYPE == "WINTER WEATHER MIX"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "WINTER WEATHER"]
storm_health$EVTYPE[storm_health$EVTYPE == "COLD"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "EXTREME COLD"]
storm_health$EVTYPE[storm_health$EVTYPE == "EXCESSIVE RAINFALL"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HEAVY RAIN"]
storm_health$EVTYPE[storm_health$EVTYPE == "HEAVY RAINS"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HEAVY RAIN"]
storm_health$EVTYPE[storm_health$EVTYPE == "TORRENTIAL RAINFALL"]  <- 
        storm_health$EVTYPE[storm_health$EVTYPE == "HEAVY RAIN"]

# split the population health variables by event type combining fatalities and injuries
# in each row
storm_health_a<-aggregate(storm_health$FATALITIES+storm_health$INJURIES~
                                  storm_health$EVTYPE, storm_health,sum)

# change column names
colnames(storm_health_a) <- c("Event_Type", "HealthDamage_Qty")

# sort dataset in decreasing order for health damage numbers
rank <- order(storm_health_a$HealthDamage_Qty, decreasing = TRUE)
storm_health_a <- storm_health_a[rank,] 

Economic Consequences Computations.

## Economic Consequences

# subset dataframe for event type and economic consequences variables
storm_econ <- storm[,c("EVTYPE","PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]

# subset further to not include zero expense rows for property and crops
storm_econ <- subset(storm_econ, storm_econ$PROPDMG>0 | 
                             storm_econ$CROPDMG>0)

# cleaning up data

#converts all event type column values to upper case to get rid of duplicates
storm_econ$EVTYPE <-  toupper(storm_econ$EVTYPE) 

unique(storm_econ$PROPDMGEXP)
##  [1] "K" "M" "B" "m" ""  "+" "0" "5" "6" "4" "h" "2" "7" "3" "H" "-"
# convert all PROPDMGEXP column values to upper case to get rid of duplicates
storm_econ$PROPDMGEXP <-  toupper(storm_econ$PROPDMGEXP)

# replace ""  "+" "0" "5" "6" "4" "H" "2" "7" "3" "-" with "?"
l<-length(storm_econ$PROPDMGEXP)
for(i in 1:l){
        if(storm_econ$PROPDMGEXP[i]==""){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="+"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="0"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="5"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="6"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="4"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="H"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="2"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="7"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="3"){
                storm_econ$PROPDMGEXP[i]<-"?"
        }else if(storm_econ$PROPDMGEXP[i]=="-"){
                storm_econ$PROPDMGEXP[i]<-"?"}
}
                
unique(storm_econ$PROPDMGEXP)
## [1] "K" "M" "B" "?"
unique(storm_econ$CROPDMGEXP)
## [1] ""  "M" "K" "m" "B" "?" "0" "k"
# convert all CROPDMGEXP column values to upper case to get rid of duplicates
storm_econ$CROPDMGEXP <-  toupper(storm_econ$CROPDMGEXP)

# replace "0" & "" with "?"
l<-length(storm_econ$CROPDMGEXP)
for(i in 1:l){
        if(storm_econ$CROPDMGEXP[i]=="0"){
                storm_econ$CROPDMGEXP[i]<-"?"
        }else if(storm_econ$CROPDMGEXP[i]==""){
                storm_econ$CROPDMGEXP[i]<-"?"}
}

unique(storm_econ$CROPDMGEXP)
## [1] "?" "M" "K" "B"
# convert values in "PROPDMG" and "CROPDMG" to appropriate amounts based on the
# "PROPDMGEXP" & "CROPDMGEXP" entries.

storm_econ_rev <- storm_econ
for(i in 1:l){
        if(storm_econ_rev$PROPDMGEXP[i]=="K"){
                storm_econ_rev$PROPDMG[i]<-1000*storm_econ_rev$PROPDMG[i]
        }else if(storm_econ_rev$PROPDMGEXP[i]=="M"){
                storm_econ_rev$PROPDMG[i]<-1000000*storm_econ_rev$PROPDMG[i]
        }else if(storm_econ_rev$PROPDMGEXP[i]=="B"){
                storm_econ_rev$PROPDMG[i]<-1000000000*storm_econ_rev$PROPDMG[i]
        }else if(storm_econ_rev$PROPDMGEXP[i]=="?"){
                storm_econ_rev$PROPDMG[i]<-1*storm_econ_rev$PROPDMG[i]}
}        

for(i in 1:l){
        if(storm_econ_rev$CROPDMGEXP[i]=="K"){
                storm_econ_rev$CROPDMG[i]<-1000*storm_econ_rev$CROPDMG[i]
        }else if(storm_econ_rev$CROPDMGEXP[i]=="M"){
                storm_econ_rev$CROPDMG[i]<-1000000*storm_econ_rev$CROPDMG[i]
        }else if(storm_econ_rev$CROPDMGEXP[i]=="B"){
                storm_econ_rev$CROPDMG[i]<-1000000000*storm_econ_rev$CROPDMG[i]
        }else if(storm_econ_rev$CROPDMGEXP[i]=="?"){
                storm_econ_rev$CROPDMG[i]<-1*storm_econ_rev$CROPDMG[i]}
} 

storm_econ_rev <- storm_econ_rev[, c(1,2,4)]
head(storm_econ_rev,3)
##    EVTYPE PROPDMG CROPDMG
## 1 TORNADO   25000       0
## 2 TORNADO    2500       0
## 3 TORNADO   25000       0
# clean up duplicate weather events
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "FLASH FLOOD"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "FLOOD"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "RIVER FLOOD"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "FLOOD"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HURRICANE"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HURRICANE/TYPHOON"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HURRICANE OPAL"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HURRICANE/TYPHOON"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HURRICANE ERIN"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HURRICANE/TYPHOON"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "TYPHOON"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HURRICANE/TYPHOON"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "TSTM WIND"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "THUNDERSTORM WIND"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "THUNDERSTORM WINDS"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "THUNDERSTORM WIND"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "ICE STORM"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "WINTER STORM"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "WILD/FOREST FIRE"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "WILDFIRE"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "WILD FIRES"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "WILDFIRE"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "STORM SURGE/TIDE"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "STORM SURGE"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HIGH WINDS"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HIGH WIND"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "BLIZZARD"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "WINTER STORM"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "SEVERE THUNDERSTORM"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HEAVY RAIN/SEVERE WEATHER"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "TORNADOES, TSTM WIND, HAIL"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HEAVY RAIN/SEVERE WEATHER"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "FREEZE"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "FROST/FREEZE"]
storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "HEAT"]  <- 
        storm_econ_rev$EVTYPE[storm_econ_rev$EVTYPE == "EXCESSIVE HEAT"]

# split the economic variables by event type
storm_econ_rev_a<-aggregate(storm_econ_rev$PROPDMG+storm_econ_rev$CROPDMG~storm_econ_rev$EVTYPE,storm_econ_rev,sum)

# change column names
colnames(storm_econ_rev_a) <- c("Event_Type", "DamageCost")

# sort dataset in decreasing order for damage costs
rank2 <- order(storm_econ_rev_a$DamageCost, decreasing = TRUE)
storm_econ_rev_a <- storm_econ_rev_a[rank2,] 

Results

Which types of weather events are the most harmful to the population health?

# total of all population health damage numbers
options(scipen=999) # removes scientific notation in values
health_tot <- sum(storm_health_a$HealthDamage_Qty)

There were 155673 fatalities and/or injuries caused by weather events in the U.S. for the years 1950 to 2011.

# % break-down of population health damage numbers
require("scales") || install.packages("scales")
## Loading required package: scales
## 
## Attaching package: 'scales'
## The following objects are masked from 'package:readr':
## 
##     col_factor, col_numeric
library(scales)
storm_health_a$percent_of_total <- percent((storm_health_a$HealthDamage_Qty)/health_tot)

Shown below is the summary statistics for the population damage counts.

summary(storm_health_a)
##   Event_Type        HealthDamage_Qty  percent_of_total  
##  Length:174         Min.   :    1.0   Length:174        
##  Class :character   1st Qu.:    1.0   Class :character  
##  Mode  :character   Median :    4.0   Mode  :character  
##                     Mean   :  894.7                     
##                     3rd Qu.:   26.5                     
##                     Max.   :96979.0
quantile(storm_health_a$HealthDamage_Qty,c(.1,.25,.5,.75,.9,.95,.98,.99,1))
##      10%      25%      50%      75%      90%      95%      98%      99% 
##     1.00     1.00     4.00    26.50   389.80  1400.50  8210.32 10714.46 
##     100% 
## 96979.00
storm_health_a <- subset(storm_health_a, storm_health_a$HealthDamage_Qty>=
                                 quantile(storm_health_a$HealthDamage_Qty,c(.9)))

For the period 1950 to 2011, the weather event that causes the greatest harm to human health in the United States is a tornado. There were 96979 fatalities and/or injuries caused by tornadoes. Tornadoes made up of 62.3% of all fatalities and/or injuries due to weather related events in the United States.

require("knitr") || install.packages("knitr")
## Loading required package: knitr
library(knitr)
kable(storm_health_a[1,])
Event_Type HealthDamage_Qty percent_of_total
148 TORNADO 96979 62.3%

Selecting only the 90% percentile and higher health damage quantities, the top weather events causing the greatest harm to human health are:

kable(storm_health_a)
Event_Type HealthDamage_Qty percent_of_total
148 TORNADO 96979 62.3%
52 HEAT 12319 7.9%
32 FLOOD 10121 6.5%
138 THUNDERSTORM WIND 10054 6.5%
94 LIGHTNING 6046 3.9%
171 WINTER STORM 5645 3.6%
68 HIGH WIND 2214 1.4%
168 WILDFIRE 1696 1.1%
82 HURRICANE/TYPHOON 1446 0.9%
50 HAIL 1376 0.9%
37 FOG 1156 0.7%
117 RIP CURRENTS 1101 0.7%
25 EXTREME COLD 735 0.5%
174 WINTER WEATHER 677 0.4%
22 DUST STORM 462 0.3%
152 TROPICAL STORM 449 0.3%
64 HIGH SURF 398 0.3%
2 AVALANCHE 394 0.3%

2-D histogram showing the Base-10 Log of population health damage counts per weather events.

# Install ggplot2 package for plotting
require("ggplot2") || install.packages("ggplot2")
## Loading required package: ggplot2
library(ggplot2)

ggplot(storm_health_a, aes(reorder(Event_Type,log10(HealthDamage_Qty)), 
                             log10(HealthDamage_Qty))) + 
        stat_bin2d(bins = 11, colour = "white")+labs(x="Weather Event", 
                        y="Log10(Total Number of People Killed and/or Injured)", 
                                title="Weather Events Causing the Highest
                        Harm to Population Health for the Years 1950 to 2011") +
        theme(axis.text.x=element_text(angle=90, size=10, vjust=0.5)) + 
        theme(legend.position="none") +
        theme(panel.background = element_rect(fill = 'black'))

Which types of weather events have the greatest economic consequences?

# total of all damage costs
damage_tot <- sum(storm_econ_rev_a$DamageCost)

The total damage cost to property and crops inflicted caused by weather events in the United States for the years 1950 to 2011 is $476422842480.

# % break-down of costs
storm_econ_rev_a$percent_of_total <- percent((storm_econ_rev_a$DamageCost)/damage_tot)

Shown below is the summary statistics for the damage costs.

summary(storm_econ_rev_a)
##   Event_Type          DamageCost           percent_of_total  
##  Length:379         Min.   :           0   Length:379        
##  Class :character   1st Qu.:       12750   Class :character  
##  Mode  :character   Median :      200000   Mode  :character  
##                     Mean   :  1257052355                     
##                     3rd Qu.:     5000000                     
##                     Max.   :178030211924
quantile(storm_econ_rev_a$DamageCost,c(.1,.25,.5,.75,.9,.95,.98,.99,1))
##          10%          25%          50%          75%          90% 
##         3000        12750       200000      5000000     76219040 
##          95%          98%          99%         100% 
##    325052989   9704214708  25183840166 178030211924
storm_econ_rev_a <- subset(storm_econ_rev_a, storm_econ_rev_a$DamageCost >=
                                 quantile(storm_econ_rev_a$DamageCost, c(.95)))

For the period 1950 to 2011, the weather event that causes the most economic loss in the United States is the flood.

A total of $178030211924 in property and/or crop damages were caused by floods. Floods made up of 37.4% of all property and/or crop damages in the United States due to weather related events.

kable(storm_econ_rev_a[1,])
Event_Type DamageCost percent_of_total
64 FLOOD 178030211924 37.4%

Selecting only the 95% percentile and higher damage costs, the top weather events that result in the greatest economic consequences are:

kable(storm_econ_rev_a)
Event_Type DamageCost percent_of_total
64 FLOOD 178030211924 37.4%
172 HURRICANE/TYPHOON 90710952810 19.0%
312 TORNADO 57352114049 12.0%
262 STORM SURGE 47965579000 10.1%
98 HAIL 18758221521 3.9%
373 WINTER STORM 16453756561 3.5%
34 DROUGHT 15018672000 3.2%
273 THUNDERSTORM WIND 10863543990 2.3%
365 WILDFIRE 8793313130 1.8%
320 TROPICAL STORM 8382236550 1.8%
153 HIGH WIND 6557661943 1.4%
123 HEAVY RAIN/SEVERE WEATHER 5308060000 1.1%
84 FROST/FREEZE 1561596000 0.3%
119 HEAVY RAIN 1427647890 0.3%
47 EXTREME COLD 1380710400 0.3%
129 HEAVY SNOW 1067242242 0.2%
196 LIGHTNING 940751537 0.2%
43 EXCESSIVE HEAT 903414200 0.2%
187 LANDSLIDE 344613000 0.1%

2-D histogram showing the Base-10 Log of the top damage costs and the associated weather events.

# See the damage costs by 2d histograms
ggplot(storm_econ_rev_a, aes(reorder(Event_Type,log10(DamageCost)), 
                           log10(DamageCost))) + 
        stat_bin2d(bins = 11, colour = "white")+labs(x="Weather Event", 
                        y="Log10(Damage Costs-Property and/or Crops)", 
        title="Weather Events Causing the Greatest 
        Economic Consequences for the Years 1950 to 2011") +
                theme(axis.text.x=element_text(angle=90, size=10, vjust=0.5)) + 
         theme(legend.position="none") +
                theme(panel.background = element_rect(fill = 'black'))