Synopsis

Data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database were explored to answer what events have the most serious impacts on health and properties across the USA. The data were preprocessed, so that only data younger than 1995 were analysed. The results are reported for three consecutive half-decades to account for eventual so called black swan events. The most harmful events with respect to population health are tornados and excessive heat (former with increasing tendency during years, latter with decreasing one). The greatest economic consequences have hurricanes and floods, though hurricanes were absent from top five most harmful events in the last half-decade.

Data Processing

  1. Preparation of working environment
#clear workspace:
rm(list=ls())

#load necessary libraries:
library(dplyr)
library(lubridate)
library(xtable)
library(ggplot2)

#set working enviro to english settings:
Sys.setlocale("LC_ALL","English")
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
  1. Downloading of data
# download zip file if needed and unzip:
# wrote by Leonard Greski, I added rm() and write()
if(!file.exists("repdata-data-StormData.csv.bz2")){
        dlMethod <- "curl"
        if(substr(Sys.getenv("OS"),1,7) == "Windows") dlMethod <- "wininet"
        url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
        download.file(url,destfile='repdata-data-StormData.csv.bz2',method=dlMethod,mode="wb")
        #unzip(zipfile = "repdata-data-StormData.csv.bz2")
        rm(dlMethod, url)
        write(date(), "time_downloaded.txt")
        }
  1. Reading data in
#read in raw data:
a<-read.csv("repdata-data-StormData.csv.bz2")

#select ony target variables:
b<- select(a, STATE, BGN_DATE, EVTYPE, FATALITIES:CROPDMGEXP)
head(b)
##   STATE           BGN_DATE  EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP
## 1    AL  4/18/1950 0:00:00 TORNADO          0       15    25.0          K
## 2    AL  4/18/1950 0:00:00 TORNADO          0        0     2.5          K
## 3    AL  2/20/1951 0:00:00 TORNADO          0        2    25.0          K
## 4    AL   6/8/1951 0:00:00 TORNADO          0        2     2.5          K
## 5    AL 11/15/1951 0:00:00 TORNADO          0        2     2.5          K
## 6    AL 11/15/1951 0:00:00 TORNADO          0        6     2.5          K
##   CROPDMG CROPDMGEXP
## 1       0           
## 2       0           
## 3       0           
## 4       0           
## 5       0           
## 6       0
  1. Data processing part 1 - converting time
#converting dates to recognizable format:
b$begin_date<-mdy_hms(b$BGN_DATE)

#get rid of no longer needed variable:
b$BGN_DATE<-NULL

#get just year:
b$year<-year(b$begin_date)
  1. Data processing part 2 - general cleaning
#get rid of "Summary something"" rows:
b2<-b[!(grepl("^Summary|^SUMMARY", b$EVTYPE)),]

#get rid of unrecognized event type:
b2<-filter(b2, !b2$EVTYPE == "?")
  1. Data processing part 3 - year 1996

According to official web (http://www.ncdc.noaa.gov/stormevents/details.jsp) the data were collected from 1950, thought only for tornados. Later, in 1955, thunderstorm winds and hails were added, and finally in 1996 the events started to be more or less classified into 48 event types. I decide to discard all data before year 1996, so that overall impact of tornados, thunderstorm winds and hails would not be overestimated due to far longer time of data collecting.

#discard everything before 1996:
b3<-filter(b2, year >= 1996)

#also get rid of rows with no fatality, injury or damage - those are not target for analysis:
b4<-filter(b3, !(FATALITIES==0 & INJURIES==0 & PROPDMG==0 & CROPDMG==0))
  1. Data processing part 4 - event types

I manually extracted 48 classified eventy types from the National Weather Service Storm Data Documentation (https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf). I dediced to use just these 48 categories and discard everything else - in case of real data, I would consult authorities about cases not clearly belonging to any category stated above. For this assignment, I chose to discard them completely (as it’s not my job to clean that cursed poorly formated database).

#48 official event types:
events <- c("Astronomical_Low_Tide", "Avalanche", "Blizzard", "Coastal_Flood", 
            "Cold_Wind_Chill", "Debris_Flow", "Dense_Fog", "Dense_Smoke", 
            "Drought", "Dust_Devil", "Dust_Storm", "Excessive_Heat", 
            "Extreme_Cold_Wind_Chill", "Flash_Flood", "Flood", "Frost_Freeze", 
            "Funnel_Cloud", "Freezing_Fog", "Hail", "Heat", "Heavy_Rain", "Heavy_Snow", 
            "High_Surf", "High_Wind", "Hurricane_Typhoon", "Ice_Storm", "Lake-Effect_Snow", 
            "Lakeshore_Flood", "Lightning", "Marine_Hail", "Marine_High_Wind", 
            "Marine_Strong_Wind", "Marine_Thunderstorm_Wind", "Rip_Current", "Seiche", 
            "Sleet", "Storm_Surge_Tide", "Strong_Wind", "Thunderstorm_Wind", "Tornado", 
            "Tropical_Depression", "Tropical_Storm", "Tsunami", "Volcanic_Ash", "Waterspout", 
            "Wildfire", "Winter_Storm", "Winter_Weather") 

#change all event types of dataset to upper cases:
b4$EVTYPE<-toupper(b4$EVTYPE)

#filter out events with correct or near correct names:
b5<-b4[grepl("LOW(.*)TIDE", b4$EVTYPE),]
b5<-rbind(b5, b4[grepl("AVALANCHE", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("BLIZZARD", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("COAST(.*)FLOOD", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("COLD|WIND(.*)CHILL", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("DEBRIS", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("DENS(.*)FOG", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("SMOKE", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("DROUGH|DRY", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("DUST(.*)DEVIL", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("DUST(.*)STORM", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HEAT|WARMTH", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("FLASH(.*)FLOOD", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("^FLOOD", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("FROST|FREEZ", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("FUNNEL", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("FREEZING(.*)FOG", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HAIL", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HEAT", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HEAVY(.*)RAIN", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HEAVY(.*)SNOW", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HIGH(.*)SURF", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HIGH(.*)WIND", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("HURRICANE|TYPHOON", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("ICE(.*)STORM", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("EFFECT(.*)SNOW", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("LAKESH(.*)FLOOD", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("LIGHTNING", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("^MARINE", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("RIP(.*)CURRENT", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("SEICHE", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("SLEET", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("STORM(.*)(SURGE|TIDE)", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("STRONG(.*)WIND", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("(THUNDERSTORM|TSTM)(.*)WIND", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("TORNADO", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("TROPIC(.*)DEPRESS", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("TROPIC(.*)STORM", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("TSUNAMI", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("VOLCAN(.*)ASH", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("WATERSPOUT", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("WILDFIRE", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("WINTER(.*)STORM", b4$EVTYPE),])
b5<-rbind(b5, b4[grepl("WINTER(.*)WEATHER", b4$EVTYPE),])

#filter out two cases that get through grepl statements and I dont want them:
b5<-filter(b5, !b5$EVTYPE=="MARINE ACCIDENT")
b5<-filter(b5, !b5$EVTYPE=="NON-TSTM WIND")

#rename event types, so that there is max of 48 types at the end:
b5$EVTYPE[grep("LOW(.*)TIDE", b5$EVTYPE)]<-"Low tide"
b5$EVTYPE[grep("AVALANCHE", b5$EVTYPE)]<-"Avalanche"
b5$EVTYPE[grep("BLIZZARD", b5$EVTYPE)]<-"Blizzard"
b5$EVTYPE[grep("COAST(.*)FLOOD", b5$EVTYPE)]<-"Coastal flood"
b5$EVTYPE[grep("COLD|WIND(.*)CHILL", b5$EVTYPE)]<-"Cold or Wind chill"
b5$EVTYPE[grep("DEBRIS", b5$EVTYPE)]<-"Debris flow"
b5$EVTYPE[grep("DENS(.*)FOG", b5$EVTYPE)]<-"Dense fog"
b5$EVTYPE[grep("SMOKE", b5$EVTYPE)]<-"Dense smoke"
b5$EVTYPE[grep("DROUGH|DRY", b5$EVTYPE)]<-"Drought"
b5$EVTYPE[grep("DUST(.*)DEVIL", b5$EVTYPE)]<-"Dust devil"
b5$EVTYPE[grep("DUST(.*)STORM", b5$EVTYPE)]<-"Dust storm"
b5$EVTYPE[grep("HEAT|WARMTH", b5$EVTYPE)]<-"Excessive heat"
b5$EVTYPE[grep("FLASH(.*)FLOOD", b5$EVTYPE)]<-"Flash flood"
b5$EVTYPE[grep("^FLOOD", b5$EVTYPE)]<-"Flood"
b5$EVTYPE[grep("FROST|FREEZ", b5$EVTYPE)]<-"Frost or Freeze"
b5$EVTYPE[grep("FUNNEL", b5$EVTYPE)]<-"Funnel cloud"
b5$EVTYPE[grep("FREEZING(.*)FOG", b5$EVTYPE)]<-"Freezing fog"
b5$EVTYPE[grep("HAIL", b5$EVTYPE)]<-"Hail"
b5$EVTYPE[grep("HEAT", b5$EVTYPE)]<-"Heat"
b5$EVTYPE[grep("HEAVY(.*)RAIN", b5$EVTYPE)]<-"Heavy rain"
b5$EVTYPE[grep("HEAVY(.*)SNOW", b5$EVTYPE)]<-"Heavy snow"
b5$EVTYPE[grep("SURF", b5$EVTYPE)]<-"High surf"
b5$EVTYPE[grep("HIGH(.*)WIND", b5$EVTYPE)]<-"High wind"
b5$EVTYPE[grep("HURRICANE|TYPHOON", b5$EVTYPE)]<-"Hurricane"
b5$EVTYPE[grep("ICE(.*)STORM", b5$EVTYPE)]<-"Ice storm"
b5$EVTYPE[grep("EFFECT(.*)SNOW", b5$EVTYPE)]<-"Lake-effect snow"
b5$EVTYPE[grep("LAKESH(.*)FLOOD", b5$EVTYPE)]<-"Lakeshore flood"
b5$EVTYPE[grep("LIGHTNING", b5$EVTYPE)]<-"Lightning"
b5$EVTYPE[grep("^MARINE(.*)HAIL", b5$EVTYPE)]<-"Marine hail"
b5$EVTYPE[grep("^MARINE(.*)HIGH", b5$EVTYPE)]<-"Marine high wind"
b5$EVTYPE[grep("^MARINE(.*)STRONG", b5$EVTYPE)]<-"Marine strong wild"
b5$EVTYPE[grep("^MARINE(.*)(THUNDER|TSTM)", b5$EVTYPE)]<-"Marine thunderstorm"
b5$EVTYPE[grep("RIP(.*)CURRENT", b5$EVTYPE)]<-"Rip current"
b5$EVTYPE[grep("SEICHE", b5$EVTYPE)]<-"Seiche"
b5$EVTYPE[grep("SLEET", b5$EVTYPE)]<-"Sleet"
b5$EVTYPE[grep("STORM(.*)(SURGE|TIDE)", b5$EVTYPE)]<-"Storm surge"
b5$EVTYPE[grep("STRONG(.*)WIND", b5$EVTYPE)]<-"Strong wind"
b5$EVTYPE[grep("^TSTM|^THUNDERSTORM| TSTM", b5$EVTYPE)]<-"Thunderstorm wind"
b5$EVTYPE[grep("TORNADO", b5$EVTYPE)]<-"Tornado"
b5$EVTYPE[grep("TROPIC(.*)DEPRESS", b5$EVTYPE)]<-"Tropic depression"
b5$EVTYPE[grep("TROPIC(.*)STORM", b5$EVTYPE)]<-"Tropic storm"
b5$EVTYPE[grep("TSUNAMI", b5$EVTYPE)]<-"Tsunami"
b5$EVTYPE[grep("VOLCAN(.*)ASH", b5$EVTYPE)]<-"Volcanic ash"
b5$EVTYPE[grep("WATERSPOUT", b5$EVTYPE)]<-"Waterspout"
b5$EVTYPE[grep("WILDFIRE", b5$EVTYPE)]<-"Wildfire"
b5$EVTYPE[grep("WINTER(.*)STORM", b5$EVTYPE)]<-"Winter storm"
b5$EVTYPE[grep("WINTER(.*)WEATHER", b5$EVTYPE)]<-"Winter storm"

#check the event types, if there are only those I want and if their names are recoded properly:
d<-as.character(unique(b5$EVTYPE))
d<-sort(d)
d
##  [1] "Avalanche"           "Blizzard"            "Coastal flood"      
##  [4] "Cold or Wind chill"  "Dense fog"           "Dense smoke"        
##  [7] "Drought"             "Dust devil"          "Dust storm"         
## [10] "Excessive heat"      "Flash flood"         "Flood"              
## [13] "Frost or Freeze"     "Funnel cloud"        "Hail"               
## [16] "Heavy rain"          "Heavy snow"          "High surf"          
## [19] "High wind"           "Hurricane"           "Ice storm"          
## [22] "Lake-effect snow"    "Lakeshore flood"     "Lightning"          
## [25] "Low tide"            "Marine strong wild"  "Marine thunderstorm"
## [28] "Rip current"         "Seiche"              "Storm surge"        
## [31] "Strong wind"         "Thunderstorm wind"   "Tornado"            
## [34] "Tropic depression"   "Tropic storm"        "Tsunami"            
## [37] "Volcanic ash"        "Waterspout"          "Wildfire"           
## [40] "Winter storm"
  1. Data processing part 5 - exponents of damage values

From pdf file reffered in previous part, the only valid marks for exponents are “K” for thousands, “M” for millions, and “B” for billions.

#lets see what mess is in this variable:
table(b5$PROPDMGEXP)
## 
##             -      ?      +      0      1      2      3      4      5 
##   8798      0      0      0      0      0      0      0      0      0 
##      6      7      8      B      h      H      K      m      M 
##      0      0      0     31      0      0 184756      0   7246
table(b5$CROPDMGEXP)
## 
##             ?      0      2      B      k      K      m      M 
## 102143      0      0      0      2      0  96923      0   1763
#so only B, K, and M are used

#create variables for filling in costs with exponents:
b6 <- mutate(b5, prop_cost=PROPDMG, crop_cost=CROPDMG)
#property damage:
b6$prop_cost<-ifelse(b6$PROPDMGEXP=="K", b6$prop_cost*1000, b6$prop_cost)
b6$prop_cost<-ifelse(b6$PROPDMGEXP=="M", b6$prop_cost*1000000, b6$prop_cost)
b6$prop_cost<-ifelse(b6$PROPDMGEXP=="B", b6$prop_cost*1000000000, b6$prop_cost)
#crop damage:
b6$crop_cost<-ifelse(b6$PROPDMGEXP=="K", b6$crop_cost*1000, b6$crop_cost)
b6$crop_cost<-ifelse(b6$PROPDMGEXP=="M", b6$crop_cost*1000000, b6$crop_cost)
b6$crop_cost<-ifelse(b6$PROPDMGEXP=="B", b6$crop_cost*1000000000, b6$crop_cost)
  1. Data processing part 6 - final dataset
#select only target variables
fin<-select(b6, EVTYPE:INJURIES, year:crop_cost)
head(fin)
##      EVTYPE FATALITIES INJURIES year prop_cost crop_cost
## 1  Low tide          0        0 2007    120000         0
## 2  Low tide          0        0 2008    200000         0
## 3 Avalanche          0        2 1996         0         0
## 4 Avalanche          1        1 1996         0         0
## 5 Avalanche          2        0 1996         0         0
## 6 Avalanche          0        2 1996         0         0

RESULTS

I decide to report not only effects on whole time scale (1996 to 2011; table data), but also for three consecutive half-decades (1996 to 2000, 2001 to 2005, and 2006 to 2011; table and plot). Rationale behind it is that 11 year may be too long interval in current changing, unstable and variable clima, so that shorter intervals, like 5 years, could show some interesting trends or at least changes in what events are the most harmful.

Across the United States, which types of events are most harmful with respect to population health?

I work with simple sum of number of fatalities and injuries as a representation of harm to population health.

#create aggregated dataset with event type and health impact:
fin_health<-group_by(fin, EVTYPE) %>%
        summarise(health_impact=sum(FATALITIES+INJURIES))
         
#arrange dataset so that the most hamrful events are at the beginning:
fin_health<-arrange(fin_health, desc(health_impact))

Tab.1. Number of fatalities and injuries together caused by events in years 1996 to 2011 across USA:

#print table
print(xtable(fin_health), type="html")
EVTYPE health_impact
1 Tornado 22178.00
2 Excessive heat 19438.00
3 Flood 7172.00
4 Thunderstorm wind 5407.00
5 Lightning 4792.00
6 Flash flood 2561.00
7 Winter storm 2027.00
8 Hurricane 1453.00
9 High wind 1322.00
10 Rip current 1045.00
11 Wildfire 986.00
12 Hail 930.00
13 Heavy snow 807.00
14 Cold or Wind chill 521.00
15 Blizzard 455.00
16 Strong wind 409.00
17 Ice storm 400.00
18 Tropic storm 395.00
19 Dust storm 387.00
20 Avalanche 379.00
21 High surf 330.00
22 Heavy rain 324.00
23 Tsunami 162.00
24 Dense fog 152.00
25 Marine thunderstorm 106.00
26 Marine strong wild 72.00
27 Storm surge 55.00
28 Dust devil 41.00
29 Drought 32.00
30 Frost or Freeze 22.00
31 Coastal flood 13.00
32 Waterspout 4.00
33 Funnel cloud 1.00
34 Dense smoke 0.00
35 Lake-effect snow 0.00
36 Lakeshore flood 0.00
37 Low tide 0.00
38 Seiche 0.00
39 Tropic depression 0.00
40 Volcanic ash 0.00

Now for the half-decades part. Categories I chose are, as their names say, half a decade, 5 year - from 1996 to 2000, from 2001 to 2005, and from 2006 to 2011 (this one is 6 years long, as the data represent 16 years long data collection).

#dataset with half-decades:
fin2<-fin
fin2$halfdecade<-cut(fin2$year, c(1995,2000,2006,2011), labels=c("1996-2000", "2001-2005", "2006-2011"))

#aggregate for decade and event type, arrange by health_impact, rank by health impact in halfdecades:
fin2_health<-group_by(fin2, halfdecade, EVTYPE) %>%
        summarise(health_impact=sum(FATALITIES+INJURIES)) %>%
        arrange(desc(health_impact)) %>%
        mutate(rank=1:length(halfdecade))

#create tables for printing:
tab_1 <- filter(fin2_health, halfdecade=="1996-2000")
tab_2 <- filter(fin2_health, halfdecade=="2001-2005")
tab_3 <- filter(fin2_health, halfdecade=="2006-2011")

tab_123<-merge(tab_1, tab_2, by="EVTYPE", all=T)
tab_123<-merge(tab_123, tab_3, by="EVTYPE", all=T)
tab_123<-select(tab_123, -contains("halfdecade"))

#rename collumns
names(tab_123)<-c("Event_type", "health_impact_96-00", "rank_96-00", "health_impact_01-05", 
                  "rank_01-05", "health_impact_06-11", "rank_06-11")

Tab.2. Number of fatalities and injuries together caused by events in three consecutive half-decades:

#print:
print(xtable(tab_123), type="html")
Event_type health_impact_96-00 rank_96-00 health_impact_01-05 rank_01-05 health_impact_06-11 rank_06-11
1 Avalanche 107.00 17 134.00 18 138.00 16
2 Blizzard 414.00 10 33.00 24 8.00 27
3 Coastal flood 7.00 25 2.00 29 4.00 28
4 Cold or Wind chill 245.00 14 118.00 21 158.00 15
5 Dense fog 149.00 17 3.00 29
6 Dense smoke 0.00 30
7 Drought 20.00 24 12.00 25 0.00 31
8 Dust devil 3.00 26 12.00 26 26.00 21
9 Dust storm 164.00 15 212.00 15 11.00 26
10 Excessive heat 8338.00 1 7334.00 1 3766.00 2
11 Flash flood 1208.00 6 744.00 6 609.00 5
12 Flood 6603.00 3 237.00 13 332.00 9
13 Frost or Freeze 22.00 23 0.00 31 0.00 32
14 Funnel cloud 1.00 28 0.00 32 0.00 33
15 Hail 377.00 11 371.00 11 182.00 11
16 Heavy rain 162.00 16 110.00 22 52.00 19
17 Heavy snow 528.00 9 235.00 14 44.00 20
18 High surf 29.00 22 124.00 19 177.00 12
19 High wind 651.00 8 504.00 9 167.00 13
20 Hurricane 94.00 18 1344.00 5 15.00 25
21 Ice storm 263.00 12 120.00 20 17.00 23
22 Lake-effect snow 0.00 29 0.00 33 0.00 34
23 Lakeshore flood 0.00 34 0.00 35
24 Lightning 1756.00 5 1954.00 3 1082.00 4
25 Low tide 0.00 36
26 Marine strong wild 4.00 27 68.00 18
27 Marine thunderstorm 36.00 23 70.00 17
28 Rip current 246.00 13 465.00 10 334.00 8
29 Seiche 0.00 30 0.00 35 0.00 37
30 Storm surge 36.00 21 3.00 28 16.00 24
31 Strong wind 65.00 20 160.00 16 184.00 10
32 Thunderstorm wind 2306.00 4 1580.00 4 1521.00 3
33 Tornado 6695.00 2 5012.00 2 10471.00 1
34 Tropic depression 0.00 36 0.00 38
35 Tropic storm 78.00 19 296.00 12 21.00 22
36 Tsunami 0.00 37 162.00 14
37 Volcanic ash 0.00 38
38 Waterspout 3.00 27 1.00 30 0.00 39
39 Wildfire 531.00 8 455.00 6
40 Winter storm 1086.00 7 532.00 7 409.00 7

Fig.1. Plot of top five most harmful events:

#dataset for plotting top five most harmful:
fin2_health2<-filter(fin2_health, rank<6)

#plot is:
ggplot(fin2_health2, aes(x=factor(EVTYPE), y=health_impact, fill=factor(EVTYPE))) + 
        geom_bar(stat="identity") + 
        facet_wrap(~halfdecade) +
        theme(axis.text.x  = element_text(angle=90, vjust=0.5, hjust=1)) +
        labs(title="Top five most harmful events in years 1996-2011 in USA
             splitted up to 3 half-decades",
             x="", y="Number of killed or injured people",
             fill="Event types")

As can be seen from the plot, the most harmful types of events are tornados followed by excessive heat. The decline in numbers of injuries and fatalities due to warmth could be possibly linked with widespread of air-conditionig. One can also notice the impact of floods in first half-decade, that disappears from top five in next two decades. Overall data in Tab.1. shows similar picture.

Across the United States, which types of events have the greatest economic consequences?

#aggreagate property and crop damage:
fin_dmg<-group_by(fin, EVTYPE) %>%
        summarise(damage=(sum(prop_cost+crop_cost)/1000000))

#arrange dataset so that the most costly events are at the beginning:
fin_dmg<-arrange(fin_dmg, desc(damage))

Tab.3. Number of fatalities and injuries together caused by events in years 1996 to 2011 across USA:

#print table
print(xtable(fin_dmg), type="html")
EVTYPE damage
1 Hurricane 1617419.53
2 Flood 226156.35
3 Tornado 49924.00
4 Flash flood 48669.60
5 Storm surge 47840.57
6 Hail 27672.74
7 Thunderstorm wind 20945.05
8 High wind 12425.03
9 Wildfire 11932.48
10 Tropic storm 9460.48
11 Drought 4491.34
12 Ice storm 3656.71
13 Winter storm 2238.11
14 Heavy snow 1601.65
15 Lightning 1044.61
16 Heavy rain 644.89
17 Blizzard 532.72
18 Cold or Wind chill 432.52
19 Coastal flood 390.24
20 Frost or Freeze 236.34
21 Strong wind 226.96
22 Tsunami 164.06
23 High surf 93.97
24 Lake-effect snow 40.18
25 Excessive heat 19.67
26 Marine thunderstorm 11.81
27 Lakeshore flood 7.54
28 Dense fog 7.32
29 Dust storm 6.33
30 Waterspout 5.73
31 Avalanche 3.71
32 Tropic depression 1.74
33 Seiche 0.98
34 Marine strong wild 0.84
35 Dust devil 0.66
36 Volcanic ash 0.50
37 Low tide 0.32
38 Rip current 0.16
39 Funnel cloud 0.13
40 Dense smoke 0.10

Now for the half-decades part. The categories are the same as for health impact.

#aggregate for decade and event type, arrange by health_impact, rank by health impact in halfdecades:
fin2_dmg<-group_by(fin2, halfdecade, EVTYPE) %>%
        summarise(damage=(sum(prop_cost+crop_cost)/1000000)) %>%
        arrange(desc(damage)) %>%
        mutate(rank=1:length(halfdecade))

#create tables for printing:
tab_4 <- filter(fin2_dmg, halfdecade=="1996-2000")
tab_5 <- filter(fin2_dmg, halfdecade=="2001-2005")
tab_6 <- filter(fin2_dmg, halfdecade=="2006-2011")

tab_456<-merge(tab_4, tab_5, by="EVTYPE", all=T)
tab_456<-merge(tab_456, tab_6, by="EVTYPE", all=T)
tab_456<-select(tab_456, -contains("halfdecade"))

#rename collumns
names(tab_456)<-c("Event_type", "damage_96-00", "rank_96-00", "damage_01-05", 
                  "rank_01-05", "damage_06-11", "rank_06-11")

Tab.4. Cost of damage in millions of dollars caused by events in three consecutive half-decades:

#print:
print(xtable(tab_456), type="html")
Event_type damage_96-00 rank_96-00 damage_01-05 rank_01-05 damage_06-11 rank_06-11
1 Avalanche 0.95 23 0.38 30 2.39 30
2 Blizzard 412.75 15 91.23 18 28.74 21
3 Coastal flood 156.33 17 69.37 19 164.55 14
4 Cold or Wind chill 420.98 14 1.92 26 9.63 24
5 Dense fog 4.56 25 2.76 29
6 Dense smoke 0.10 36
7 Drought 1130.34 9 3350.08 11 10.92 23
8 Dust devil 0.06 28 0.22 31 0.38 34
9 Dust storm 1.44 22 1.71 27 3.18 28
10 Excessive heat 6.64 20 7.34 24 5.69 26
11 Flash flood 13019.20 3 17583.12 4 18067.29 3
12 Flood 29675.69 2 157361.54 2 39119.13 1
13 Frost or Freeze 110.81 18 0.07 34 125.47 16
14 Funnel cloud 0.05 29 0.01 37 0.07 38
15 Hail 10195.74 5 8444.26 7 9032.73 4
16 Heavy rain 176.98 16 354.48 14 113.43 17
17 Heavy snow 1340.11 8 193.72 16 67.82 19
18 High surf 0.56 25 10.45 22 82.97 18
19 High wind 1107.01 10 6251.88 9 5066.13 6
20 Hurricane 811858.78 1 802912.65 1 2648.11 8
21 Ice storm 1673.30 7 1203.37 12 780.05 12
22 Lake-effect snow 0.06 27 23.41 20 16.71 22
23 Lakeshore flood 0.01 38 7.53 25
24 Lightning 485.50 13 255.52 15 303.60 13
25 Low tide 0.32 35
26 Marine strong wild 0.03 35 0.80 33
27 Marine thunderstorm 10.84 21 0.97 32
28 Rip current 0.00 30 0.16 32 0.00 39
29 Seiche 0.82 24 0.07 33 0.09 37
30 Storm surge 28.76 19 43170.32 3 4641.49 7
31 Strong wind 2.62 21 177.82 17 46.52 20
32 Thunderstorm wind 8103.28 6 4062.23 10 8779.55 5
33 Tornado 11400.57 4 13927.03 5 24596.41 2
34 Tropic depression 0.43 29 1.30 31
35 Tropic storm 508.11 12 7955.43 8 996.95 10
36 Tsunami 9.20 23 154.86 15
37 Volcanic ash 0.50 28
38 Waterspout 0.50 26 0.03 36 5.20 27
39 Wildfire 9640.04 6 2292.44 9
40 Winter storm 839.90 11 469.73 13 928.48 11

Fig.2. Plot of top five most harmful events:

#dataset for plotting top five most harmful:
fin2_dmg2<-filter(fin2_dmg, rank<6)

#plot is:
ggplot(fin2_dmg2, aes(x=factor(EVTYPE), y=damage, fill=factor(EVTYPE))) + 
        geom_bar(stat="identity") + 
        facet_wrap(~halfdecade) +
        theme(axis.text.x  = element_text(angle=90, vjust=0.25, hjust=1)) +
        labs(title="Top five events with the greatest economic consequences 
             in years 1996-2011 in USA splitted up to 3 half-decades",
             x=" ", y="Damage to properties and crops (millions of $)",
             fill="Event types")

The hurricanes were the most prominent cause of destruction in first two half-decades, though in third half-decade their impact is lower and below top five most costly events. Second in rank are floods causing high damage in every half-decade. In overall data those two events exceed others in two and one levels of magnitude of how costly their impact is.