About the project

The mission of the project is to clean a raw database and present the data on human and economical losses caused by the natural disasters in the US from January 1950 to December 2011 for the entire period.

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The data documentation may be found here: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf.

The data itself may be found here: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2.

The project is made in R which is a statistical programming language free software. The code samples are represented in this text.

Preparing the R environment

Data Processing

This is done through the direct data downloading:

my_url = "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
# ad <- read_csv(my_url) #Method1 
# The copy of this file will be in the Git Hub Project 
# library(data.table)
ad <- data.table::fread(my_url) # method 2 - used here :) - Long time processing - around 1 minute

# knitr::kable(head(ad, 5)) #for simple R use: 
# head(ad, 5) 
str(ad)
## Classes 'data.table' and 'data.frame':   902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
##  - attr(*, ".internal.selfref")=<externalptr>
str(ad$BGN_DATE)
##  chr [1:902297] "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" ...
# chr [1:902297] "4/18/1950 0:00:00" "4/18/1950 0:00:00" 
# ad$date1 <- strptime(ad$BGN_DATE, format = "%m/%d/%Y %H:%M:%S")# Too big dataset for this command to be applied like this 
ad$date1 <- ad$BGN_DATE
strptime("4/18/1950 0:00:00", format = "%m/%d/%Y %H:%M:%S") #Test if it will deal with our date output 
## [1] "1950-04-18 CET"
ad[['date1']] <- strptime(ad[['date1']], format='%m/%d/%Y %H:%M:%S')
ad$date1[1:10] #Output OK - we transformed the data 
##  [1] "1950-04-18 CET" "1950-04-18 CET" "1951-02-20 CET" "1951-06-08 CET"
##  [5] "1951-11-15 CET" "1951-11-15 CET" "1951-11-16 CET" "1952-01-22 CET"
##  [9] "1952-02-13 CET" "1952-02-13 CET"
### Method 2 - very long - not used here 
# install.packages("anytime")
# library(anytime)
anytime("4/18/1950 0:00:00") #Good - format is recognized well 
## [1] "1950-04-18 CET"
# ad$date1 <- anytime(ad$BGN_DATE) #Very Long in processing, but effective function
### Method 3 - not used here in this example, but may be used potentially 
#cleaning the BGN_DATE variable
# date1<-ad$BGN_DATE
# date2<-sub("0:00:00","",date1,ignore.case = "TRUE")
# head(date2)
# ad <- mutate(ad,BGN_DATE=date2)
# rm(date1, date2) 

str(ad$date1)
##  POSIXlt[1:902297], format: "1950-04-18" "1950-04-18" "1951-02-20" "1951-06-08" "1951-11-15" ...
names(ad)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"     "date1"
ad$date1 = as.POSIXct(ad$date1) # POSIXlt date format is not supported in R data.table basic package :( - bug in R; POSIXct - should be supported
ad$date1[1:10] #Test shows OK :) 
##  [1] "1950-04-18 CET" "1950-04-18 CET" "1951-02-20 CET" "1951-06-08 CET"
##  [5] "1951-11-15 CET" "1951-11-15 CET" "1951-11-16 CET" "1952-01-22 CET"
##  [9] "1952-02-13 CET" "1952-02-13 CET"
ad <- data.table(ad) #Othervise it cannot deal with the date1 column (date-time format) - bug in R
# ?data.table: POSIXlt is not supported as a column type because it uses 40 bytes to store a single datetime. Unexpected errors may occur if you manage to create a column of type POSIXlt. Please see NEWS for 1.6.3, and IDateTime instead. IDateTime has methods to convert to and from POSIXlt
df1 <- ad
df1 <- df1 %>% 
  dplyr::select(STATE__, date1, COUNTY, COUNTYNAME, STATE, EVTYPE, BGN_RANGE, LENGTH, WIDTH, F, MAG, 
  FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, LATITUDE, LONGITUDE, LATITUDE_E, LONGITUDE_, REFNUM)
dim(ad)
## [1] 902297     38
dim(df1)
## [1] 902297     21
# The unnecessary columns succesfully removed :) 


# Data Processing
# number of unique event types
length(unique(df1$EVTYPE))
## [1] 985
# translate all letters to lowercase
unique_events_by_types <- tolower(df1$EVTYPE)
length(unique(unique_events_by_types)) #898
## [1] 898
# replace all punct. characters with a space
unique_events_by_types <- gsub("[[:blank:][:punct:]+]", " ", unique_events_by_types)
length(unique(unique_events_by_types)) #874 - punctuation and spaces made some events non-unique, although they were unique 
## [1] 874
# update the data frame
df1$EVTYPE1 <- unique_events_by_types
unique_events_by_types[1:100]
##   [1] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##   [7] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##  [13] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##  [19] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##  [25] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##  [31] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##  [37] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##  [43] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tornado"  
##  [49] "tornado"   "tornado"   "tornado"   "tornado"   "tornado"   "tstm wind"
##  [55] "hail"      "hail"      "tstm wind" "hail"      "tstm wind" "tstm wind"
##  [61] "hail"      "hail"      "hail"      "tstm wind" "tstm wind" "tstm wind"
##  [67] "hail"      "tornado"   "tstm wind" "tornado"   "tstm wind" "tstm wind"
##  [73] "tstm wind" "hail"      "tstm wind" "tornado"   "tornado"   "tornado"  
##  [79] "tornado"   "tornado"   "tstm wind" "hail"      "tstm wind" "tstm wind"
##  [85] "tstm wind" "tstm wind" "tstm wind" "hail"      "hail"      "tornado"  
##  [91] "tstm wind" "hail"      "tstm wind" "tornado"   "tstm wind" "tstm wind"
##  [97] "tstm wind" "tstm wind" "tornado"   "hail"
# We see the number of event types reduced significantly after we performed the data cleaning process. 
# Further we could contact the weather scientists and ask them if we can merge similar things like different winds in the events of the same type. 
summary(unique(unique_events_by_types))
##    Length     Class      Mode 
##       874 character character

Discussion about the data cleaning

The official reserch mentions 48 events as follows (see the documentation):

  • ASTRONOMICAL LOW TIDE
  • AVALANCHE
  • BLIZZARD
  • COASTAL FLOOD
  • COLD/WIND CHILL
  • DEBRIS FLOW
  • DENSE FOG
  • DENSE SMOKE
  • DROUGHT
  • DUST DEVIL
  • DUST STORM
  • EXCESSIVE HEAT
  • EXTREME COLD/WIND CHILL
  • FLASH FLOOD
  • FLOOD
  • FROST/FREEZE
  • FUNNEL CLOUD
  • FREEZING FOG
  • HAIL
  • HEAT
  • HEAVY RAIN
  • HEAVY SNOW
  • HIGH SURF
  • HIGH WIND
  • HURRICANE (TYPHOON)
  • ICE STORM
  • LAKE- EFFECT SNOW
  • LAKESHORE FLOOD
  • LIGHTNING
  • MARINE HAIL
  • MARINE HIGH WIND
  • MARINE STRONG WIND
  • MARINE THUNDERSTORM WIND
  • RIP CURRENT
  • SEICHE
  • SLEET
  • STORM SURGE/TIDE
  • STRONG WIND
  • THUNDERSTORM WIND
  • TORNADO
  • TROPICAL DEPRESSION
  • TROPICAL STORM
  • TSUNAMI
  • VOLCANIC ASH
  • WATERSPOUT
  • WILDFIRE
  • WINTER STORM
  • WINTER WEATHER

Across the United States, which types of events are most harmful with respect to population health?

Data Cleaning

This is how the data cleaning is done (alternative methods are also presented here):

#### Method 1 to work with them: 
#ASTRONOMICAL LOW TIDE
my_tidy_data<-df1[which(EVTYPE=="ASTRONOMICAL LOW TIDE"), ]
#AVALANCE
my_temporary_file<-df1[which(EVTYPE=="AVALANCE"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#BLIZZARD
my_temporary_file<-df1[which(EVTYPE=="BLIZZARD"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#COASTAL FLOOD
my_temporary_file<-df1[which(EVTYPE=="COASTAL FLOOD"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#COLD/WIND CHILL
my_temporary_file<-df1[which(EVTYPE=="COLD/WIND CHILL"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#DEBRIS FLOW
my_temporary_file<-df1[which(EVTYPE=="DEBRIS FLOW"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#DENSE FOG
my_temporary_file<-df1[which(EVTYPE=="DENSE FOG"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#DENSE SMOKE
my_temporary_file<-df1[which(EVTYPE=="DENSE SMOKE"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#DROUGHT
my_temporary_file<-df1[which(EVTYPE=="DROUGHT"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#DUST DEVIL
my_temporary_file<-df1[which(EVTYPE=="DUST DEVIL"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#DUST STORM
my_temporary_file<-df1[which(EVTYPE=="DUST STORM"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#EXCESSIVE HEAT
my_temporary_file<-df1[which(EVTYPE=="EXCESSIVE HEAT"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#EXTREME COLD/WIND CHILL
my_temporary_file<-df1[which(EVTYPE=="EXTREME COLD/WIND CHILL"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#FLASH FLOOD
my_temporary_file<-df1[which(EVTYPE=="FLASH FLOOD"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#FLOOD
my_temporary_file<-df1[which(EVTYPE=="FLOOD"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#FROST/FREEZE
my_temporary_file<-df1[which(EVTYPE=="FROST/FREEZE"|EVTYPE=="Frost/Freeze"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#FUNNEL CLOUD
my_temporary_file<-df1[which(EVTYPE=="FUNNEL CLOUD"|EVTYPE=="Funnel Cloud"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#FREEZING FOG
my_temporary_file<-df1[which(EVTYPE=="FREEZING FOG"|EVTYPE=="Freezing Fog"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#HAIL
my_temporary_file<-df1[which(EVTYPE=="HAIL"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#HEAT
my_temporary_file<-df1[which(EVTYPE=="HEAT"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#HEAVY RAIN
my_temporary_file<-df1[which(EVTYPE=="HEAVY RAIN"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#HEAVY SNOW
my_temporary_file<-df1[which(EVTYPE=="HEAVY SNOW"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#HIGH SURF
my_temporary_file<-df1[which(EVTYPE=="HIGH SURF"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#HIGH WIND
my_temporary_file<-df1[which(EVTYPE=="HIGH WIND"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#HURRICANE (TYPHOON)
my_temporary_file<-df1[which(EVTYPE=="HURRICANE/TYPHOON"|EVTYPE=="TYPHOON"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#ICE STORM
my_temporary_file<-df1[which(EVTYPE=="ICE STORM"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#LAKE-EFFECT SNOW
my_temporary_file<-df1[which(EVTYPE=="LAKE-EFFECT SNOW"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#LAKESHORE FLOOD
my_temporary_file<-df1[which(EVTYPE=="LAKESHORE FLOOD"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#LIGHTNING
my_temporary_file<-df1[which(EVTYPE=="LIGHTNING"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#MARINE HAIL
my_temporary_file<-df1[which(EVTYPE=="MARINE HAIL"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#MARINE HIGH WIND
my_temporary_file<-df1[which(EVTYPE=="MARINE HIGH WIND"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#MARINE STRONG WIND
my_temporary_file<-df1[which(EVTYPE=="MARINE STRONG WIND"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#MARINE THUNDERSTORM WIND
my_temporary_file<-df1[which(EVTYPE=="MARINE THUNDERSTORM WIND"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#RIP CURRENT
my_temporary_file<-df1[which(EVTYPE=="RIP CURRENT"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#SEICHE
my_temporary_file<-df1[which(EVTYPE=="SEICHE"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#SLEET
my_temporary_file<-df1[which(EVTYPE=="SLEET"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#STORM SURGE/TIDE
my_temporary_file<-df1[which(EVTYPE=="STORM SURGE/TIDE"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#STRONG WIND
my_temporary_file<-df1[which(EVTYPE=="STRONG WIND"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#THUNDERSTORM WIND
my_temporary_file<-df1[which(EVTYPE=="THUNDERSTORM WIND"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#TORNADO
my_temporary_file<-df1[which(EVTYPE=="TORNADO"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#TROPICAL DEPRESSION
my_temporary_file<-df1[which(EVTYPE=="TROPICAL DEPRESSION"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#TROPICAL STORM
my_temporary_file<-df1[which(EVTYPE=="TROPICAL STORM"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#TSUNAMI
my_temporary_file<-df1[which(EVTYPE=="TSUNAMI"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#VOLCANIC ASH
my_temporary_file<-df1[which(EVTYPE=="VOLCANIC ASH"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#WATERSPOUT
my_temporary_file<-df1[which(EVTYPE=="WATERSPOUT"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#WILDFIRE
my_temporary_file<-df1[which(EVTYPE=="WILDFIRE"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#WINTER STORM
my_temporary_file<-df1[which(EVTYPE=="WINTER STORM"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)
#WINTER WEATHER
my_temporary_file<-df1[which(EVTYPE=="WINTER WEATHER"), ]
my_tidy_data<-rbind(my_tidy_data,my_temporary_file)

### Method 2 - more preferred  df1$EVTYPE1 (new column with cleaned up data from punctuation and so on)
#### Method 1 to work with them: 
#All events are to be found by containing some specific keyword or partial-word pattern from full wording of the official classification (say, if word has "astro" in it => hence, it may to deal something with the ASTRONOMICAL LOW TIDE that we have in our classification)
df2 = dplyr::filter(df1, grepl('astro|aval|blizz|flood|wind|debris|fog|smoke|drought|dust|heat|cold|frost|cloud|hail|rain|snow|surf|typhoon|hurricane|storm|lightning|rip current|seiche|sleet|tornado|depression|tsunami|volcan|waterspout|fire|winter', EVTYPE1))
summary(unique(df2$EVTYPE1)) #629 events has at least something to deal with the officially recognized events classification - they have at least word patterns from the official classification 
##    Length     Class      Mode 
##       629 character character
unique(df2$EVTYPE1)[1:100] #Looks OK - we did not "eat" the most important events :) 
##   [1] "tornado"                        "tstm wind"                     
##   [3] "hail"                           "freezing rain"                 
##   [5] "snow"                           "ice storm flash flood"         
##   [7] "snow ice"                       "winter storm"                  
##   [9] "hurricane opal high winds"      "thunderstorm winds"            
##  [11] "record cold"                    "hurricane erin"                
##  [13] "hurricane opal"                 "heavy rain"                    
##  [15] "lightning"                      "thunderstorm wind"             
##  [17] "dense fog"                      "rip current"                   
##  [19] "thunderstorm wins"              "flash flood"                   
##  [21] "flash flooding"                 "high winds"                    
##  [23] "funnel cloud"                   "tornado f0"                    
##  [25] "thunderstorm winds lightning"   "thunderstorm winds hail"       
##  [27] "heat"                           "wind"                          
##  [29] "heavy rains"                    "lightning and heavy rain"      
##  [31] "wall cloud"                     "flooding"                      
##  [33] "flood"                          "cold"                          
##  [35] "heavy rain lightning"           "flash flooding thunderstorm wi"
##  [37] "wall cloud funnel cloud"        "thunderstorm"                  
##  [39] "waterspout"                     "extreme cold"                  
##  [41] "hail 1 75 "                     "lightning heavy rain"          
##  [43] "high wind"                      "blizzard"                      
##  [45] "blizzard weather"               "wind chill"                    
##  [47] "breakup flooding"               "high wind blizzard"            
##  [49] "river flood"                    "heavy snow"                    
##  [51] "coastal flood"                  "high wind and high tides"      
##  [53] "high wind blizzard freezing ra" "high wind and heavy snow"      
##  [55] "record cold and high wind"      "high winds heavy rains"        
##  [57] "high wind  blizzard"            "ice storm"                     
##  [59] "blizzard high wind"             "high wind low wind chill"      
##  [61] "heavy snow high"                "high winds and wind chill"     
##  [63] "heavy snow high winds freezing" "avalanche"                     
##  [65] "wind chill high wind"           "high wind wind chill blizzard" 
##  [67] "high wind wind chill"           "high wind heavy snow"          
##  [69] "flood watch "                   "high wind seas"                
##  [71] "high winds heavy rain"          "record rainfall"               
##  [73] "record snowfall"                "heavy snow wind"               
##  [75] "extreme heat"                   "wind damage"                   
##  [77] "dust storm"                     "sleet"                         
##  [79] "hail storm"                     "funnel clouds"                 
##  [81] "flash floods"                   "dust devil"                    
##  [83] "excessive heat"                 "thunderstorm winds funnel clou"
##  [85] "winter storm high wind"         "winter storm high winds"       
##  [87] "gusty winds"                    "strong winds"                  
##  [89] "flooding heavy rain"            "snow and wind"                 
##  [91] "heavy surf coastal flooding"    "heavy surf"                    
##  [93] "urban flooding"                 "high surf"                     
##  [95] "blowing dust"                   "wild fires"                    
##  [97] "urban small flooding"           "high winds dust storm"         
##  [99] "local flood"                    "winter storms"
# The Method 1 (BUT NOT Method 2) will eat all the events that have even 1 symbol different from the official wording - here we ate nothing important for sure 

### Method 3 - The Tidiest Data Possible:
#ASTRONOMICAL LOW TIDE
df3 = dplyr::filter(df1, grepl('astro', EVTYPE1))
df3$type = "ASTRONOMICAL LOW TIDE"
#AVALANCE
my_temporary_file<-dplyr::filter(df1, grepl('avalan', EVTYPE1)); my_temporary_file$type = "AVALANCE"
df3<-rbind(df3,my_temporary_file)
#BLIZZARD
my_temporary_file<-dplyr::filter(df1, grepl('blizz', EVTYPE1)); my_temporary_file$type = "BLIZZARD"
df3<-rbind(df3,my_temporary_file)
#COASTAL FLOOD
my_temporary_file<-dplyr::filter(df1, grepl('coast', EVTYPE1)); my_temporary_file$type = "COASTAL FLOOD"
df3<-rbind(df3,my_temporary_file)
#COLD/WIND CHILL
my_temporary_file<-dplyr::filter(df1, grepl('chill', EVTYPE1)); my_temporary_file$type = "COLD/WIND CHILL"
df3<-rbind(df3,my_temporary_file)
#DEBRIS FLOW
my_temporary_file<-dplyr::filter(df1, grepl('debris', EVTYPE1)); my_temporary_file$type = "DEBRIS FLOW"
df3<-rbind(df3,my_temporary_file)
#DENSE FOG
my_temporary_file<-dplyr::filter(df1, grepl('dense fog', EVTYPE1)); my_temporary_file$type = "DENSE FOG"
df3<-rbind(df3,my_temporary_file)
#DENSE SMOKE
my_temporary_file<-dplyr::filter(df1, grepl('smoke', EVTYPE1)); my_temporary_file$type = "DENSE SMOKE"
df3<-rbind(df3,my_temporary_file)
#DROUGHT
my_temporary_file<-dplyr::filter(df1, grepl('drought', EVTYPE1)); my_temporary_file$type = "DROUGHT"
df3<-rbind(df3,my_temporary_file)
#DUST DEVIL
my_temporary_file<-dplyr::filter(df1, grepl('devil', EVTYPE1)); my_temporary_file$type = "DUST DEVIL"
df3<-rbind(df3,my_temporary_file)
#DUST STORM
my_temporary_file<-dplyr::filter(df1, grepl('dust', EVTYPE1)&!grepl('devil', EVTYPE1)); my_temporary_file$type = "DUST STORM"
df3<-rbind(df3,my_temporary_file)
#EXCESSIVE HEAT
my_temporary_file<-dplyr::filter(df1, grepl('excessive', EVTYPE1)); my_temporary_file$type = "EXCESSIVE HEAT"
df3<-rbind(df3,my_temporary_file)
#EXTREME COLD/WIND CHILL
my_temporary_file<-dplyr::filter(df1, grepl('extreme', EVTYPE1)); my_temporary_file$type = "EXTREME COLD/WIND CHILL"
df3<-rbind(df3,my_temporary_file)
#FLASH FLOOD
my_temporary_file<-dplyr::filter(df1, grepl('flash', EVTYPE1)); my_temporary_file$type = "FLASH FLOOD"
df3<-rbind(df3,my_temporary_file)
#FLOOD
my_temporary_file<-dplyr::filter(df1, grepl('flood', EVTYPE1)&!grepl('flash', EVTYPE1)&!grepl('coast', EVTYPE1)&!grepl('lake', EVTYPE1)); my_temporary_file$type = "FLOOD"
df3<-rbind(df3,my_temporary_file)
#FROST/FREEZE
my_temporary_file<-dplyr::filter(df1, grepl('frost', EVTYPE1)); my_temporary_file$type = "FROST/FREEZE"
df3<-rbind(df3,my_temporary_file)
#FUNNEL CLOUD
my_temporary_file<-dplyr::filter(df1, grepl('cloud', EVTYPE1)); my_temporary_file$type = "FUNNEL CLOUD"
df3<-rbind(df3,my_temporary_file)
#FREEZING FOG
my_temporary_file<-dplyr::filter(df1, grepl('freezing', EVTYPE1)); my_temporary_file$type = "FREEZING FOG"
df3<-rbind(df3,my_temporary_file)
#HAIL
my_temporary_file<-dplyr::filter(df1, grepl('hail', EVTYPE1)&!grepl('marine', EVTYPE1)); my_temporary_file$type = "HAIL"
df3<-rbind(df3,my_temporary_file)
#HEAT
my_temporary_file<-dplyr::filter(df1, grepl('heat', EVTYPE1)&!grepl('excessive', EVTYPE1)); my_temporary_file$type = "HEAT"
df3<-rbind(df3,my_temporary_file)
#HEAVY RAIN
my_temporary_file<-dplyr::filter(df1, grepl('rain', EVTYPE1)); my_temporary_file$type = "HEAVY RAIN"
df3<-rbind(df3,my_temporary_file)
#HEAVY SNOW
my_temporary_file<-dplyr::filter(df1, grepl('snow', EVTYPE1)&!grepl('lake', EVTYPE1)); my_temporary_file$type = "HEAVY SNOW"
df3<-rbind(df3,my_temporary_file)
#HIGH SURF
my_temporary_file<-dplyr::filter(df1, grepl('surf', EVTYPE1)); my_temporary_file$type = "HIGH SURF"
df3<-rbind(df3,my_temporary_file)
#HIGH WIND
my_temporary_file<-dplyr::filter(df1, grepl('high wind', EVTYPE1)); my_temporary_file$type = "HIGH WIND"
df3<-rbind(df3,my_temporary_file)
#HURRICANE (TYPHOON)
my_temporary_file<-dplyr::filter(df1, grepl('hurricane|typhoon', EVTYPE1)); my_temporary_file$type = "HURRICANE/TYPHOON|TYPHOON"
df3<-rbind(df3,my_temporary_file)
#ICE STORM
my_temporary_file<-dplyr::filter(df1, grepl('ice', EVTYPE1)); my_temporary_file$type = "ICE STORM"
df3<-rbind(df3,my_temporary_file)
#LAKE-EFFECT SNOW
my_temporary_file<-dplyr::filter(df1, grepl('lake', EVTYPE1)&grepl('snow', EVTYPE1)); my_temporary_file$type = "LAKE-EFFECT SNOW"
df3<-rbind(df3,my_temporary_file)
#LAKESHORE FLOOD
my_temporary_file<-dplyr::filter(df1, grepl('lake', EVTYPE1)&grepl('flood', EVTYPE1)); my_temporary_file$type = "LAKESHORE FLOOD"
df3<-rbind(df3,my_temporary_file)
#LIGHTNING
my_temporary_file<-dplyr::filter(df1, grepl('lightning', EVTYPE1)); my_temporary_file$type = "LIGHTNING"
df3<-rbind(df3,my_temporary_file)
#MARINE HAIL
my_temporary_file<-dplyr::filter(df1, grepl('marine', EVTYPE1)&grepl('hail', EVTYPE1)); my_temporary_file$type = "MARINE HAIL"
df3<-rbind(df3,my_temporary_file)
#MARINE HIGH WIND
my_temporary_file<-dplyr::filter(df1, grepl('marine', EVTYPE1)&grepl('high', EVTYPE1)&grepl('wind', EVTYPE1)); my_temporary_file$type = "MARINE HIGH WIND"
df3<-rbind(df3,my_temporary_file)
#MARINE STRONG WIND
my_temporary_file<-dplyr::filter(df1, grepl('marine', EVTYPE1)&grepl('strong', EVTYPE1)&grepl('wind', EVTYPE1)); my_temporary_file$type = "MARINE STRONG WIND"
df3<-rbind(df3,my_temporary_file)
#MARINE THUNDERSTORM WIND
my_temporary_file<-dplyr::filter(df1, grepl('marine', EVTYPE1)&grepl('thunderstorm', EVTYPE1)&grepl('wind', EVTYPE1)); my_temporary_file$type = "MARINE THUNDERSTORM WIND"
df3<-rbind(df3,my_temporary_file)
#RIP CURRENT
my_temporary_file<-dplyr::filter(df1, grepl('rip', EVTYPE1)&grepl('current', EVTYPE1)); my_temporary_file$type = "RIP CURRENT"
df3<-rbind(df3,my_temporary_file)
#SEICHE
my_temporary_file<-dplyr::filter(df1, grepl('seiche', EVTYPE1)); my_temporary_file$type = "SEICHE"
df3<-rbind(df3,my_temporary_file)
#SLEET
my_temporary_file<-dplyr::filter(df1, grepl('sleet', EVTYPE1)); my_temporary_file$type = "SLEET"
df3<-rbind(df3,my_temporary_file)
#STORM SURGE/TIDE
my_temporary_file<-dplyr::filter(df1, grepl('storm suge|storm tide', EVTYPE1)); my_temporary_file$type = "STORM SURGE/TIDE"
df3<-rbind(df3,my_temporary_file)
#STRONG WIND
my_temporary_file<-dplyr::filter(df1, grepl('strong', EVTYPE1)&grepl('wind', EVTYPE1)); my_temporary_file$type = "STRONG WIND"
df3<-rbind(df3,my_temporary_file)
#THUNDERSTORM WIND
my_temporary_file<-dplyr::filter(df1, grepl('thunderstorm', EVTYPE1)&grepl('wind', EVTYPE1)); my_temporary_file$type = "THUNDERSTORM WIND"
df3<-rbind(df3,my_temporary_file)
#TORNADO
my_temporary_file<-dplyr::filter(df1, grepl('tornado', EVTYPE1)); my_temporary_file$type = "TORNADO"
df3<-rbind(df3,my_temporary_file)
#TROPICAL DEPRESSION
my_temporary_file<-dplyr::filter(df1, grepl('depression', EVTYPE1)); my_temporary_file$type = "TROPICAL DEPRESSION"
df3<-rbind(df3,my_temporary_file)
#TROPICAL STORM
my_temporary_file<-dplyr::filter(df1, grepl('tropical', EVTYPE1)&grepl('storm', EVTYPE1)); my_temporary_file$type = "TROPICAL STORM"
df3<-rbind(df3,my_temporary_file)
#TSUNAMI
my_temporary_file<-dplyr::filter(df1, grepl('tsunami', EVTYPE1)); my_temporary_file$type = "TSUNAMI"
df3<-rbind(df3,my_temporary_file)
#VOLCANIC ASH
my_temporary_file<-dplyr::filter(df1, grepl('volcan', EVTYPE1)); my_temporary_file$type = "VOLCANIC ASH"
df3<-rbind(df3,my_temporary_file)
#WATERSPOUT
my_temporary_file<-dplyr::filter(df1, grepl('waterspout', EVTYPE1)); my_temporary_file$type = "WATERSPOUT"
df3<-rbind(df3,my_temporary_file)
#WILDFIRE
my_temporary_file<-dplyr::filter(df1, grepl('fire', EVTYPE1)); my_temporary_file$type = "WILDFIRE"
df3<-rbind(df3,my_temporary_file)
#WINTER STORM
my_temporary_file<-dplyr::filter(df1, grepl('winter', EVTYPE1)&grepl('storm', EVTYPE1)); my_temporary_file$type = "WINTER STORM"
df3<-rbind(df3,my_temporary_file)
#WINTER WEATHER
my_temporary_file<-dplyr::filter(df1, grepl('weather', EVTYPE1)); my_temporary_file$type = "WINTER WEATHER"
df3<-rbind(df3,my_temporary_file)

dim(df3)
## [1] 676836     23
summary(unique(df3$type))
##    Length     Class      Mode 
##        47 character character
unique(df3$type)
##  [1] "ASTRONOMICAL LOW TIDE"     "AVALANCE"                 
##  [3] "BLIZZARD"                  "COASTAL FLOOD"            
##  [5] "COLD/WIND CHILL"           "DEBRIS FLOW"              
##  [7] "DENSE FOG"                 "DENSE SMOKE"              
##  [9] "DROUGHT"                   "DUST DEVIL"               
## [11] "DUST STORM"                "EXCESSIVE HEAT"           
## [13] "EXTREME COLD/WIND CHILL"   "FLASH FLOOD"              
## [15] "FLOOD"                     "FROST/FREEZE"             
## [17] "FUNNEL CLOUD"              "FREEZING FOG"             
## [19] "HAIL"                      "HEAT"                     
## [21] "HEAVY RAIN"                "HEAVY SNOW"               
## [23] "HIGH SURF"                 "HIGH WIND"                
## [25] "HURRICANE/TYPHOON|TYPHOON" "ICE STORM"                
## [27] "LAKE-EFFECT SNOW"          "LAKESHORE FLOOD"          
## [29] "LIGHTNING"                 "MARINE HAIL"              
## [31] "MARINE HIGH WIND"          "MARINE STRONG WIND"       
## [33] "MARINE THUNDERSTORM WIND"  "RIP CURRENT"              
## [35] "SEICHE"                    "SLEET"                    
## [37] "STRONG WIND"               "THUNDERSTORM WIND"        
## [39] "TORNADO"                   "TROPICAL DEPRESSION"      
## [41] "TROPICAL STORM"            "TSUNAMI"                  
## [43] "VOLCANIC ASH"              "WATERSPOUT"               
## [45] "WILDFIRE"                  "WINTER STORM"             
## [47] "WINTER WEATHER"
summary(unique(df3$EVTYPE1)) #We transformed 569 categories of disasters into 47 main categories that correspond the main DS problem set 
##    Length     Class      Mode 
##       569 character character

Answering the question about the human losses due to the natural disasters

This is the code on how the answer is answered:

library(plyr)
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:plotly':
## 
##     arrange, mutate, rename, summarise
## The following object is masked from 'package:ggpubr':
## 
##     mutate
## The following object is masked from 'package:expss':
## 
##     split_labels
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## The following object is masked from 'package:purrr':
## 
##     compact
# For method 2
fatalities_and_injuries <- plyr::ddply(df2, .(EVTYPE1), summarize,
                           fatalities = sum(FATALITIES),
                           injuries = sum(INJURIES))
#Much better DS -for method 3: 
fatalities_and_injuries <- plyr::ddply(df3, .(type), summarize,
                           fatalities = sum(FATALITIES),
                           injuries = sum(INJURIES))
saveRDS(fatalities_and_injuries, file = "C:\\Users\\Alex\\Documents\\COURSERA STUDIES\\Reproducible Research JHI\\FINAL_PROJECT\\fatalities_and_injuries.rds")
# Find top-10 disasters with the highest fatality and injuries: 
fatality_disaster_events <- head(fatalities_and_injuries[order(fatalities_and_injuries$fatalities, decreasing = TRUE), ], 10)
injury_disaster_events <- head(fatalities_and_injuries[order(fatalities_and_injuries$injuries, decreasing = TRUE), ], 10)
fatality_disaster_events[, c("type", "fatalities")]
##                       type fatalities
## 39                 TORNADO       5661
## 12          EXCESSIVE HEAT       1924
## 20                    HEAT       1216
## 14             FLASH FLOOD       1035
## 29               LIGHTNING        817
## 34             RIP CURRENT        577
## 15                   FLOOD        484
## 13 EXTREME COLD/WIND CHILL        400
## 24               HIGH WIND        299
## 5          COLD/WIND CHILL        237
injury_disaster_events[, c("type", "injuries")]
##                 type injuries
## 39           TORNADO    91407
## 15             FLOOD     6795
## 12    EXCESSIVE HEAT     6548
## 29         LIGHTNING     5232
## 20              HEAT     2699
## 38 THUNDERSTORM WIND     2439
## 26         ICE STORM     2166
## 14       FLASH FLOOD     1802
## 45          WILDFIRE     1608
## 24         HIGH WIND     1523
summary(ad$date1) #To know the start and end observations date of our research database
##                  Min.               1st Qu.                Median 
## "1950-01-03 00:00:00" "1995-04-20 00:00:00" "2002-03-18 00:00:00" 
##                  Mean               3rd Qu.                  Max. 
## "1998-12-27 22:53:21" "2007-07-28 00:00:00" "2011-11-30 00:00:00"

Answering the question about the economic consequences of the natural disasters

This is the code on how the answer is answered:

df3 = data.table(df3)
summary(df3$CROPDMG)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.000   1.866   0.000 990.000
# From the official data description: 
# Estimates can be obtained from emergency managers, U.S. Geological Survey, U.S. Army Corps
#of Engineers, power utility companies, and newspaper articles. If the values provided are rough
#estimates, then this should be stated as such in the narrative. Estimates should be rounded to
#three significant digits, followed by an alphabetical character signifying the magnitude of the
#number, i.e., 1.55B for $1,550,000,000. Alphabetical characters used to signify magnitude
#include “K” for thousands, “M” for millions, and “B” for billions. If additional precision is
#available, it may be provided in the narrative part of the entry. When damage is due to more
#than one element of the storm, indicate, when possible, the amount of damage caused by each
#element. If the dollar amount of damage is unknown, or not available, check the “no information
#available” box. 
summary(unique(df3$PROPDMGEXP))
##    Length     Class      Mode 
##        19 character character
unique(df3$PROPDMGEXP) #To see what variations are potentially possible - we see there are powers of exponents + different verbal meanings like M or m meaning Millions USD and so on. 
##  [1] ""  "M" "K" "0" "5" "4" "7" "?" "1" "B" "+" "2" "m" "8" "H" "-" "3" "6" "h"
df3$prop_dmg = ifelse(df3$PROPDMGEXP == "K", df3$PROPDMG*1000,ifelse(df3$PROPDMGEXP == "M" | df3$PROPDMGEXP == "m", df3$PROPDMG*1000*1000, ifelse(df3$PROPDMGEXP == "B", df3$PROPDMG*1000*1000*1000, ifelse(df3$PROPDMGEXP == "H" | df3$PROPDMGEXP == "h",df3$PROPDMG*100, ifelse(df3$PROPDMGEXP == "+" | df3$PROPDMGEXP == "-" | df3$PROPDMGEXP == "?" | df3$PROPDMGEXP == "", df3$PROPDMG, (df3$PROPDMG)*10^as.numeric(df3$PROPDMGEXP))))))
## Warning in ifelse(df3$PROPDMGEXP == "+" | df3$PROPDMGEXP == "-" | df3$PROPDMGEXP
## == : NAs introduced by coercion
summary(df3$prop_dmg)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 0.000e+00 0.000e+00 5.592e+05 5.000e+02 1.150e+11
df3$crop_dmg = ifelse(df3$PROPDMGEXP == "K", df3$CROPDMG*1000,ifelse(df3$PROPDMGEXP == "M" | df3$PROPDMGEXP == "m", df3$CROPDMG*1000*1000, ifelse(df3$PROPDMGEXP == "B", df3$CROPDMG*1000*1000*1000, ifelse(df3$PROPDMGEXP == "H" | df3$PROPDMGEXP == "h",df3$CROPDMG*100, ifelse(df3$PROPDMGEXP == "+" | df3$PROPDMGEXP == "-" | df3$PROPDMGEXP == "?" | df3$PROPDMGEXP == "", df3$CROPDMG, (df3$CROPDMG)*10^as.numeric(df3$PROPDMGEXP))))))
## Warning in ifelse(df3$PROPDMGEXP == "+" | df3$PROPDMGEXP == "-" | df3$PROPDMGEXP
## == : NAs introduced by coercion
summary(df3$crop_dmg)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.00e+00 0.00e+00 0.00e+00 2.63e+06 0.00e+00 5.00e+11
# Across the United States, which types of events have the greatest economic consequences?
economic_losses <- plyr::ddply(df3, .(type), summarize,
                   prop_dmg = sum(prop_dmg),
                   crop_dmg = sum(crop_dmg))

saveRDS(economic_losses, file = "C:\\Users\\Alex\\Documents\\COURSERA STUDIES\\Reproducible Research JHI\\FINAL_PROJECT\\economic_losses.rds")

# Find top-10 disasters with the highest fatality and injuries: 
economic_losses_property_damage_disaster_events <- head(economic_losses[order(economic_losses$prop_dmg, decreasing = TRUE), ], 10)
economic_losses_crop_damage_disaster_events <- head(economic_losses[order(economic_losses$crop_dmg, decreasing = TRUE), ], 10)
economic_losses_property_damage_disaster_events[, c("type", "prop_dmg")]
##                         type     prop_dmg
## 15                     FLOOD 150182217679
## 25 HURRICANE/TYPHOON|TYPHOON  85356410010
## 39                   TORNADO  58603317927
## 19                      HAIL  17622987537
## 14               FLASH FLOOD  17589812096
## 45                  WILDFIRE   8501628500
## 41            TROPICAL STORM   7714390550
## 46              WINTER STORM   6749997251
## 24                 HIGH WIND   6166300053
## 38         THUNDERSTORM WIND   5432306655
economic_losses_crop_damage_disaster_events[, c("type", "crop_dmg")]
##                         type     crop_dmg
## 25 HURRICANE/TYPHOON|TYPHOON 1.550732e+12
## 15                     FLOOD 9.426539e+10
## 14               FLASH FLOOD 4.058135e+10
## 39                   TORNADO 3.076989e+10
## 24                 HIGH WIND 1.923427e+10
## 19                      HAIL 1.782837e+10
## 38         THUNDERSTORM WIND 9.380175e+09
## 45                  WILDFIRE 7.770024e+09
## 9                    DROUGHT 3.443569e+09
## 41            TROPICAL STORM 1.959070e+09

Results

The best way to reproduce the results is to build a plot:

### Fatalities plot: 
a <- ggplot(fatality_disaster_events, aes(x = reorder(type, desc(fatalities)), y = fatalities, fill = type)) + 
  geom_bar(stat = "identity") + xlab("Type Of Disaster") +ylab("Fatalities Caused") + theme_light()
a + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))  +
theme(legend.title = element_blank()) + theme(legend.position = "none")+
  geom_text(aes(label = fatalities), vjust = -0.4, color = "black", size = 3.5)  + 
  ggtitle("Top 10 Natural Disasters As Fatalities' Sources \n In The US (Jan 1950 - Dec 2011, Total Sum)") 

### Injuries plot:
b <- ggplot(injury_disaster_events, aes(x = reorder(type, desc(injuries)), y = injuries, fill = type)) + 
  geom_bar(stat = "identity") + xlab("Type Of Disaster") +ylab("Injuries Caused") + theme_light()
b + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))  +
theme(legend.title = element_blank()) + theme(legend.position = "none")+
  geom_text(aes(label = injuries), vjust = -0.4, color = "black", size = 3.5) + 
  ggtitle("Top 10 Natural Disasters As Injuries' Sources \n In The US (Jan 1950 - Dec 2011, Total Sum)") 

### Property Damage Plot: 
a <- ggplot(economic_losses_property_damage_disaster_events, 
aes(x = reorder(type, desc(prop_dmg)), y = prop_dmg, fill = type)) + 
  geom_bar(stat = "identity") + xlab("Type Of Disaster") +ylab("Total Property Damages Caused, USD") + theme_light()
a + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))  +
theme(legend.title = element_blank()) + theme(legend.position = "none")+
  geom_text(aes(label = round(prop_dmg,0)), vjust = -0.4, color = "black", size = 2.5)  + 
  ggtitle("Top 10 Natural Disasters As Property Damages' Sources \n In The US (Jan 1950 - Dec 2011, Total Sum In USD)") 

### Crop Damage Plot:
b <- ggplot(economic_losses_crop_damage_disaster_events, aes(x = reorder(type, desc(crop_dmg)), y = crop_dmg, fill = type)) + 
  geom_bar(stat = "identity") + xlab("Type Of Disaster") +ylab("Total Crop Damages Caused, USD") + theme_light()
b + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))  +
theme(legend.title = element_blank()) + theme(legend.position = "none")+
  geom_text(aes(label = round(crop_dmg,0)), vjust = -0.4, color = "black", size = 2.5) + 
  ggtitle("Top 10 Natural Disasters As Crop Damages' Sources \n In The US (Jan 1950 - Dec 2011, Total Sum In USD)") 

The plots are better than many words. We see the top-10 reasons caused the human and economic losses. The main reasons of human fatalities and injuries are tornadoes. Heat is also among the main reasons for casualities. Floods and hurricanes are the main economic disasters both for crops and for property items. The output goes in accordance with the primary data provided.