Floods, storms, hurricanes, earthquakes and tsunamis are natural disasters resulting from natural processes of the Earth.Every weather event can cause serious consequences in health and economics, resulting in loss of live, injuries and property damages.For communities and municipalities it’s important to prevent and estimate the infliction, prioritize the resources based on different types of events and establish actions plans that help build disaster resilent communities.The US National Oceanic and Atmospheric Administration’s collected information of storms from 1950 to 2011 in a database that tracks characteristics like when and where they occur and records of the damages or fatalities, that it’s the primary source of information for this project. This report aims to answer the questions of which type of event has the greatest and most harmful consequence over health and goods:
The data for this assignment comes in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. It can be downloaded from the course web site: Storm Data [47Mb].
The events in the database start in 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.
First the bzip file was downloaded from the course url previously shown.
#download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2","storm.bz2")
If the file was downloaded correctly we go to read the csv file into a dataframe.
data_storm<-read.csv(bzfile("storm.bz2"),sep=",")
To make the dataset smaller and manageable we need to extract the variables related with the hypothesis to answer:
storm_event<-data_storm[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]
Next we need to make the EVTYPE,PROPDMGEXP and CROPDMGEXP to uppercase and PROPDMG and CROPDMG transform to numeric value.
storm_event$EVTYPE<-str_trim(toupper(storm_event$EVTYPE))
storm_event$PROPDMGEXP<-str_trim(toupper(storm_event$PROPDMGEXP))
storm_event$CROPDMGEXP<-str_trim(toupper(storm_event$CROPDMGEXP))
storm_event$PROPDMG<-as.numeric(storm_event$PROPDMG)
storm_event$CROPDMG<-as.numeric(storm_event$CROPDMG)
And the new dataset looks like this:
summary(storm_event)
## EVTYPE FATALITIES INJURIES PROPDMG
## Length:356162 Min. : 0.000 Min. : 0.0000 Min. : 0.00
## Class :character 1st Qu.: 0.000 1st Qu.: 0.0000 1st Qu.: 0.00
## Mode :character Median : 0.000 Median : 0.0000 Median : 0.00
## Mean : 0.024 Mean : 0.2848 Mean : 13.05
## 3rd Qu.: 0.000 3rd Qu.: 0.0000 3rd Qu.: 0.00
## Max. :583.000 Max. :1700.0000 Max. :970.00
## PROPDMGEXP CROPDMG CROPDMGEXP
## Length:356162 Min. : 0.000 Length:356162
## Class :character 1st Qu.: 0.000 Class :character
## Mode :character Median : 0.000 Mode :character
## Mean : 1.198
## 3rd Qu.: 0.000
## Max. :978.000
First extract the records with values
injuries_events<-subset(storm_event,storm_event$INJURIES!=0)
fatalities_events<-subset(storm_event,storm_event$FATALITIES!=0)
Then we need 2 summary datasets: injuries and fatalities
health_injuries<-aggregate(injuries_events$INJURIES,list(injuries_events$EVTYPE),sum)
health_fatalities<-aggregate(fatalities_events$FATALITIES,list(fatalities_events$EVTYPE),sum)
Now order the results and filter the top 10 events for the outcomes
health_injuries<-health_injuries[order(-health_injuries$x),][1:10,]
health_injuries
## Group.1 x
## 98 TORNADO 74570
## 20 FLOOD 6466
## 104 TSTM WIND 4988
## 65 LIGHTNING 2095
## 61 ICE STORM 1852
## 13 EXCESSIVE HEAT 1667
## 91 THUNDERSTORM WINDS 908
## 34 HEAT 878
## 33 HAIL 781
## 3 BLIZZARD 777
health_fatalities<-health_fatalities[order(-health_fatalities$x),][1:10,]
health_fatalities
## Group.1 x
## 112 TORNADO 4359
## 46 HEAT 706
## 19 EXCESSIVE HEAT 555
## 116 TSTM WIND 364
## 25 FLASH FLOOD 334
## 78 LIGHTNING 322
## 30 FLOOD 193
## 47 HEAT WAVE 172
## 22 EXTREME COLD 129
## 23 EXTREME HEAT 96
And now we see the plots for the 2 results
Firts subset the data with proper values
crop_event<-subset(storm_event,storm_event$CROPDMG!=0)
prop_event<-subset(storm_event,storm_event$PROPDMG!=0)
For this question we need to quantify the amount of harm for each type of event registered. Since the exponential value stored in another column and it’s represented by letters (H=hundred, K=thousand, M=million, B=billion) the calculation of the monetary amount needs extra transformation.
First create a function that permits applying the corresponding factor to each letter
factor<-function(val,amnt){
if(val=="B") return(amnt*10^9)
else if(val=="M") return(amnt*10^6)
else if(val=="K") return(amnt*10^3)
else if(val=="H") return(amnt*100)
else return(amnt)
}
Next aggregate the calculated value using the function
crop_event$CROPAMNT<-mapply(factor,crop_event$CROPDMGEXP,crop_event$CROPDMG)
prop_event$PROPAMNT<-mapply(factor,prop_event$PROPDMGEXP,prop_event$PROPDMG)
Finally we calculate the sum per type of event
econ_crop<-aggregate(crop_event$CROPAMNT,list(crop_event$EVTYPE),sum)
econ_prop<-aggregate(prop_event$PROPAMNT,list(prop_event$EVTYPE),sum)
Taking the top 10 for both kind of economics consequences ordered by amount
econ_crop<-econ_crop[order(-econ_crop$x),][1:10,]
econ_crop
## Group.1 x
## 66 RIVER FLOOD 5029459000
## 62 ICE STORM 5013448500
## 8 DROUGHT 3533141000
## 20 FLOOD 1378429050
## 55 HURRICANE 1252055000
## 14 EXTREME COLD 1140078000
## 33 HAIL 1117642273
## 16 FLASH FLOOD 501789100
## 26 FREEZE 403375000
## 42 HEAT 401285000
econ_prop<-econ_prop[order(-econ_prop$x),][1:10,]
econ_prop
## Group.1 x
## 276 TORNADO 35576611369
## 49 FLOOD 9710265457
## 332 WINTER STORM 5279478601
## 196 RIVER FLOOD 5118945500
## 146 HURRICANE 5057822000
## 36 FLASH FLOOD 3525223957
## 81 HAIL 3361966033
## 152 HURRICANE OPAL 3172846000
## 102 HEAVY RAIN/SEVERE WEATHER 2500000000
## 288 TSTM WIND 2093300205
And now we can plot the results
In order to make a conclusion for each of the questions (health and economic) it’s necessary to make a general summary for both.
Calculate the total injuries and fatalities in human health
health_general<-aggregate(storm_event$INJURIES+storm_event$FATALITIES,list(storm_event$EVTYPE),sum)
health_general<-health_general[order(-health_general$x),][1:10,]
health_general
## Group.1 x
## 686 TORNADO 78929
## 127 FLOOD 6659
## 706 TSTM WIND 5352
## 374 LIGHTNING 2417
## 95 EXCESSIVE HEAT 2222
## 348 ICE STORM 1907
## 211 HEAT 1584
## 111 FLASH FLOOD 1037
## 639 THUNDERSTORM WINDS 972
## 15 BLIZZARD 859
Calculate the general amount of damage in property and crops
econ_subset<-subset(storm_event[,c("EVTYPE","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")],storm_event$CROPDMG!=0 | storm_event$PROPDMG!=0)
econ_subset$CROPAMNT<-mapply(factor,econ_subset$CROPDMGEXP,econ_subset$CROPDMG)
econ_subset$PROPAMNT<-mapply(factor,econ_subset$PROPDMGEXP,econ_subset$PROPDMG)
econ_subset$AMOUNT<-econ_subset$CROPAMNT+econ_subset$PROPAMNT
head(econ_subset)
## EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP CROPAMNT PROPAMNT AMOUNT
## 1 TORNADO 25.0 K 0 0 25000 25000
## 2 TORNADO 2.5 K 0 0 2500 2500
## 3 TORNADO 25.0 K 0 0 25000 25000
## 4 TORNADO 2.5 K 0 0 2500 2500
## 5 TORNADO 2.5 K 0 0 2500 2500
## 6 TORNADO 2.5 K 0 0 2500 2500
econ_general<-aggregate(econ_subset$AMOUNT,list(econ_subset$EVTYPE),sum)
econ_general<-econ_general[order(-econ_general$x),][1:10,]
econ_general
## Group.1 x
## 294 TORNADO 35748425129
## 55 FLOOD 11088694507
## 209 RIVER FLOOD 10148404500
## 158 HURRICANE 6309877000
## 173 ICE STORM 6014393540
## 355 WINTER STORM 5305769601
## 88 HAIL 4479608306
## 42 FLASH FLOOD 4027013057
## 26 DROUGHT 3732546000
## 164 HURRICANE OPAL 3191846000
In conclusion, Tornados are the most dangerous type of event with the more severe consequences and relentless devastation causes for both health and economics.