Synopsis

This is a cursory analysis of the historical data on damages from environmental hazards. We explore data made available from NOAA to get a general idea of the level of threat posed by different kinds of events. The impact is evaluated in terms of recorded fatalities and in terms of an indicator of economic damage. Wind related events are most hazardous from both life loss and economic damage, tornadoes in particular. Water, specifically floods, are also very damaging to both life and property, while heat damages health while causing comparative little economic damage.

Data processing

NOAA has made data publically available and the dataset can be downloaded directly from the website. It will be stored in your working directory. The size of the file is relatively large and contains information which is not necessary in order to get an initial understanding of the ranking of natural disasters in terms of loss of life and economic damage.

Therefore we select only the following columns, describing: event type, number of fatalities and date of occurance, damage to crops and damage to property.

The date is stored in the original file as a factor variable, and we convert it to PosixLt, in case we want to look at time-series data later.

At this exploratory stage of analysis, we use a sum of crop and property damages as an indicator of economic impact of an event.

download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", "NOAA.csv")
full_data<-read.csv("NOAA.csv")

data<-select(full_data,BGN_DATE, EVTYPE,FATALITIES)

data<-select(full_data,BGN_DATE,PROPDMG,CROPDMG, EVTYPE,FATALITIES)
data$BGN_DATE<-strptime(as.character(data$BGN_DATE), "%m/%d/%Y %H:%M:%S")
data$EVTYPE<-tolower(data$EVTYPE)
data$Econ<-data$PROPDMG+data$CROPDMG
events<-unique(data$EVTYPE)
str(data)
## 'data.frame':    902297 obs. of  6 variables:
##  $ BGN_DATE  : POSIXlt, format: "1950-04-18" "1950-04-18" ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ EVTYPE    : chr  "tornado" "tornado" "tornado" "tornado" ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ Econ      : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...

Results

Loss of life

We can now calculate fatalities for each event descibed, since records began. The most demaging 20 are shown in increasing order of magnitute. The deadliest events are tornadoes.

LifeLoss<-with(data, tapply(FATALITIES,as.factor(EVTYPE),sum, na.rm = TRUE))
tail(sort(LifeLoss),20)
##                blizzard             strong wind               high surf 
##                     101                     103                     104 
## extreme cold/wind chill              heavy snow       thunderstorm wind 
##                     125                     127                     133 
##            extreme cold               heat wave            rip currents 
##                     162                     172                     204 
##            winter storm               avalanche               high wind 
##                     206                     224                     248 
##             rip current                   flood               tstm wind 
##                     368                     470                     504 
##               lightning                    heat             flash flood 
##                     816                     937                     978 
##          excessive heat                 tornado 
##                    1903                    5633

Because events are classified into so many categories, it might be better to ask which are of the four elements “Wind”, “Fire” (Heat), “Water” or “Earth” are the most dangerous in terms of potential loss of life?

We see that wind related disasters kill more people, than either water or heat.

Water<-sum(LifeLoss[events[c(grep("snow", events),grep("water", events),grep("hail", events),
                  grep("rain", events),grep("ice", events),grep("flood", events),grep("tsunami", events),
                  grep("avalanche", events), grep("current", events), grep("surf", events))]])

Wind<-sum(LifeLoss[events[c(grep("wind", events),grep("tornado", events),
                  grep("blizzard", events),grep("hurricane", events))]])

Heat<-sum(LifeLoss[events[c(grep("heat", events),grep("tropical", events),
                  grep("fire", events))]])

Earth<-sum(LifeLoss[events[c(grep("landslide", events),grep("dust", events))]])

elem<-cbind(Water,Wind,Heat,Earth)

barplot(elem, main = "Total fatalities since records began", col = c("dark red"))

Economic costs

Similarly, we can ask which events or elements are most costly in economic terms.

EconLoss<-with(data, tapply(Econ,as.factor(EVTYPE),sum, na.rm = TRUE))
tail(sort(EconLoss),20)
##     flash flooding            drought   wild/forest fire 
##           33623.20           37997.67           43534.49 
##     tropical storm         high winds         heavy rain 
##           54322.80           57384.60           61964.94 
##        strong wind          ice storm           wildfire 
##           64628.71           67689.62           88823.54 
##         heavy snow       winter storm          high wind 
##          124417.71          134699.58          342014.77 
## thunderstorm winds          lightning  thunderstorm wind 
##          464978.11          606932.39          943635.62 
##              flood               hail          tstm wind 
##         1067976.36         1268289.66         1445198.21 
##        flash flood            tornado 
##         1599325.05         3312276.68

And we can similarly depict economic damage by each of the elements.

Water<-sum(EconLoss[events[c(grep("snow", events),grep("water", events),grep("hail", events),
                  grep("rain", events),grep("ice", events),grep("flood", events),grep("tsunami", events),
                  grep("avalanche", events), grep("current", events), grep("surf", events))]])

Wind<-sum(EconLoss[events[c(grep("wind", events),grep("tornado", events),
                  grep("blizzard", events),grep("hurricane", events))]])

Heat<-sum(EconLoss[events[c(grep("heat", events),grep("tropical", events),
                  grep("fire", events))]])

Earth<-sum(EconLoss[events[c(grep("landslide", events),grep("dust", events))]])

elem<-cbind(Water,Wind,Heat,Earth)

barplot(elem, main = "Total economic damage since records began", col = c("green"))

It is clear that wind related events are most economically costly, while heat causes little damage overall.

However, we can also consider crop and property damage separately, and consider 20 most damaging events.

CropLoss<-with(data, tapply(CROPDMG,as.factor(EVTYPE),sum, na.rm = TRUE))
PropLoss<-with(data, tapply(PROPDMG,as.factor(EVTYPE),sum, na.rm = TRUE))
tail(sort(CropLoss),20)
##          lightning   wild/forest fire     tstm wind/hail 
##            3580.61            4189.54            4356.65 
##           wildfire  hurricane/typhoon     flash flooding 
##            4364.20            4798.48            5126.05 
##          hurricane     tropical storm       extreme cold 
##            5339.31            5899.12            6141.14 
##       frost/freeze         heavy rain          high wind 
##            7134.14           11122.80           17283.21 
## thunderstorm winds            drought  thunderstorm wind 
##           18684.93           33898.62           66791.45 
##            tornado          tstm wind              flood 
##          100018.52          109202.60          168037.88 
##        flash flood               hail 
##          179200.46          579596.28
tail(sort(PropLoss),20)
## urban/sml stream fld       flash flooding     wild/forest fire 
##             26051.94             28497.15             39344.95 
##       tropical storm           heavy rain           high winds 
##             48423.68             50842.14             55625.00 
##          strong wind            ice storm             wildfire 
##             63011.81             66000.67             84459.34 
##           heavy snow         winter storm            high wind 
##            122251.99            132720.59            324731.56 
##   thunderstorm winds            lightning                 hail 
##            446293.18            603351.78            688693.38 
##    thunderstorm wind                flood            tstm wind 
##            876844.17            899938.48           1335995.61 
##          flash flood              tornado 
##           1420124.59           3212258.16

We can see that while tornadoes are more likely to damage property, hail seems to have caused more damage to crops.