This is a cursory analysis of the historical data on damages from environmental hazards. We explore data made available from NOAA to get a general idea of the level of threat posed by different kinds of events. The impact is evaluated in terms of recorded fatalities and in terms of an indicator of economic damage. Wind related events are most hazardous from both life loss and economic damage, tornadoes in particular. Water, specifically floods, are also very damaging to both life and property, while heat damages health while causing comparative little economic damage.
NOAA has made data publically available and the dataset can be downloaded directly from the website. It will be stored in your working directory. The size of the file is relatively large and contains information which is not necessary in order to get an initial understanding of the ranking of natural disasters in terms of loss of life and economic damage.
Therefore we select only the following columns, describing: event type, number of fatalities and date of occurance, damage to crops and damage to property.
The date is stored in the original file as a factor variable, and we convert it to PosixLt, in case we want to look at time-series data later.
At this exploratory stage of analysis, we use a sum of crop and property damages as an indicator of economic impact of an event.
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", "NOAA.csv")
full_data<-read.csv("NOAA.csv")
data<-select(full_data,BGN_DATE, EVTYPE,FATALITIES)
data<-select(full_data,BGN_DATE,PROPDMG,CROPDMG, EVTYPE,FATALITIES)
data$BGN_DATE<-strptime(as.character(data$BGN_DATE), "%m/%d/%Y %H:%M:%S")
data$EVTYPE<-tolower(data$EVTYPE)
data$Econ<-data$PROPDMG+data$CROPDMG
events<-unique(data$EVTYPE)
str(data)
## 'data.frame': 902297 obs. of 6 variables:
## $ BGN_DATE : POSIXlt, format: "1950-04-18" "1950-04-18" ...
## $ PROPDMG : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
## $ CROPDMG : num 0 0 0 0 0 0 0 0 0 0 ...
## $ EVTYPE : chr "tornado" "tornado" "tornado" "tornado" ...
## $ FATALITIES: num 0 0 0 0 0 0 0 0 1 0 ...
## $ Econ : num 25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
We can now calculate fatalities for each event descibed, since records began. The most demaging 20 are shown in increasing order of magnitute. The deadliest events are tornadoes.
LifeLoss<-with(data, tapply(FATALITIES,as.factor(EVTYPE),sum, na.rm = TRUE))
tail(sort(LifeLoss),20)
## blizzard strong wind high surf
## 101 103 104
## extreme cold/wind chill heavy snow thunderstorm wind
## 125 127 133
## extreme cold heat wave rip currents
## 162 172 204
## winter storm avalanche high wind
## 206 224 248
## rip current flood tstm wind
## 368 470 504
## lightning heat flash flood
## 816 937 978
## excessive heat tornado
## 1903 5633
Because events are classified into so many categories, it might be better to ask which are of the four elements “Wind”, “Fire” (Heat), “Water” or “Earth” are the most dangerous in terms of potential loss of life?
We see that wind related disasters kill more people, than either water or heat.
Water<-sum(LifeLoss[events[c(grep("snow", events),grep("water", events),grep("hail", events),
grep("rain", events),grep("ice", events),grep("flood", events),grep("tsunami", events),
grep("avalanche", events), grep("current", events), grep("surf", events))]])
Wind<-sum(LifeLoss[events[c(grep("wind", events),grep("tornado", events),
grep("blizzard", events),grep("hurricane", events))]])
Heat<-sum(LifeLoss[events[c(grep("heat", events),grep("tropical", events),
grep("fire", events))]])
Earth<-sum(LifeLoss[events[c(grep("landslide", events),grep("dust", events))]])
elem<-cbind(Water,Wind,Heat,Earth)
barplot(elem, main = "Total fatalities since records began", col = c("dark red"))
Similarly, we can ask which events or elements are most costly in economic terms.
EconLoss<-with(data, tapply(Econ,as.factor(EVTYPE),sum, na.rm = TRUE))
tail(sort(EconLoss),20)
## flash flooding drought wild/forest fire
## 33623.20 37997.67 43534.49
## tropical storm high winds heavy rain
## 54322.80 57384.60 61964.94
## strong wind ice storm wildfire
## 64628.71 67689.62 88823.54
## heavy snow winter storm high wind
## 124417.71 134699.58 342014.77
## thunderstorm winds lightning thunderstorm wind
## 464978.11 606932.39 943635.62
## flood hail tstm wind
## 1067976.36 1268289.66 1445198.21
## flash flood tornado
## 1599325.05 3312276.68
And we can similarly depict economic damage by each of the elements.
Water<-sum(EconLoss[events[c(grep("snow", events),grep("water", events),grep("hail", events),
grep("rain", events),grep("ice", events),grep("flood", events),grep("tsunami", events),
grep("avalanche", events), grep("current", events), grep("surf", events))]])
Wind<-sum(EconLoss[events[c(grep("wind", events),grep("tornado", events),
grep("blizzard", events),grep("hurricane", events))]])
Heat<-sum(EconLoss[events[c(grep("heat", events),grep("tropical", events),
grep("fire", events))]])
Earth<-sum(EconLoss[events[c(grep("landslide", events),grep("dust", events))]])
elem<-cbind(Water,Wind,Heat,Earth)
barplot(elem, main = "Total economic damage since records began", col = c("green"))
It is clear that wind related events are most economically costly, while heat causes little damage overall.
However, we can also consider crop and property damage separately, and consider 20 most damaging events.
CropLoss<-with(data, tapply(CROPDMG,as.factor(EVTYPE),sum, na.rm = TRUE))
PropLoss<-with(data, tapply(PROPDMG,as.factor(EVTYPE),sum, na.rm = TRUE))
tail(sort(CropLoss),20)
## lightning wild/forest fire tstm wind/hail
## 3580.61 4189.54 4356.65
## wildfire hurricane/typhoon flash flooding
## 4364.20 4798.48 5126.05
## hurricane tropical storm extreme cold
## 5339.31 5899.12 6141.14
## frost/freeze heavy rain high wind
## 7134.14 11122.80 17283.21
## thunderstorm winds drought thunderstorm wind
## 18684.93 33898.62 66791.45
## tornado tstm wind flood
## 100018.52 109202.60 168037.88
## flash flood hail
## 179200.46 579596.28
tail(sort(PropLoss),20)
## urban/sml stream fld flash flooding wild/forest fire
## 26051.94 28497.15 39344.95
## tropical storm heavy rain high winds
## 48423.68 50842.14 55625.00
## strong wind ice storm wildfire
## 63011.81 66000.67 84459.34
## heavy snow winter storm high wind
## 122251.99 132720.59 324731.56
## thunderstorm winds lightning hail
## 446293.18 603351.78 688693.38
## thunderstorm wind flood tstm wind
## 876844.17 899938.48 1335995.61
## flash flood tornado
## 1420124.59 3212258.16
We can see that while tornadoes are more likely to damage property, hail seems to have caused more damage to crops.