Data from the US National Oceanic and Atmospheric Administration’s storm database, which describes all major weather events from 1950 to 2011, were analysed to ascertain the types of events that have the greatest health-related and economic impacts. Using the total numbers of reported fatalities and injuries as a measure of negative consequences to human health, tornadoes were found to cause the largest number of both fatalities and injuries by a wide margin (5633 fatalities and 91346 injuries over the 61-year period, compared to the next most harmful type of event, excessive heat, which caused 1903 fatalities and 8428 injuries). Tornadoes also appeared to have the greatest economic consquences, causing a total of 57,800 million dollars’ worth of damage to property and crops over the recording period, more than three times the cost incurred by the next most expensive type of event (flash floods, which caused a total of 18,100 million dollars’ worth of damage).
The data were downloaded from the internet as a .bz2 file, unzipped and read into R.
if(!file.exists("stormData.csv.bz2")) {
download.file(url = "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile = "stormData.csv.bz2")
}
data <- read.csv(bzfile("stormData.csv.bz2"))
The numbers of fatalities and injuries were then added by event type, both separately and together. (Using the mean instead of the sum was considered during exploratory analysis, but this was found to place too much importance on one-off events.) The resulting data frame was sorted to show the most harmful types of event.
library(dplyr, quietly=T)
## Warning: package 'dplyr' was built under R version 3.1.3
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
health <- summarize(group_by(data, EVTYPE),
sum(FATALITIES), sum(INJURIES), sum(FATALITIES) + sum(INJURIES))
names(health) <- c("eventtype", "fatalities", "injuries", "total")
health <- health[order(health$total, decreasing=T),]
A similar analysis was performed using the figures for property and crop damage to estimate the economic impacts of various types of event. These figures are each recorded as a number and a scale prefix, which had to be combined into a single number in millions of dollars before summation.
data<-mutate(data,PROPDMG2=ifelse(PROPDMGEXP%in%c("B","b"),PROPDMG*1000,PROPDMG))#billion->million
data<-mutate(data,PROPDMG2=ifelse(PROPDMGEXP%in%c("K","k"),PROPDMG/1000,PROPDMG2))#thous.->million
data<-mutate(data,PROPDMG2=ifelse(PROPDMGEXP%in%c("H","h"),PROPDMG/10000,PROPDMG2))#hundr.->mill.
data<-mutate(data,PROPDMG2=ifelse(PROPDMGEXP%in%c("0","1","2","3","4","5","6","7","8"),
PROPDMG/100000,PROPDMG2)) # ten -> million
data<-mutate(data,PROPDMG2=ifelse(PROPDMGEXP=="+",PROPDMG/1000000,PROPDMG2)) # one -> million
data<-mutate(data,CROPDMG2=ifelse(CROPDMGEXP%in%c("B","b"),CROPDMG*1000,CROPDMG))
data<-mutate(data,CROPDMG2=ifelse(CROPDMGEXP%in%c("K","k"),CROPDMG/1000,CROPDMG2))
data<-mutate(data,CROPDMG2=ifelse(CROPDMGEXP%in%c("H","h"),CROPDMG/10000,CROPDMG2))
data<-mutate(data,CROPDMG2=ifelse(CROPDMGEXP%in%c("0","1","2","3","4","5","6","7","8"),
CROPDMG/100000,CROPDMG2))
data<-mutate(data,CROPDMG2=ifelse(CROPDMGEXP=="+",CROPDMG/1000000,CROPDMG2)) # one -> million
econ <- summarize(group_by(data, EVTYPE),
sum(PROPDMG2), sum(CROPDMG2), sum(PROPDMG2) + sum(CROPDMG2))
names(econ) <- c("eventtype", "propertydamage", "cropdamage", "total")
econ <- econ[order(econ$total, decreasing=T),]
The types of event that have the greatest health-related and economic consequences are shown below.
## [1] "Most harmful to human health:"
## eventtype fatalities injuries total
## 1 TORNADO 5633 91346 96979
## 2 EXCESSIVE HEAT 1903 6525 8428
## 3 TSTM WIND 504 6957 7461
## 4 FLOOD 470 6789 7259
## 5 LIGHTNING 816 5230 6046
## 6 HEAT 937 2100 3037
## 7 FLASH FLOOD 978 1777 2755
## 8 ICE STORM 89 1975 2064
## 9 THUNDERSTORM WIND 133 1488 1621
## 10 WINTER STORM 206 1321 1527
## [1] "Greatest economic impact:"
## eventtype propertydamage cropdamage total
## 1 FLOOD 144664.710 5661.9685 150326.678
## 2 HURRICANE/TYPHOON 69305.840 2607.8728 71913.713
## 3 TORNADO 56940.163 414.9547 57355.118
## 4 STORM SURGE 43323.536 0.0050 43323.541
## 5 HAIL 15789.270 3028.9547 18818.225
## 6 FLASH FLOOD 16347.815 1421.3171 17769.132
## 7 DROUGHT 1046.106 13972.5660 15018.672
## 8 HURRICANE 11868.319 2741.9100 14610.229
## 9 RIVER FLOOD 5118.945 5029.4590 10148.405
## 10 ICE STORM 3944.928 5022.1135 8967.042
From these tables, it would appear that tornadoes have the largest impact in both areas under consideration. Floods, flash floods, lightning and thunderstorm winds also feature prominently. However, as is clear from the table of costs, some broad types of event, such as thunderstorm winds, have been recorded as multiple categories, reducing their apparent impact. Re-analysis taking this into account may be beneficial. Barplots of the top fifteen events in each category are shown below.
barplot(health$total[1:15],
names.arg = tolower(health$eventtype[1:15]),
cex.names = 0.4,
main = "Bar plot of total fatalities and injuries")
barplot(econ$total[1:15],
names.arg = tolower(econ$eventtype[1:15]),
cex.names = 0.4,
main = "Bar plot of total property and crop damage")