Weather events and their impact on human health and economy in the US have been recorded by NOAA. NOAA has made available the data from 1950 to 2011 for analysis. This report presents the analysis of this data with the objective of understanding the cost of these weather events on human health and economy.
The data contains a record of approximately a million events over six decades. The data is explored by year of event and event type to identify high impact event types. The impact is measured through fatality count and injury count to asses the human health cost and through property and crop damage to asses economic cost.
The analysis has shown that twelve event types are responsible for 85% of the fatalities. The event types - Tornado, excessive heat/heat and flash flood/flood, are responsible for over 10,000 deaths out of the total 15,000 deaths during the analysis period.
On economic front, the top 9 event types account for over 90% of the damage to property and crops. Again, Tornado and flash flood are shown to be the primary causes of damage. In addition, hail and wind also cause significant damage.
A further analysis of geographical/localised aspect of certain weather events is recommended which is not included in this project due to time constraints.
The data is available in the form of CSV files. The file is loaded using read.csv function.
The variables included in this dataset that are of interest in the current analysis are
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.1.3
##
## Attaching package: 'dplyr'
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lattice)
## Warning: package 'lattice' was built under R version 3.1.3
StormData <- read.csv("repdata_data_StormData.csv", sep=",", header=TRUE)
StormData$BGN_DATE <- as.Date(as.character(StormData$BGN_DATE), "%m/%d/%Y")
StormData$BGN_YEAR <- as.numeric(as.character(StormData$BGN_DATE, "%Y"))
StormData$DAMAGE <- StormData[,"PROPDMG"] + StormData[,"CROPDMG"]
print(paste("Total events ", nrow(StormData), sep=":"))
## [1] "Total events :902297"
print(paste("Fatalities ", sum(StormData$FATALITIES), sep=":"))
## [1] "Fatalities :15145"
print(paste("Injuries ", sum(StormData$INJURIES), sep=":"))
## [1] "Injuries :140528"
print(paste("Property Damage", sum(StormData$PROPDMG), sep=":"))
## [1] "Property Damage:10884500.01"
print(paste("Crop Damage ", sum(StormData$CROPDMG), sep=":"))
## [1] "Crop Damage :1377827.32"
The data is summarised by year and event type for exploratory analysis of human cost.
StormDataSummaryByYear <- aggregate(StormData[, c("FATALITIES","INJURIES")]
, list(StormData[,"BGN_YEAR"], StormData[,"EVTYPE"] )
, sum
, na.rm=TRUE
)
colnames(StormDataSummaryByYear) <- c("BGN_YEAR", "EVTYPE", "FATALITIES", "INJURIES")
StormDataSummaryHealthByType <- aggregate(StormDataSummaryByYear[, c("FATALITIES","INJURIES")]
, list(StormDataSummaryByYear[,"EVTYPE"])
, sum
, na.rm=TRUE
)
colnames(StormDataSummaryHealthByType) <- c("EVTYPE", "FATALITIES", "INJURIES")
StormDataSummaryHealthByType <- StormDataSummaryHealthByType[order(StormDataSummaryHealthByType$FATALITIES, decreasing=TRUE),]
The top 16 event types by fatality are
print(StormDataSummaryHealthByType[1:16, c("EVTYPE", "FATALITIES","INJURIES") ])
## EVTYPE FATALITIES INJURIES
## 834 TORNADO 5633 91346
## 130 EXCESSIVE HEAT 1903 6525
## 153 FLASH FLOOD 978 1777
## 275 HEAT 937 2100
## 464 LIGHTNING 816 5230
## 856 TSTM WIND 504 6957
## 170 FLOOD 470 6789
## 585 RIP CURRENT 368 232
## 359 HIGH WIND 248 1137
## 19 AVALANCHE 224 170
## 972 WINTER STORM 206 1321
## 586 RIP CURRENTS 204 297
## 278 HEAT WAVE 172 309
## 140 EXTREME COLD 160 231
## 760 THUNDERSTORM WIND 133 1488
## 310 HEAVY SNOW 127 1021
This analysis is focussed on the top 9 event types.
EVTYPE <- as.data.frame(StormDataSummaryHealthByType[1:9,"EVTYPE"])
colnames(EVTYPE) <- c("EVTYPE")
EvtypeStormDataHealth <- merge(EVTYPE, StormData, by="EVTYPE")
EvtypeStormDataSummaryByYear <- aggregate(EvtypeStormDataHealth[, c("FATALITIES","INJURIES")]
, list(EvtypeStormDataHealth[,"EVTYPE"], EvtypeStormDataHealth[,"BGN_YEAR"])
, sum
, na.rm=TRUE
)
colnames(EvtypeStormDataSummaryByYear) <- c("EVTYPE", "BGN_YEAR", "FATALITIES", "INJURIES")
The top 9 event types account for 11857 out of the total 15145 fatalities.
PlotH <- xyplot(FATALITIES + INJURIES ~ BGN_YEAR | EVTYPE, data = EvtypeStormDataSummaryByYear, col=c("red","blue"), type="l" ,
ylab=list("Casualty count", cex=1.15), xlab= list("Year", cex=1.15), main=list("Casualty due to weather", cex=2),
layout=c(3,3),
key=simpleKey(c("Fatalities (in red)", "Injuries (in blue)"), columns=2, points=FALSE, col=c("red","blue")),
scales=list(
y=list(
log=TRUE,
limits=c(1,3000),
at=c(1,10,30,100,300,1000,2000),
labels=c(1,10,30,100,300,1000,2000)
) )
)
print(PlotH)
The data is summarised by year and event type for exploratory analysis of economic cost.
StormDataDamageSummaryByYear <- aggregate(StormData[, c("DAMAGE")]
, list(StormData[,"BGN_YEAR"], StormData[,"EVTYPE"] )
, sum
, na.rm=TRUE
)
colnames(StormDataDamageSummaryByYear) <- c("BGN_YEAR", "EVTYPE", "DAMAGE")
StormDataDamageSummaryByType <- aggregate(StormDataDamageSummaryByYear[, c("DAMAGE")]
, list(StormDataDamageSummaryByYear[,"EVTYPE"])
, sum
, na.rm=TRUE
)
colnames(StormDataDamageSummaryByType) <- c("EVTYPE", "DAMAGE")
StormDataDamageSummaryByType <- StormDataDamageSummaryByType[order(StormDataDamageSummaryByType$DAMAGE, decreasing=TRUE),]
print(StormDataDamageSummaryByType[1:15,])
## EVTYPE DAMAGE
## 834 TORNADO 3312276.68
## 153 FLASH FLOOD 1599325.05
## 856 TSTM WIND 1445168.21
## 244 HAIL 1268289.66
## 170 FLOOD 1067976.36
## 760 THUNDERSTORM WIND 943635.62
## 464 LIGHTNING 606932.39
## 786 THUNDERSTORM WINDS 464978.11
## 359 HIGH WIND 342014.77
## 972 WINTER STORM 134699.58
## 310 HEAVY SNOW 124417.71
## 957 WILDFIRE 88823.54
## 427 ICE STORM 67689.62
## 676 STRONG WIND 64610.71
## 290 HEAVY RAIN 61964.94
The top 9 event types account for US$ 11050596.85 out of the total cost of US$ 12262327.33.
EVTYPE <- as.data.frame(StormDataDamageSummaryByType[1:9,"EVTYPE"])
colnames(EVTYPE) <- c("EVTYPE")
EvtypeStormDataDamage = merge(EVTYPE, StormData, by="EVTYPE")
EvtypeStormDataDamageSummaryByYear <- aggregate(EvtypeStormDataDamage[, c("DAMAGE")]
, list(EvtypeStormDataDamage[,"BGN_YEAR"], EvtypeStormDataDamage[,"EVTYPE"] )
, sum
, na.rm=TRUE
)
colnames(EvtypeStormDataDamageSummaryByYear) <- c("BGN_YEAR", "EVTYPE", "DAMAGE")
PlotE <- xyplot(
DAMAGE ~ BGN_YEAR | EVTYPE, data = EvtypeStormDataDamageSummaryByYear, col=c("red"), type="l" ,
ylab=list("Economic Impact in US$", cex=1.15), xlab=list("Year", cex=1.15), main=list("Economic Impact of weather",cex=2),
layout=c(3,3),
scales=list(
y=list(
log=TRUE,
limits=c(10000,700000),
at=c(10000,30000,100000,300000,600000),
labels=c("10K","30K","100K","300K","600K")
)
)
)
print(PlotE)
The analysis of weather event data by event type, year and number of fatality/injury show that tornadoes account for over a third of casualties. Tornadoes have also been well recorded throughout the analysis period, may be due to better understanding of their destructive power. While the number of deaths are still high at 5633 for Tornado compared to other events, they are only 6 percent of the injuries caused which is 91346. The underlying reasons for lower death rate need to be investigated which could be better preparation and response in the event of tornado or significant difference in tornadoes themselves. In any case, given the large impact even a 10 percent reduction in tornado impact makes a huge difference.
Excessive Heat/Heat cause 2,840 deaths and 8,625 injuries. Data is only available from the 1990s showing only recent understanding of the event. Also, deaths form a significant part of injury. A similar pattern emerges for flash flood.
Another event that requires attention is rip current that have more deaths than injury.
The analysis of weather events for economic impact shows increasing impact of tornadoes. Of late, winds (winds with or without thunderstorms) are affecting as much as tornadoes. As expected, flood and flash flood damage is increasing at a significant pace and now almost reach same proportion as tornadoes.
Hail, while not significantly affecting human health, has affected crops. but this damage has remained approximately at same level over the last 2 decades.
Together these events are responsible for 90% of all damages.