This document analyses the historical storm data provided by NOAA to find the top severe weather events in terms of impacts to population and economic damage.
Several weather related events occur in the United States every year and some are known to cause widespread population impact and economic damage. This study analyses the weather related events dating back to 1950 to determine which specific events have caused the most impacts economically and socially. The NOAA database provides weather related events and their corrosponding impact in a database file from 1950 to November 2011. The database contains various fields that capture information for each event . The most relevant fields for this study was determined to be the type of event (EVTYPE), the economic damage including monetary assesment to property and crops (PROPDMG,CROPDMG), and impact on populations (FATALITIES,INJURIES).
The format of the NOAA data is in the form of a csv and was loaded into R using the following code
data <- read.csv("C:/RLearning/RepResearch/data/repdata_data_StormData.csv")
Next, only the relevant fields for this study were extracted into a seperate dataset
x <- data[,c("BGN_DATE","STATE","COUNTY","END_DATE","EVTYPE","PROPDMG","CROPDMG","FATALITIES","INJURIES","REMARKS")]
The event types (EVTYPE) are grouped together by factoring this column on the dataset and a summation is applied for each specific event for the population impact columns (FATALITIES,INJURIES). This results in a dataset that contains the summation of a specific EVTYPE for all recorded years over the entire United States.
totalsum <- data.frame(cbind(tapply(x$FATALITIES,x$EVTYPE,sum),tapply(x$INJURIES,x$EVTYPE,sum)))
totalsum <- cbind(totalsum,rownames(totalsum))
This summation data can now sorted in decreasing order (FATALATIES first,INJURIES second). The assumption is that FATALATIES are considered more severe outcome than INJURIES and therefore sorted first. Two additional colums indicating FATALITIES and INJURIES as percentage of total are added to infer severity.
rownames(totalsum) <- c()
xx <-totalsum[order(totalsum[,1],totalsum[,2],decreasing=TRUE),]
xx$fpercent <- xx[,1]*100/sum(xx[,1])
xx$ipercent<- xx[,2]*100/sum(xx[,2])
popdamages <- xx[1:10,c(3,1,2,4,5)]
colnames(popdamages) = c("EVTYPE","FATALITIES","INJURIES","FATALITIES (% total)","INJURIES(% total)")
To asses economic consequenses of weather the dollar amounts in damages defined in PROPDMG and CROPDMG are added together for into a column called TOTALPROPDMG. This is now sorted in descending order from most damage to least.
propsum <- data.frame(cbind(tapply(x$PROPDMG,x$EVTYPE,sum),tapply(x$CROPDMG,x$EVTYPE,sum)))
propsum <- cbind(totalsum,rownames(totalsum))
propsum$TOTALPROPDMG <- totalsum[,1] + totalsum[,2]
propsumdesc <-propsum[order(propsum[,"TOTALPROPDMG"],decreasing=TRUE),]
propsumdesc$dmgpercent <- propsumdesc[,"TOTALPROPDMG"]*100/sum(propsumdesc[,"TOTALPROPDMG"])
propsumdesc <- propsumdesc[,c(3,5,6)]
rownames(propsumdesc) <- c()
colnames(propsumdesc) <- c("EVTYPE","TOTALDMG","% of Total")
We when we review the population damage data in descending order we can see that TORNADO and EXCESSIVE HEAT events combined cause more than 50% of fatalities and injuries.
head(popdamages,10)
## EVTYPE FATALITIES INJURIES FATALITIES (% total)
## 834 TORNADO 5633 91346 37.193793
## 130 EXCESSIVE HEAT 1903 6525 12.565203
## 153 FLASH FLOOD 978 1777 6.457577
## 275 HEAT 937 2100 6.186860
## 464 LIGHTNING 816 5230 5.387917
## 856 TSTM WIND 504 6957 3.327831
## 170 FLOOD 470 6789 3.103334
## 585 RIP CURRENT 368 232 2.429845
## 359 HIGH WIND 248 1137 1.637504
## 19 AVALANCHE 224 170 1.479036
## INJURIES(% total)
## 834 65.0019925
## 130 4.6432028
## 153 1.2645167
## 275 1.4943641
## 464 3.7216782
## 856 4.9506148
## 170 4.8310657
## 585 0.1650917
## 359 0.8090914
## 19 0.1209723
And when we do the same with economic damage we see that the most damage is caused TORNADO event type (62%) followed by EXCESSIVE HEAT (5.5%)
head(propsumdesc,10)
## EVTYPE TOTALDMG % of Total
## 1 TORNADO 96979 62.2966089
## 2 EXCESSIVE HEAT 8428 5.4139125
## 3 TSTM WIND 7461 4.7927386
## 4 FLOOD 7259 4.6629795
## 5 LIGHTNING 6046 3.8837820
## 6 HEAT 3037 1.9508842
## 7 FLASH FLOOD 2755 1.7697353
## 8 ICE STORM 2064 1.3258561
## 9 THUNDERSTORM WIND 1621 1.0412853
## 10 WINTER STORM 1527 0.9809023
A pie chart showing the events that contribute the bulk of the damage assesment is illustrated below
par(mfrow=c(1,3))
pie(propsumdesc[1:5,3],propsumdesc$EVTYPE,main="Fatalities");pie(popdamages[1:5,4],popdamages$EVTYPE,main=
"Injuries");pie(popdamages[1:5,3],popdamages$EVTYPE,main="Economic")
From the analysis we can conclude that the two weather events TORNADO and EXCESSIVE HEAT have the greatest impact to population and economy and preparing for these events will help mitigate these losses to a considerable extent.