Synopsis

An analysis of the adverse weather events is performed to gain an insight on which events causes the most human injuries, human fatalities, and property Damage. Events from 1950 to November 2011 have been taken into consideration. It has been found that Tornadoes top the list in all three categories.

Data Processing

Loading data into R

The data is downloaded from this link. It is in a bzip2 file format and is unzipped and loaded into the worksapce object, “stormdata” using the following code block.

R.utils::bunzip2("stormdata.csv.bz2")
## Warning in file.remove(filename): cannot remove file 'stormdata.csv.bz2',
## reason 'Permission denied'
stormdata <- read.csv("stormdata.csv")

Preprocessing the Data.

We are interested in the total human fatalities, human injuries and property damage inficted by each event. Hence the columns of interest from the current data set are “EVTYPE”, “FATALITIES”, “INJURIES”, and “PROPDMG”. The aggregate function from the plyr package is used to find the total of each variables per event type and the result is stored in an object, sdFnInPd.

library(plyr)
## Warning: package 'plyr' was built under R version 3.3.2
sdFnInPd <- aggregate(stormdata[,c("FATALITIES", "INJURIES", "PROPDMG")], by = list(stormdata$EVTYPE), sum, na.rm = TRUE)
names(sdFnInPd)[1] <- "EVTYPE"

Calculating top 5 Events.

The top 5 events that cause maximum human fatalities, human injuries and property damage are calculated and stored in three separate objects. A combined table is formed from the three objects. The table is printed after the below block of code. The effect of these events on all three variables are looked at.

eventMax5Inj <- as.character(head(arrange(sdFnInPd, sdFnInPd$INJURIES, decreasing = TRUE), 5)[,1])
eventMax5Fat <- as.character(head(arrange(sdFnInPd, sdFnInPd$FATALITIES, decreasing = TRUE), 5)[,1])
eventMax5PD <- as.character(head(arrange(sdFnInPd, sdFnInPd$PROPDMG, decreasing = TRUE), 5)[,1])

eventMax <- unique(c(eventMax5Inj, eventMax5Fat,eventMax5PD))

eventMax5DF <- sdFnInPd[sdFnInPd$EVTYPE %in% eventMax,]
eventMax5DF <- arrange(eventMax5DF, eventMax5DF$FATALITIES, decreasing = TRUE)

print(eventMax5DF)
##               EVTYPE FATALITIES INJURIES    PROPDMG
## 1            TORNADO       4069    69331 1882363.57
## 2               HEAT        643      232      55.00
## 3          TSTM WIND        263     3327     325.00
## 4          LIGHTNING         61      377   33472.42
## 5              FLOOD         31       21   15706.60
## 6 THUNDERSTORM WINDS         26      251  152017.57
## 7        FLASH FLOOD         17       22   51224.20
## 8           BLIZZARD         14      404     655.00
## 9               HAIL          5      442   23750.96

Results

It has been found that tornadoes inflict maximum propert damage and causes maximum human injuries and fatalities. The plots that follows demonstrates that.

Human Injuries and Fatalities

par(las=1, mfrow = c(2,1), mar= c(4,12,1,1))
barplot(t(eventMax5DF[,2:3]), names.arg = eventMax5DF[,1], horiz = TRUE,
        xlab = "Victim-Count", 
        main = "Humans were most affected by Tornadoes.")
barplot(t(log(eventMax5DF[,2:3])), names.arg = eventMax5DF[,1], horiz = TRUE,
        legend.text=TRUE, args.legend=list(x=20, y=12, bty = "n"), 
        xlab = "Log of Victim-Count" )

Damage to properties.

eventMax5DF <- arrange(eventMax5DF, eventMax5DF$PROPDMG, decreasing = TRUE)
par(las=1, mfrow = c(2,1), mar= c(4,12,1,1))
barplot(t(eventMax5DF[,4]), names.arg = eventMax5DF[,1], horiz = TRUE,
        xlab = "Property Damage ($)", 
        main = "Properties were most affected by Tornadoes")
barplot(t(log(eventMax5DF[,4])), names.arg = eventMax5DF[,1], horiz = TRUE,
        xlab = "Log of Property Damage ($)" )