This data analysis uses the National Oceanic and Atmospherice Adminstration (NOAA) database to arrive at the storm type that has had impact on human health and economy.
The NOAA data has observations for the count of human impact (fatalities and injuries) and economic impact (crop and property damage). This data spans over multiple event types and all over the United States. In this analysis we will analyse the count of impacts to derive the storm type that causes the maximum damage.
We read the raw CSV file as described below.
storm<-read.csv("StormData.csv",
header=TRUE,
stringsAsFactors=FALSE,
na.strings="?")
We notice that, there are a total of 902297 observations spanning 37 columns of observation. Also, there are a 985 number of event types whose impact to human life has been reported in FATALITIES and INJURIES column. The economic damage is covered in the PROPDMG and CROPDMG columns.
To describe the impact of events on human life, we calculate the overall impact of events by summing up the fatalities and injuries column per event across US.
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
humanCasualties<-summarise(group_by(storm,EVTYPE),
totalCount=sum(FATALITIES+INJURIES))
economicImpact<-summarise(group_by(storm,EVTYPE),
totalCount=sum(CROPDMG+PROPDMG))
To describe the impact of events on human life, we calculate the overall impact of events by summing up the crop and property damage column per event across US.
economicImpact<-summarise(group_by(storm,EVTYPE),
totalCount=sum(CROPDMG+PROPDMG))
We develop a bar-plot of top five (total) events and their impact on human life and economy.
In the following plot, we see the event types and the counts of fatalities and injuries caused by them. Given that, there are over 900 storm event types, we plot here only the top 5 such events.
x<-arrange(humanCasualties,
desc(totalCount))
top5Human<-head(x,5)
barplot(top5Human$totalCount,
names.arg=top5Human$EVTYPE,
main="Storm Event Types and Human Impact",
xlab="Storm Event Types",
ylab="Count of injuries and fatalities",
col="orangered1")
In the following plot, we see the event types and the counts of crop and property damage caused by them. Given that, there are over 900 storm event types, we plot here only the top 5 such events.
y<-arrange(economicImpact,
desc(totalCount))
top5Econ<-head(y,5)
barplot(top5Econ$totalCount,
names.arg=top5Econ$EVTYPE,
main="Storm Event Types and Impact on Economy",
xlab="Storm Event Types",
ylab="Count of crop and property damage",
col="red1")
From both these plots, it is clear tornado has had the highest impact on human life and economy. To further analyse this data, we look at the count of damage (human and economy) over the past years.
t0<-filter(storm,
EVTYPE=="TORNADO")
t1<-mutate(t0,tStamp=
as.Date(BGN_DATE,format="%m/%d/%Y %H:%M:%S"))
tornadoCasCounts<-summarise(group_by(t1,tStamp),
totalCount=sum(FATALITIES+INJURIES))
tornadoEcoCounts<-summarise(group_by(t1,tStamp),
totalCount=sum(CROPDMG+PROPDMG))
par(mfrow=c(1,2),
oma = c(0,0,8,0))
plot(tornadoCasCounts$tStamp,
tornadoCasCounts$totalCount,
type="l",
col="green",
xlab="Years",
ylab="Human Casualty Counts")
plot(tornadoEcoCounts$tStamp,
tornadoEcoCounts$totalCount,
type="l",
col="blue",
xlab="Years",
ylab="Economic Casualty Counts")
title("Comparison of human and economic casualties by tornado",
outer=TRUE)
The time-series plot reveals that, over the years, the loss of human lives have been minimized but, the property damage still lies on the higher side.
Thus, it is clear that, tornado has had the highest impact on human health population and economy.
It is recommended that, the citizens are educated to avoid locations such as the following owing to high fatality associated with it. (Refer Section 7.40.9 in National Weather Service document for more details.)