This report consummates the analysis and findings of storm data data collected between the years 1950 and 2011. Data is collected by National Weather Service and National Climatic Data Center which explores the severe weather events which can result in public health and economic problems for communities and municipalities.
This report consist of 2 major sections which has
A summary of how the data is processed
Inferences and the Results arrived
library(data.table)
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:data.table':
##
## between, last
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(lubridate)
##
## Attaching package: 'lubridate'
##
## The following objects are masked from 'package:data.table':
##
## hour, mday, month, quarter, wday, week, yday, year
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
##
## Attaching package: 'plyr'
##
## The following object is masked from 'package:lubridate':
##
## here
##
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
library(reshape2)
library(ggplot2)
Data is tranformed from flat csv loaded one to Data Frame format for quick analysis using ddply functions
library(data.table)
library(dplyr)
library(lubridate)
library(plyr)
library(reshape2)
library(ggplot2)
if (!(exists("storm_data"))) {
print("Storm Data Loading...")
if (!(exists("csv_data"))) {
csv_data <- read.csv("StormData.csv", sep = ",", stringsAsFactors = F)
}
storm_data <- tbl_df(csv_data)
rm(csv_data)
}
## [1] "Storm Data Loading..."
This section provides an exhaustive summary of the fatalities, injuries and losses incurred due to natural calamities listed in the source data.
This section portrays the top 5 event which causes human death and injuries respectivelys.
Fatalities Tornados resulted in more than 5500 deaths and the next in the list is Excessive Heat which is quite less than half of the death caused by tornados.
calcNetFatalitiesByEvent <- function(data) {
fatalitiesByEvent <- ddply(
storm_data,
.(EVTYPE),
function(x) {
sum(x$FATALITIES, na.rm = T)
}
)
return(fatalitiesByEvent[order(-fatalitiesByEvent$V1), ])
}
netFatalities <- calcNetFatalitiesByEvent(storm_data)
tmpData1 <- netFatalities[1:5, ]
p <- qplot(
EVTYPE, V1, stat="identity", data = tmpData1, geom = "bar",
xlab = "Event Type",
ylab = "No of Deaths",
main = "Net Fatalities of Top 5 Events",
fill = factor(EVTYPE)
)
print(p)
print(tmpData1)
## EVTYPE V1
## 834 TORNADO 5633
## 130 EXCESSIVE HEAT 1903
## 153 FLASH FLOOD 978
## 275 HEAT 937
## 464 LIGHTNING 816
Injuries Injuries caused by tornados are way high compared to any other natural calamity.
calcNetInjuriesByEvent <- function(data) {
injuriesByEvent <- ddply(
storm_data,
.(EVTYPE),
function(x) {
sum(x$INJURIES, na.rm = T)
}
)
return(injuriesByEvent[order(-injuriesByEvent$V1), ])
}
netInjuries <- calcNetInjuriesByEvent(storm_data)
tmpData2 <- netInjuries[1:5, ]
q <- qplot(EVTYPE, V1, stat="identity", data = tmpData2, geom = "bar",
xlab = "Event Type",
ylab = "No of Injuries",
main = "Net Injuries of Top 5 Events",
fill = factor(EVTYPE)
)
print(q)
print(tmpData2)
## EVTYPE V1
## 834 TORNADO 91346
## 856 TSTM WIND 6957
## 170 FLOOD 6789
## 130 EXCESSIVE HEAT 6525
## 464 LIGHTNING 5230
This section portrays the top 10 events which causes maximum economic losses.
Crop Damage Hail storms and flash floods result in high crop damage. The losses due to hail storm is more than 3 times its next damage causing event which is flash floods.
calcNetCropDamage <- function(data) {
cropDamageByEvent <- ddply(
storm_data,
.(EVTYPE),
function(x) {
sum(x$CROPDMG, na.rm = T)
}
)
return(cropDamageByEvent[order(-cropDamageByEvent$V1), ])
}
cropDamage <- calcNetCropDamage(storm_data)
print(head(cropDamage, 10))
## EVTYPE V1
## 244 HAIL 579596.28
## 153 FLASH FLOOD 179200.46
## 170 FLOOD 168037.88
## 856 TSTM WIND 109202.60
## 834 TORNADO 100018.52
## 760 THUNDERSTORM WIND 66791.45
## 95 DROUGHT 33898.62
## 786 THUNDERSTORM WINDS 18684.93
## 359 HIGH WIND 17283.21
## 290 HEAVY RAIN 11122.80
Property Damage Tornados has resulted in highest property damage of order 3.2 billion.
calcNetPropertyDamage <- function(data) {
propertyDamageByEvent <- ddply(
storm_data,
.(EVTYPE),
function(x) {
sum(x$PROPDMG, na.rm = T)
}
)
return(propertyDamageByEvent[order(-propertyDamageByEvent$V1), ])
}
propertyDamage <- calcNetPropertyDamage(storm_data)
print(head(propertyDamage, 10))
## EVTYPE V1
## 834 TORNADO 3212258.2
## 153 FLASH FLOOD 1420124.6
## 856 TSTM WIND 1335965.6
## 170 FLOOD 899938.5
## 760 THUNDERSTORM WIND 876844.2
## 244 HAIL 688693.4
## 464 LIGHTNING 603351.8
## 786 THUNDERSTORM WINDS 446293.2
## 359 HIGH WIND 324731.6
## 972 WINTER STORM 132720.6