This report consummates the analysis and findings of storm data data collected between the years 1950 and 2011. Data is collected by National Weather Service and National Climatic Data Center which explores the severe weather events which can result in public health and economic problems for communities and municipalities.

Synopsis

This report consist of 2 major sections which has

library(data.table)
library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:data.table':
## 
##     between, last
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
## 
## Attaching package: 'lubridate'
## 
## The following objects are masked from 'package:data.table':
## 
##     hour, mday, month, quarter, wday, week, yday, year
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## 
## The following object is masked from 'package:lubridate':
## 
##     here
## 
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
library(reshape2)
library(ggplot2)

Data Processing

Data Tranformation

Data is tranformed from flat csv loaded one to Data Frame format for quick analysis using ddply functions

library(data.table)
library(dplyr)
library(lubridate)
library(plyr)
library(reshape2)
library(ggplot2)

if (!(exists("storm_data"))) {
    print("Storm Data Loading...")
    if (!(exists("csv_data"))) {
        csv_data <- read.csv("StormData.csv", sep = ",", stringsAsFactors = F)
    }
    storm_data <- tbl_df(csv_data)
    rm(csv_data)
}
## [1] "Storm Data Loading..."

Results

This section provides an exhaustive summary of the fatalities, injuries and losses incurred due to natural calamities listed in the source data.

Events Most Harmful to Population Health

This section portrays the top 5 event which causes human death and injuries respectivelys.

Fatalities Tornados resulted in more than 5500 deaths and the next in the list is Excessive Heat which is quite less than half of the death caused by tornados.

calcNetFatalitiesByEvent <- function(data) {
    fatalitiesByEvent <- ddply(
        storm_data, 
        .(EVTYPE), 
        function(x) {  
            sum(x$FATALITIES, na.rm = T)
        }
    )
    return(fatalitiesByEvent[order(-fatalitiesByEvent$V1), ])
}
netFatalities <- calcNetFatalitiesByEvent(storm_data)

tmpData1 <- netFatalities[1:5, ]

p <- qplot(
    EVTYPE, V1, stat="identity", data = tmpData1, geom = "bar",
    xlab = "Event Type",
    ylab = "No of Deaths",
    main = "Net Fatalities of Top 5 Events",
    fill = factor(EVTYPE)
) 
print(p)

print(tmpData1)
##             EVTYPE   V1
## 834        TORNADO 5633
## 130 EXCESSIVE HEAT 1903
## 153    FLASH FLOOD  978
## 275           HEAT  937
## 464      LIGHTNING  816

Injuries Injuries caused by tornados are way high compared to any other natural calamity.

calcNetInjuriesByEvent <- function(data) {
    injuriesByEvent <- ddply(
        storm_data, 
        .(EVTYPE), 
        function(x) {  
            sum(x$INJURIES, na.rm = T)
        }
    )
    return(injuriesByEvent[order(-injuriesByEvent$V1), ])
}

netInjuries <- calcNetInjuriesByEvent(storm_data)

tmpData2 <- netInjuries[1:5, ]

q <- qplot(EVTYPE, V1, stat="identity", data = tmpData2, geom = "bar",
    xlab = "Event Type",
    ylab = "No of Injuries",
    main = "Net Injuries of Top 5 Events",
    fill = factor(EVTYPE)
)
print(q)

print(tmpData2)
##             EVTYPE    V1
## 834        TORNADO 91346
## 856      TSTM WIND  6957
## 170          FLOOD  6789
## 130 EXCESSIVE HEAT  6525
## 464      LIGHTNING  5230

Events Causing Maximum Economic Loss

This section portrays the top 10 events which causes maximum economic losses.

Crop Damage Hail storms and flash floods result in high crop damage. The losses due to hail storm is more than 3 times its next damage causing event which is flash floods.

calcNetCropDamage <- function(data) {
    cropDamageByEvent <- ddply(
        storm_data, 
        .(EVTYPE), 
        function(x) {  
            sum(x$CROPDMG, na.rm = T)
        }
    )
    return(cropDamageByEvent[order(-cropDamageByEvent$V1), ])
}

cropDamage <- calcNetCropDamage(storm_data)
print(head(cropDamage, 10))
##                 EVTYPE        V1
## 244               HAIL 579596.28
## 153        FLASH FLOOD 179200.46
## 170              FLOOD 168037.88
## 856          TSTM WIND 109202.60
## 834            TORNADO 100018.52
## 760  THUNDERSTORM WIND  66791.45
## 95             DROUGHT  33898.62
## 786 THUNDERSTORM WINDS  18684.93
## 359          HIGH WIND  17283.21
## 290         HEAVY RAIN  11122.80

Property Damage Tornados has resulted in highest property damage of order 3.2 billion.

calcNetPropertyDamage <- function(data) {
    propertyDamageByEvent <- ddply(
        storm_data, 
        .(EVTYPE), 
        function(x) {  
            sum(x$PROPDMG, na.rm = T)
        }
    )
    return(propertyDamageByEvent[order(-propertyDamageByEvent$V1), ])
}

propertyDamage <- calcNetPropertyDamage(storm_data)
print(head(propertyDamage, 10))
##                 EVTYPE        V1
## 834            TORNADO 3212258.2
## 153        FLASH FLOOD 1420124.6
## 856          TSTM WIND 1335965.6
## 170              FLOOD  899938.5
## 760  THUNDERSTORM WIND  876844.2
## 244               HAIL  688693.4
## 464          LIGHTNING  603351.8
## 786 THUNDERSTORM WINDS  446293.2
## 359          HIGH WIND  324731.6
## 972       WINTER STORM  132720.6