The following is a brief review of the Storm Data which is an official publication of the National Oceanic and Atmospheric Administration (NOAA) which documents the occurrence of storms and other significant weather phenomena having sufficient intensity to cause loss of life, injuries, significant property damage, and/or disruption to commerce. The data covers weather events type over a period between the years 1950 and 2011. This review will highlight the types of weather events based on the average damage to human health and property per weather event type. This review can be used to guide you in your research and planning for the most significant weather events.
The Storm Data is extracted and loaded into the Storm Data frame. The Storm Data is processed to get numeric values for the property damage and the injuries and fatalities are combined to one value. The average property damage cost and average number of injuries and fatalities per type of weather event is computed. The top 10 ten weather events by average property damage cost and average number of injuries and fatalities are selected and sorted into smaller data frames.
## setup the environment
setwd("~/Coursera/Reproducible2")
library("R.utils")
## Loading required package: R.oo
## Loading required package: R.methodsS3
## R.methodsS3 v1.6.1 (2014-01-04) successfully loaded. See ?R.methodsS3 for help.
## R.oo v1.18.0 (2014-02-22) successfully loaded. See ?R.oo for help.
##
## Attaching package: 'R.oo'
##
## The following objects are masked from 'package:methods':
##
## getClasses, getMethods
##
## The following objects are masked from 'package:base':
##
## attach, detach, gc, load, save
##
## R.utils v1.33.0 (2014-08-24) successfully loaded. See ?R.utils for help.
##
## Attaching package: 'R.utils'
##
## The following object is masked from 'package:utils':
##
## timestamp
##
## The following objects are masked from 'package:base':
##
## cat, commandArgs, getOption, inherits, isOpen, parse, warnings
library("ggplot2")
## extract the StormData dataset
bunzip2("./repdata-data-StormData.csv.bz2")
## load the Storm Data
StormData <- read.csv("./repdata-data-StormData.csv")
## combine the property damage costs
StormData$PROPDMGCOST <- sapply(as.character(StormData$PROPDMGEXP), FUN=function(exp) {
if (exp == "K")
return(1000.0)
else if (exp == "M")
return(1000000.0)
else if (exp == "B")
return(1000000000.0)
return(1.0)
})
StormData$PROPDMGCOST <- with( StormData , PROPDMG * PROPDMGCOST )
## combine the injuries and the fatality numbers
StormData$HUMANINV <- with( StormData , FATALITIES + INJURIES )
## Calculate the means for each weather type
StormAvg <- aggregate(StormData[c("PROPDMGCOST","HUMANINV")], by=list(EVTYPE = StormData$EVTYPE), mean , na.rm=TRUE )
## select the top 10 weather types in order of cost
t10cost <- head(StormAvg[order(-StormAvg$PROPDMGCOST),] , 10)
t10cost$EVTYPE <- reorder(t10cost$EVTYPE,t10cost$PROPDMGCOST)
## select the top 10 weather types in order of number of injuries and fatalities
t10human <- head(StormAvg[order(-StormAvg$HUMANINV),] , 10)
t10human$EVTYPE <- reorder(t10human$EVTYPE,t10human$HUMANINV)
plot10human <- ggplot(t10human, aes(x=EVTYPE, y=HUMANINV)) +
geom_bar(stat='identity') +
xlab("Weather Event Type") + ylab("Average Number of Combined Injuries and Fatalities") +
coord_flip()
plot10cost <- ggplot(t10cost, aes(x=EVTYPE, y=PROPDMGCOST)) +
geom_bar(stat='identity') +
xlab("Weather Event Type") + ylab("Average Property Damage Cost ($)") +
coord_flip()
The following figure shows the top 10 weather types that have the highest average number of combined injuries and fatalities.
print(plot10human)
The following figure shows the top 10 weather types that have the highest average property damage costs.
print(plot10cost)
The Weather types are not always named according to the recommendations and have not been corrected in the data presented. So particular events such as named tropical storms would be not be averaged with similar event types. The renaming of the weather types would combine some similar types however the averages of costs or combined injuries and fatalities could possibly change. The analysis is meant to give insight into the results of weather events that have occurred and due to the quality of the data further in depth analysis of the data in its context is recommended.