Created by James Lim - June 2015

Title

Explore the NOAA Storm Database and answer some basic questions about severe weather events.

Synopsis

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Data Processing

The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. You can download the 47mb file from the web site Storm Data

There is also some documentation of the database available. Here you will find how some of the variables are constructed/defined.

  1. National Weather Service Storm Data Documentation

  2. National Climatic Data Center Storm Events FAQ

The csv file is downloaded into machine local hard disk and user must unzip the zip file. This will take a few minutes to read the csv.

setwd("e:\\module5")
stormdata <- read.csv("repdata_data_StormData.csv", sep = ",")

Subset data for fatalities, injuries, and property damage

newDataName=c("EVTYPE","FATALITIES","INJURIES","PROPDMG","CROPDMG")
dataDanger<-subset(stormdata,select=newDataName)
dataDanger$EVTYPE = toupper(stormdata$EVTYPE)
dataDanger<-dataDanger[!grepl("Summary", stormdata$EVTYPE), ]

Aggregate Data

damageType=c("FATALITIES","INJURIES","PROPDMG","CROPDMG")
damages<-aggregate(dataDanger[damageType],list(dataDanger$EVTYPE),sum)

#Delete rows with zero in all columns
damages<-damages[rowSums(damages[, -1] > 0) != 0, ]
#Sum damage on property and crop
damages$finantial<-damages$PROPDMG+damages$CROPDMG
damages$PROPDMG<-NULL
damages$CROPDMG<-NULL

Results

(A) Analysis of Fatalities by weather Events

  1. Top 10 weather events classified by number of Fatalities
myData<-data.frame(damages$Group.1,damages$FATALITIES)
names(myData)<-c("x","y")
#Order descending
myData <- myData[order(-myData$y),]
#deleting zero
myData<-myData[(myData[, -1] >0),]
#keep top 10 Events
myData<-head(myData,10)
ymax<-max(myData$y)
barplot(myData$y, las=3, names.arg = myData$x, main = "Top 10 Highest Fatalities", ylab = "Number of Fatalities", col = "blue")

Analysis of Injuries by weather Events

  1. Top 10 weather events classified by number of Injuries - Tornato caused the greatest injuries
myData<-data.frame(damages$Group.1,damages$INJURIES)
names(myData)<-c("x","y")
#Order descending
myData <- myData[order(-myData$y),]
#deleting zero
myData<-myData[(myData[, -1] >0),]
#keep top 10 Events
myData<-head(myData,10)
ymax<-max(myData$y)
barplot (myData$y, las=3, names.arg = myData$x, main = "Top 10 Highest Injuries", ylab = "Number of Injuries", col = "green")

(B) Analysis of Economic damage caused by weather Events

Top 10 weather events classified by economic damage to crops and properties

myData<-data.frame(damages$Group.1,damages$finantial)
names(myData)<-c("x","y")
#Order descending
myData <- myData[order(-myData$y),]
#deleting zero
myData<-myData[(myData[, -1] >0),]
#keep top 10 Events
myData<-head(myData,10)
ymax<-max(myData$y)
barplot(myData$y, las=3, names.arg = myData$x, main = "Top 10 Highest Economic Damages", ylab = "Economic damage ($)", col = "red")