Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Assignment

The basic goal of this assignment is to explore the NOAA Storm Database and answer some basic questions about severe weather events.

  1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

  2. Across the United States, which types of events have the greatest economic consequences?

Data processing

Loading the data

if(!file.exists("./data")){dir.create("./data")}
setwd("C:/Raghu/Rscipts/data")
setInternet2(use = TRUE)
fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileUrl, destfile = "./repdata_data_StormData.csv.bz2", method="auto")

require(R.utils)
## Warning: package 'R.utils' was built under R version 3.2.2
## Warning: package 'R.oo' was built under R version 3.2.2
## Warning: package 'R.methodsS3' was built under R version 3.2.2
bunzip2("repdata_data_StormData.csv.bz2", overwrite=TRUE)
stormdata <- read.csv("repdata_data_StormData.csv")

Processing

How the data are processed for analysis.

To calculate the injuries to humans, damages dataframe is being used, to aggregate both fatal and non-fatal injuries.

The economic impact is assessed by calculating the exponential value of the property and corp damage in data frame economic.

Two small data frames dam and eco are used to calculate only the top 10 events in human and economic impact respectively.

library(Hmisc)
## Warning: package 'Hmisc' was built under R version 3.2.1
## Warning: package 'Formula' was built under R version 3.2.1
## Warning: package 'ggplot2' was built under R version 3.2.1
library(reshape)
## Warning: package 'reshape' was built under R version 3.2.2
library(ggplot2)
library(car)
## Warning: package 'car' was built under R version 3.2.2
stormdata$EVTYPE <- capitalize(tolower(stormdata$EVTYPE))

damages<-aggregate(cbind(FATALITIES, INJURIES) ~ EVTYPE , stormdata, sum)
dam<-melt(head(damages[order(-damages$FATALITIES,-damages$INJURIES),],10))

stormdata$PROPDMG<-stormdata$PROPDMG*as.numeric(Recode(stormdata$PROPDMGEXP, "'0'=1;'1'=10;'2'=100;'3'=1000;'4'=10000;'5'=100000;'6'=1000000;'7'=10000000;'8'=100000000;'B'=1000000000;'h'=100;'H'=100;'K'=1000;'m'=1000000;'M'=1000000;'-'=0;'?'=0;'+'=0",as.factor.result=FALSE))
stormdata$CROPDMG<-stormdata$CROPDMG*as.numeric(Recode(stormdata$CROPDMGEXP, "'0'=1;'2'=100;'B'=1000000000;'k'=1000;'K'=1000;'m'=1000000;'M'=1000000;''=0;'?'=0",as.factor.result=FALSE))

economic<-aggregate(cbind(PROPDMG, CROPDMG) ~ EVTYPE , stormdata, sum)
eco<-melt(head(economic[order(-economic$PROPDMG,-economic$CROPDMG),],10))

Results

Human casualties

  • Question: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

By using the ggplot2 library we present a combined flipped barplot graph of the fatal (Deaths) and non-fatal Injuries, by event type.

ggplot(dam, aes(x=EVTYPE,y=value,fill=variable)) + geom_bar(stat = "identity") + coord_flip() +
  ggtitle("Harmful events") + labs(x = "", y="number of people impacted") +
  scale_fill_manual (values=c("orange","black"), labels=c("Deaths","Injuries"))

Economic impact

  • Question: Across the United States, which types of events have the greatest economic consequences?

By using the ggplot2 library we present a combined flipped barplot graph of the property and corp damages, by event type.

ggplot(eco, aes(x=EVTYPE,y=value,fill=variable)) + geom_bar(stat = "identity") + coord_flip() +
  ggtitle("Economic consequences") + labs(x = "", y="cost of damages in dollars") +
  scale_fill_manual (values=c("orange","black"), labels=c("Property Damage","Crop Damage"))