1. Title Reproducible Research Project 2

author: “Farzad Ravari”

date: “April 12, 2017”

output: html_document

Brief Summary

The basic goal of this assignment is to explore the NOAA Storm Database and answer some basic questions about severe weather events. You must use the database to answer the questions below and show the code for your entire analysis. Your analysis can consist of tables, figures, or other summaries. You may use any R package you want to support your analysis.

2. Synopsis

The National Oceanic and Atmospheric Administration (NOAA) maintain a public database for storm event. The data contains the type of storm event, details like location, date, estimates for damage to property as well as the number of human victims of the storm. data analysis must address the following questions: a) Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health? b) Across the United States, which types of events have the greatest economic consequences?

3. Data Processing

Set Directory

getwd()

[1] "C:/Users/farza/OneDrive/Documents"

setwd("C:/Users/farza/Desktop/Data Science/Course 5/Project 2/data/data")

getwd()

[1] "C:/Users/farza/Desktop/Data Science/Course 5/Project 2/data/data"

Laod necessary Libraries

library(RCurl) # for loading external dataset 
library(plyr) # for count & aggregate method
library(reshape2) # for melt 
library(ggplot2) # for plots
library(grid) # for grids
library(gridExtra) # for advanced plots
library(scales) # for plot scaling

Load files and extract the files

  if(!file.exists("C:/Users/farza/Desktop/Data Science/Course 5/Project
     2/data/data/StormData.csv")){
    filePath <- "C:/Users/farza/Desktop/Data Science/Course 5/Project
        2/data/data/StormData.csv.bz2"
       destPath <- "C:/Users/farza/Desktop/Data Science/Course 5/Project
      2/data/data/StormData.csv"
       unzip(filePath,destPath,overwrite=TRUE, remove=FALSE)
        }
  

Reading Data from Destination path

storm <- read.csv("C:/Users/farza/Desktop/Data Science/Course 5/Project
    2/data/data/StormData.csv")

Extract data for health & economic impaction

event <- c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP",
    "CROPDMG", "CROPDMGEXP")
    data <- storm[event]

Evaluate property damage

unique(data$PROPDMGEXP)
[1] K M   B m + 0 5 6 ? 4 2 3 h 7 H - 1 8
Levels:  - ? + 0 1 2 3 4 5 6 7 8 B h H K m M

data$PROPEXP[data$PROPDMGEXP == "6"] <- 1e+06
data$PROPEXP[data$PROPDMGEXP == "4"] <- 10000
data$PROPEXP[data$PROPDMGEXP == "2"] <- 100
data$PROPEXP[data$PROPDMGEXP == "3"] <- 1000
data$PROPEXP[data$PROPDMGEXP == "h"] <- 100
data$PROPEXP[data$PROPDMGEXP == "7"] <- 1e+07
data$PROPEXP[data$PROPDMGEXP == "H"] <- 100
data$PROPEXP[data$PROPDMGEXP == "1"] <- 10
data$PROPEXP[data$PROPDMGEXP == "8"] <- 1e+08
data$PROPEXP[data$PROPDMGEXP == "+"] <- 0
data$PROPEXP[data$PROPDMGEXP == "-"] <- 0
data$PROPEXP[data$PROPDMGEXP == "?"] <- 0

Calculate Property damage value

data$PROPDMGVAL <- data$PROPDMG * data$PROPEXP

Evaluate crop damage

unique(data$CROPDMGEXP)
[1]   M K m B ? 0 k 2
Levels:  ? 0 2 B k K m M

data$CROPEXP[data$CROPDMGEXP == "M"] <- 1e+06
data$CROPEXP[data$CROPDMGEXP == "K"] <- 1000
data$CROPEXP[data$CROPDMGEXP == "m"] <- 1e+06
data$CROPEXP[data$CROPDMGEXP == "B"] <- 1e+09
data$CROPEXP[data$CROPDMGEXP == "0"] <- 1
data$CROPEXP[data$CROPDMGEXP == "k"] <- 1000
data$CROPEXP[data$CROPDMGEXP == "2"] <- 100
data$CROPEXP[data$CROPDMGEXP == ""] <- 1
data$CROPEXP[data$CROPDMGEXP == "?"] <- 0

Calculate crop damage value

data$CROPDMGVAL <- data$CROPDMG * data$CROPEXP

Calculate Total incidents by events

  data$CROPDMGVAL <- data$CROPDMG * data$CROPEXP
  fatal <- aggregate(FATALITIES ~ EVTYPE, data, FUN = sum)
  injury <- aggregate(INJURIES ~ EVTYPE, data, FUN = sum)
  propdmg <- aggregate(PROPDMGVAL ~ EVTYPE, data, FUN = sum)
  cropdmg <- aggregate(CROPDMGVAL ~ EVTYPE, data, FUN = sum)
  

Plot the events with highest fatalities and highest injuries

fatal8 <- fatal[order(-fatal$FATALITIES), ][1:8, ]
injury8 <- injury[order(-injury$INJURIES), ][1:8, ]par(mfrow = c(1, 2), mar= c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(fatal8$FATALITIES, las= 3, names.arg = fatal8$EVTYPE, main ="Highest Fatalities Events",ylab ="Fatalities No.", col = "red")
barplot(injury8$INJURIES, las = 3, names.arg = injury8$EVTYPE, main =     "Highest Injuries Events",ylab = "Injuries No.", col = "purple")
Alt text

Alt text

Plot the events with highest Property damage and highest crop damage

propdmg8 <- propdmg[order(-propdmg$PROPDMGVAL), ][1:8, ]
cropdmg8 <- cropdmg[order(-cropdmg$CROPDMGVAL), ][1:8, ]
par(mfrow = c(1, 2), mar = c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(propdmg8$PROPDMGVAL/(10^9), las = 3, names.arg = propdmg8$EVTYPE, 
 main = "Highest Property Damages Events", ylab = "Damage Cost ($billions)", col = "grey")
 barplot(cropdmg8$CROPDMGVAL/(10^9), las = 3, names.arg = cropdmg8$EVTYPE, 
main = "Highest Crop Damages Events", ylab = "Damage Cost ($ billions)",col = "orange")
Alt text

Alt text

4.Results

Fatalities and injuries:

Maximum fatalities and injuries are due to tornados and then excessive heat for fatalities and thunderstorm wind for injuries

Property damage:

Mainly caused by floods and then hurricanes /typhoons and crop damage by drought and floods