Exploratory Analysis of the NOAA Storm Data

Synopsis

The goal of this assignment is to explore data from the National Oceanic and Atmospheric Administration storm database inorder to establish which severe weather events affect public health and economic activities in the US the most.

Data Processing

We begin this analysis by preparing the data for analysis i.e clear the environment, set working directory, download, extract and read the data into R.

rm(list = ls()) #clean up working environment

setwd("H:/Data Science/Reproducible Research/week4/Assignment/") #set working directory

# Load required libraries
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.5
#initialize variable with file download URL
fileURL = "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2" #Initialize file URL for data download


if(!file.exists("Data")){ #check if data directory exists, and create
      dir.create("Data")
}
if(!file.exists("Data/StormData.csv.bz2")){ #check if data file exists and download
      download.file(fileURL, destfile = "Data/StormData.csv.bz2")
}

#Load data into in R
StormData <-read.csv("Data/StormData.csv.bz2", sep=",", header=T, quote = "\"")
## Warning in scan(file, what, nmax, sep, dec, quote, skip, nlines,
## na.strings, : EOF within quoted string
#Prepare data for analysis
subsetSD <- StormData[,c('EVTYPE','FATALITIES','INJURIES', 'PROPDMG', 'PROPDMGEXP', 'CROPDMG', 'CROPDMGEXP')]

#Create variable for math friendly values and initialize it to 0
subsetSD$PROPDMGNUM = 0
# Convert H, K, M, B units to calculate for Property Damage
subsetSD[subsetSD$PROPDMGEXP == "H", ]$PROPDMGNUM = subsetSD[subsetSD$PROPDMGEXP == "H", ]$PROPDMG * 10^2
subsetSD[subsetSD$PROPDMGEXP == "K", ]$PROPDMGNUM = subsetSD[subsetSD$PROPDMGEXP == "K", ]$PROPDMG * 10^3
subsetSD[subsetSD$PROPDMGEXP == "M", ]$PROPDMGNUM = subsetSD[subsetSD$PROPDMGEXP == "M", ]$PROPDMG * 10^6
subsetSD[subsetSD$PROPDMGEXP == "B", ]$PROPDMGNUM = subsetSD[subsetSD$PROPDMGEXP == "B", ]$PROPDMG * 10^9
# Convert H, K, M, B units to calculate Crop Damage
subsetSD$CROPDMGNUM = 0
subsetSD[subsetSD$CROPDMGEXP == "H", ]$CROPDMGNUM = subsetSD[subsetSD$CROPDMGEXP == "H", ]$CROPDMG * 10^2
subsetSD[subsetSD$CROPDMGEXP == "K", ]$CROPDMGNUM = subsetSD[subsetSD$CROPDMGEXP == "K", ]$CROPDMG * 10^3
subsetSD[subsetSD$CROPDMGEXP == "M", ]$CROPDMGNUM = subsetSD[subsetSD$CROPDMGEXP == "M", ]$CROPDMG * 10^6
subsetSD[subsetSD$CROPDMGEXP == "B", ]$CROPDMGNUM = subsetSD[subsetSD$CROPDMGEXP == "B", ]$CROPDMG * 10^9

Results

For this analysis, we look at weather events causing death, injury and economic destruction.

Severe weather conditions causing death

To establish major cause of death, we look at the fatal injuries and what causes them.

Severe weather conditions causing public health

Additionally, we look at the events cause most injuries

Severe weather conditions causeing death

In order to understand the extent of economic data mage, we take the value of both crop and property damages by each event type.

As it turns out, floods cause the most crop and property damages