This is an analysis of sample data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. In this brief analysis I will seek to present the greatest impacts of these weather events from a health and economical perspective.Two dichotomous plots are used for the aforementioned perspective.Both the source code and results are shown in this document.
library(knitr)
library(dplyr)
library(plyr)
opts_chunk$set(echo=TRUE)
This block of code is used to ensure full automation. The dataset will be downloaded from the online source only if it does not exist in the current working directory. The data is also loaded into a R Data Frame.
if (! file.exists('stormData.csv.bz2')){
download.file('https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2',destfile =
'stormData.csv.bz2')
}
if(!exists('stormData')){
storm <- read.csv(bzfile("stormData.csv.bz2"))
}
This block of code creates the matricies that will be used to analyze the fatalities and injuries caused by the weather events. The first step is to sum by the Event Types and then store in 2 matricies. I have only taken the top 5 so we can focus on the most dangerous ones.
casualties <- ddply(storm, .(EVTYPE), summarize,
fatalities = sum(FATALITIES),
injuries = sum(INJURIES))
# Find events that caused most death and injury
fatal_events <- head(casualties[order(casualties$fatalities, decreasing = T), ], 5)
injury_events <- head(casualties[order(casualties$injuries, decreasing = T), ], 5)
#1
exp_transform <- function(e) {
# h -> hundred, k -> thousand, m -> million, b -> billion
if (e %in% c('h', 'H'))
return(2)
else if (e %in% c('k', 'K'))
return(3)
else if (e %in% c('m', 'M'))
return(6)
else if (e %in% c('b', 'B'))
return(9)
else if (!is.na(as.numeric(e))) # if a digit
return(as.numeric(e))
else if (e %in% c('', '-', '?', '+'))
return(0)
else {
stop("Invalid exponent value.")
}
}
prop_dmg_exp <- sapply(storm$PROPDMGEXP, FUN=exp_transform)
storm$prop_dmg <- storm$PROPDMG * (10 ** prop_dmg_exp)
crop_dmg_exp <- sapply(storm$CROPDMGEXP, FUN=exp_transform)
storm$crop_dmg <- storm$CROPDMG * (10 ** crop_dmg_exp)
#2
econ_loss <- ddply(storm, .(EVTYPE), summarize,
prop_dmg = sum(prop_dmg),
crop_dmg = sum(crop_dmg))
#3
econ_loss <- econ_loss[(econ_loss$prop_dmg > 0 | econ_loss$crop_dmg > 0), ]
prop_dmg_events <- head(econ_loss[order(econ_loss$prop_dmg, decreasing = T), ], 5)
crop_dmg_events <- head(econ_loss[order(econ_loss$crop_dmg, decreasing = T), ], 5)
Show data in pie chart
#Generate the plot
par(mfrow = c(2, 1), mar = c(0, 0, 2, 0), oma = c(0, 0, 0, 0))
pie(fatal_events$fatalities, main="Top 5 Most Fatal Events", labels=fatal_events$EVTYPE
, col=c(2:6))
pie(injury_events$injuries, main="Top 5 Events Resulting in Injury", labels=injury_events$EVTYPE
,col=c(7:12))
box(lty = '1373', which="outer")
Figure 1 - Top 5 fatalities and injuries
Results show that tornado is by far the most dangerous natural disaster. There are atleast 50% more fatalities and injuries caused by tornadoes than that of any other event.
Show data as horizontal barplot
#most_econ_dmg <- rbind(prop_dmg_events, crop_dmg_events)
par(mfrow = c(2, 1), mar = c(4.5, 11, 2, 0.5), oma = c(0, 0, 2, 0))
barplot(log10(prop_dmg_events$prop_dmg), names.arg=prop_dmg_events$EVTYPE
,main="Property Damage"
,col="purple"
,las=1
,horiz=T)
barplot(crop_dmg_events$crop_dmg, names.arg=crop_dmg_events$EVTYPE
,main="Crop Damage"
,las=2
,col="green"
, horiz=T)
title(main="Events with the greatest economic consequences", outer=T)
box(lty = '1373', which="outer")
Figure 2 - Top 5 property and crop damages
Property damages are given in logarithmic scale due to large range of values. The property damages are perhaps the least skewed of this analysis. Flashflood has done the most damage but there is not a major difference in the loss incurred by the top 5 events. Unsurprisngly drought has caused the most damage to crops.
If you see a tornado coming, RUN! Analysis has shown that tornadoes are the most dangerous natural disaster in the US.