Synopsis
The goal of this project is to assess the NOAA (US National OCeanic and Atmospheric) sorm database to track the characteristics and trends of major storms and weather events. THe overal objective is to assess what events have the greatest impact on public health and economic development. Total casualties and property damage under each weaher event will be assessed to make conclusion.
Introduction
Sorms and severe weather can cause economic and public helath catastrophes. Such events can can result in casualties, and property damage, and ecomomic deterioration, Preventing these negative consequences is a key concern.
This report will explore the NOAA sorm database, which tracks characteristics of major storms and weather related vents in the US. The data also asseses when and where these events occur, casualties that occured, and economic damage the event has afflicted.
The data will assess the following questions:
Whcih type of whether events (EVTYPE) is most harmful to public health across the US by looking at the total casualties for each event. Secondly, what even negatively influences economic development by looking at crop and property damage across the US?
The analysis will agregrate the data by storm type
It was concluded that TOrnados are the damaging to thge public health (using fatalities and inuries as the measuring factor). Floods are the damaging economically.
Data Processing
The data was first downloaded and unslipped and the necessary packages are downloaded to carry out the analysis.
library(ggplot2) # for gplot
library(R.utils) # for bunzip2
library(scales) # for commas in the ggplot labels
library(knitr) #knit the analysis
#read file
initial <- read.csv("repdata%2Fdata%2FStormData.csv")
#isolate the necessary columns for analysis
initial <- initial[, c('EVTYPE', 'FATALITIES', 'INJURIES', 'PROPDMG', 'PROPDMGEXP', 'CROPDMG', 'CROPDMG') ]
# Per the documentation, convert units to correctly calculate damage to property
# H=Hundreds, K=Thousands, M=Millions, B=Billions
initial$pd <- 0
initial [initial$PROPDMGEXP == "H", ]$pd <- initial[initial$PROPDMGEXP == "H", ]$PROPDMG * 100
initial[initial$PROPDMGEXP == "K", ]$pd <- initial[initial$PROPDMGEXP == "K", ]$PROPDMG * 1000
initial[initial$PROPDMGEXP == "M", ]$pd <- initial[initial$PROPDMGEXP == "M", ]$PROPDMG * 1000000
initial[initial$PROPDMGEXP == "B", ]$pd <- initial[initial$PROPDMGEXP == "B", ]$PROPDMG * 1000000000
# same for crops.
initial$cd <- 0
initial[initial$CROPDMGEXP == "H", ]$cd <- initial[initial$CROPDMGEXP == "H", ]$CROPDMG * 100
initial[initial$CROPDMGEXP == "K", ]$cd <- initial[initial$CROPDMGEXP == "K", ]$CROPDMG * 1000
initial[initial$CROPDMGEXP == "M", ]$cd <- initial[initial$CROPDMGEXP == "M", ]$CROPDMG * 1000000
initial[initial$CROPDMGEXP == "B", ]$cd <- initial[initial$CROPDMGEXP == "B", ]$CROPDMG * 1000000000
Results
1. Across the US what weather events (indicated by EVTYPE) are most harmful to population based public health
#Aggregate, order top 10, and set levels based on weather events
fatalities <- aggregate(FATALITIES ~ EVTYPE, data = initial, sum)
fatalities <- fatalities[order(-fatalities$FATALITIES), ][1:10, ]
fatalities$EVTYPE <- factor(fatalities$EVTYPE, levels = fatalities$EVTYPE)
#Create bar plot
ggplot(fatalities, aes(x = EVTYPE, y = FATALITIES)) + geom_bar(stat = "identity", fill = "cyan", col= "black") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Event Type") + ylab("Fatalities") + scale_y_continuous(labels=comma)+ggtitle("Top 10 Weather Event Types by Fatalities: US 1950-2011")

Injuries across each weather event will also be analyzed
#Analyze the injuries for each weather event
injuries <- aggregate(INJURIES ~ EVTYPE, data = initial, sum)
injuries <- injuries[order(-injuries$INJURIES), ][1:10, ]
injuries$EVTYPE <- factor(injuries$EVTYPE, levels = injuries$EVTYPE)
ggplot(injuries, aes(x = EVTYPE, y = INJURIES)) + geom_bar(stat = "identity", fill = "orange", col= "black") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Event Type") + ylab("Injuries") +
scale_y_continuous(labels=comma) + ggtitle("Top 10 Weather Event Types by Injuries: US 1950-2011")

THe data shows the greatest number of injrues and fatalities occur during tornados from the year 1950 to 2011.
Across the United States, which types of events have the greatest economic consequences?
#Aggregate and analyze weather events that produce the greatest the economic damage
#Aggregate and organize data
damage <- aggregate(pd + cd ~ EVTYPE, data = initial, sum)
names(damage) <- c("EVTYPE", "TDAMAGE")
damage <- damage[order(-damage$TDAMAGE), ][1:10, ]
damage$EVTYPE <- factor(damage$EVTYPE, levels = damage$EVTYPE)
#Plot data of economic data as bar graph
ggplot(damage, aes(x = EVTYPE, y = TDAMAGE)) + geom_bar(stat = "identity", fill = "green", col= "black") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) + xlab("Event Type") + ylab("Damage (US$)") + scale_y_continuous(labels=comma) +
ggtitle("Top 10 Weather Event Types by Property & Crop Damage: US 1950-2011")

The bar graph shows that Floods end up creating the most damage in crops and property
Conclusion
The data shows that fatalities in tornados increase are as high as 8,000 deaths across the US. They are double the number of deaths from excessive heat that comes in second at 2,000 deaths. Additionally, tornados causes the greatest injuries at about 100,000. This is 10 times that of winds that come in at second place at 2,000 injduries. Finally, floods cause 150,000,000,000 dollars in damage in crops and property damage. This is dobule the damage caused from huricanses, which is in second place.