In this assignment we will investigate impact of the severe weather events in the health community and what are economic consequences for these events. We are running exploratory analysis based on the storm database collected from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) from 1948 - 2013. During database exploration, we will use some of the important parameters such as: fatalities, injuries, property damage and crop damage in time frame of 65 years. During this report we will try to have clear material about our concern and giving us awareness for preventing or minimizing impact of the severe weather event.
Purpose of the project: 1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health? 2. Across the United States, which types of events have the greatest economic consequences?
Fundamental settings libraries
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.1.3
##
## Attaching package: 'ggplot2'
##
## The following object is masked _by_ '.GlobalEnv':
##
## mpg
library(plyr)
library(grid)
library(gridExtra)
Data processing and creating R working environment
data<-read.csv("repdata-data-StormData.csv", stringsAsFactors = FALSE, sep=",", header=T)
data$BGN_DATE<- strptime(data$BGN_DATE, "%m/%d/%Y %H:%M:%S")
data$BGN_DATE<-as.Date(data$BGN_DATE)
Plot a histogram with the total data by year
hist(data$BGN_DATE, breaks = 30, main="Histogram data/year" ,xlab="Years")
For purpose of the data manipulation we substitute in the PROPDMG and CROPDMG columns for each observation where we have Thousand (K), Million (M) and Billion (B) with numeric data.
from <- c("k","K","M","m","B")
to <- c("1000","1000","1e+06","1e+06","1e+09")
gsub1 <- function(pattern, replacement, x, ...) {
for(i in 1:length(pattern))
x <- gsub(pattern[i], replacement[i], x, ...)
x
}
data$PROPDMGEXP <- gsub1(from, to, data$PROPDMGEXP)
data$CROPDMGEXP <- gsub1(from, to, data$CROPDMGEXP)
In this section, we will create function of processing information for four parameters (Fatalities, Injuries, Crop damage, Property damage) and getting the first 10 most severe types of weather events from column EVTYPE.
filter <- function(colonne, data = data, recursive = FALSE) {
i <- which(colnames(data) == colonne)
d <- aggregate(data[, i], by = list(data$EVTYPE), FUN = "sum", na.rm = TRUE)
names(d) <- c("EVTYPE", colonne)
d <- arrange(d, d[, 2], decreasing = T)
d <- within(d, EVTYPE <- factor(x = EVTYPE, levels = d$EVTYPE))
d <- d[1:10, ]
return(d)
}
fatalities <- filter("FATALITIES", data = data)
injuries <- filter("INJURIES", data = data)
property <- filter("PROPDMG", data = data)
crop<- filter("CROPDMG", data = data)
Results As for the impact of the severe weather event on communities we got two sorted list and graphics. These evidence clarify us numbers of peoples affected by type of weather events.
fatalities
## EVTYPE FATALITIES
## 1 TORNADO 5633
## 2 EXCESSIVE HEAT 1903
## 3 FLASH FLOOD 978
## 4 HEAT 937
## 5 LIGHTNING 816
## 6 TSTM WIND 504
## 7 FLOOD 470
## 8 RIP CURRENT 368
## 9 HIGH WIND 248
## 10 AVALANCHE 224
injuries
## EVTYPE INJURIES
## 1 TORNADO 91346
## 2 TSTM WIND 6957
## 3 FLOOD 6789
## 4 EXCESSIVE HEAT 6525
## 5 LIGHTNING 5230
## 6 HEAT 2100
## 7 ICE STORM 1975
## 8 FLASH FLOOD 1777
## 9 THUNDERSTORM WIND 1488
## 10 HAIL 1361
fataities_plot<-ggplot(fatalities, aes(x=EVTYPE, y=FATALITIES)) + geom_bar(stat="identity", fill="orange", colour="brown")+
scale_x_discrete(name="EVENTS")+
scale_y_continuous(name="FATALITIES")+
ggtitle("FATALITIES") +
theme(plot.title = element_text(lineheight=1, face="bold"),
axis.title.x = element_text(face="bold", colour="#990000"),
axis.title.y = element_text(face="bold", colour="#990000"),
axis.text.x = element_text(angle=45, hjust=1,face="bold",colour="blue")) + geom_text(aes(x = EVTYPE, y = FATALITIES, label = FATALITIES, angle = 90, size = 4, hjust = -0.1), color = "brown", show_guide = F)
injuries_plot<-ggplot(injuries, aes(x=EVTYPE, y=INJURIES)) + geom_bar(stat="identity", fill="orange", colour="brown")+
scale_x_discrete(name="EVENTS")+
scale_y_continuous(name="INJURIES")+
ggtitle("INJURIES") +
theme(plot.title = element_text(lineheight=1, face="bold"),
axis.title.x = element_text(face="bold", colour="#990000"),
axis.title.y = element_text(face="bold", colour="#990000"),
axis.text.x = element_text(angle=45, hjust=1,face="bold",colour="blue")) + geom_text(aes(x = EVTYPE, y = INJURIES, label = INJURIES, angle = 90, size = 4, hjust = -0.1), color = "brown", show_guide = F)
grid.arrange(fataities_plot, injuries_plot, ncol = 2)
Summary: Based on the above evidences, we find that tornado, excessive heat and flood cause most fatalities and injuries in the United States from 1948 to 2013.
As for the impact of the severe weather event on total property damage and total crop damage we got two sorted list and graphics. These evidence clarify us amount of money in $ by type of weather events
property
## EVTYPE PROPDMG
## 1 TORNADO 3212258.2
## 2 FLASH FLOOD 1420124.6
## 3 TSTM WIND 1335965.6
## 4 FLOOD 899938.5
## 5 THUNDERSTORM WIND 876844.2
## 6 HAIL 688693.4
## 7 LIGHTNING 603351.8
## 8 THUNDERSTORM WINDS 446293.2
## 9 HIGH WIND 324731.6
## 10 WINTER STORM 132720.6
crop
## EVTYPE CROPDMG
## 1 HAIL 579596.28
## 2 FLASH FLOOD 179200.46
## 3 FLOOD 168037.88
## 4 TSTM WIND 109202.60
## 5 TORNADO 100018.52
## 6 THUNDERSTORM WIND 66791.45
## 7 DROUGHT 33898.62
## 8 THUNDERSTORM WINDS 18684.93
## 9 HIGH WIND 17283.21
## 10 HEAVY RAIN 11122.80
cropPlot<- qplot(EVTYPE, data = crop, weight = CROPDMG, geom = "bar", fill=EVTYPE, binwidth = 1) +
theme(axis.text.x = element_text(angle = 30, hjust = 1)) + scale_y_continuous("Crop Damage in US $") + xlab("Event Type") + ggtitle("Crop Damage/Events 1948 - 2013")
propertyPlot<- qplot(EVTYPE, data = property, weight = PROPDMG, geom = "bar", fill=EVTYPE, binwidth = 1) + theme(axis.text.x = element_text(angle = 30, hjust = 1)) + scale_y_continuous("Crop Damage in US $") + xlab("Event Type") + ggtitle("Property damage/Events 1948 - 2013")
grid.arrange(propertyPlot, cropPlot, ncol = 1)
Summary: Based on the above evidences, we find that tornado and floods cause most total property damage and hail is most for total crop damage in the United States from 1948 to 2013.