This report analyzes the NOAA Storm Database to answer:
1.Which event types (EVTYPE) are most harmful to population health?
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
Downloading data from the website:
[Storm Data](https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2)
or you can download file using the file URL
``` r
fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
fileZip <- "repdata-data-StormData.csv.bz2"
if (!file.exists(fileZip)) {
download.file(fileUrl, destfile = fileZip, mode = "wb")
}
data <- read.csv(fileZip)
Reading data from the raw file:
data <- read.csv(bzfile("repdata-data-StormData.csv.bz2"), header = TRUE, sep=",")
Find the most harmful with respect to population health by FALITIES and INJURIES variable
harm <- data$FATALITIES + data$INJURIES
maxharm <- max(harm)
maxharm
## [1] 1742
Subsetting data to find the most harmful with respect to population health
mostharm <- data$EVTYPE[which(data$INJURIES + data$FATALITIES == maxharm)]
mostharm
## [1] "TORNADO"
Property Damage estimate in Billion dollar
prop <- data$PROPDMG[which(data$PROPDMGEXP == "B")]
prop
## [1] 5.00 0.10 2.10 1.60 1.00 5.00 2.50 1.20 3.00 1.70
## [11] 3.00 1.50 5.15 1.00 1.04 2.50 5.42 1.30 4.83 4.00
## [21] 1.00 1.50 10.00 16.93 31.30 4.00 7.35 11.26 5.88 2.09
## [31] 115.00 1.00 4.00 1.50 1.80 1.00 1.50 2.80 1.00 2.00
The maximum property damage:
a <- max(prop)
a
## [1] 115
Crop Damage estimate in Billion dollar
crop <- data$CROPDMG[which(data$CROPDMGEXP == "B")]
crop
## [1] 0.40 5.00 0.50 0.20 5.00 1.51 1.00 0.00 0.00
The maximum property damage:
b <- max(crop)
b
## [1] 5
We can clearly see that the property damage is 115 billion dollar, which is the the largest amount.
The type of event which creates the greatest economic consequences:
econ <- data$EVTYPE[which((data$PROPDMG == a) & (data$PROPDMGEXP == "B"))]
econ
## [1] "FLOOD"
Number of injuries and fatalities caused by Storm
tornado <- data[which(data$EVTYPE == "TORNADO"),]
library(ggplot2)
# create variable
data$total_human_cost <- data$FATALITIES + data$INJURIES
# aggregate to EVTYPE (so geom_bar(stat="identity") makes sense)
harm_plot <- aggregate(total_human_cost ~ EVTYPE, data = data, sum, na.rm = TRUE)
harm_plot <- harm_plot[order(-harm_plot$total_human_cost), ]
harm_plot10 <- harm_plot[1:10, ]
ggplot(harm_plot10, aes(x = reorder(EVTYPE, total_human_cost), y = total_human_cost)) +
geom_col() +
coord_flip() +
xlab("Event type") +
ylab("Total fatalities + injuries") +
ggtitle("Top 10 Event Types by Total Human Cost")
The Property Damage caused by Flood
boxplot(data$PROPDMG[data$EVTYPE=="FLOOD"]
, main ="Property Damage caused by Flood"
,xlab="FLOOD", ylab="Billions of Dollar")
Question 1: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
mostharm <- data$EVTYPE[which(data$INJURIES + data$FATALITIES == maxharm)]
mostharm
## [1] "TORNADO"
The answer is TORNADO.
Question 2: Across the United States, which types of events have the greatest economic consequences?
econ <- data$EVTYPE[which((data$PROPDMG == a) & (data$PROPDMGEXP == "B"))]
econ
## [1] "FLOOD"
The answer is FLOOD.