Author: Andrei Damsa Date: 2017.03.12
Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The basic goal of this assignment is to explore the NOAA Storm Database and answer two basic questions about severe weather events. The first question is focusing on the types of events which are the most harmful with respect to population health. The second question describes the economic consequences of different weather events.
Data processing involves gathering, reading, transforming, and analysing the data.
The data can be accessed using the following URL:
https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2
Information about the data set can be found on the following links:
Description of the database
https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf
FAQ of the database:
https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf
Libraries used during the analysis:
library(readr)
library(ggplot2)
library(R.utils)
## Loading required package: R.oo
## Loading required package: R.methodsS3
## R.methodsS3 v1.7.1 (2016-02-15) successfully loaded. See ?R.methodsS3 for help.
## R.oo v1.21.0 (2016-10-30) successfully loaded. See ?R.oo for help.
##
## Attaching package: 'R.oo'
## The following objects are masked from 'package:methods':
##
## getClasses, getMethods
## The following objects are masked from 'package:base':
##
## attach, detach, gc, load, save
## R.utils v2.5.0 (2016-11-07) successfully loaded. See ?R.utils for help.
##
## Attaching package: 'R.utils'
## The following object is masked from 'package:utils':
##
## timestamp
## The following objects are masked from 'package:base':
##
## cat, commandArgs, getOption, inherits, isOpen, parse, warnings
Downloading and reading the data
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(url, destfile = "storms.csv.bz2")
bunzip2("storms.csv.bz2", "storms_2.csv")
storms <- read_csv("storms_2.csv")
## Parsed with column specification:
## cols(
## .default = col_character(),
## STATE__ = col_double(),
## COUNTY = col_double(),
## BGN_RANGE = col_double(),
## COUNTY_END = col_double(),
## END_RANGE = col_double(),
## LENGTH = col_double(),
## WIDTH = col_double(),
## F = col_integer(),
## MAG = col_double(),
## FATALITIES = col_double(),
## INJURIES = col_double(),
## PROPDMG = col_double(),
## CROPDMG = col_double(),
## LATITUDE = col_double(),
## LONGITUDE = col_double(),
## LATITUDE_E = col_double(),
## LONGITUDE_ = col_double(),
## REFNUM = col_double()
## )
## See spec(...) for full column specifications.
storms_sub <- storms[,c("EVTYPE","FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP")]
Transforming vaues to numeric forms (PROPDMGEXP, CROPDMGEXP). As described in the documentation of the database, the values in PROPDMGEXP and CROPDMGEXP are coded with the letters K, M, and B.
Rows and Columns:
dim(storms_sub)
## [1] 902297 7
According to the analysis, we defined the top 10 most dangerous weather effects in the case of health and economic harm in the US. Regarding health, the most harmful events are related to tornadoes.
In the case of economic consequences we found two major damage groups. In the case of the first group we examined the effect of weather on property damage. We found that the most dangerous weather events where related to tornadoes, winds, and hails. The second group consisted of crop (agricultural) damage, in this case the most harmful events where excessive wetness, cold and wet conditions, and early frost.
Graphic visualization of the processed data.
g1 <- ggplot(ev_sum_head, aes(Events, Freq))
g1 + geom_bar(stat = "identity") + ylab("Fatalities") + xlab("Events") + ggtitle("Number of personal harm (fatalities and injuries) caused by weather events") + theme(axis.text.x = element_text(angle = 90))
g2 <- ggplot(ev_dmg_head, aes(Events, Freq))
g2 + geom_bar(stat = "identity") + ylab("Damage in $") + xlab("Events") + ggtitle("Amount of damage (in $) caused by weather events (PROP)") + theme(axis.text.x = element_text(angle = 90))
g3 <- ggplot(ev_dmg_crop_head, aes(Events, Freq))
g3 + geom_bar(stat = "identity") + ylab("Damage in $") + xlab("Events") + ggtitle("Amount of damage (in $) caused by weather events (CROP)") + theme(axis.text.x = element_text(angle = 90))