Synopsis:
This project utilizes years of data on storm tracking starting from 1950s onwards in order to answer the question - Which natural disasters are most disasterous to human life and economic activity?
In order to do so, it treats fatalities and injuries as indicators of damage to human life. Tabulating by the type of disaster, we are able to obtain an estimate of the types of disasters that have the highest impact on human life, and to safeguard against it accordingly.
echo settings for embedding code
Setting Directory
getwd()
## [1] "C:/Users/dhnsingh/Dropbox/Coursera/InProgress/R_5_reproducible_research/wk4_documenting"
setwd("C:/Users/dhnsingh/Dropbox/Coursera/InProgress/R_5_reproducible_research/wk4_documenting")
Reading in strom data csv bz2:
storm_data <- read.csv("repdata_data_StormData.csv.bz2")
Loading up necessary packages:
# install.packages("knitr", repos='http://cran.us.r-project.org')
# install.packages("rsconnect", repos='http://cran.us.r-project.org')
# install.packages("rmarkdown", repos='http://cran.us.r-project.org')
# install.packages("markdown", repos='http://cran.us.r-project.org')
library(rmarkdown)
## Warning: package 'rmarkdown' was built under R version 3.6.2
library(markdown)
## Warning: package 'markdown' was built under R version 3.6.2
library(knitr)
## Warning: package 'knitr' was built under R version 3.6.2
library(ggplot2)
library(rsconnect)
## Warning: package 'rsconnect' was built under R version 3.6.2
##
## Attaching package: 'rsconnect'
## The following object is masked from 'package:markdown':
##
## rpubsUpload
Converting data type for categorical analysis:
storm_data$EVTYPE <- as.character(storm_data$EVTYPE)
Aggregating data to produce summary plots:
storm_agg <- aggregate(FATALITIES ~ EVTYPE, storm_data, FUN = sum)
# subsetting to only the highest damage incidents
storm_agg <- storm_agg[storm_agg$FATALITIES > 100 ,]
# doing the same for property damage as economic loss indicator
storm_agg2 <- aggregate(PROPDMG ~ EVTYPE, storm_data, FUN = sum)
# subsetting to only the highest damage incidents
storm_agg2 <- storm_agg2[storm_agg2$PROPDMG > 26000 ,]
Relevant plots for human damage:
# building plot
highest_mortality <- ggplot(storm_agg, aes(x = storm_agg$EVTYPE, y = storm_agg$FATALITIES)) +
geom_bar(stat = "identity", fill = "steelblue") + labs(x = "Event Type", y = "Fatalities", title = "Fatalities by Event Type") +
theme(axis.text.x = element_text(angle = 90), legend.position="none")
# printing plot
highest_mortality
In the results we find that the most damaging natural disasters include Tornados and Excessive Heat, followed by Flash Floods and Lightening. All of which contain an element of unpredictability, but can be foreseen when regular inclement weather takes on more menacing features. And should therefore be safeguarded against.
Relevant plots for property damage:
# building plot
highest_prop.loss <- ggplot(storm_agg2, aes(x = storm_agg2$EVTYPE, y = storm_agg2$PROPDMG)) +
geom_bar(stat = "identity", fill = "steelblue") + labs(x = "Event Type", y = "Property Damage", title = "Property Damage by Event Type") +
theme(axis.text.x = element_text(angle = 90), legend.position="none")
# printing plot
highest_prop.loss
Once again, we find that Tornados and Flash Floods cause the greatest damage to property, followed by what is abbreviated as TSTM Winds!
Relevant tables:
# producing table of overall stats
summary(storm_data$FATALITIES)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.0000 0.0000 0.0168 0.0000 583.0000
# tabulating damage stats
summary(storm_data$PROPDMG)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 0.00 0.00 12.06 0.50 5000.00
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0000 0.0000 0.0000 0.0168 0.0000 583.0000
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00 0.00 0.00 12.06 0.50 5000.00