Synapsis

Using the ‘storm’ database from National Climatic Data Center for the period year 1950 to November 2011, this code tries to analyse and highlight

  1. Most harmful events for the population and health
  2. events with the greatest economic consequences

The url to the original data and data dictionary are also included in the code.

Data processing

# Setting the working directory and Including libraries
setwd("~\\GitHub\\reproducible-researchpeerOMZ37course-project-2")
library(dplyr)
library(ggplot2)
library(kableExtra)

Downloading and extracting the file if does not already exist and reading data

if(!file.exists("data.bz2")){
  download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2","data.bz2",method="curl")
  unzip("data.bz2")
}

Downloading data documentation

if(!file.exists("data documentation.pdf")){
  download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf","data documentation.pdf",method="curl")
}
d<-read.csv("data.bz2")

Correcting the economic damage numbers

pwr<-cbind(name=c("",  "-", "?", "+", "0", "1", "2", "3", "4", "5", "6", 
             "7", "8", "B", "h", "H", "K", "m","M"),
           power=10^c(0,0,0,0,0,1,2,3,4,5,6,7,8,12,2,2,3,6,6)) %>% as_tibble()
pwr$power<- pwr$power %>% as.numeric()
d$Property.Damage<-pwr$power[match(d$PROPDMGEXP,pwr$name)]*d$PROPDMG
d$Crop.Damage<-pwr$power[match(d$CROPDMGEXP,pwr$name)]*d$CROPDMG

Summarising the data

d<-d %>% group_by(EVTYPE)
d$TotalEconomicDamage=d$Crop.Damage+d$Property.Damage

## Top 10 in fatalities
s.fatal<-d %>% group_by(EVTYPE) %>% select(EVTYPE,FATALITIES) %>% 
  summarise_all(.funs = "sum") %>% top_n(10,wt=FATALITIES)

## Top 10 in injuries
s.injur<-d %>% group_by(EVTYPE) %>% select(EVTYPE,INJURIES) %>% 
  summarise_all(.funs = "sum") %>% top_n(10,wt=INJURIES)

## Top 10 in economic damage
s.TotalEconomicDamage<-d %>% group_by(EVTYPE) %>% select(EVTYPE,TotalEconomicDamage) %>% 
  summarise_all(.funs = "sum") %>% top_n(10,wt=TotalEconomicDamage)

Results

The following graphs show the top 10 events that cause the most harm to people and lives and top 10 events that cause the most economic damage.

The graphs have discriptive titles

ggplot(s.fatal,aes(x=EVTYPE,y=FATALITIES))+geom_col()+ylim(0,10000)+theme(axis.text.x=element_text(angle = 30,hjust = 1))+labs(title="Top 10 events causing the highest number of fatalities",x="Event type",y="Number of fatalities")

ggplot(s.injur,aes(x=EVTYPE,y=INJURIES/1000000))+geom_col()+theme(axis.text.x=element_text(angle = 30,hjust = 1))+labs(title="Top 10 events causing the highest number of injuries",x="Event type",y="Number of incidents (in millions)")

ggplot(s.TotalEconomicDamage,aes(x=EVTYPE,y=TotalEconomicDamage/1000000000))+geom_col()+theme(axis.text.x=element_text(angle = 30,hjust = 1))+labs(title="Top 10 events with highest economic consequnces",x="Event type",y="Amount in $ billion")