Russian data scientist made new discovery! The most harmful for population is… tornado and the most destructive for economy disaster is … flood !
This research was made by Sergey Chernov in january 2015. This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. Date was loaded and cleaned. After it we determined top ten the most harmful for population disasters and top ten the most destructive disaster. The most harmful for population is tornado and the most destructive for economy disaster is flood The governement must increase budget for tornado preventions.
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
setwd("D:/Program/RepProject2")
storm<-read.csv(con<-bzfile("repdata-data-StormData.csv.bz2"),sep=",")
We excluded events without consequences, as it does not affect the amount of damage and the total number of victims.
tidy_storm<-storm[which((storm$INJURIES!=0)|(storm$FATALITIES!=0)|(storm$PROPDMG!=0)),c("EVTYPE","INJURIES","FATALITIES","PROPDMG","PROPDMGEXP")]
After that we prepared variables for calculate amount of property damage. The code book (Storm Events) was used for the conversion.
tidy_storm<-mutate(tidy_storm,DMGEXP=0)
tidy_storm[tidy_storm$PROPDMGEXP=="K","DMGEXP"]<-1000
tidy_storm[tidy_storm$PROPDMGEXP=="M","DMGEXP"]<-1e+06
tidy_storm[tidy_storm$PROPDMGEXP=="","DMGEXP"]<-1
tidy_storm[tidy_storm$PROPDMGEXP=="B","DMGEXP"]<-1e+09
tidy_storm[tidy_storm$PROPDMGEXP=="m","DMGEXP"]<-1e+06
tidy_storm[tidy_storm$PROPDMGEXP=="0","DMGEXP"]<-1
tidy_storm[tidy_storm$PROPDMGEXP=="8","DMGEXP"]<-1e+08
tidy_storm[tidy_storm$PROPDMGEXP=="7","DMGEXP"]<-1e+07
tidy_storm[tidy_storm$PROPDMGEXP=="5","DMGEXP"]<-1e+05
tidy_storm[tidy_storm$PROPDMGEXP=="6","DMGEXP"]<-1e+06
tidy_storm[tidy_storm$PROPDMGEXP=="4","DMGEXP"]<-1e+04
tidy_storm[tidy_storm$PROPDMGEXP=="3","DMGEXP"]<-1e+03
tidy_storm[tidy_storm$PROPDMGEXP=="2","DMGEXP"]<-1e+02
tidy_storm[tidy_storm$PROPDMGEXP=="h","DMGEXP"]<-100
tidy_storm[tidy_storm$PROPDMGEXP=="H","DMGEXP"]<-100
tidy_storm[tidy_storm$PROPDMGEXP=="1","DMGEXP"]<-10
tidy_storm[tidy_storm$PROPDMGEXP=="+","DMGEXP"]<-0
tidy_storm[tidy_storm$PROPDMGEXP=="-","DMGEXP"]<-0
tidy_storm[tidy_storm$PROPDMGEXP=="?","DMGEXP"]<-0
tidy_storm<-group_by(tidy_storm,EVTYPE)
all_lethal<-summarize(tidy_storm, inj=sum(INJURIES+FATALITIES))
all_damage<-summarize(tidy_storm, damage=sum(PROPDMG*DMGEXP))
top_lethal<-arrange(all_lethal,desc(inj))[1:10,]
top_damage<-arrange(all_damage,desc(damage))[1:10,]
Question : Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
Answer: The most harmful for population disasters are
top_lethal
## Source: local data frame [10 x 2]
##
## EVTYPE inj
## 1 TORNADO 96979
## 2 EXCESSIVE HEAT 8428
## 3 TSTM WIND 7461
## 4 FLOOD 7259
## 5 LIGHTNING 6046
## 6 HEAT 3037
## 7 FLASH FLOOD 2755
## 8 ICE STORM 2064
## 9 THUNDERSTORM WIND 1621
## 10 WINTER STORM 1527
qplot(data=top_lethal[1:5,],x=EVTYPE,y=inj,xlab="Type of disaster",ylab="The number of victims", main="Number of victims caused by disasters", geom="area",size=20,colour=EVTYPE)
Question : Across the United States, which types of events have the greatest economic consequences?
Answer: The most destructive disaster are
top_damage
## Source: local data frame [10 x 2]
##
## EVTYPE damage
## 1 FLOOD 144657709807
## 2 HURRICANE/TYPHOON 69305840000
## 3 TORNADO 56947380617
## 4 STORM SURGE 43323536000
## 5 FLASH FLOOD 16822673979
## 6 HAIL 15735267513
## 7 HURRICANE 11868319010
## 8 TROPICAL STORM 7703890550
## 9 WINTER STORM 6688497251
## 10 HIGH WIND 5270046260
qplot(data=top_damage[1:5,],x=EVTYPE,y=damage,xlab="Type of disaster",ylab="Economic consequences, $", main="Economic consequences ($) caused by disasters", geom="area",size=20,colour=EVTYPE)