The climate event patterns should be a valuable preventing catastopher tool. In the later years the registry of these events have been more carefully taken, giving us the opportunity to suggest some conclusions that could help to decision making for preventive reaction to climate events. This document will enlist the worst and costly type of events they have occured from 1950 to November 2011.
For reproducible purposes only.
R version 3.1.1 (2014-07-10) Platform: x86_64-w64-mingw32/x64 (64-bit)
locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] dplyr_0.4.1 plyr_1.8.1
loaded via a namespace (and not attached): [1] assertthat_0.1 DBI_0.3.1 digest_0.6.8 htmltools_0.2.6 lazyeval_0.1.10 [6] magrittr_1.5 parallel_3.1.1 R6_2.0.1 Rcpp_0.11.3 rmarkdown_0.8.1 [11] tools_3.1.1 yaml_2.1.13
Download StormData from the coursera url: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 Read directly from the raw file in the working directory
library(plyr)
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.1.2
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:plyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Sys.setlocale(category = "LC_ALL", "English")
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
setwd("D:/")
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", "stormdata.bz2")
dateDownloaded <- date()
stormdata <- read.csv("repdata-data-StormData.csv.bz2")
You can find the description of the data file variables on the next urls: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf
We are interestd in the worst events for the damage prioritized by: Fatalities Injuries Damage Cost to Property and Crop
The subset will include only rows with Fatalities and Injuries where are >0, and Property and Crop Damage are on the Millions of dollars.
stormdatadf <- tbl_df(stormdata)
stormdata_2 <- filter(stormdatadf, FATALITIES >0, INJURIES >0, PROPDMGEXP == "M", CROPDMGEXP =="M")
stormdata_3 <- filter(stormdatadf, FATALITIES >0, INJURIES >0)
stormdata_4 <- filter(stormdatadf, PROPDMGEXP == "M", CROPDMGEXP =="M")
byEVtype <- group_by(stormdata_2, EVTYPE)
The analysis identifies the most severe weather events, based on the NOAA Storm Database.
Summary tables of the most harmful events with respect to population health and the greatest economic consequences across the United States.
byEVtype1 <- group_by(stormdata_3, EVTYPE)
byEVtype2 <- group_by(stormdata_4, EVTYPE)
2
## [1] 2
summarize(byEVtype2, MaxPropDamage = max(PROPDMG), MaxCropDamage = max(CROPDMG)) %>% arrange(desc(MaxPropDamage), desc(MaxCropDamage))
## Source: local data frame [38 x 3]
##
## EVTYPE MaxPropDamage MaxCropDamage
## 1 HIGH WIND 929.00 175.00
## 2 HURRICANE/TYPHOON 621.00 423.00
## 3 HAIL 500.00 55.00
## 4 FLOOD 450.00 500.00
## 5 HURRICANE 410.62 413.60
## 6 HURRICANE ERIN 230.00 5.00
## 7 TORNADO 200.00 6.00
## 8 WILDFIRE 151.10 45.40
## 9 FLASH FLOOD 150.00 200.00
## 10 River Flooding 78.74 26.84
## .. ... ... ...
Top rated events with the most fatalities
select(stormdata_2, BGN_DATE, STATE, EVTYPE, FATALITIES) %>% arrange(desc(FATALITIES))
## Source: local data frame [41 x 4]
##
## BGN_DATE STATE EVTYPE FATALITIES
## 1 4/8/1998 0:00:00 AL TORNADO 32
## 2 2/5/2008 0:00:00 TN TORNADO 13
## 3 3/12/2006 0:00:00 TX WILDFIRE 12
## 4 2/13/2000 0:00:00 GA TORNADO 11
## 5 10/5/1995 0:00:00 GA THUNDERSTORM WINDS 8
## 6 2/14/2000 0:00:00 GA TORNADO 6
## 7 3/13/1993 0:00:00 GA BLIZZARD 5
## 8 4/24/2010 0:00:00 MS TORNADO 5
## 9 11/10/1998 0:00:00 WI HIGH WIND 4
## 10 5/4/2003 0:00:00 MO TORNADO 4
## .. ... ... ... ...
Top rated events with the most injuries
select(stormdata_2, BGN_DATE, STATE, EVTYPE, INJURIES) %>% arrange(desc(INJURIES))
## Source: local data frame [41 x 4]
##
## BGN_DATE STATE EVTYPE INJURIES
## 1 2/8/1994 0:00:00 OH ICE STORM 1568
## 2 3/13/1993 0:00:00 GA BLIZZARD 385
## 3 4/8/1998 0:00:00 AL TORNADO 258
## 4 2/13/2000 0:00:00 GA TORNADO 175
## 5 11/4/1998 0:00:00 FL TROPICAL STORM 65
## 6 4/24/2010 0:00:00 MS TORNADO 53
## 7 4/28/2011 0:00:00 VA TORNADO 50
## 8 4/28/2011 0:00:00 VA TORNADO 50
## 9 2/5/2008 0:00:00 TN TORNADO 44
## 10 4/24/2010 0:00:00 MS TORNADO 40
## .. ... ... ... ...
Top rated events with the most property damage
select(stormdata_2, BGN_DATE, STATE, EVTYPE, PROPDMG) %>% arrange(desc(PROPDMG))
## Source: local data frame [41 x 4]
##
## BGN_DATE STATE EVTYPE PROPDMG
## 1 8/13/2004 0:00:00 FL HIGH WIND 929.00
## 2 4/8/1998 0:00:00 AL TORNADO 200.00
## 3 7/12/1996 0:00:00 NC HURRICANE 140.25
## 4 4/24/2010 0:00:00 MS TORNADO 140.00
## 5 4/24/2010 0:00:00 MS TORNADO 90.00
## 6 10/5/1995 0:00:00 GA THUNDERSTORM WINDS 75.00
## 7 9/14/2004 0:00:00 PR TROPICAL STORM 68.00
## 8 12/9/1995 0:00:00 CA WINTER STORM HIGH WINDS 60.00
## 9 4/24/2010 0:00:00 MS TORNADO 60.00
## 10 10/26/2003 0:00:00 CA WILDFIRE 55.22
## .. ... ... ... ...
Top rated events with the most crop damage
select(stormdata_2, BGN_DATE, STATE, EVTYPE, CROPDMG) %>% arrange(desc(CROPDMG))
## Source: local data frame [41 x 4]
##
## BGN_DATE STATE EVTYPE CROPDMG
## 1 8/13/2004 0:00:00 FL HIGH WIND 175.0
## 2 7/12/1996 0:00:00 NC HURRICANE 127.0
## 3 9/14/2004 0:00:00 PR TROPICAL STORM 101.5
## 4 10/5/1995 0:00:00 GA THUNDERSTORM WINDS 50.0
## 5 3/13/1993 0:00:00 GA BLIZZARD 50.0
## 6 3/12/2006 0:00:00 TX WILDFIRE 45.4
## 7 1/5/2003 0:00:00 CA HIGH WIND 28.0
## 8 10/26/2003 0:00:00 CA WILDFIRE 22.0
## 9 11/4/1998 0:00:00 FL TROPICAL STORM 20.0
## 10 5/31/1998 0:00:00 MI TSTM WIND 10.0
## .. ... ... ... ...
The barplots show the top 5 most damaging type of events.
plot1 <- summarize(byEVtype2, sumPropDamage = sum(PROPDMG), sumCropDamage = sum(CROPDMG)) %>% mutate(totaldmg= sumPropDamage + sumCropDamage) %>% arrange(desc(totaldmg))
plot2 <- summarize(byEVtype1, sumFatalities = sum(FATALITIES)) %>% arrange(desc(sumFatalities))
plot3 <- summarize(byEVtype1, sumInjuries = sum(INJURIES)) %>% arrange(desc(sumInjuries))
par(mfrow = c(1,3), las = 3)
barplot(plot1$totaldmg[1:5], names.arg = plot1$EVTYPE[1:5], main = "Total Millions USD Damage")
barplot(plot2$sumFatalities[1:5], names.arg = plot2$EVTYPE[1:5], main = "Total Fatalities top 5 Events")
barplot(plot3$sumInjuries[1:5], names.arg = plot3$EVTYPE[1:5], main = "Total Injuries top 5 Events")