Climate events damage and cost top-rate

From 1950 to November 2011

Synopsis

The climate event patterns should be a valuable preventing catastopher tool.  In the later years the registry of these events have been more carefully taken, giving us the opportunity to suggest some conclusions that could help to decision making for preventive reaction to climate events.  This document will enlist the worst and costly type of events they have occured from 1950 to November 2011.

Data Processing

Session info and local configuration

For reproducible purposes only.

R version 3.1.1 (2014-07-10) Platform: x86_64-w64-mingw32/x64 (64-bit)

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] dplyr_0.4.1 plyr_1.8.1

loaded via a namespace (and not attached): [1] assertthat_0.1 DBI_0.3.1 digest_0.6.8 htmltools_0.2.6 lazyeval_0.1.10 [6] magrittr_1.5 parallel_3.1.1 R6_2.0.1 Rcpp_0.11.3 rmarkdown_0.8.1 [11] tools_3.1.1 yaml_2.1.13

Getting Data:

Download StormData from the coursera url: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 Read directly from the raw file in the working directory

library(plyr)
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.1.2
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:plyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
Sys.setlocale(category = "LC_ALL", "English")
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
setwd("D:/")
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", "stormdata.bz2")
dateDownloaded <- date()
stormdata <- read.csv("repdata-data-StormData.csv.bz2")

You can find the description of the data file variables on the next urls: https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2FNCDC%20Storm%20Events-FAQ%20Page.pdf

Subsetting the data

We are interestd in the worst events for the damage prioritized by: Fatalities Injuries Damage Cost to Property and Crop

The subset will include only rows with Fatalities and Injuries where are >0, and Property and Crop Damage are on the Millions of dollars.

stormdatadf <- tbl_df(stormdata)
stormdata_2 <- filter(stormdatadf, FATALITIES >0, INJURIES >0, PROPDMGEXP == "M", CROPDMGEXP =="M")
stormdata_3 <- filter(stormdatadf, FATALITIES >0, INJURIES >0)
stormdata_4 <- filter(stormdatadf, PROPDMGEXP == "M", CROPDMGEXP =="M")
byEVtype <- group_by(stormdata_2, EVTYPE)

Results

The analysis identifies the most severe weather events, based on the NOAA Storm Database.

Top rated list

Summary tables of the most harmful events with respect to population health and the greatest economic consequences across the United States.

byEVtype1 <- group_by(stormdata_3, EVTYPE)
byEVtype2 <- group_by(stormdata_4, EVTYPE)
2
## [1] 2
summarize(byEVtype2, MaxPropDamage = max(PROPDMG), MaxCropDamage = max(CROPDMG)) %>% arrange(desc(MaxPropDamage), desc(MaxCropDamage))
## Source: local data frame [38 x 3]
## 
##               EVTYPE MaxPropDamage MaxCropDamage
## 1          HIGH WIND        929.00        175.00
## 2  HURRICANE/TYPHOON        621.00        423.00
## 3               HAIL        500.00         55.00
## 4              FLOOD        450.00        500.00
## 5          HURRICANE        410.62        413.60
## 6     HURRICANE ERIN        230.00          5.00
## 7            TORNADO        200.00          6.00
## 8           WILDFIRE        151.10         45.40
## 9        FLASH FLOOD        150.00        200.00
## 10    River Flooding         78.74         26.84
## ..               ...           ...           ...

Top rated events with the most fatalities

select(stormdata_2, BGN_DATE, STATE, EVTYPE, FATALITIES) %>% arrange(desc(FATALITIES))
## Source: local data frame [41 x 4]
## 
##              BGN_DATE STATE             EVTYPE FATALITIES
## 1    4/8/1998 0:00:00    AL            TORNADO         32
## 2    2/5/2008 0:00:00    TN            TORNADO         13
## 3   3/12/2006 0:00:00    TX           WILDFIRE         12
## 4   2/13/2000 0:00:00    GA            TORNADO         11
## 5   10/5/1995 0:00:00    GA THUNDERSTORM WINDS          8
## 6   2/14/2000 0:00:00    GA            TORNADO          6
## 7   3/13/1993 0:00:00    GA           BLIZZARD          5
## 8   4/24/2010 0:00:00    MS            TORNADO          5
## 9  11/10/1998 0:00:00    WI          HIGH WIND          4
## 10   5/4/2003 0:00:00    MO            TORNADO          4
## ..                ...   ...                ...        ...

Top rated events with the most injuries

select(stormdata_2, BGN_DATE, STATE, EVTYPE, INJURIES) %>% arrange(desc(INJURIES))
## Source: local data frame [41 x 4]
## 
##             BGN_DATE STATE         EVTYPE INJURIES
## 1   2/8/1994 0:00:00    OH      ICE STORM     1568
## 2  3/13/1993 0:00:00    GA       BLIZZARD      385
## 3   4/8/1998 0:00:00    AL        TORNADO      258
## 4  2/13/2000 0:00:00    GA        TORNADO      175
## 5  11/4/1998 0:00:00    FL TROPICAL STORM       65
## 6  4/24/2010 0:00:00    MS        TORNADO       53
## 7  4/28/2011 0:00:00    VA        TORNADO       50
## 8  4/28/2011 0:00:00    VA        TORNADO       50
## 9   2/5/2008 0:00:00    TN        TORNADO       44
## 10 4/24/2010 0:00:00    MS        TORNADO       40
## ..               ...   ...            ...      ...

Top rated events with the most property damage

select(stormdata_2, BGN_DATE, STATE, EVTYPE, PROPDMG) %>% arrange(desc(PROPDMG))
## Source: local data frame [41 x 4]
## 
##              BGN_DATE STATE                  EVTYPE PROPDMG
## 1   8/13/2004 0:00:00    FL               HIGH WIND  929.00
## 2    4/8/1998 0:00:00    AL                 TORNADO  200.00
## 3   7/12/1996 0:00:00    NC               HURRICANE  140.25
## 4   4/24/2010 0:00:00    MS                 TORNADO  140.00
## 5   4/24/2010 0:00:00    MS                 TORNADO   90.00
## 6   10/5/1995 0:00:00    GA      THUNDERSTORM WINDS   75.00
## 7   9/14/2004 0:00:00    PR          TROPICAL STORM   68.00
## 8   12/9/1995 0:00:00    CA WINTER STORM HIGH WINDS   60.00
## 9   4/24/2010 0:00:00    MS                 TORNADO   60.00
## 10 10/26/2003 0:00:00    CA                WILDFIRE   55.22
## ..                ...   ...                     ...     ...

Top rated events with the most crop damage

select(stormdata_2, BGN_DATE, STATE, EVTYPE, CROPDMG) %>% arrange(desc(CROPDMG))
## Source: local data frame [41 x 4]
## 
##              BGN_DATE STATE             EVTYPE CROPDMG
## 1   8/13/2004 0:00:00    FL          HIGH WIND   175.0
## 2   7/12/1996 0:00:00    NC          HURRICANE   127.0
## 3   9/14/2004 0:00:00    PR     TROPICAL STORM   101.5
## 4   10/5/1995 0:00:00    GA THUNDERSTORM WINDS    50.0
## 5   3/13/1993 0:00:00    GA           BLIZZARD    50.0
## 6   3/12/2006 0:00:00    TX           WILDFIRE    45.4
## 7    1/5/2003 0:00:00    CA          HIGH WIND    28.0
## 8  10/26/2003 0:00:00    CA           WILDFIRE    22.0
## 9   11/4/1998 0:00:00    FL     TROPICAL STORM    20.0
## 10  5/31/1998 0:00:00    MI          TSTM WIND    10.0
## ..                ...   ...                ...     ...

Acumulative damage across time

The barplots show the top 5 most damaging type of events.

plot1 <- summarize(byEVtype2, sumPropDamage = sum(PROPDMG), sumCropDamage = sum(CROPDMG)) %>% mutate(totaldmg= sumPropDamage + sumCropDamage) %>% arrange(desc(totaldmg))
plot2 <- summarize(byEVtype1, sumFatalities = sum(FATALITIES)) %>% arrange(desc(sumFatalities))
plot3 <- summarize(byEVtype1, sumInjuries = sum(INJURIES)) %>% arrange(desc(sumInjuries))
par(mfrow = c(1,3), las = 3)
barplot(plot1$totaldmg[1:5], names.arg = plot1$EVTYPE[1:5], main = "Total Millions USD Damage")
barplot(plot2$sumFatalities[1:5], names.arg = plot2$EVTYPE[1:5], main = "Total Fatalities top 5 Events")
barplot(plot3$sumInjuries[1:5], names.arg = plot3$EVTYPE[1:5], main = "Total Injuries top 5 Events")