Synopsis

This report analyzes storm data across the U.S. It is based on the storm data data set which contains observations of “event types” and the resulting bodily and/or property damage. The goal of the report is to present the storm events that correlate with the highest number of fatalities/injuries and corresponding property/crop damage

Data Processing

This report contains all code used and is completely reproducible starting from the download of the available bz2 file. Although the unzipped dataset was in csv format, some cleanup needed to be done. The EVTYPE column which contained the storm event type names contained misspellings resulting in some duplicate rows. This was remedied by transforming the names to lowercase and using grep to match on common words and then choosing a common term to represent it (e.g. “tornado”). Each financial field (property damage and crop damage) had a corresponding field with an “EXP” suffix. This field contained, for example, “K”, “M”, “B”, etc. to denote thousands, millions or billions, respectively. There were also a few nonsensical values (such as, “+”, “?”). The extension was replaced with a multiplier and used to multiply the amount field. The irrelevant fields were then discarded.

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
wd <- "/Users/Maria/Documents/Coursera/Data Science Specialization/Reproducible Research"
url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
dfile <- "stormdata.bz2"
setwd(wd)
download.file(url, dfile) 
unzipped <- bzfile(dfile, "r")
sdata <- read.csv(unzipped)

Results

The fatalities and injuries are reported on first. The top 20 are presented in the following tables/graphs. For completeness, I included the top storm events with regards to fatalities, injuries and combined. I used log(base 10) since tornado fatalities/injuries were so high so as to squash the remaining results.

Fatalities and Injuries

#Get rid of the "EXP" columns and split the data into sub data frames, arrange, mutate, etc.
sdata <- select(sdata, -ends_with("EXP"))
prop <- select(sdata, -contains("IES"))
fatal <- select(sdata, -contains("ROP"))
fatal <- aggregate(. ~ EVTYPE, data = fatal, FUN = sum)
fatal <- arrange(fatal, desc(FATALITIES))
injuries <- arrange(fatal, desc(INJURIES))
fplusi <- mutate(fatal, FPLUSI = FATALITIES + INJURIES)
fplusi   <- arrange(fplusi, desc(FPLUSI))
fatal <- head(fatal, n = 20)
injuries <- head(injuries, n = 20)
fplusi <- head(fplusi, n = 20)
fplusi2 <- filter(fplusi, EVTYPE != "tornado")
prop <- aggregate(. ~ EVTYPE, data = prop, FUN = sum)
prop <- arrange(prop, desc(PROPDMG))
crop <- arrange(prop, desc(CROPDMG))
cplusp <- mutate(prop, CPLUSP = PROPDMG + CROPDMG)
cplusp <- arrange(cplusp, desc(CPLUSP))
prop <- head(prop, n = 20)
crop <- head(crop, n = 20)
cplusp <- head(cplusp, n = 20)

Sorted by fatalities

##                     EVTYPE FATALITIES INJURIES
## 1                  tornado       5633    91346
## 2           excessive heat       1903     6525
## 3              flash flood        978     1777
## 4                     heat        937     2100
## 5                lightning        816     5230
## 6                tstm wind        504     6957
## 7                    flood        470     6789
## 8              rip current        368      232
## 9                high wind        248     1137
## 10               avalanche        224      170
## 11            winter storm        206     1321
## 12            rip currents        204      297
## 13               heat wave        172      379
## 14            extreme cold        162      231
## 15       thunderstorm wind        133     1488
## 16              heavy snow        127     1021
## 17 extreme cold/wind chill        125       24
## 18               high surf        104      156
## 19             strong wind        103      280
## 20                blizzard        101      805

Sorted by injuries

##                EVTYPE FATALITIES INJURIES
## 1             tornado       5633    91346
## 2           tstm wind        504     6957
## 3               flood        470     6789
## 4      excessive heat       1903     6525
## 5           lightning        816     5230
## 6                heat        937     2100
## 7           ice storm         89     1975
## 8         flash flood        978     1777
## 9   thunderstorm wind        133     1488
## 10               hail         15     1361
## 11       winter storm        206     1321
## 12  hurricane/typhoon         64     1275
## 13          high wind        248     1137
## 14         heavy snow        127     1021
## 15           wildfire         75      911
## 16 thunderstorm winds         64      908
## 17           blizzard        101      805
## 18                fog         62      734
## 19   wild/forest fire         12      545
## 20         dust storm         22      440

Fatalities and injuries summed and sorted

##                EVTYPE FATALITIES INJURIES FPLUSI
## 1             tornado       5633    91346  96979
## 2      excessive heat       1903     6525   8428
## 3           tstm wind        504     6957   7461
## 4               flood        470     6789   7259
## 5           lightning        816     5230   6046
## 6                heat        937     2100   3037
## 7         flash flood        978     1777   2755
## 8           ice storm         89     1975   2064
## 9   thunderstorm wind        133     1488   1621
## 10       winter storm        206     1321   1527
## 11          high wind        248     1137   1385
## 12               hail         15     1361   1376
## 13  hurricane/typhoon         64     1275   1339
## 14         heavy snow        127     1021   1148
## 15           wildfire         75      911    986
## 16 thunderstorm winds         64      908    972
## 17           blizzard        101      805    906
## 18                fog         62      734    796
## 19        rip current        368      232    600
## 20   wild/forest fire         12      545    557
Fig. 1: Storm Event Effects on fatalities/injuries

Fig. 1: Storm Event Effects on fatalities/injuries

These results show that, by far, tornados have the largest effect on both fatalities and injuries.

Cost of Property and Crop Damage

Sorted by property damage cost

##                       EVTYPE      PROPDMG    CROPDMG
## 1                      flood 144657709807 5661968450
## 2          hurricane/typhoon  69305840000 2607872800
## 3                    tornado  56937161054  414953110
## 4                storm surge  43323536000       5000
## 5                flash flood  16140812294 1421317100
## 6                       hail  15732266932 3025954453
## 7                  hurricane  11868319010 2741910000
## 8             tropical storm   7703890550  678346000
## 9               winter storm   6688497250   26944000
## 10                 high wind   5270046295  638571300
## 11               river flood   5118945500 5029459000
## 12                  wildfire   4765114000  295472800
## 13          storm surge/tide   4641188000     850000
## 14                 tstm wind   4484958440  554007350
## 15                 ice storm   3944927810 5022113500
## 16         thunderstorm wind   3483121166  414843050
## 17            hurricane opal   3172846000   19000000
## 18          wild/forest fire   3001829500  106796830
## 19 heavy rain/severe weather   2500000000          0
## 20        thunderstorm winds   1735953834  190654708

Sorted by crop damage cost

##               EVTYPE      PROPDMG     CROPDMG
## 1            drought   1046106000 13972566000
## 2              flood 144657709807  5661968450
## 3        river flood   5118945500  5029459000
## 4          ice storm   3944927810  5022113500
## 5               hail  15732266932  3025954453
## 6          hurricane  11868319010  2741910000
## 7  hurricane/typhoon  69305840000  2607872800
## 8        flash flood  16140812294  1421317100
## 9       extreme cold     67737400  1312973000
## 10      frost/freeze     10480000  1094186000
## 11        heavy rain    694248090   733399800
## 12    tropical storm   7703890550   678346000
## 13         high wind   5270046295   638571300
## 14         tstm wind   4484958440   554007350
## 15    excessive heat      7753700   492402000
## 16            freeze       205000   456725000
## 17           tornado  56937161054   414953110
## 18 thunderstorm wind   3483121166   414843050
## 19              heat      1797000   401461500
## 20   damaging freeze      8000000   296230000

Sorted by sum of property and crop damage cost

##                       EVTYPE      PROPDMG     CROPDMG       CPLUSP
## 1                      flood 144657709807  5661968450 150319678257
## 2          hurricane/typhoon  69305840000  2607872800  71913712800
## 3                    tornado  56937161054   414953110  57352114164
## 4                storm surge  43323536000        5000  43323541000
## 5                       hail  15732266932  3025954453  18758221385
## 6                flash flood  16140812294  1421317100  17562129394
## 7                    drought   1046106000 13972566000  15018672000
## 8                  hurricane  11868319010  2741910000  14610229010
## 9                river flood   5118945500  5029459000  10148404500
## 10                 ice storm   3944927810  5022113500   8967041310
## 11            tropical storm   7703890550   678346000   8382236550
## 12              winter storm   6688497250    26944000   6715441250
## 13                 high wind   5270046295   638571300   5908617595
## 14                  wildfire   4765114000   295472800   5060586800
## 15                 tstm wind   4484958440   554007350   5038965790
## 16          storm surge/tide   4641188000      850000   4642038000
## 17         thunderstorm wind   3483121166   414843050   3897964216
## 18            hurricane opal   3172846000    19000000   3191846000
## 19          wild/forest fire   3001829500   106796830   3108626330
## 20 heavy rain/severe weather   2500000000           0   2500000000
Fig. 1: Storm Event Cost of Property and Crop Damage

Fig. 1: Storm Event Cost of Property and Crop Damage

These results show that drought causes the highest damage cost to crops, while flooding causes the highest damage cost to property in general