Synopsis

This report analyses storm and natural disaster data from the United States. Specifically, it focusses on the public health as well as the economic damage caused by different event types. It finds that tornados are the most consequential in terms of public health. It also finds that Torandos of force 4 are most fatal. In addition this report analyses economics damage as well as per event public health effects.

Data Processing

Download the data and import it.

library(R.utils, warn.conflicts=FALSE)
## Loading required package: R.oo
## Loading required package: R.methodsS3
## R.methodsS3 v1.6.1 (2014-01-04) successfully loaded. See ?R.methodsS3 for help.
## R.oo v1.18.0 (2014-02-22) successfully loaded. See ?R.oo for help.
## 
## Attaching package: 'R.oo'
## 
## The following objects are masked from 'package:methods':
## 
##     getClasses, getMethods
## 
## The following objects are masked from 'package:base':
## 
##     attach, detach, gc, load, save
## 
## R.utils v1.32.4 (2014-05-14) successfully loaded. See ?R.utils for help.
bunzip2('StormData.csv.bz2')
StormData <- read.csv('StormData.csv')

Recode the names (to a clean R-style), encode the date, create a total damages variable, then save to an RData file and remove the csv file.

names(StormData) <- tolower( names(StormData) )
names(StormData) <- gsub( '_$', '', names(StormData) )
names(StormData) <- gsub( '_$', '', names(StormData) ) # 2x
names(StormData) <- gsub( '_', '.', names(StormData) )

StormData$total.dmg <- sum(StormData$propdmg, StormData$cropdmg)

save(StormData, file = 'StormData.RData')
file.remove('StormData.csv')
## [1] TRUE
attach(StormData)

Summarise the fatalities, injuries and damage, once for totals and once for averages.

library(dplyr, warn.conflicts=FALSE)
total.stat <- summarise( group_by(StormData, evtype), total.fatalities=sum(fatalities), total.injuries=sum(injuries), total.damage = sum(total.dmg) )
mean.stat <- summarise( group_by(StormData, evtype), mean.fatalities=mean(fatalities), mean.injuries=mean(injuries) )
top.events <- total.stat[with(total.stat, order(-total.fatalities, -total.injuries)),]$evtype[1:10]
top.damages <- total.stat[with(total.stat, order(-total.damage)),]$evtype[1:10]
top.stat <- subset(total.stat, evtype %in% top.events)
top.dam.stat <- subset(total.stat, evtype %in% top.damages)

Results

Plot the fatalities for the most fatal event types.

library(ggplot2)
plot <- ggplot(top.stat, aes(evtype, weight=total.fatalities ) )
plot + geom_bar() + xlab('Event type') + ylab('Number of fatalities')

plot of chunk unnamed-chunk-4

Since we find that tornado is the most fatal, we further analyse this. Plot the fatalities for each storm type (wind force)

plot <- ggplot(StormData, aes(f, weight=fatalities) )
plot + geom_bar() + xlab('Wind force') + ylab('Number of fatalities')
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-5

Plot the damage of the most damaging events.

plot <- ggplot(top.dam.stat, aes(evtype, weight=total.damage ) )
plot + geom_bar() + xlab('Event type') + ylab('Total damage')

plot of chunk unnamed-chunk-6

In addition to this we print the top of the tables which summarise the most fatal (injuries included) and most economically damaging events.

The public health statistics.

top.stat[with(top.stat, order(-total.fatalities, -total.injuries)),]
## Source: local data frame [10 x 4]
## 
##             evtype total.fatalities total.injuries total.damage
## 830        TORNADO             5633          91346    7.437e+11
## 123 EXCESSIVE HEAT             1903           6525    2.058e+10
## 147    FLASH FLOOD              978           1777    6.656e+11
## 269           HEAT              937           2100    9.405e+09
## 452      LIGHTNING              816           5230    1.932e+11
## 854      TSTM WIND              504           6957    2.697e+12
## 164          FLOOD              470           6789    3.106e+11
## 581    RIP CURRENT              368            232    5.763e+09
## 354      HIGH WIND              248           1137    2.478e+11
## 11       AVALANCHE              224            170    4.733e+09

The economic damage statistics

top.dam.stat[with(top.dam.stat, order(-total.damage)),]
## Source: local data frame [10 x 4]
## 
##                 evtype total.fatalities total.injuries total.damage
## 238               HAIL               15           1361    3.540e+12
## 854          TSTM WIND              504           6957    2.697e+12
## 759  THUNDERSTORM WIND              133           1488    1.012e+12
## 830            TORNADO             5633          91346    7.437e+11
## 147        FLASH FLOOD              978           1777    6.656e+11
## 164              FLOOD              470           6789    3.106e+11
## 783 THUNDERSTORM WINDS               64            908    2.556e+11
## 354          HIGH WIND              248           1137    2.478e+11
## 452          LIGHTNING              816           5230    1.932e+11
## 304         HEAVY SNOW              127           1021    1.926e+11

The mean public health effects, which give an indication of per event consequences.

mean.stat[with(mean.stat, order(-mean.fatalities, -mean.injuries)),]
## Source: local data frame [985 x 3]
## 
##                         evtype mean.fatalities mean.injuries
## 833 TORNADOES, TSTM WIND, HAIL          25.000         0.000
## 65               COLD AND SNOW          14.000         0.000
## 847      TROPICAL STORM GORDON           8.000        43.000
## 552      RECORD/EXCESSIVE HEAT           5.667         0.000
## 135               EXTREME HEAT           4.364         7.045
## 275          HEAT WAVE DROUGHT           4.000        15.000
## 385             HIGH WIND/SEAS           4.000         0.000
## 483              MARINE MISHAP           3.500         2.500
## 976              WINTER STORMS           3.333         5.667
## 360         HIGH WIND AND SEAS           3.000        20.000
## ..                         ...             ...           ...