This report analyses storm and natural disaster data from the United States. Specifically, it focusses on the public health as well as the economic damage caused by different event types. It finds that tornados are the most consequential in terms of public health. It also finds that Torandos of force 4 are most fatal. In addition this report analyses economics damage as well as per event public health effects.
Download the data and import it.
library(R.utils, warn.conflicts=FALSE)
## Loading required package: R.oo
## Loading required package: R.methodsS3
## R.methodsS3 v1.6.1 (2014-01-04) successfully loaded. See ?R.methodsS3 for help.
## R.oo v1.18.0 (2014-02-22) successfully loaded. See ?R.oo for help.
##
## Attaching package: 'R.oo'
##
## The following objects are masked from 'package:methods':
##
## getClasses, getMethods
##
## The following objects are masked from 'package:base':
##
## attach, detach, gc, load, save
##
## R.utils v1.32.4 (2014-05-14) successfully loaded. See ?R.utils for help.
bunzip2('StormData.csv.bz2')
StormData <- read.csv('StormData.csv')
Recode the names (to a clean R-style), encode the date, create a total damages variable, then save to an RData file and remove the csv file.
names(StormData) <- tolower( names(StormData) )
names(StormData) <- gsub( '_$', '', names(StormData) )
names(StormData) <- gsub( '_$', '', names(StormData) ) # 2x
names(StormData) <- gsub( '_', '.', names(StormData) )
StormData$total.dmg <- sum(StormData$propdmg, StormData$cropdmg)
save(StormData, file = 'StormData.RData')
file.remove('StormData.csv')
## [1] TRUE
attach(StormData)
Summarise the fatalities, injuries and damage, once for totals and once for averages.
library(dplyr, warn.conflicts=FALSE)
total.stat <- summarise( group_by(StormData, evtype), total.fatalities=sum(fatalities), total.injuries=sum(injuries), total.damage = sum(total.dmg) )
mean.stat <- summarise( group_by(StormData, evtype), mean.fatalities=mean(fatalities), mean.injuries=mean(injuries) )
top.events <- total.stat[with(total.stat, order(-total.fatalities, -total.injuries)),]$evtype[1:10]
top.damages <- total.stat[with(total.stat, order(-total.damage)),]$evtype[1:10]
top.stat <- subset(total.stat, evtype %in% top.events)
top.dam.stat <- subset(total.stat, evtype %in% top.damages)
Plot the fatalities for the most fatal event types.
library(ggplot2)
plot <- ggplot(top.stat, aes(evtype, weight=total.fatalities ) )
plot + geom_bar() + xlab('Event type') + ylab('Number of fatalities')
Since we find that tornado is the most fatal, we further analyse this. Plot the fatalities for each storm type (wind force)
plot <- ggplot(StormData, aes(f, weight=fatalities) )
plot + geom_bar() + xlab('Wind force') + ylab('Number of fatalities')
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
Plot the damage of the most damaging events.
plot <- ggplot(top.dam.stat, aes(evtype, weight=total.damage ) )
plot + geom_bar() + xlab('Event type') + ylab('Total damage')
In addition to this we print the top of the tables which summarise the most fatal (injuries included) and most economically damaging events.
The public health statistics.
top.stat[with(top.stat, order(-total.fatalities, -total.injuries)),]
## Source: local data frame [10 x 4]
##
## evtype total.fatalities total.injuries total.damage
## 830 TORNADO 5633 91346 7.437e+11
## 123 EXCESSIVE HEAT 1903 6525 2.058e+10
## 147 FLASH FLOOD 978 1777 6.656e+11
## 269 HEAT 937 2100 9.405e+09
## 452 LIGHTNING 816 5230 1.932e+11
## 854 TSTM WIND 504 6957 2.697e+12
## 164 FLOOD 470 6789 3.106e+11
## 581 RIP CURRENT 368 232 5.763e+09
## 354 HIGH WIND 248 1137 2.478e+11
## 11 AVALANCHE 224 170 4.733e+09
The economic damage statistics
top.dam.stat[with(top.dam.stat, order(-total.damage)),]
## Source: local data frame [10 x 4]
##
## evtype total.fatalities total.injuries total.damage
## 238 HAIL 15 1361 3.540e+12
## 854 TSTM WIND 504 6957 2.697e+12
## 759 THUNDERSTORM WIND 133 1488 1.012e+12
## 830 TORNADO 5633 91346 7.437e+11
## 147 FLASH FLOOD 978 1777 6.656e+11
## 164 FLOOD 470 6789 3.106e+11
## 783 THUNDERSTORM WINDS 64 908 2.556e+11
## 354 HIGH WIND 248 1137 2.478e+11
## 452 LIGHTNING 816 5230 1.932e+11
## 304 HEAVY SNOW 127 1021 1.926e+11
The mean public health effects, which give an indication of per event consequences.
mean.stat[with(mean.stat, order(-mean.fatalities, -mean.injuries)),]
## Source: local data frame [985 x 3]
##
## evtype mean.fatalities mean.injuries
## 833 TORNADOES, TSTM WIND, HAIL 25.000 0.000
## 65 COLD AND SNOW 14.000 0.000
## 847 TROPICAL STORM GORDON 8.000 43.000
## 552 RECORD/EXCESSIVE HEAT 5.667 0.000
## 135 EXTREME HEAT 4.364 7.045
## 275 HEAT WAVE DROUGHT 4.000 15.000
## 385 HIGH WIND/SEAS 4.000 0.000
## 483 MARINE MISHAP 3.500 2.500
## 976 WINTER STORMS 3.333 5.667
## 360 HIGH WIND AND SEAS 3.000 20.000
## .. ... ... ...