Analysis Of Damage Caused By Weather Events in the USA

Anton Votinov

19 August 2014

Synopsis

Current work represents an analysis of the damage caused by storms in the United States during the years 1950 to 2011. The damage is considered in two ways: damage to human's health and damage to an economy. Both deaths and injuries are considered in the first case, damage to crops and production is considered in the second case.

Data processing

Load packages and the data.

library(data.table)
library(ggplot2)
data <- data.table(read.csv(bzfile("StormData.csv.bz2")))

Use “data.table” package to find 10 weather events which caused most deaths ans injuries in the USA.


dataHealth <- data[,  lapply(.SD, sum, na.rm=TRUE), by= EVTYPE, .SDcols=c("INJURIES", "FATALITIES")]
dataHealth <- dataHealth[,ALL := INJURIES + FATALITIES]
dataHealth <- dataHealth[ALL > 0 ,]
dataHealth <- dataHealth[ order(ALL, decreasing = T) ,]
dataHealth <- dataHealth[ 1:10 ,]

Use “data.table” package to find 10 weather events which caused most economic damage to agriculture and production in the USA.

dataEc <- data[,  lapply(.SD, sum, na.rm=TRUE), by= EVTYPE, .SDcols=c("PROPDMG", "CROPDMG")]
dataEc <- dataEc[,ALL := PROPDMG + CROPDMG]
dataEc <- dataEc[ALL > 0 ,]
dataEc <- dataEc[ order(ALL, decreasing = T) ,]
dataEc <- dataEc[ 1:10 ,]

Results

Use “ggplot2” package to visualize results of the current analysis.

dataHealth
##                EVTYPE INJURIES FATALITIES   ALL
##  1:           TORNADO    91346       5633 96979
##  2:    EXCESSIVE HEAT     6525       1903  8428
##  3:         TSTM WIND     6957        504  7461
##  4:             FLOOD     6789        470  7259
##  5:         LIGHTNING     5230        816  6046
##  6:              HEAT     2100        937  3037
##  7:       FLASH FLOOD     1777        978  2755
##  8:         ICE STORM     1975         89  2064
##  9: THUNDERSTORM WIND     1488        133  1621
## 10:      WINTER STORM     1321        206  1527
ggplot(data = dataHealth, aes(x = EVTYPE)) +
    ggtitle("Number of deaths and injuries") +
    geom_bar(aes(x  = EVTYPE, y = ALL), stat = "identity",position="dodge") + 
    scale_x_discrete(limits=rev(dataHealth[,EVTYPE]), name="Event") +
    coord_flip() +
    scale_y_continuous(name="Count") 

plot of chunk unnamed-chunk-4

This bar plot represents number of deaths and injuries of 10 most dangerous weather events. It's easily seen that Tornadoes are the most dangerous weather events in the USA as they caused most deaths and injuries.

dataEc
##                 EVTYPE PROPDMG CROPDMG     ALL
##  1:            TORNADO 3212258  100019 3312277
##  2:        FLASH FLOOD 1420125  179200 1599325
##  3:          TSTM WIND 1335966  109203 1445168
##  4:               HAIL  688693  579596 1268290
##  5:              FLOOD  899938  168038 1067976
##  6:  THUNDERSTORM WIND  876844   66791  943636
##  7:          LIGHTNING  603352    3581  606932
##  8: THUNDERSTORM WINDS  446293   18685  464978
##  9:          HIGH WIND  324732   17283  342015
## 10:       WINTER STORM  132721    1979  134700
ggplot(data = dataEc, aes(x = EVTYPE)) +
    ggtitle("Ecomonic damage") +
    geom_bar(aes(x  = EVTYPE, y = ALL), stat = "identity",position="dodge") + 
    scale_x_discrete(limits=rev(dataEc[,EVTYPE]), name="Event") +
    coord_flip() +
    scale_y_continuous(name="Damage") 

plot of chunk unnamed-chunk-5

This bar plot represents economic damage of 10 most dangerous weather events. It's easily seen that Tornadoes are the most dangerous weather events in the USA as they caused most economic damage.