Tornado is most harmful with respect to population health and flood has the greatest economic consequences

author: angelayuan
date: Friday, March 20, 2015

Synopsis

In this report we aim to investigate which type of events are most harmful with respect to population health, and which type of events have the greatest economic consequences across the U.S. To this end, we obtained Storm Data from National Climatic Data Center (NCDC) who regularly receives Storm Data from the National Weather Service (NWS). From these data, we found that, on average across the U.S., (1) Tornado is most harmful with respect to population health, resulting in the most fatalities and injuries; and (2) flood has the greatest economic consequences, causing the heaviest property and crop damage.

Data Processing

We obtained Storm Data from the internet (https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2).

Reading in the Storm Data

We read in the data, and check the first few rows.

data <- read.csv(bzfile("repdata-data-StormData.csv.bz2"))
head(data)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL
##    EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO         0                                               0
## 2 TORNADO         0                                               0
## 3 TORNADO         0                                               0
## 4 TORNADO         0                                               0
## 5 TORNADO         0                                               0
## 6 TORNADO         0                                               0
##   COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1         NA         0                      14.0   100 3   0          0
## 2         NA         0                       2.0   150 2   0          0
## 3         NA         0                       0.1   123 2   0          0
## 4         NA         0                       0.0   100 2   0          0
## 5         NA         0                       0.0   150 2   0          0
## 6         NA         0                       1.5   177 2   0          0
##   INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1       15    25.0          K       0                                    
## 2        0     2.5          K       0                                    
## 3        2    25.0          K       0                                    
## 4        2     2.5          K       0                                    
## 5        2     2.5          K       0                                    
## 6        6     2.5          K       0                                    
##   LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1     3040      8812       3051       8806              1
## 2     3042      8755          0          0              2
## 3     3340      8742          0          0              3
## 4     3458      8626          0          0              4
## 5     3412      8642          0          0              5
## 6     3450      8748          0          0              6

Results

Here we will make plots to illustrate the two questions: Across the U.S, (1) which types of events are most harmful with respect to population health; and (2) which types of events have the greatest economic consequences.

Top 10 harmful events causing most fatalities and Top 10 harmful events causing most injuries

We first sort the data according to fatalities in decreasing order and sort the data according to injuries in decreasing order, respectively

fat <- arrange(data2,desc(FATALITIES),desc(INJURIES))
inj <- arrange(data2,desc(INJURIES),desc(FATALITIES))
head(fat)
##          Group.1 FATALITIES INJURIES
## 1        TORNADO       5633    91346
## 2 EXCESSIVE HEAT       1903     6525
## 3    FLASH FLOOD        978     1777
## 4           HEAT        937     2100
## 5      LIGHTNING        816     5230
## 6      TSTM WIND        504     6957
head(inj)
##          Group.1 FATALITIES INJURIES
## 1        TORNADO       5633    91346
## 2      TSTM WIND        504     6957
## 3          FLOOD        470     6789
## 4 EXCESSIVE HEAT       1903     6525
## 5      LIGHTNING        816     5230
## 6           HEAT        937     2100
sum(fat$FATALITIES[1:10])/sum(fat$FATALITIES)
## [1] 0.797689
sum(inj$INJURIES[1:10])/sum(inj$INJURIES)
## [1] 0.893402
sum(fat$FATALITIES[1])/sum(fat$FATALITIES)
## [1] 0.3719379
sum(inj$INJURIES[1])/sum(inj$INJURIES)
## [1] 0.6500199

We can see that the fatalities caused by the top 10 harmful events accounts for 79.8% of total fatalities, and the injuries caused by the top 10 harmful events accounts for 89.3% of total injuries. Moreover, the most harmful event caused 37.2% fatalities and 65% injuries. These results indicate the necessity to prevent harmful events especially TORNADO!

We plot the top 10 harmful events for fatalities and injuries separately.

par(mar = c(10,10,2,2), las = 2)
#par(mfrow = c(2,1))
barplot( height = fat$FATALITIES[1:10]/1000 , names.arg = fat$Group.1[1:10] , col = heat.colors(10) , main = "Top 10 Harmful Events Cause Most Fatalities" , ylab = "Total number of fatalities (thousand persons)")

par(mar = c(10,8,2,2), las = 2)
barplot( height = inj$INJURIES[1:10]/1000 , names.arg = inj$Group.1[1:10] , col = heat.colors(10) , main = "Top 10 Harmful Events Cause Most Injuries" , ylab = "Total number of injuries (thousand persons)")

For above results, we can answer that Tornado is most harmful with respect to population health.

Top 10 harmful events with greatest economic consequences

We first sort the data according to property damage in decreasing order and sort the data according to crop damage in decreasing order, respectively.

prop <- arrange(data4,desc(PROPDMG),desc(CROPDMG))
crop <- arrange(data4,desc(CROPDMG),desc(PROPDMG))
head(prop)
##             Group.1      PROPDMG    CROPDMG
## 1             FLOOD 144657709807 5661968450
## 2 HURRICANE/TYPHOON  69305840000 2607872800
## 3           TORNADO  56937160779  414953270
## 4       STORM SURGE  43323536000       5000
## 5       FLASH FLOOD  16140812067 1421317100
## 6              HAIL  15732267048 3025954473
head(crop)
##       Group.1      PROPDMG     CROPDMG
## 1     DROUGHT   1046106000 13972566000
## 2       FLOOD 144657709807  5661968450
## 3 RIVER FLOOD   5118945500  5029459000
## 4   ICE STORM   3944927860  5022113500
## 5        HAIL  15732267048  3025954473
## 6   HURRICANE  11868319010  2741910000
sum(prop$PROPDMG[1:10])/sum(prop$PROPDMG)
## [1] 0.8837154
sum(crop$CROPDMG[1:10])/sum(crop$CROPDMG)
## [1] 0.8526812
sum(prop$PROPDMG[1])/sum(prop$PROPDMG)
## [1] 0.3385242
sum(crop$CROPDMG[1])/sum(crop$CROPDMG)
## [1] 0.2845494

We can see that the property damage caused by the top 10 harmful events accounts for 88.4% of total property damage, and the crop damage caused by the top 10 harmful events accounts for 85.3% of total crop damage. Moreover, the most harmful event caused 33.9% property damage (FLOOD) and 28.5% crop damage (DROUGHT).

We calculate the damage taking both property and crop damage into consideration.

data5 <- mutate(data4, DMG = data4$PROPDMG+data4$CROPDMG)
data5 <- arrange(data5,desc(DMG))
head(data5)
##             Group.1      PROPDMG    CROPDMG          DMG
## 1             FLOOD 144657709807 5661968450 150319678257
## 2 HURRICANE/TYPHOON  69305840000 2607872800  71913712800
## 3           TORNADO  56937160779  414953270  57352114049
## 4       STORM SURGE  43323536000       5000  43323541000
## 5              HAIL  15732267048 3025954473  18758221521
## 6       FLASH FLOOD  16140812067 1421317100  17562129167

From above results, we can see that flood has the greatest economic consequences. We plot the top 10 harmful events for damage as follows.

par(mar = c(11,10,3,3), las = 2)
par(mfrow = c(1,1))
barplot( height = data5$DMG[1:10]/1000000000 , names.arg = data5$Group.1[1:10] , col = heat.colors(10) , main = "Top 10 Harmful Events Cause the Greatest Economic Loss" , ylab = "Total economic loss (billion dollars)")

For above results, we can answer that Flood has the greatest economic consequences.