Descriptive Analysis of Severe Weather Events in USA

Severe weather events, particularly for the United States, are significant threats to the physical safety of people and the economy. In this case, we want to study, from the storm database provided by NOAA, damage caused by storms and other severe weather events on people and private property.

The results of this study could be used to design policies to mitigate the effects of these events. In the database are included data from the early 50s through November 2011, being recent data most complete and reliable.

Synopsis

In the present study conducted for the United States, we want to answer two basic questions: 1. Which types of events are the most harmful to the physical integrity of the population. 2. Which types of events generate the greatest economic losses.

Fields that allow us to respond to the questions raised are:

FIELD Description
EVTYPE Type of severe weather event
FATALITIES Number of direct fatalities per event
INJURIES Number of direct injuries per event
PROPDMG Estimated value of property damage to three significant figures
PROPDMGEXP Exponent for PROPDMG, where K=thousands, M=millions, B=billions
CROPDMG Estimated value of crop damage to three significant figures
CROPDMGEXP Exponent for CROPDMG, where K=thousands, M=millions, B=billions

Data Processing

Data reading and basic processing

StormData <- read.csv("/data/mapologo/projects/RepData_PeerAssessment2/StormData.csv")
StormData$BGN_DATE <- as.Date(StormData$BGN_DATE, format="%m/%d/%Y")
StormData$EVTYPE <- as.factor(StormData$EVTYPE)

The fields available in the database are:

names(StormData)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"
data = StormData[c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]
summary(data$PROPDMGEXP)
##             -      ?      +      0      1      2      3      4      5 
## 465934      1      8      5    216     25     13      4      4     28 
##      6      7      8      B      h      H      K      m      M 
##      4      5      1     40      1      6 424665      7  11330
summary(data$CROPDMGEXP)
##             ?      0      2      B      k      K      m      M 
## 618413      7     19      1      9     21 281832      1   1994

“*EXP” fields were taken to be an exponent of 10

calc_dmg <-function(x, x_exp){
  aux = numeric(length(x))
  for (i in 1:length(x)){
    if (x[i] == 0){
      aux[i] <- 0
    }else{
      exp <- switch(as.character(x_exp[i]),
                    "h"=2, "H"=2, "k"=3, "K"=3, "m"=6, "M"=6, "b"=9, "B"=9,
                    "1"=1, "2"=2, "3"=3, "4"=4, "5"=5, "6"=6, "7"=7, "8"=8, "9"=9, 0)
      aux[i] <- x[i] * 10^exp
    }
  }
  return(aux)
}
data$PROPDMGAMOUNT <- calc_dmg(data$PROPDMG, data$PROPDMGEXP)
data$CROPDMGAMOUNT <- calc_dmg(data$CROPDMG, data$CROPDMGEXP)

Verifying EVTYPE field, whe encounter several coding problems

summary(data$EVTYPE)
##                     HAIL                TSTM WIND        THUNDERSTORM WIND 
##                   288661                   219940                    82563 
##                  TORNADO              FLASH FLOOD                    FLOOD 
##                    60652                    54277                    25326 
##       THUNDERSTORM WINDS                HIGH WIND                LIGHTNING 
##                    20843                    20212                    15754 
##               HEAVY SNOW               HEAVY RAIN             WINTER STORM 
##                    15708                    11723                    11433 
##           WINTER WEATHER             FUNNEL CLOUD         MARINE TSTM WIND 
##                     7026                     6839                     6175 
## MARINE THUNDERSTORM WIND               WATERSPOUT              STRONG WIND 
##                     5812                     3796                     3566 
##     URBAN/SML STREAM FLD                 WILDFIRE                 BLIZZARD 
##                     3392                     2761                     2719 
##                  DROUGHT                ICE STORM           EXCESSIVE HEAT 
##                     2488                     2006                     1678 
##               HIGH WINDS         WILD/FOREST FIRE             FROST/FREEZE 
##                     1533                     1457                     1342 
##                DENSE FOG       WINTER WEATHER/MIX           TSTM WIND/HAIL 
##                     1293                     1104                     1028 
##  EXTREME COLD/WIND CHILL                     HEAT                HIGH SURF 
##                     1002                      767                      725 
##           TROPICAL STORM           FLASH FLOODING             EXTREME COLD 
##                      690                      682                      655 
##            COASTAL FLOOD         LAKE-EFFECT SNOW        FLOOD/FLASH FLOOD 
##                      650                      636                      624 
##                LANDSLIDE                     SNOW          COLD/WIND CHILL 
##                      600                      587                      539 
##                      FOG              RIP CURRENT              MARINE HAIL 
##                      538                      470                      442 
##               DUST STORM                AVALANCHE                     WIND 
##                      427                      386                      340 
##             RIP CURRENTS              STORM SURGE            FREEZING RAIN 
##                      304                      261                      250 
##              URBAN FLOOD     HEAVY SURF/HIGH SURF        EXTREME WINDCHILL 
##                      249                      228                      204 
##             STRONG WINDS           DRY MICROBURST    ASTRONOMICAL LOW TIDE 
##                      196                      186                      174 
##                HURRICANE              RIVER FLOOD               LIGHT SNOW 
##                      174                      173                      154 
##         STORM SURGE/TIDE            RECORD WARMTH         COASTAL FLOODING 
##                      148                      146                      143 
##               DUST DEVIL         MARINE HIGH WIND        UNSEASONABLY WARM 
##                      141                      135                      126 
##                 FLOODING   ASTRONOMICAL HIGH TIDE        MODERATE SNOWFALL 
##                      120                      103                      101 
##           URBAN FLOODING               WINTRY MIX        HURRICANE/TYPHOON 
##                       98                       90                       88 
##            FUNNEL CLOUDS               HEAVY SURF              RECORD HEAT 
##                       87                       84                       81 
##                   FREEZE                HEAT WAVE                     COLD 
##                       74                       74                       72 
##              RECORD COLD                      ICE  THUNDERSTORM WINDS HAIL 
##                       64                       61                       61 
##      TROPICAL DEPRESSION                    SLEET         UNSEASONABLY DRY 
##                       60                       59                       56 
##                    FROST              GUSTY WINDS      THUNDERSTORM WINDSS 
##                       53                       53                       51 
##       MARINE STRONG WIND                    OTHER               SMALL HAIL 
##                       48                       48                       47 
##                   FUNNEL             FREEZING FOG             THUNDERSTORM 
##                       46                       45                       45 
##       Temperature record          TSTM WIND (G45)         Coastal Flooding 
##                       43                       39                       38 
##              WATERSPOUTS    MONTHLY PRECIPITATION                    WINDS 
##                       37                       36                       36 
##                  (Other) 
##                     2940
data$EVTYPE <- toupper(data$EVTYPE)
data$EVTYPE[data$EVTYPE == "AVALANCE"] <- "AVALANCHE"
data$EVTYPE[data$EVTYPE == "TSTM WIND"] <- "THUNDERSTORM WIND"
data$EVTYPE[data$EVTYPE == "FLASH FLOOD/FLOOD"] <- "FLASH FLOOD"
# and so on

Results

Most harmful events

In terms of fatalities

evt_fat <- aggregate(data$FATALITIES, by=list(data$EVTYPE), FUN=sum)
names(evt_fat) <- c("EventType", "Fatalities")
head(evt_fat[order(evt_fat$Fatalities, decreasing=TRUE),], n=15)
##             EventType Fatalities
## 752           TORNADO       5633
## 108    EXCESSIVE HEAT       1903
## 131       FLASH FLOOD        992
## 235              HEAT        937
## 406         LIGHTNING        816
## 682 THUNDERSTORM WIND        637
## 146             FLOOD        470
## 518       RIP CURRENT        368
## 313         HIGH WIND        248
## 10          AVALANCHE        225
## 885      WINTER STORM        206
## 519      RIP CURRENTS        204
## 239         HEAT WAVE        172
## 117      EXTREME COLD        162
## 266        HEAVY SNOW        127

Graphical

library(ggplot2)
top_fat <- head(evt_fat[order(evt_fat$Fatalities, decreasing=TRUE),])
qplot(data= top_fat, y=Fatalities, x=reorder(EventType,Fatalities), geom="bar", stat="identity", ylab="# Fatalities", main="5 Most harmful Events in terms of fatalities", fill=EventType)

plot of chunk graph_fat

In term of injuries

evt_inj <- aggregate(data$INJURIES, by=list(data$EVTYPE), FUN=sum)
names(evt_inj) <- c("EventType", "Injuries")
head(evt_inj[order(evt_inj$Injuries, decreasing=TRUE),], n=15)
##              EventType Injuries
## 752            TORNADO    91346
## 682  THUNDERSTORM WIND     8445
## 146              FLOOD     6789
## 108     EXCESSIVE HEAT     6525
## 406          LIGHTNING     5230
## 235               HEAT     2100
## 381          ICE STORM     1975
## 131        FLASH FLOOD     1777
## 204               HAIL     1361
## 885       WINTER STORM     1321
## 365  HURRICANE/TYPHOON     1275
## 313          HIGH WIND     1137
## 266         HEAVY SNOW     1021
## 868           WILDFIRE      911
## 706 THUNDERSTORM WINDS      908

Graphical

top_inj <- head(evt_inj[order(evt_inj$Injuries, decreasing=TRUE),])
qplot(data= top_inj, y=Injuries, x=reorder(EventType,Injuries), geom="bar", stat="identity", ylab="# Injuries", main="5 Most harmful Events in terms of injuries", fill=EventType)

plot of chunk graph_inj

Most costly events

evt_cost <- aggregate(data$PROPDMGAMOUNT, by=list(data$EVTYPE), FUN=sum)
names(evt_cost) <- c("EventType", "Amount")
evt_cost$Amount <- evt_cost$Amount / 100000
head(evt_cost[order(evt_cost$Amount, decreasing=TRUE),],n=15)
##             EventType  Amount
## 146             FLOOD 1446577
## 365 HURRICANE/TYPHOON  693058
## 752           TORNADO  569474
## 593       STORM SURGE  433235
## 131       FLASH FLOOD  170951
## 204              HAIL  157353
## 356         HURRICANE  118683
## 682 THUNDERSTORM WIND   79681
## 766    TROPICAL STORM   77039
## 885      WINTER STORM   66885
## 313         HIGH WIND   52700
## 523       RIVER FLOOD   51189
## 868          WILDFIRE   47651
## 594  STORM SURGE/TIDE   46412
## 381         ICE STORM   39449

Graphical

top_cost <- head(evt_cost[order(evt_cost$Amount, decreasing=TRUE),])
qplot(data= top_cost, y=Amount, x=reorder(EventType,Amount), geom="bar", stat="identity", ylab="Amount in thousands of US$", main="5 Most costly Events", fill=EventType)

plot of chunk graph_cost

Notes

It is crucial to review and standardize the EVTYPE field, as it has major inconsistencies in coding and transcription.