Storm and Weather events in the United States. Fatalities, Injuries and Property Damage.

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The data are available in the link:

The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

Synopsis

The analysis in the database of storm events revealed that tornadoes are the principal dangerous weather event for the health of the population in deaths and injuries. The second dangerous type of event in terms of deaths is Excessive Heat and in terms of injuries is Thunderstorm Wind, Flood, and Excessive Heat. The biggest damage to crops caused by drought, followed by floods and river floods. The biggest damage to property was caused by Flood, followed by Hurricane Typhoon, tornado, and storm surge.

Data Processing

# load data
rm(list= ls())
url<-"https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
datzip <- 'StormData.csv.bz2'
if(!file.exists(datzip)) {
  download.file(url,datzip)
  }
DataStorm<-read.csv("StormData.csv.bz2")

Preprocessing for the EVTYPE variable. Translate all letters to lowercase and replace all symbols ( /, ?, +) with space.

EVENTYPE<- tolower(DataStorm$EVTYPE)
EVENTYPE<-gsub("[[:blank:][:punct:]+]", " ",EVENTYPE)
DataStorm$EVTYPE<-EVENTYPE

The variable EVTYPE had 985 categories and now it has 874 categories

Types of events are most harmful to population health.

library(dplyr)
DataStorm.by.Eventype<-group_by(DataStorm,EVTYPE)
Total.Fatalities<-summarize(DataStorm.by.Eventype, Fatalities=sum(FATALITIES))
Total.Fatalities.ordered<-head(arrange(Total.Fatalities,desc(Fatalities)),15)
Total.Fatalities.ordered
## # A tibble: 15 x 2
##    EVTYPE            Fatalities
##    <chr>                  <dbl>
##  1 tornado                 5633
##  2 excessive heat          1903
##  3 flash flood              978
##  4 heat                     937
##  5 lightning                816
##  6 tstm wind                504
##  7 flood                    470
##  8 rip current              368
##  9 high wind                248
## 10 avalanche                224
## 11 winter storm             206
## 12 rip currents             204
## 13 heat wave                172
## 14 extreme cold             162
## 15 thunderstorm wind        133

Tornado is the principal dangerous weather event for the health of the population in deaths.

Total.Injuries<-summarize(DataStorm.by.Eventype, Injuries=sum(INJURIES))
Total.Injuries.ordered<-head(arrange(Total.Injuries,desc(Injuries)),15)
Total.Injuries.ordered
## # A tibble: 15 x 2
##    EVTYPE            Injuries
##    <chr>                <dbl>
##  1 tornado              91346
##  2 tstm wind             6957
##  3 flood                 6789
##  4 excessive heat        6525
##  5 lightning             5230
##  6 heat                  2100
##  7 ice storm             1975
##  8 flash flood           1777
##  9 thunderstorm wind     1488
## 10 hail                  1361
## 11 winter storm          1321
## 12 hurricane typhoon     1275
## 13 high wind             1137
## 14 heavy snow            1021
## 15 wildfire               911

Tornado is the principal dangerous weather event for the health of the population in injuries.

par(mfrow = c(1, 2), mar = c(10, 4, 2, 2), las = 3, cex = 0.7, cex.main = 1.4, cex.lab = 1.2)
barplot(Total.Fatalities.ordered$Fatalities, names.arg = Total.Fatalities.ordered$EVTYPE, col = 'blue',
        main = 'Top 15 Weather Events for Fatalities', ylab = 'Number of Fatalities')
barplot(Total.Injuries.ordered$Injuries, names.arg = Total.Injuries.ordered$EVTYPE, col = 'red',
        main = 'Top 15 Weather Events for Injuries', ylab = 'Number of Injuries')

The analysis in the database of storm events revealed that tornadoes are the principal dangerous weather event for the health of the population in deaths and injuries. The second dangerous type of event in terms of deaths is Excessive Heat and in terms of injuries is Thunderstorm Wind, Flood, and Excessive Heat.

Types of events have the greatest economic consequences.

exp_transform <- function(e) {
  # h -> hundred, k -> thousand, m -> million, b -> billion
  if (e %in% c('h', 'H'))
    return(2)
  else if (e %in% c('k', 'K'))
    return(3)
  else if (e %in% c('m', 'M'))
    return(6)
  else if (e %in% c('b', 'B'))
    return(9)
  else if (!is.na(as.numeric(e))) # if a digit
    return(as.numeric(e))
  else if (e %in% c('', '-', '?', '+'))
    return(0)
  else {
    stop("Invalid exponent value.")
  }
}
prop_dmg_exp <- sapply(DataStorm$PROPDMGEXP, FUN=exp_transform)
DataStorm$prop_dmg <- DataStorm$PROPDMG * (10 ** prop_dmg_exp)
crop_dmg_exp <- sapply(DataStorm$CROPDMGEXP, FUN=exp_transform)
DataStorm$crop_dmg <- DataStorm$CROPDMG * (10 ** crop_dmg_exp)
DataStorm.by.Eventype<-group_by(DataStorm,EVTYPE)
Total.crop.dmg<-summarize(DataStorm.by.Eventype, crop_damage=sum(crop_dmg))
Total.cropdmg.ordered<-head(arrange(Total.crop.dmg,desc(crop_damage)),15)
Total.cropdmg.ordered
## # A tibble: 15 x 2
##    EVTYPE            crop_damage
##    <chr>                   <dbl>
##  1 drought           13972566000
##  2 flood              5661968450
##  3 river flood        5029459000
##  4 ice storm          5022113500
##  5 hail               3025954473
##  6 hurricane          2741910000
##  7 hurricane typhoon  2607872800
##  8 flash flood        1421317100
##  9 extreme cold       1312973000
## 10 frost freeze       1094186000
## 11 heavy rain          733399800
## 12 tropical storm      678346000
## 13 high wind           638571300
## 14 tstm wind           554007350
## 15 excessive heat      492402000
Total.prop.dmg<-summarize(DataStorm.by.Eventype, prop_damage=sum(prop_dmg))
Total.propdmg.ordered<-head(arrange(Total.prop.dmg,desc(prop_damage)),15)
Total.propdmg.ordered
## # A tibble: 15 x 2
##    EVTYPE              prop_damage
##    <chr>                     <dbl>
##  1 flood             144657709807 
##  2 hurricane typhoon  69305840000 
##  3 tornado            56947380676.
##  4 storm surge        43323536000 
##  5 flash flood        16822673978.
##  6 hail               15735267513.
##  7 hurricane          11868319010 
##  8 tropical storm      7703890550 
##  9 winter storm        6688497251 
## 10 high wind           5270046295 
## 11 river flood         5118945500 
## 12 wildfire            4765114000 
## 13 storm surge tide    4641188000 
## 14 tstm wind           4484958495 
## 15 ice storm           3944927860
par(mfrow = c(1, 2), mar = c(10, 4, 2, 2), las = 3, cex = 0.7, cex.main = 1.4, cex.lab = 1.2)
barplot(Total.propdmg.ordered$prop_damage, names.arg = Total.propdmg.ordered$EVTYPE, col = 'blue',
        main = 'Weather cost to the US Economy', ylab = 'Property damage in dollars')

barplot(Total.cropdmg.ordered$crop_damage, names.arg = Total.cropdmg.ordered$EVTYPE, col = 'red',
        main = 'Weather cost to the US Economy', ylab = 'Crop damage in dollars')

The biggest damage to crops caused by drought, followed by floods and river floods. The biggest damage to property was caused by Flood, followed by Hurricane Typhoon, tornado, and storm surge.