HEALTH CONSEQUENCES AND ECONOMIC IMPACT OF STORMS AND OTHER SEVERE WEATHER EVENTS IN US

by John Sprockel D.

Date: “6/08/2020”

SUMMARY

In order to carry out an analysis of the way in which Storms and other severe weather events can cause public health and economic problems for communities and municipalities we explore the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database that contains information on events since the year 1950 and end in November 2011.

The following report contains a report of the ten most frequent causes of fatalities, injuries and costs of property damage, as well as the states most affected by these natural disasters. Then a graph of the behavior of each type of disaster over time is made.

The analysis and graphs were carried out in the R statistical program using the basic package and the dplyr and ggplot2 libraries.

The R environment in which this data analisis was made is:

sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_4.0.2  magrittr_1.5    tools_4.0.2     htmltools_0.5.0
##  [5] stringi_1.4.6   rmarkdown_2.3   knitr_1.29      stringr_1.4.0  
##  [9] xfun_0.15       digest_0.6.25   rlang_0.4.6     evaluate_0.14

Data Processing

Download of database

download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", "StormData.csv.bz2")

library(dplyr)

db<- read.csv(bzfile("StormData.csv.bz2"))
db<- tbl_df(db)
## Warning: `tbl_df()` is deprecated as of dplyr 1.0.0.
## Please use `tibble::as_tibble()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.

Exploration of data

An general summary of database is:

summary(db)
##     STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.:19.0   Class :character   Class :character   Class :character  
##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :31.2                                                           
##  3rd Qu.:45.0                                                           
##  Max.   :95.0                                                           
##                                                                         
##      COUNTY       COUNTYNAME           STATE              EVTYPE         
##  Min.   :  0.0   Length:902297      Length:902297      Length:902297     
##  1st Qu.: 31.0   Class :character   Class :character   Class :character  
##  Median : 75.0   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :100.6                                                           
##  3rd Qu.:131.0                                                           
##  Max.   :873.0                                                           
##                                                                          
##    BGN_RANGE          BGN_AZI           BGN_LOCATI          END_DATE        
##  Min.   :   0.000   Length:902297      Length:902297      Length:902297     
##  1st Qu.:   0.000   Class :character   Class :character   Class :character  
##  Median :   0.000   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :   1.484                                                           
##  3rd Qu.:   1.000                                                           
##  Max.   :3749.000                                                           
##                                                                             
##    END_TIME           COUNTY_END COUNTYENDN       END_RANGE       
##  Length:902297      Min.   :0    Mode:logical   Min.   :  0.0000  
##  Class :character   1st Qu.:0    NA's:902297    1st Qu.:  0.0000  
##  Mode  :character   Median :0                   Median :  0.0000  
##                     Mean   :0                   Mean   :  0.9862  
##                     3rd Qu.:0                   3rd Qu.:  0.0000  
##                     Max.   :0                   Max.   :925.0000  
##                                                                   
##    END_AZI           END_LOCATI            LENGTH              WIDTH         
##  Length:902297      Length:902297      Min.   :   0.0000   Min.   :   0.000  
##  Class :character   Class :character   1st Qu.:   0.0000   1st Qu.:   0.000  
##  Mode  :character   Mode  :character   Median :   0.0000   Median :   0.000  
##                                        Mean   :   0.2301   Mean   :   7.503  
##                                        3rd Qu.:   0.0000   3rd Qu.:   0.000  
##                                        Max.   :2315.0000   Max.   :4400.000  
##                                                                              
##        F               MAG            FATALITIES          INJURIES        
##  Min.   :0.0      Min.   :    0.0   Min.   :  0.0000   Min.   :   0.0000  
##  1st Qu.:0.0      1st Qu.:    0.0   1st Qu.:  0.0000   1st Qu.:   0.0000  
##  Median :1.0      Median :   50.0   Median :  0.0000   Median :   0.0000  
##  Mean   :0.9      Mean   :   46.9   Mean   :  0.0168   Mean   :   0.1557  
##  3rd Qu.:1.0      3rd Qu.:   75.0   3rd Qu.:  0.0000   3rd Qu.:   0.0000  
##  Max.   :5.0      Max.   :22000.0   Max.   :583.0000   Max.   :1700.0000  
##  NA's   :843563                                                           
##     PROPDMG         PROPDMGEXP           CROPDMG         CROPDMGEXP       
##  Min.   :   0.00   Length:902297      Min.   :  0.000   Length:902297     
##  1st Qu.:   0.00   Class :character   1st Qu.:  0.000   Class :character  
##  Median :   0.00   Mode  :character   Median :  0.000   Mode  :character  
##  Mean   :  12.06                      Mean   :  1.527                     
##  3rd Qu.:   0.50                      3rd Qu.:  0.000                     
##  Max.   :5000.00                      Max.   :990.000                     
##                                                                           
##      WFO             STATEOFFIC         ZONENAMES            LATITUDE   
##  Length:902297      Length:902297      Length:902297      Min.   :   0  
##  Class :character   Class :character   Class :character   1st Qu.:2802  
##  Mode  :character   Mode  :character   Mode  :character   Median :3540  
##                                                           Mean   :2875  
##                                                           3rd Qu.:4019  
##                                                           Max.   :9706  
##                                                           NA's   :47    
##    LONGITUDE        LATITUDE_E     LONGITUDE_       REMARKS         
##  Min.   :-14451   Min.   :   0   Min.   :-14455   Length:902297     
##  1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0   Class :character  
##  Median :  8707   Median :   0   Median :     0   Mode  :character  
##  Mean   :  6940   Mean   :1452   Mean   :  3509                     
##  3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735                     
##  Max.   : 17124   Max.   :9706   Max.   :106220                     
##                   NA's   :40                                        
##      REFNUM      
##  Min.   :     1  
##  1st Qu.:225575  
##  Median :451149  
##  Mean   :451149  
##  3rd Qu.:676723  
##  Max.   :902297  
## 

We can see that EVTYPE, FATALITIES, INJURIES, PROPDMG does not have any missing values

  • Columns names are:
colnames(db)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Transform the BGN_DATE variable to Date in order to classified the dates. No other actions were necessary to obtain the tidy data.

db$BGN_DATE<- as.Date(db$BGN_DATE, "%m/%d/%Y %H:%M:%S")

RESULTS

MAIN CAUSES OF DEATHS, INJURIES AND COSTS

Since 1950 storms and other severe weather events have caused 1.514510^{4} deaths and 1.4052810^{5} injuries, with an approximate cost of 1.0884510^{7} dollars.

As can be seen in the following table, tornadoes are the leading cause of death, injury, and economic cost since 1950.

top10<- cbind(
  db %>%
        group_by(EVTYPE) %>%
        summarize(totalF = sum(FATALITIES))%>%
        arrange(desc(totalF)), 
  db %>%
        group_by(EVTYPE) %>%
        summarize(totalI = sum(INJURIES))%>%
        arrange(desc(totalI)), 
  db %>%
        group_by(EVTYPE) %>%
        summarize(totalP = sum(PROPDMG))%>%
        arrange(desc(totalP)))
## `summarise()` ungrouping output (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
head(top10, 10)
##            EVTYPE totalF            EVTYPE totalI             EVTYPE    totalP
## 1         TORNADO   5633           TORNADO  91346            TORNADO 3212258.2
## 2  EXCESSIVE HEAT   1903         TSTM WIND   6957        FLASH FLOOD 1420124.6
## 3     FLASH FLOOD    978             FLOOD   6789          TSTM WIND 1335965.6
## 4            HEAT    937    EXCESSIVE HEAT   6525              FLOOD  899938.5
## 5       LIGHTNING    816         LIGHTNING   5230  THUNDERSTORM WIND  876844.2
## 6       TSTM WIND    504              HEAT   2100               HAIL  688693.4
## 7           FLOOD    470         ICE STORM   1975          LIGHTNING  603351.8
## 8     RIP CURRENT    368       FLASH FLOOD   1777 THUNDERSTORM WINDS  446293.2
## 9       HIGH WIND    248 THUNDERSTORM WIND   1488          HIGH WIND  324731.6
## 10      AVALANCHE    224              HAIL   1361       WINTER STORM  132720.6

Tornadoes were the cause of 37.1937933% (5633 / 1.514510^{4}) of deaths, 65.0019925% (9.134610^{4} / 1.4052810^{5}) of injuries and constituted 29.5122252% of total costs within what was reported in the database since 1950.

  • By States

In this table you can see the top 10 states classified by number of fatalities, injuries or property damage cost.

top10states<- cbind(
  db %>%
                group_by(STATE) %>%
                summarize(totalF = sum(FATALITIES))%>%
                arrange(desc(totalF)), 
  db %>%
                group_by(STATE) %>%
                summarize(totalI = sum(INJURIES))%>%
                arrange(desc(totalI)), 
  db %>%
                group_by(STATE) %>%
                summarize(totalP = sum(PROPDMG))%>%
                arrange(desc(totalP)))
## `summarise()` ungrouping output (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
head(top10states, 10)
##    STATE totalF STATE totalI STATE   totalP
## 1     IL   1421    TX  17667    TX 937138.0
## 2     TX   1366    MO   8998    IA 685487.3
## 3     PA    846    AL   8742    OH 559834.4
## 4     AL    784    OH   7112    GA 485873.7
## 5     MO    754    MS   6675    MS 481811.8
## 6     FL    746    FL   5918    KS 387183.8
## 7     MS    555    OK   5710    FL 374428.0
## 8     CA    550    IL   5563    NY 373109.4
## 9     AR    530    AR   5550    AL 363606.7
## 10    TN    521    TN   5202    AR 361121.6

In general, Texas is the most affected. Just Illinois exceeds it in the number of fatalities.

FATALITIES

The behavior over time of the first 10 causes of death from storms and other severe weather events since 1950 was:

mort<- db %>%
  filter ( EVTYPE== "TORNADO" | EVTYPE== "EXCESSIVE HEAT" | EVTYPE== "FLASH FLOOD" | 
             EVTYPE== "HEAT" | EVTYPE== "LIGHTNING" | EVTYPE== "TSTM WIND" | 
             EVTYPE== "FLOOD" | EVTYPE== "RIP CURRENT" | EVTYPE== "HIGH WIND" | 
             EVTYPE== "AVALANCHE") %>%
  mutate(year = as.numeric(format(BGN_DATE,'%Y')))

mortTpo<-aggregate(FATALITIES ~ year+EVTYPE, mort, sum)

library(ggplot2)
g<- ggplot(mortTpo, aes(year, FATALITIES, col= EVTYPE))
g+ geom_line(size=2)+ theme_bw(base_family = "Times")+ 
  labs(title = "Fatalities from Storms and Other Severe Weather Events by Year")

We can note that there are natural disasters for which information has only been available since the mid-1990s. Tornadoes have been collected from the beginning. There are times when disasters occur that cause great damage in a short space of time.

INJURIES

The top 10 causes of injuries from storms and other severe weather events since 1950 were:

inj<- db %>%
  filter ( EVTYPE== "TORNADO" | EVTYPE== "TSTM WIND" | EVTYPE== "FLOOD" | 
             EVTYPE==  "EXCESSIVE HEAT"    | EVTYPE==  "LIGHTNING" | EVTYPE== "HEAT" |
             EVTYPE== "ICE STORM" | EVTYPE==  "FLASH FLOOD" | 
             EVTYPE== "THUNDERSTORM WIND" | EVTYPE== "HAIL") %>%
  mutate(year = as.numeric(format(BGN_DATE,'%Y')))

injTpo<-aggregate(INJURIES ~ year+EVTYPE, inj, sum)

library(ggplot2)
g<- ggplot(injTpo, aes(year, INJURIES, col= EVTYPE))
g+ geom_line(size=2)+ theme_bw(base_family = "Times")+ 
  labs(title = "Injuries from Storms and Other Severe Weather Events by Year")

PROPERTY DAMAGE ESTIMATES

The top 10 causes of property damage from storms and other severe weather events since 1950 were:

pdmg<- db %>%
  filter ( EVTYPE== "TORNADO" | EVTYPE== "FLASH FLOOD" | EVTYPE== "TSTM WIND" | 
             EVTYPE== "FLOOD" | EVTYPE== "THUNDERSTORM WIND" | EVTYPE== "HAIL" | 
             EVTYPE== "LIGHTNING" | EVTYPE== "THUNDERSTORM WINDS" | 
             EVTYPE== "HIGH WIND" | EVTYPE== "WINTER STORM") %>%
  mutate(year = as.numeric(format(BGN_DATE,'%Y')))

pdmgTpo<-aggregate(PROPDMG ~ year+EVTYPE, pdmg, sum)

library(ggplot2)
g<- ggplot(pdmgTpo, aes(year, PROPDMG, col= EVTYPE))
g+ geom_line(size=2)+ theme_bw(base_family = "Times")+ 
  labs(title = "Property Damage from Storms and Other Severe Weather Events by Year")

As expected, although heat is a major cause of injury and death, it is not a cause of property damage.