Report: Effects of weather events on human health and economy

Mahmoud Shaaban

Summary

Here we report on the effect of stroms and other sever weather events on population health and economy across the USA. The data used here was collecte by NOAA, few transformations were added to the original data set maily to reorder the data by the event type then show the totals of the effects by state and county.

Introduction

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Data processing

Data was first read using the provided orginal file of .csv.dz2 extension. No further transformation was done on the data as this step. The data set is stored in dat object for further analysis and the next code is exploring the dat object dimensions and structure.

# reading data
dat <- read.csv("repdata-data-StormData.csv.bz2")
dim(dat)
## [1] 902297     37
str(dat)
## 'data.frame':    902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : Factor w/ 16335 levels "10/10/1954 0:00:00",..: 6523 6523 4213 11116 1426 1426 1462 2873 3980 3980 ...
##  $ BGN_TIME  : Factor w/ 3608 levels "000","0000","00:00:00 AM",..: 212 257 2645 1563 2524 3126 122 1563 3126 3126 ...
##  $ TIME_ZONE : Factor w/ 22 levels "ADT","AKS","AST",..: 7 7 7 7 7 7 7 7 7 7 ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: Factor w/ 29601 levels "","5NM E OF MACKINAC BRIDGE TO PRESQUE ISLE LT MI",..: 13513 1873 4598 10592 4372 10094 1973 23873 24418 4598 ...
##  $ STATE     : Factor w/ 72 levels "AK","AL","AM",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ EVTYPE    : Factor w/ 985 levels "?","ABNORMALLY DRY",..: 830 830 830 830 830 830 830 830 830 830 ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : Factor w/ 35 levels "","E","Eas","EE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_LOCATI: Factor w/ 54429 levels "","?","(01R)AFB GNRY RNG AL",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_DATE  : Factor w/ 6663 levels "","10/10/1993 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_TIME  : Factor w/ 3647 levels "","?","0000",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : Factor w/ 24 levels "","E","ENE","ESE",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ END_LOCATI: Factor w/ 34506 levels "","(0E4)PAYSON ARPT",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ WFO       : Factor w/ 542 levels "","2","43","9V9",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ STATEOFFIC: Factor w/ 250 levels "","ALABAMA, Central",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ ZONENAMES : Factor w/ 25112 levels "","                                                                                                                               "| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : Factor w/ 436781 levels ""," ","  ","   ",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...

Results

Effects of sever weather events on population health

For summarizing the effect of variable weather events on population health a few transformations were done using dplyr pckage. First, we grouped sum the injuries and fatalities by event type and reported the top ten types of events that has the biggest effect on population health in terms of number of total injuries and fatalities. Then we show the same values for each state and county.

The following chunck of code groups the total injuries and fatalities by the event type and reports the biggest 10 events that had an impact on poulation health across USA.

# summary of injuries and fatalities with different types of events
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
event <- dat %>%
    group_by(EVTYPE) %>%
    summarize(injuries = sum(INJURIES), fatalities = sum(FATALITIES))

topinj <- arrange(event, desc(injuries))
head(topinj, 10) # top ten events by casualities
## Source: local data frame [10 x 3]
## 
##               EVTYPE injuries fatalities
##               (fctr)    (dbl)      (dbl)
## 1            TORNADO    91346       5633
## 2          TSTM WIND     6957        504
## 3              FLOOD     6789        470
## 4     EXCESSIVE HEAT     6525       1903
## 5          LIGHTNING     5230        816
## 6               HEAT     2100        937
## 7          ICE STORM     1975         89
## 8        FLASH FLOOD     1777        978
## 9  THUNDERSTORM WIND     1488        133
## 10              HAIL     1361         15

Next chunck of code groups the total injuries and fatalities by the event type and reports the total injuries and fatalites for each state in two bar graphes. Top graph shows total injuries for each state along the whole recorded period when the data were collected. The bottom graph showes total number of fatalities for each state along the same period of time

# summary of injuries and fatalities across the states caused by different events

states <- dat %>% 
    group_by(STATE, EVTYPE) %>% 
    summarize(fatalities = sum(FATALITIES),
              injuries = sum(INJURIES))

totals <- states %>% group_by(STATE) %>% summarize(injuries = sum(injuries), fatalities = sum(fatalities))
totals$total <- totals$injuries + totals$fatalities

par(mfrow=c(2,1), mar = c(2,2,5,2), cex.axis = 0.5)
with(totals, 
     barplot(injuries,
             names.arg = STATE,
             col = "red"),main = "Number of injures per state")

with(totals, 
     barplot(fatalities,
             names.arg = STATE,
             col = "blue"),
     main = "Number of injures per state")

Third chunck of code groups the total injuries and fatalities by the event type and reports the total injuries and fatalites for each county and stores the result in object counties then reports the first few rows of the object.

# summary of injuries and fatalities across the counties caused by different events

counties <- dat %>% 
    group_by(COUNTYNAME, EVTYPE) %>% 
    summarize(fatalities = sum(FATALITIES),
              injuries = sum(INJURIES))
head(counties)
## Source: local data frame [6 x 4]
## Groups: COUNTYNAME [1]
## 
##   COUNTYNAME           EVTYPE fatalities injuries
##       (fctr)           (fctr)      (dbl)    (dbl)
## 1               Coastal Flood          0        0
## 2            Coastal Flooding          0        0
## 3                 FLASH FLOOD          0        0
## 4                Funnel Cloud          0        0
## 5                FUNNEL CLOUD          0        0
## 6                        HAIL          0        0

Effects of sever weather events on economy

Similar transformations using dplyr package were done to summarize the effects of sever weather events of economy across the USA. Data were first grouped by event type and reported damage in terms of properties and crops damage. First, events with biggest effect were reported then summary of effect on each state and counties follows.

The following chunck of code groups the total properties and crops damage by the event type and reports the biggest 10 events that had an impact on economy of USA.

# summary of property and crop damage with different types of events
library(dplyr)
eventdmg <- dat %>%
    group_by(EVTYPE) %>%
    summarize(property = sum(PROPDMG), crop = sum(CROPDMG))

topdmg <- arrange(eventdmg, desc(property))
head(topdmg, 10)
## Source: local data frame [10 x 3]
## 
##                EVTYPE  property      crop
##                (fctr)     (dbl)     (dbl)
## 1             TORNADO 3212258.2 100018.52
## 2         FLASH FLOOD 1420124.6 179200.46
## 3           TSTM WIND 1335965.6 109202.60
## 4               FLOOD  899938.5 168037.88
## 5   THUNDERSTORM WIND  876844.2  66791.45
## 6                HAIL  688693.4 579596.28
## 7           LIGHTNING  603351.8   3580.61
## 8  THUNDERSTORM WINDS  446293.2  18684.93
## 9           HIGH WIND  324731.6  17283.21
## 10       WINTER STORM  132720.6   1978.99

Next chunck of code groups the total damages by the event type and reports the totals for each state in two bar graphes. Top graph shows total property damage for each state along the whole recorded period when the data were collected. The bottom graph showes total crops damage for each state along the same period of time.

# summary of property and crop damage across the states caused by different events
statesdmg <- dat %>% 
    group_by(STATE, EVTYPE) %>% 
    summarize(property = sum(PROPDMG),
              crop = sum(CROPDMG))

totalsdmg <- statesdmg %>%
    group_by(STATE) %>% 
    summarize(property = sum(property),
              crop = sum(crop))
head(totalsdmg)
## Source: local data frame [6 x 3]
## 
##    STATE  property     crop
##   (fctr)     (dbl)    (dbl)
## 1     AK  33995.51   205.00
## 2     AL 363606.66  9666.94
## 3     AM   5653.80    50.00
## 4     AN    294.00     0.00
## 5     AR 361121.58 25819.13
## 6     AS   2954.50  1564.00
totalsdmg$total <- totalsdmg$property + totalsdmg$crop

par(mfrow=c(2,1), mar = c(2,2,5,2), cex.axis = 0.5)
with(totalsdmg, 
     barplot(property,
             names.arg = STATE,
             col = "red"), 
     main = "Amount of property damages per state")

with(totalsdmg, 
     barplot(crop,
             names.arg = STATE,
             col = "blue"),
     main = "Amount of crop damages per stat")

Lastly, a chunck of code groups the total damages by the event type and reports the total property and crops damage for each county and stores the result in object countiesdmg then reports the first few rows of the object.

# summary of property and crop damage across the counties caused by different events
countiesdmg <- dat %>% 
    group_by(COUNTYNAME, EVTYPE) %>% 
    summarize(property = sum(PROPDMG),
              crop = sum(CROPDMG))
head(arrange(countiesdmg, desc(property)))
## Source: local data frame [6 x 4]
## Groups: COUNTYNAME [1]
## 
##   COUNTYNAME           EVTYPE property  crop
##       (fctr)           (fctr)    (dbl) (dbl)
## 1               Coastal Flood      270     0
## 2                 FLASH FLOOD      100     0
## 3                  HEAVY RAIN       45     0
## 4                     TORNADO       15     0
## 5            Coastal Flooding        5     0
## 6                Funnel Cloud        0     0