Impact of Severe Weather Events on Public Health and Economy in the United States

Synonpsis

In this report, we aim to analyze the impact of different weather events on public health and economy based on the storm database collected from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) from 1950 - 2011. We will use the estimates of fatalities, injuries, property and crop damage to decide which types of event are most harmful to the population health and economy. From these data, we found that excessive heat and tornado are most harmful with respect to population health, while flood, drought, and hurricane/typhoon have the greatest economic consequences.

Basic settings

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

Data Processing

First, we download the data file and unzip it.

if (!"repdata_data_StormData.csv.bz2" %in% dir("."))
  {
    print("Downloading Data")
    download.file("http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile = "repdata_data_StormData.csv.bz2")
  }
## [1] "Downloading Data"
if (!"stormData" %in% ls()) 
  {
    stormData <- read.csv("repdata_data_StormData.csv.bz2")
  }
dim(stormData)
## [1] 902297     37

Then, we read the generated csv file. If the data already exists in the working environment, we do not need to load it again. Otherwise, we read the csv file.

fatalitiesAndInjuries <- stormData %>% group_by (EVTYPE)  %>% summarise(Fatalities =  sum(FATALITIES), Injuries = sum(INJURIES))
## `summarise()` ungrouping output (override with `.groups` argument)
fatalitiesAndInjuries <- mutate(fatalitiesAndInjuries, total = Fatalities + Injuries) 
topTen <- fatalitiesAndInjuries[order(fatalitiesAndInjuries$total, decreasing = TRUE),][1:10,]
topTen
## # A tibble: 10 x 4
##    EVTYPE            Fatalities Injuries total
##    <chr>                  <dbl>    <dbl> <dbl>
##  1 TORNADO                 5633    91346 96979
##  2 EXCESSIVE HEAT          1903     6525  8428
##  3 TSTM WIND                504     6957  7461
##  4 FLOOD                    470     6789  7259
##  5 LIGHTNING                816     5230  6046
##  6 HEAT                     937     2100  3037
##  7 FLASH FLOOD              978     1777  2755
##  8 ICE STORM                 89     1975  2064
##  9 THUNDERSTORM WIND        133     1488  1621
## 10 WINTER STORM             206     1321  1527

There are 902297 rows and 37 columns in total. The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

propertyAndCropDmgData <- stormData[ , c(8,25,26,27,28)]
table(propertyAndCropDmgData$PROPDMGEXP)
## 
##             -      ?      +      0      1      2      3      4      5      6 
## 465934      1      8      5    216     25     13      4      4     28      4 
##      7      8      B      h      H      K      m      M 
##      5      1     40      1      6 424665      7  11330

Based on the above histogram, we see that the number of events tracked starts to significantly increase around 1995. So, we use the subset of the data from 1990 to 2011 to get most out of good records.

table(propertyAndCropDmgData$CROPDMGEXP)
## 
##             ?      0      2      B      k      K      m      M 
## 618413      7     19      1      9     21 281832      1   1994

Now, there are 681500 rows and 38 columns in total.

Impact on Public Health

In this section, we check the number of fatalities and injuries that are caused by the severe weather events. We would like to get the first 15 most severe types of weather events.

propertyAndCropDmgData <- mutate(propertyAndCropDmgData, PropDmgInDollars = PROPDMG, CropDmgInDollars = CROPDMG)

propertyAndCropDmgData$PROPDMGEXP[!grepl("K|M|B", propertyAndCropDmgData$PROPDMGEXP, ignore.case = TRUE)] <- 0

propertyAndCropDmgData$PROPDMGEXP[grep("K", propertyAndCropDmgData$PROPDMGEXP, ignore.case = TRUE)] <- 3
propertyAndCropDmgData$PROPDMGEXP[grep("M", propertyAndCropDmgData$PROPDMGEXP, ignore.case = TRUE)] <- 6
propertyAndCropDmgData$PROPDMGEXP[grep("B", propertyAndCropDmgData$PROPDMGEXP, ignore.case = TRUE)] <- 9
propertyAndCropDmgData$PROPDMGEXP[grep("H", propertyAndCropDmgData$PROPDMGEXP, ignore.case = TRUE)] <- 2
propertyAndCropDmgData$PropDmgInDollars <- propertyAndCropDmgData$PROPDMG * 10^as.numeric(propertyAndCropDmgData$PROPDMGEXP)


propertyAndCropDmgData$CROPDMGEXP[!grepl("K|M|B", propertyAndCropDmgData$CROPDMGEXP, ignore.case = TRUE)] <- 0
propertyAndCropDmgData$CROPDMGEXP[grep("K", propertyAndCropDmgData$CROPDMGEXP, ignore.case = TRUE)] <- 3
propertyAndCropDmgData$CROPDMGEXP[grep("M", propertyAndCropDmgData$CROPDMGEXP, ignore.case = TRUE)] <- 6
propertyAndCropDmgData$CROPDMGEXP[grep("B", propertyAndCropDmgData$CROPDMGEXP, ignore.case = TRUE)] <- 9
propertyAndCropDmgData$CropDmgInDollars <- propertyAndCropDmgData$CROPDMG * 10^as.numeric(propertyAndCropDmgData$CROPDMGEXP)


dmgByEvent <- propertyAndCropDmgData %>% group_by(EVTYPE) %>% summarise(totalPropDmg = sum(PropDmgInDollars), totalCropDmg = sum(CropDmgInDollars))
## `summarise()` ungrouping output (override with `.groups` argument)
dmgByEvent <- mutate(dmgByEvent, totalDmgInDollars = totalPropDmg + totalCropDmg)

topPropDmg <- dmgByEvent[order(dmgByEvent$totalPropDmg, decreasing = TRUE),][1:10,]
topPropDmg
## # A tibble: 10 x 4
##    EVTYPE             totalPropDmg totalCropDmg totalDmgInDollars
##    <chr>                     <dbl>        <dbl>             <dbl>
##  1 FLOOD             144657709807    5661968450     150319678257 
##  2 HURRICANE/TYPHOON  69305840000    2607872800      71913712800 
##  3 TORNADO            56937160779.    414953270      57352114049.
##  4 STORM SURGE        43323536000          5000      43323541000 
##  5 FLASH FLOOD        16140812067.   1421317100      17562129167.
##  6 HAIL               15732267048.   3025954473      18758221521.
##  7 HURRICANE          11868319010    2741910000      14610229010 
##  8 TROPICAL STORM      7703890550     678346000       8382236550 
##  9 WINTER STORM        6688497251      26944000       6715441251 
## 10 HIGH WIND           5270046295     638571300       5908617595

Impact on Economy

We will convert the property damage and crop damage data into comparable numerical forms according to the meaning of units described in the code book (Storm Events). Both PROPDMGEXP and CROPDMGEXP columns record a multiplier for each observation where we have Hundred (H), Thousand (K), Million (M) and Billion (B).

topCropDmg <- dmgByEvent[order(dmgByEvent$totalCropDmg, decreasing = TRUE),][1:10,]
topCropDmg
## # A tibble: 10 x 4
##    EVTYPE             totalPropDmg totalCropDmg totalDmgInDollars
##    <chr>                     <dbl>        <dbl>             <dbl>
##  1 DROUGHT             1046106000   13972566000      15018672000 
##  2 FLOOD             144657709807    5661968450     150319678257 
##  3 RIVER FLOOD         5118945500    5029459000      10148404500 
##  4 ICE STORM           3944927860    5022113500       8967041360 
##  5 HAIL               15732267048.   3025954473      18758221521.
##  6 HURRICANE          11868319010    2741910000      14610229010 
##  7 HURRICANE/TYPHOON  69305840000    2607872800      71913712800 
##  8 FLASH FLOOD        16140812067.   1421317100      17562129167.
##  9 EXTREME COLD          67737400    1292973000       1360710400 
## 10 FROST/FREEZE           9480000    1094086000       1103566000

Results

As for the impact on public health, we have got two sorted lists of severe weather events below by the number of people badly affected.

ggplot(topTen, aes(total, EVTYPE)) + geom_bar(stat = "identity", fill = "red") + ggtitle("Events Responsible for most Fatalities and Injuries") + ylab("Event")+ xlab("Total Fatalities and Injuries")

And the following is a pair of graphs of total fatalities and total injuries affected by these severe weather events.