An Analysis Report of Health and Economic Impact by Severe Weather Events - Based on NOAA Storm Database

Synopsis

Storm and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severs events can results in fatalities, injuries and property damage. Preventing such outcomes to the extent possible is a key concern. The U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database tracks characteristics of major storms and weather events in the United States, include when and where they occur, aswell as estimates of any fatalities, injuries and property damage. This report contains the exploratory analysis results on the health and economic impact by the severe weather events based on the data from NOAA database, in answering the following two questions:

  1. Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

  2. Across the United States, which types of events have the greatest economic consequences?

Data Processing

Loading the data

# download file from URL
if (!file.exists("C:/Users/DELL 1/Documents/Module5PeerProject2/repdata-data-StormData.csv.bz2")) {
    download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", 
        "C:/Users/DELL 1/Documents/Module5PeerProject2/repdata-data-StormData.csv.bz2")
}
# unzip file
if (!file.exists("C:/Users/DELL 1/Documents/Module5PeerProject2/repdata-data-StormData.csv.bz2")) {
    library(R.utils)
    bunzip2("C:/Users/DELL 1/Documents/Module5PeerProject2/repdata-data-StormData.csv.bz2", remove = FALSE)
}
# load data into R
storm <- read.csv("C:/Users/DELL 1/Documents/Module5PeerProject2/repdata-data-StormData.csv.bz2", header = TRUE)

Extracting the first few lines of the data

head(storm)
##   STATE__           BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1       1  4/18/1950 0:00:00     0130       CST     97     MOBILE    AL
## 2       1  4/18/1950 0:00:00     0145       CST      3    BALDWIN    AL
## 3       1  2/20/1951 0:00:00     1600       CST     57    FAYETTE    AL
## 4       1   6/8/1951 0:00:00     0900       CST     89    MADISON    AL
## 5       1 11/15/1951 0:00:00     1500       CST     43    CULLMAN    AL
## 6       1 11/15/1951 0:00:00     2000       CST     77 LAUDERDALE    AL
##    EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO         0                                               0
## 2 TORNADO         0                                               0
## 3 TORNADO         0                                               0
## 4 TORNADO         0                                               0
## 5 TORNADO         0                                               0
## 6 TORNADO         0                                               0
##   COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1         NA         0                      14.0   100 3   0          0
## 2         NA         0                       2.0   150 2   0          0
## 3         NA         0                       0.1   123 2   0          0
## 4         NA         0                       0.0   100 2   0          0
## 5         NA         0                       0.0   150 2   0          0
## 6         NA         0                       1.5   177 2   0          0
##   INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1       15    25.0          K       0                                    
## 2        0     2.5          K       0                                    
## 3        2    25.0          K       0                                    
## 4        2     2.5          K       0                                    
## 5        2     2.5          K       0                                    
## 6        6     2.5          K       0                                    
##   LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1     3040      8812       3051       8806              1
## 2     3042      8755          0          0              2
## 3     3340      8742          0          0              3
## 4     3458      8626          0          0              4
## 5     3412      8642          0          0              5
## 6     3450      8748          0          0              6

Selecting data for processing

There are 7 variables that are related to answer the above two questions :

  1. EVTYPE as a measure of event type (e.g. tornado, flood, etc.)
  2. FATALITIES as a measure of harm to human health
  3. INJURIES as a measure of harm to human health
  4. PROPDMG as a measure of property damage and hence economic damage in USD
  5. PROPDMGEXP as a measure of magnitude of property damage (e.g. thousands, millions USD, etc.)
  6. CROPDMG as a measure of crop damage and hence economic damage in USD
  7. CROPDMGEXP as a measure of magnitude of crop damage (e.g. thousands, millions USD, etc.)

Selecting specified columns

library(dplyr)
storm <- select(storm, EVTYPE, FATALITIES, INJURIES, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
str(storm)
## 'data.frame':    902297 obs. of  7 variables:
##  $ EVTYPE    : Factor w/ 985 levels "   HIGH SURF ADVISORY",..: 834 834 834 834 834 834 834 834 834 834 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: Factor w/ 19 levels "","-","?","+",..: 17 17 17 17 17 17 17 17 17 17 ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: Factor w/ 9 levels "","?","0","2",..: 1 1 1 1 1 1 1 1 1 1 ...

The impact to economic is measured by damages to the property and crop. PROPDMGEXP and CROPDMGEXP are variables that are related to this.

#Extracting unique elements of PROPDMGEXP
unique(storm$PROPDMGEXP)
##  [1] K M   B m + 0 5 6 ? 4 2 3 h 7 H - 1 8
## Levels:  - ? + 0 1 2 3 4 5 6 7 8 B h H K m M
#Extracting unique elements of CROPDMGEXP
unique(storm$CROPDMGEXP)
## [1]   M K m B ? 0 k 2
## Levels:  ? 0 2 B k K m M

The resulting colums need to be transformed for further processing

storm$PROPDMGEXP <- as.character(storm$PROPDMGEXP)
storm$PROPDMGEXP = gsub("\\-|\\+|\\?","0",storm$PROPDMGEXP)
storm$PROPDMGEXP = gsub("B|b", "9", storm$PROPDMGEXP)
storm$PROPDMGEXP = gsub("M|m", "6", storm$PROPDMGEXP)
storm$PROPDMGEXP = gsub("K|k", "3", storm$PROPDMGEXP)
storm$PROPDMGEXP = gsub("H|h", "2", storm$PROPDMGEXP)
storm$PROPDMGEXP <- as.numeric(storm$PROPDMGEXP)
storm$PROPDMGEXP[is.na(storm$PROPDMGEXP)] = 0
storm$ActPropDam<- storm$PROPDMG * 10^storm$PROPDMGEXP
propDam <- aggregate(ActPropDam~EVTYPE, data=storm, sum)
propDam_reorder<- propDam[order(-propDam$ActPropDam),]
PropDamages<-propDam_reorder[1:10,]

storm$CROPDMGEXP <- as.character(storm$CROPDMGEXP)
storm$CROPDMGEXP = gsub("\\-|\\+|\\?","0",storm$CROPDMGEXP)
storm$CROPDMGEXP = gsub("B|b", "9", storm$CROPDMGEXP)
storm$CROPDMGEXP = gsub("M|m", "6", storm$CROPDMGEXP)
storm$CROPDMGEXP = gsub("K|k", "3", storm$CROPDMGEXP)
storm$CROPDMGEXP = gsub("H|h", "2", storm$CROPDMGEXP)
storm$CROPDMGEXP <- as.numeric(storm$CROPDMGEXP)
storm$CROPDMGEXP[is.na(storm$CROPDMGEXP)] = 0
storm$ActCropDam<- storm$CROPDMG * 10^storm$CROPDMGEXP
cropDam <- aggregate(ActCropDam~EVTYPE, data=storm, sum)
cropDam_reorder<- cropDam[order(-cropDam$ActCropDam),]
CropDamages<-cropDam_reorder[1:10,]

TotalDam <- aggregate(ActPropDam + ActCropDam~EVTYPE, data=storm, sum)
names(TotalDam)[2] <- "total"
TotalDamages <- arrange(TotalDam, desc(total)) %>% top_n(10)
## Selecting by total

Results

Which weather events caused the most fatalities and injuries?

#Weather events causing the most fatalities
byFATALITIES <- group_by(storm, EVTYPE)
MostFatal<- summarise(byFATALITIES,
    total = sum(FATALITIES)
) %>% arrange(desc(total)) %>% top_n(10)
## Selecting by total
MostFatal
## Source: local data frame [10 x 2]
## 
##            EVTYPE total
## 1         TORNADO  5633
## 2  EXCESSIVE HEAT  1903
## 3     FLASH FLOOD   978
## 4            HEAT   937
## 5       LIGHTNING   816
## 6       TSTM WIND   504
## 7           FLOOD   470
## 8     RIP CURRENT   368
## 9       HIGH WIND   248
## 10      AVALANCHE   224
#Weather events causing the most injuries
byINJURIES <- group_by(storm, EVTYPE)
MostInjuries <- summarise(byINJURIES ,
    total = sum(INJURIES)
) %>% arrange(desc(total)) %>% top_n(10)
## Selecting by total
MostInjuries
## Source: local data frame [10 x 2]
## 
##               EVTYPE total
## 1            TORNADO 91346
## 2          TSTM WIND  6957
## 3              FLOOD  6789
## 4     EXCESSIVE HEAT  6525
## 5          LIGHTNING  5230
## 6               HEAT  2100
## 7          ICE STORM  1975
## 8        FLASH FLOOD  1777
## 9  THUNDERSTORM WIND  1488
## 10              HAIL  1361

Visualizing the results in graph format

par(mfrow = c(1, 2), mar = c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(MostFatal$total, 
        las = 3,
        names.arg = MostFatal$EVTYPE,
        col = "orange",
        ylab = "Total No. of Deaths",
        main = "Top 10 Weather Events Causing Fatalities")
barplot(MostInjuries$total, 
        las = 3,
        names.arg = MostInjuries$EVTYPE,
        col = "orange",
        ylab = "Total No. of Injuries",
        main = "Top 10 Weather Events Causing Injuries")

The result shows that Tornado is the most harmful weather event which caused the most fatalities and injuries across the United States

Which types of events have the greatest economic consequences?

Listing table for property damages :

PropDamages
##                EVTYPE   ActPropDam
## 170             FLOOD 144657709807
## 411 HURRICANE/TYPHOON  69305840000
## 834           TORNADO  56947380677
## 670       STORM SURGE  43323536000
## 153       FLASH FLOOD  16822673979
## 244              HAIL  15735267513
## 402         HURRICANE  11868319010
## 848    TROPICAL STORM   7703890550
## 972      WINTER STORM   6688497251
## 359         HIGH WIND   5270046295

Listing table for crop damages :

CropDamages
##                EVTYPE  ActCropDam
## 95            DROUGHT 13972566000
## 170             FLOOD  5661968450
## 590       RIVER FLOOD  5029459000
## 427         ICE STORM  5022113500
## 244              HAIL  3025954473
## 402         HURRICANE  2741910000
## 411 HURRICANE/TYPHOON  2607872800
## 153       FLASH FLOOD  1421317100
## 140      EXTREME COLD  1292973000
## 212      FROST/FREEZE  1094086000

Listing table for total damages :

TotalDamages
##               EVTYPE        total
## 1              FLOOD 150319678257
## 2  HURRICANE/TYPHOON  71913712800
## 3            TORNADO  57362333947
## 4        STORM SURGE  43323541000
## 5               HAIL  18761221986
## 6        FLASH FLOOD  18243991079
## 7            DROUGHT  15018672000
## 8          HURRICANE  14610229010
## 9        RIVER FLOOD  10148404500
## 10         ICE STORM   8967041360

Visualizing the results in graph format

par(mfrow=c(1,3))
barplot(PropDamages$ActPropDam, 
        names.arg = PropDamages$EVTYPE,
        las = 3,
        col = "blue",
        ylab = "Total Property Damage ($)",
        main = "Top 10 Events Causing \n Most Property Damages")
barplot(CropDamages$ActCropDam, 
        names.arg = CropDamages$EVTYPE,
        las = 3,
        col = "blue",
        ylab = "Total Crop Damage ($)",
        main = "Top 10 Events Causing \n Most Crop Damages")
barplot(TotalDamages$total, 
        names = TotalDamages$EVTYPE,
        las = 3,
        col = "red",
        ylab = "Total Damages ($)",
        main = "Top 10 Events Causing \n Most Total Damages")

The results show that flood, tornado and typhoon have caused the greatest damage to properties. On the other hand, drought and flood appear as the caused for the greatest damage to crops. As a whole, flood is identified as the weather event that contributed the most to the economic damages.