Impact Of Natural Disasters On The Economy And The Public Health Of The United States

Storms and other severe weather events impacts both public health and economic activities for communities and governments. Some of the devastating events can result in fatalities, injuries, and property damage. Preventing such outcomes to the extent possible is a a major area of concern. This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

Synopsis

This report analyzes the NOAA storm database containing data on severe climate events. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. It was collected during the period from 1950 through 2011. The purpose of this analysis is to answer the following two questions:

Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health? Across the United States, which types of events have the greatest economic consequences? Main conclusions of the study are as follows: 1. Tornado is the most harmful climate event for population health with more than 5600 deaths and 91346 injuries. 2. Floods have caused the most significant economic damage ~150 billion USD.

Across the United States, which types of events are most harmful with respect to population health?

Loading Packages and setting up the working directory

setwd("/home/rstudio/Reproducible Research/week2")
library(grid)
library(ggplot2)
library(plyr)
require(gridExtra)
## Loading required package: gridExtra
## Data Processing
data <- read.csv("repdata_data_StormData1.csv")
str(data)
##  'data.frame':   902297 obs. of  37 variables:
##  $ STATE__   : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ BGN_DATE  : chr  "4/18/1950 0:00:00" "4/18/1950 0:00:00" "2/20/1951 0:00:00" "6/8/1951 0:00:00" ...
##  $ BGN_TIME  : chr  "0130" "0145" "1600" "0900" ...
##  $ TIME_ZONE : chr  "CST" "CST" "CST" "CST" ...
##  $ COUNTY    : num  97 3 57 89 43 77 9 123 125 57 ...
##  $ COUNTYNAME: chr  "MOBILE" "BALDWIN" "FAYETTE" "MADISON" ...
##  $ STATE     : chr  "AL" "AL" "AL" "AL" ...
##  $ EVTYPE    : chr  "TORNADO" "TORNADO" "TORNADO" "TORNADO" ...
##  $ BGN_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ BGN_AZI   : chr  "" "" "" "" ...
##  $ BGN_LOCATI: chr  "" "" "" "" ...
##  $ END_DATE  : chr  "" "" "" "" ...
##  $ END_TIME  : chr  "" "" "" "" ...
##  $ COUNTY_END: num  0 0 0 0 0 0 0 0 0 0 ...
##  $ COUNTYENDN: logi  NA NA NA NA NA NA ...
##  $ END_RANGE : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ END_AZI   : chr  "" "" "" "" ...
##  $ END_LOCATI: chr  "" "" "" "" ...
##  $ LENGTH    : num  14 2 0.1 0 0 1.5 1.5 0 3.3 2.3 ...
##  $ WIDTH     : num  100 150 123 100 150 177 33 33 100 100 ...
##  $ F         : int  3 2 2 2 2 2 2 1 3 3 ...
##  $ MAG       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ FATALITIES: num  0 0 0 0 0 0 0 0 1 0 ...
##  $ INJURIES  : num  15 0 2 2 2 6 1 0 14 0 ...
##  $ PROPDMG   : num  25 2.5 25 2.5 2.5 2.5 2.5 2.5 25 25 ...
##  $ PROPDMGEXP: chr  "K" "K" "K" "K" ...
##  $ CROPDMG   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ CROPDMGEXP: chr  "" "" "" "" ...
##  $ WFO       : chr  "" "" "" "" ...
##  $ STATEOFFIC: chr  "" "" "" "" ...
##  $ ZONENAMES : chr  "" "" "" "" ...
##  $ LATITUDE  : num  3040 3042 3340 3458 3412 ...
##  $ LONGITUDE : num  8812 8755 8742 8626 8642 ...
##  $ LATITUDE_E: num  3051 0 0 0 0 ...
##  $ LONGITUDE_: num  8806 0 0 0 0 ...
##  $ REMARKS   : chr  "" "" "" "" ...
##  $ REFNUM    : num  1 2 3 4 5 6 7 8 9 10 ...
summary(data)
##   STATE__       BGN_DATE           BGN_TIME          TIME_ZONE        
 ##  Min.   : 1.0   Length:902297      Length:902297      Length:902297     
 ##  1st Qu.:19.0   Class :character   Class :character   Class :character  
 ##  Median :30.0   Mode  :character   Mode  :character   Mode  :character  
 ##  Mean   :31.2                                                           
 ##  3rd Qu.:45.0                                                           
 ##  Max.   :95.0                                                           
 ##                                                                         
 ##    COUNTY       COUNTYNAME           STATE              EVTYPE         
 ##  Min.   :  0.0   Length:902297      Length:902297      Length:902297     
 ##  1st Qu.: 31.0   Class :character   Class :character   Class :character  
 ##  Median : 75.0   Mode  :character   Mode  :character   Mode  :character  
 ##  Mean   :100.6                                                           
 ##  3rd Qu.:131.0                                                           
 ##  Max.   :873.0                                                           
 ##                                                                          
 ##    BGN_RANGE          BGN_AZI           BGN_LOCATI          END_DATE        
 ##  Min.   :   0.000   Length:902297      Length:902297      Length:902297     
 ##  1st Qu.:   0.000   Class :character   Class :character   Class :character  
 ##  Median :   0.000   Mode  :character   Mode  :character   Mode  :character  
 ##  Mean   :   1.484                                                           
 ##  3rd Qu.:   1.000                                                           
 ##  Max.   :3749.000                                                           
 ##                                                                             
 ##    END_TIME           COUNTY_END COUNTYENDN       END_RANGE       
 ##  Length:902297      Min.   :0    Mode:logical   Min.   :  0.0000  
 ##  Class :character   1st Qu.:0    NA's:902297    1st Qu.:  0.0000  
 ##  Mode  :character   Median :0                   Median :  0.0000  
 ##                     Mean   :0                   Mean   :  0.9862  
 ##                     3rd Qu.:0                   3rd Qu.:  0.0000  
 ##                     Max.   :0                   Max.   :925.0000  
                                                                  
 ##    END_AZI           END_LOCATI            LENGTH              WIDTH         
 ##  Length:902297      Length:902297      Min.   :   0.0000   Min.   :   0.000  
 ##  Class :character   Class :character   1st Qu.:   0.0000   1st Qu.:   0.000  
 ##  Mode  :character   Mode  :character   Median :   0.0000   Median :   0.000  
 ##                                        Mean   :   0.2301   Mean   :   7.503  
 ##                                        3rd Qu.:   0.0000   3rd Qu.:   0.000  
 ##                                        Max.   :2315.0000   Max.   :4400.000  
 ##                                                                              
 ##        F               MAG            FATALITIES          INJURIES        
 ##  Min.   :0.0      Min.   :    0.0   Min.   :  0.0000   Min.   :   0.0000  
 ##  1st Qu.:0.0      1st Qu.:    0.0   1st Qu.:  0.0000   1st Qu.:   0.0000  
 ##  Median :1.0      Median :   50.0   Median :  0.0000   Median :   0.0000  
 ##  Mean   :0.9      Mean   :   46.9   Mean   :  0.0168   Mean   :   0.1557  
 ##  3rd Qu.:1.0      3rd Qu.:   75.0   3rd Qu.:  0.0000   3rd Qu.:   0.0000  
 ##  Max.   :5.0      Max.   :22000.0   Max.   :583.0000   Max.   :1700.0000  
 ##  NA's   :843563                                                           
 ##     PROPDMG         PROPDMGEXP           CROPDMG         CROPDMGEXP       
 ##  Min.   :   0.00   Length:902297      Min.   :  0.000   Length:902297     
 ##  1st Qu.:   0.00   Class :character   1st Qu.:  0.000   Class :character  
 ##  Median :   0.00   Mode  :character   Median :  0.000   Mode  :character  
 ##  Mean   :  12.06                      Mean   :  1.527                     
 ##  3rd Qu.:   0.50                      3rd Qu.:  0.000                     
 ##  Max.   :5000.00                      Max.   :990.000                     
 ##                                                                           
 ##      WFO             STATEOFFIC         ZONENAMES            LATITUDE   
 ##  Length:902297      Length:902297      Length:902297      Min.   :   0  
 ##  Class :character   Class :character   Class :character   1st Qu.:2802  
 ##  Mode  :character   Mode  :character   Mode  :character   Median :3540  
 ##                                                           Mean   :2875  
 ##                                                           3rd Qu.:4019  
 ##                                                           Max.   :9706  
 ##                                                           NA's   :47    
 ##    LONGITUDE        LATITUDE_E     LONGITUDE_       REMARKS         
 ##  Min.   :-14451   Min.   :   0   Min.   :-14455   Length:902297     
 ##  1st Qu.:  7247   1st Qu.:   0   1st Qu.:     0   Class :character  
 ##  Median :  8707   Median :   0   Median :     0   Mode  :character  
 ##  Mean   :  6940   Mean   :1452   Mean   :  3509                     
 ##  3rd Qu.:  9605   3rd Qu.:3549   3rd Qu.:  8735                     
 ##  Max.   : 17124   Max.   :9706   Max.   :106220                     
 ##                   NA's   :40                                        
 ##      REFNUM      
 ##  Min.   :     1  
 ##  1st Qu.:225575  
 ##  Median :451149  
 ##  Mean   :451149  
 ##  3rd Qu.:676723  
 ##  Max.   :902297  

Since we do not need all the columns, let us select the relevant columns

reduceddata <-data[ , c(8, 23:28)]

head(reduceddata)
##   EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO          0       15    25.0          K       0           
## 2 TORNADO          0        0     2.5          K       0           
## 3 TORNADO          0        2    25.0          K       0           
## 4 TORNADO          0        2     2.5          K       0           
## 5 TORNADO          0        2     2.5          K       0           
## 6 TORNADO          0        6     2.5          K       0  

Next, we aggregate fatalities and injuries to assess the harm that different events caused with respect to population health. We have taken top 20 harmful events sorted in descending order.

harmfulevent<-aggregate(cbind(FATALITIES,INJURIES) ~ EVTYPE, data = reduceddata, sum, na.rm=TRUE)
harmfulevent<-arrange(harmfulevent, desc(FATALITIES+INJURIES))
harmfulevent<-harmfulevent[1:15,]
harmfulevent
##               EVTYPE FATALITIES INJURIES
## 1            TORNADO       5633    91346
## 2     EXCESSIVE HEAT       1903     6525
## 3          TSTM WIND        504     6957
## 4              FLOOD        470     6789
## 5          LIGHTNING        816     5230
## 6               HEAT        937     2100
## 7        FLASH FLOOD        978     1777
## 8          ICE STORM         89     1975
## 9  THUNDERSTORM WIND        133     1488
## 10      WINTER STORM        206     1321
## 11         HIGH WIND        248     1137
## 12              HAIL         15     1361
## 13 HURRICANE/TYPHOON         64     1275
## 14        HEAVY SNOW        127     1021
## 15          WILDFIRE         75      911

We plot the event data to analyze the damage caused to population health

names_events <- harmfulevent$EVTYPE
barplot(t(harmfulevent[,-1]), names.arg = names_events, ylim = c(0,95000), beside = T, cex.names = 0.8, las=2, col = c("yellow", "orange"), main="Top Disaster Casualties")
legend("topright",c("Fatalities","Injuries"),fill=c("yellow","orange"),bty = "n")

We can see from the table and barplot that maximum damage both in terms of injury and fatality i.e.popoulation health harm is caused by Tornadoes.

Across the United States, which types of events have the greatest economic consequences?

Data Processing

table(reduceddata$PROPDMGEXP)
##             -      ?      +      0      1      2      3      4      5      6 
## 465934      1      8      5    216     25     13      4      4     28      4 
##      7      8      B      h      H      K      m      M 
##      5      1     40      1      6 424665      7  11330 
table(reduceddata$CROPDMGEXP)
## 
##             ?      0      2      B      k      K      m      M 
## 618413      7     19      1      9     21 281832      1   1994

We need to convert property and crop damage into numbers where H=10^2, K=10^3, M =10^6, and B=10^9. For this, we create two new variables: propvalue, cropvalue. Assign “O” to NA values

reduceddata$propFactor<-factor(reduceddata$PROPDMGEXP,levels=c("H","K","M","B","h","m","O"))
reduceddata$propFactor[is.na(reduceddata$propFactor)] <- "O"
table(reduceddata$propFactor)

## 
##      H      K      M      B      h      m      O 
##      6 424665  11330     40      1      7 466248
reduceddata$cropFactor<-factor(reduceddata$CROPDMGEXP,levels=c("K","M","B","k","m","O"))
reduceddata$cropFactor[is.na(reduceddata$cropFactor)] <- "O"
table(reduceddata$cropFactor)
## 
##      K      M      B      k      m      O 
## 281832   1994      9     21      1 618440
reduceddata<- mutate(reduceddata,propvalue= 0, cropvalue=0)
reduceddata$propvalue[reduceddata$propFactor=="K"]<-reduceddata$PROPDMG[reduceddata$propFactor=="K"]*1000
reduceddata$propvalue[reduceddata$propFactor=="H"|reduceddata$propFactor=="h"]<-reduceddata$PROPDMG[reduceddata$propFactor=="H"|reduceddata$propFactor=="h"]*100
reduceddata$propvalue[reduceddata$propFactor=="M"|reduceddata$propFactor=="m"]<-reduceddata$PROPDMG[reduceddata$propFactor=="M"|reduceddata$propFactor=="m"]*1e6
reduceddata$propvalue[reduceddata$propFactor=="B"]<-reduceddata$PROPDMG[reduceddata$propFactor=="B"]*1e9
reduceddata$propvalue[reduceddata$propFactor=="O"]<- reduceddata$PROPDMG[reduceddata$propFactor=="O"]*1
reduceddata$cropvalue[reduceddata$cropFactor=="K"|reduceddata$cropFactor=="k"]<-reduceddata$CROPDMG[reduceddata$cropFactor=="K"|reduceddata$cropFactor=="k"]*1000
reduceddata$cropvalue[reduceddata$cropFactor=="M"|reduceddata$cropFactor=="m"]<-reduceddata$CROPDMG[reduceddata$cropFactor=="M"|reduceddata$cropFactor=="m"]*1e6
reduceddata$cropvalue[reduceddata$cropFactor=="B"]<-reduceddata$CROPDMG[reduceddata$cropFactor=="B"]*1e9
reduceddata$cropvalue[reduceddata$cropFactor=="O"]<-reduceddata$CROPDMG[reduceddata$cropFactor=="O"]*1

Next, we aggregate crop and property damage to assess the harm that different events caused. We have taken top 20 harmful events sorted in descending order.

economic_dmg<-aggregate(propvalue + cropvalue~ EVTYPE, data = reduceddata, sum, na.rm=TRUE)
names(economic_dmg) = c("EVENT", "TOTAL_DAMAGE")
economic_dmg<-arrange(economic_dmg, desc(TOTAL_DAMAGE))
economic_dmg<-economic_dmg[1:20,]
economic_dmg$TOTAL_DAMAGE <- economic_dmg$TOTAL_DAMAGE/10^9
economic_dmg$EVENT <- factor(economic_dmg$EVENT, levels = economic_dmg$EVENT)
head(economic_dmg)
##               EVENT TOTAL_DAMAGE
## 1             FLOOD    150.31968
## 2 HURRICANE/TYPHOON     71.91371
## 3           TORNADO     57.35211
## 4       STORM SURGE     43.32354
## 5              HAIL     18.75822
## 6       FLASH FLOOD     17.56213

Plotting the result to analyze

with(economic_dmg, barplot(TOTAL_DAMAGE, names.arg = EVENT, beside = T, cex.names = 0.8, las=2, col = "light blue", main = "Total Property & Crop Damage (Top 20)", ylab = "Total Damage in USD (10^9)"))

We can observe from the table and bar plot that maximum damage is caused by Flood.