Data Analysis for the Impact of Severe Weather Events


Synopsis

This report provides data analysis for different severe weather events. The information included in this report is offered by NOAA. The analysis in this report will cover the most harmful weather events with respect to population health, in addition to the weather events that have the greatest economic consequences. The main goal of this report is to prioritize resources for different types of weather events based on the above-mentioned criteria.


1. Data Processing

Loading necessary libraries

library(dplyr)
library(knitr)
library(readr)
library(ggplot2)

Reading Data File

a<-read.csv("StormData1.csv.bz2")

Adding Casualty for each Weather Event, which is the sum of fatalities and injuries

a<-mutate(a, cslty = FATALITIES + INJURIES)

Data processing and cleaning

##clean and pre-process the data

#1. fill the exponent columns and damgae columns with numbers apporpriate for calculations (i.e. zeros in exponent columns and NA's have to replaced with values valid for true mathmatical operations).

a$PROPDMGEXP[is.na(a$PROPDMGEXP)]<-1
a$PROPDMGEXP[a$PROPDMGEXP==0]<-1
a$PROPDMG[is.na(a$PROPDMG)]<-0
a$CROPDMGEXP[is.na(a$CROPDMGEXP)]<-1
a$CROPDMGEXP[a$CROPDMGEXP==0]<-1
a$CROPDMG[is.na(a$CROPDMG)]<-0

#2. Replace prefix multipliers letters with their correspendent number, to be used for mathmatical operations (exponent symbol letter cannot be used in mathmatical operations). 
a$CROPDMGEXP<-as.character(a$CROPDMGEXP)
a$CROPDMGEXP[a$CROPDMGEXP=="K"]<-"1000"
a$CROPDMGEXP[a$CROPDMGEXP=="k"]<-"1000"
a$CROPDMGEXP[a$CROPDMGEXP=="M"]<-"1000000"
a$CROPDMGEXP[a$CROPDMGEXP=="m"]<-"1000000"
a$CROPDMGEXP[a$CROPDMGEXP=="B"]<-"1000000000"
a$CROPDMGEXP[a$CROPDMGEXP=="H"]<-"100"
a$CROPDMGEXP[a$CROPDMGEXP=="h"]<-"100"
a$CROPDMGEXP[a$CROPDMGEXP=="0"]<-"1"
a$CROPDMGEXP<-as.numeric(a$CROPDMGEXP)




#3. Replace prefix multipliers letters with their correspendent number, to be used for mathmatical operations. 
a$PROPDMGEXP<-as.character(a$PROPDMGEXP)
a$PROPDMGEXP[a$PROPDMGEXP=="K"]<-"1000"
a$PROPDMGEXP[a$PROPDMGEXP=="k"]<-"1000"
a$PROPDMGEXP[a$PROPDMGEXP=="M"]<-"1000000"
a$PROPDMGEXP[a$PROPDMGEXP=="m"]<-"1000000"
a$PROPDMGEXP[a$PROPDMGEXP=="B"]<-"1000000000"
a$PROPDMGEXP[a$PROPDMGEXP=="H"]<-"100"
a$PROPDMGEXP[a$PROPDMGEXP=="h"]<-"100"
a$PROPDMGEXP<-as.numeric(a$PROPDMGEXP)

Calculating Total Damage (i.e. property + crop)

#Calculate toal property damage
mutate(a, propT=PROPDMGEXP*PROPDMG, na.rm=TRUE)
a<-mutate(a, propT=PROPDMGEXP*PROPDMG)

#Calculate toal crop damage
a$CROPDMGEXP<-as.numeric(a$CROPDMGEXP)
a<-mutate(a, cropT=CROPDMGEXP*CROPDMG)

#Calculate Total Damage
a<-mutate(a, totalDamage= cropT + propT)

2. RESULTS

A. Impact on Population Health

#The maximum casualty for a single weather event
max<-max(a$cslty)
max
## [1] 1742
#The weather event with the maximum number of casualty
ev<-a$EVTYPE[which.max(a$cslty)]
ev
## [1] TORNADO
## 977 Levels: ? ABNORMAL WARMTH ABNORMALLY DRY ... WND

The single weather event that has the highest fatality rate is TORNADO with 1742 casualties.


The top 10 casualty-causing severe weather events

g<-group_by(a, EVTYPE)%>%summarize(csum=sum(cslty)) 
a1<-g[order(-g$csum),]
head(a1$EVTYPE,10)
##  [1] TORNADO           EXCESSIVE HEAT    TSTM WIND        
##  [4] FLOOD             LIGHTNING         HEAT             
##  [7] FLASH FLOOD       ICE STORM         THUNDERSTORM WIND
## [10] WINTER STORM     
## 977 Levels: ? ABNORMAL WARMTH ABNORMALLY DRY ... WND

The top 10 mortality-causing severe weather events

g1<-group_by(a, EVTYPE)%>%summarize(fsum=sum(FATALITIES))
a2<-g1[order(-g1$fsum),]
head(a2$EVTYPE,10)
##  [1] TORNADO        EXCESSIVE HEAT FLASH FLOOD    HEAT          
##  [5] LIGHTNING      TSTM WIND      FLOOD          RIP CURRENT   
##  [9] HIGH WIND      AVALANCHE     
## 977 Levels: ? ABNORMAL WARMTH ABNORMALLY DRY ... WND

The top 10 injury-causing severe weather events

g2<-group_by(a, EVTYPE)%>%summarize(isum=sum(INJURIES))
a3<-g2[order(-g2$isum),]
head(a3$EVTYPE,10)
##  [1] TORNADO           TSTM WIND         FLOOD            
##  [4] EXCESSIVE HEAT    LIGHTNING         HEAT             
##  [7] ICE STORM         FLASH FLOOD       THUNDERSTORM WIND
## [10] HAIL             
## 977 Levels: ? ABNORMAL WARMTH ABNORMALLY DRY ... WND

Weather Events and Casualty Rate Table:

b<-a1[1:10,]
colnames(b)<-c("Event", "Casualty")
kable(b, caption = "Top 10 severe weather events in terms of casualty rate")
Top 10 severe weather events in terms of casualty rate
Event Casualty
TORNADO 96979
EXCESSIVE HEAT 8428
TSTM WIND 7461
FLOOD 7259
LIGHTNING 6046
HEAT 3037
FLASH FLOOD 2755
ICE STORM 2064
THUNDERSTORM WIND 1621
WINTER STORM 1527

Conclusion

Tornado is the most harmful weather events with respect to population health (both fatalities and injuries combined).


B. Economic Consequences

The top 10 weather events causing property damage

g5<-group_by(a, EVTYPE)%>%summarize(psum=sum(propT))
a5<-g5[order(-g5$psum),]
b5<-a5[1:10,]
b5
## # A tibble: 10 x 2
##    EVTYPE                    psum
##    <fct>                    <dbl>
##  1 FLOOD             144657709807
##  2 HURRICANE/TYPHOON  69305840000
##  3 STORM SURGE        43323536000
##  4 HURRICANE          11868319010
##  5 TROPICAL STORM      7703890550
##  6 WINTER STORM        6688497251
##  7 RIVER FLOOD         5118945500
##  8 WILDFIRE            4765114000
##  9 STORM SURGE/TIDE    4641188000
## 10 TSTM WIND           4493028495

The top 10 weather events causing crop damage

g6<-group_by(a, EVTYPE)%>%summarize(crsum=sum(cropT))
a6<-g6[order(-g6$crsum),]
b6<-a6[1:10,]
b6
## # A tibble: 10 x 2
##    EVTYPE                  crsum
##    <fct>                   <dbl>
##  1 DROUGHT           13972566000
##  2 FLOOD              5661968450
##  3 RIVER FLOOD        5029459000
##  4 ICE STORM          5022113500
##  5 HAIL               3025954473
##  6 HURRICANE          2741910000
##  7 HURRICANE/TYPHOON  2607872800
##  8 FLASH FLOOD        1421317100
##  9 EXTREME COLD       1292973000
## 10 FROST/FREEZE       1094086000

The top 10 weather events causing total damage

#Calculate Total Damage
a<-mutate(a, totalDamage= cropT + propT)
g7<-group_by(a, EVTYPE)%>%summarize(tsum=sum(totalDamage))
a7<-g7[order(-g7$tsum),]
b7<-a7[1:10,]
colnames(b7)<-c("Event", "Damage")
kable(b7, caption = "Top 10 severe weather events in terms of Total Damage")
Top 10 severe weather events in terms of Total Damage
Event Damage
FLOOD 150319678257
HURRICANE/TYPHOON 71913712800
STORM SURGE 43323541000
DROUGHT 15018672000
HURRICANE 14610229010
RIVER FLOOD 10148404500
ICE STORM 8967041360
TROPICAL STORM 8382236550
WINTER STORM 6715441251
WILDFIRE 5060586800

gg<-ggplot(b7, aes(Event, Damage/1000000))
gg+geom_point(size=3 ,color="red")+labs(x="event", y="Damage Cost in US milloin Dollars")+theme(axis.text.x = element_text(color="black", size=8, angle=30))

The top 10 damage cost caused by weather events for both crop damage and property damage (Number in milion dollars).


Conclusion

Flood has the greatest economic consequences (both property and crop damage combined).