Synopsis

U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database begain tracking a standard set of 48 storm data events in 1996. After analyzing storm data events from 1996 to 2011 it was found that Hurricanes/Typhoons cause the most economic impact in relation to crop and property damage, while Tornados take the most population toll in regards to injuries and fatalities. This report contains the analysis of the data obtained from the NOAA repository. The report includes specifics of data retreival and preparing it for the analysis. The report also includes the analysis of resuls in two categories(1) Types of events that are most harmful to population health, (2) Types of events that have huge economic consequences.

Data Processing

Retrieval and Loading

The compressed data is conditionally downloaded from the source URL if not found locally and then loaded directly via read.csv.

#install.packages("dplyr")
library('dplyr')
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
filename <- 'StormData.csv.bz2'
if (!file.exists(filename)) {
download.file('https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2', filename)
}
storm_data <- read.csv(filename)
df <- storm_data

Health Impact

To evaluate the health impact, the total fatalities and the total injuries for each event type (EVTYPE) are calculated. The codes for this calculation are shown below.

df.fatalities <- df %>% select(EVTYPE, FATALITIES) %>% group_by(EVTYPE) %>% summarise(total.fatalities = sum(FATALITIES)) %>% arrange(-total.fatalities)
head(df.fatalities, 10)
## # A tibble: 10 x 2
##    EVTYPE         total.fatalities
##    <fct>                     <dbl>
##  1 TORNADO                    5633
##  2 EXCESSIVE HEAT             1903
##  3 FLASH FLOOD                 978
##  4 HEAT                        937
##  5 LIGHTNING                   816
##  6 TSTM WIND                   504
##  7 FLOOD                       470
##  8 RIP CURRENT                 368
##  9 HIGH WIND                   248
## 10 AVALANCHE                   224

The results above show above different event types and corresponding fatalities. For TORNADO has the largest number of fatalities 5633 in total. Now we compute the number of injuries caused by different events such as TORNADO, EXCESSIVE HEAT, FLASH FLOOD etc. shown in the table below. Only 10 rows are shown here.

df.injuries <- df %>% select(EVTYPE, INJURIES) %>% group_by(EVTYPE) %>% summarise(total.injuries = sum(INJURIES)) %>% arrange(-total.injuries)
head(df.injuries, 10)
## # A tibble: 10 x 2
##    EVTYPE            total.injuries
##    <fct>                      <dbl>
##  1 TORNADO                    91346
##  2 TSTM WIND                   6957
##  3 FLOOD                       6789
##  4 EXCESSIVE HEAT              6525
##  5 LIGHTNING                   5230
##  6 HEAT                        2100
##  7 ICE STORM                   1975
##  8 FLASH FLOOD                 1777
##  9 THUNDERSTORM WIND           1488
## 10 HAIL                        1361

Total injuries results shown above depict that TORNADO causes largest injuries 91346 in total followed by THUNDERSTORM wind 6957.

Economic Impact of Storms

To analyze the impact of weather events on the economy, available property damage and crop damage reportings/estimates were used. In the raw data, the property damage is represented with two fields, a number PROPDMG in dollars and the exponent PROPDMGEXP. Similarly, the crop damage is represented using two fields, CROPDMG and CROPDMGEXP. The first step in the analysis is to calculate the property and crop damage for each event. The code below computes the economic impact/damage measured in dollars.

df.damage <- df %>% select(EVTYPE, PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP)

Symbol <- sort(unique(as.character(df.damage$PROPDMGEXP)))
Multiplier <- c(0,0,0,1,10,10,10,10,10,10,10,10,10,10^9,10^2,10^2,10^3,10^6,10^6)
convert.Multiplier <- data.frame(Symbol, Multiplier)

df.damage$Prop.Multiplier <- convert.Multiplier$Multiplier[match(df.damage$PROPDMGEXP, convert.Multiplier$Symbol)]
df.damage$Crop.Multiplier <- convert.Multiplier$Multiplier[match(df.damage$CROPDMGEXP, convert.Multiplier$Symbol)]

df.damage <- df.damage %>% mutate(PROPDMG = PROPDMG*Prop.Multiplier) %>% mutate(CROPDMG = CROPDMG*Crop.Multiplier) %>% mutate(TOTAL.DMG = PROPDMG+CROPDMG)

df.damage.total <- df.damage %>% group_by(EVTYPE) %>% summarize(TOTAL.DMG.EVTYPE = sum(TOTAL.DMG))%>% arrange(-TOTAL.DMG.EVTYPE) 

head(df.damage.total,10)
## # A tibble: 10 x 2
##    EVTYPE            TOTAL.DMG.EVTYPE
##    <fct>                        <dbl>
##  1 FLOOD                 150319678250
##  2 HURRICANE/TYPHOON      71913712800
##  3 TORNADO                57352117607
##  4 STORM SURGE            43323541000
##  5 FLASH FLOOD            17562132111
##  6 DROUGHT                15018672000
##  7 HURRICANE              14610229010
##  8 RIVER FLOOD            10148404500
##  9 ICE STORM               8967041810
## 10 TROPICAL STORM          8382236550

The two columns EVTPE and TOTAL.DMG.EVTYPE in the table above depict total damages with respect to storm event types for ten largest damages. For example, Flood causes the highest damage $150 Billion (rounded). Exact value is shown in the table. Hurricane and typhoon are second in terms of damages amounting to 71 Billion.

Results

Health Impacts

In the sections below, we show varaition of fatalities due to different types using bar-plot. The code given below is used to create this plot.

library(ggplot2)
g <- ggplot(df.fatalities[1:10,], aes(x=reorder(EVTYPE, -total.fatalities), y=total.fatalities))+geom_bar(stat="identity") + theme(axis.text.x = element_text(angle=90, vjust=0.5, hjust=1))+ggtitle("Top 10 Events with Highest Total Fatalities") +labs(x="EVENT TYPE", y="Total Fatalities")
g

The top 10 events with the highest total fatalities and injuries are shown above graphically. In the plot below, we compute the highest total injuries with respect to different event types such as Tornado, Excessive heat, etc.

g <- ggplot(df.injuries[1:10,], aes(x=reorder(EVTYPE, -total.injuries), y=total.injuries))+geom_bar(stat="identity") + theme(axis.text.x = element_text(angle=90, vjust=0.5, hjust=1))+ggtitle("Top 10 Events with Highest Total Injuries") +labs(x="EVENT TYPE", y="Total Injuries")
g

The plot shows that higest injuries are caused by Tornados followed by Thunderstrom wind, Flood etc. Only top 10 event types are shown in this plot.

Economic Impacts

Now we consider depicting the variation of economic damage with respect to different storm events. The top 10 events with the highest total economic damages that consist of property and crop are shown below.

g <- ggplot(df.damage.total[1:10,], aes(x=reorder(EVTYPE, -TOTAL.DMG.EVTYPE), y=TOTAL.DMG.EVTYPE))+geom_bar(stat="identity") + theme(axis.text.x = element_text(angle=90, vjust=0.5, hjust=1))+ggtitle("Top 10 Events with Highest Economic Impact") +labs(x="EVENT TYPE", y="Total Economic Impact ($USD)")

g

The plot abobe makes it clear that highest economic damage is caused by Flooding, followed by Hurricane/Typhoon, and Tornado. These are the top three events that have largest economic impact as compared toother events such as Ice Storm and Tropical storm.

Conclusion

In this project, health and ecomic impacts of storms have been studied using the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. NOAA begain tracking a standard set of 48 storm data events in 1996. After analyzing storm data events from 1996 to 2011 it was found that Hurricanes/Typhoons cause the most economic impact in relation to crop and property damage, while Tornados take the most population toll in regards to injuries and fatalities. This report contains the analysis of the data obtained from the NOAA repository. The report included specifics of Data retreival and preparing it for the analysis. The report also included the analysis of resuls mainly in two categories(1) Types of events that are most harmful to population health, and (2) Types of events that have huge economic consequences.

Tornado events top the list in the analysis with over two and half times the health impact of second place, which is Excessive Heat. Excessive Heat is worth noting however due to the fact that even though it is far behind tornados in total health impact, but has the most fatalities overall.