Impact analysis of wheather events’ in the US

Synopsis

This study aims to answer following questions about severe wheather events in US:

  • Which type of wheather events in US are most harmful with respect to population health?
  • Which type of wheather events in US have the greatest economic consequences?

In order to answer these two questions NOAA Storm database is used and processed using R code described below.

Data Processing

First we are going to include all necessary packages and download the NOAA dataset from the website.

download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",destfile = "storm_data.csv.bz2")
data <- read.csv("storm_data.csv.bz2")

Subsequently, we are going to group the dataset according to the type of wheather events and calculate the total number of injuries and fatalties for each wheather event. Likewise, similar calculation is done to calculate total economical damage that is result of damage to the crops and properties.

library(dplyr)
## Warning: package 'dplyr' was built under R version 3.5.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
summary_damage <- data %>% group_by(EVTYPE) %>% summarize(Total_Injuries = sum(INJURIES),Total_Fatalities = sum(FATALITIES))

injuries <- summary_damage[order(summary_damage$Total_Injuries,decreasing = TRUE),]
fatalities <- summary_damage[order(summary_damage$Total_Fatalities,decreasing = TRUE),]

injuries_max <- injuries[1:5,]
fatalities_max <- fatalities[1:5,]
summary_damage$total <- summary_damage$Total_Injuries + summary_damage$Total_Fatalities
summary_eco_damage <- data %>% group_by(EVTYPE) %>% summarize(Crop_Dmg = sum(CROPDMG),Prop_Dmg = sum(PROPDMG))
summary_eco_damage$Total_Dmg <- summary_eco_damage$Crop_Dmg + summary_eco_damage$Prop_Dmg
eco_dmg <- summary_eco_damage[order(summary_eco_damage$Total_Dmg,decreasing = TRUE),]

Results

In this section we are going to look at the results. First, lets have a look at wheater events that results with top 5 injuries.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.5.3
head(injuries_max)
## # A tibble: 5 x 3
##   EVTYPE         Total_Injuries Total_Fatalities
##   <fct>                   <dbl>            <dbl>
## 1 TORNADO                 91346             5633
## 2 TSTM WIND                6957              504
## 3 FLOOD                    6789              470
## 4 EXCESSIVE HEAT           6525             1903
## 5 LIGHTNING                5230              816
ggplot(data=injuries_max,aes(x=factor(EVTYPE),y=Total_Injuries)) + geom_bar(stat="identity") + coord_flip() + ylab("Total Number of Injuries") + xlab("Event Type") + theme_classic()

We can see that on top of the list is Tornado, followed by thunderstrom wind, and flood. Now lets see which events result in top 5 fatalities.

head(fatalities_max)
## # A tibble: 5 x 3
##   EVTYPE         Total_Injuries Total_Fatalities
##   <fct>                   <dbl>            <dbl>
## 1 TORNADO                 91346             5633
## 2 EXCESSIVE HEAT           6525             1903
## 3 FLASH FLOOD              1777              978
## 4 HEAT                     2100              937
## 5 LIGHTNING                5230              816
ggplot(data=fatalities_max,aes(x=factor(EVTYPE),y=Total_Fatalities)) + geom_bar(stat="identity") + coord_flip() + ylab("Total Number of Injuries") + theme_classic() + xlab("Event Type")

Just as before on top of the list is tornado, followed by excessive heat.

Let’s have a look now events with most economic impact.

head(eco_dmg)
## # A tibble: 6 x 4
##   EVTYPE            Crop_Dmg Prop_Dmg Total_Dmg
##   <fct>                <dbl>    <dbl>     <dbl>
## 1 TORNADO            100019. 3212258.  3312277.
## 2 FLASH FLOOD        179200. 1420125.  1599325.
## 3 TSTM WIND          109203. 1335966.  1445168.
## 4 HAIL               579596.  688693.  1268290.
## 5 FLOOD              168038.  899938.  1067976.
## 6 THUNDERSTORM WIND   66791.  876844.   943636.
ggplot(data=eco_dmg[1:5,],aes(x=factor(EVTYPE),y=Total_Dmg)) + geom_bar(stat="identity") + coord_flip() + ylab("Total Economical Damage") + theme_classic() + xlab("Event Type")

From the graph above we can see that tornados create most economic damage, followed by flash floods and thunderstorm wind.