Synopsis

Economic and population health effects from severe weather events in the USA (exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database) Different weather events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern. This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

Load Data

This is an R Markdown document contains the Peer-graded Assignment: Course Project 2. Step 1 You will need to download & save in a specific folder, see: https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2.

library(readr)
## Warning: package 'readr' was built under R version 3.3.3
df <- read_csv("C:/Users/gbennett/Dropbox/Data Scientists/5.Reproducible_Research/project/repdata_data_StormData.csv")
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   STATE__ = col_double(),
##   COUNTY = col_double(),
##   BGN_RANGE = col_double(),
##   COUNTY_END = col_double(),
##   END_RANGE = col_double(),
##   LENGTH = col_double(),
##   WIDTH = col_double(),
##   F = col_integer(),
##   MAG = col_double(),
##   FATALITIES = col_double(),
##   INJURIES = col_double(),
##   PROPDMG = col_double(),
##   CROPDMG = col_double(),
##   LATITUDE = col_double(),
##   LONGITUDE = col_double(),
##   LATITUDE_E = col_double(),
##   LONGITUDE_ = col_double(),
##   REFNUM = col_double()
## )
## See spec(...) for full column specifications.

Data Processing

To evaluate the health impact, the total fatalities and the total injuries for each event type (EVTYPE) are calculated. The codes for this calculation are:

## Warning: package 'dplyr' was built under R version 3.3.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Warning: package 'bindrcpp' was built under R version 3.3.3
## # A tibble: 10 x 2
##            EVTYPE total.fatalities
##             <chr>            <dbl>
##  1        TORNADO             5633
##  2 EXCESSIVE HEAT             1903
##  3    FLASH FLOOD              978
##  4           HEAT              937
##  5      LIGHTNING              816
##  6      TSTM WIND              504
##  7          FLOOD              470
##  8    RIP CURRENT              368
##  9      HIGH WIND              248
## 10      AVALANCHE              224
## # A tibble: 10 x 2
##               EVTYPE total.injuries
##                <chr>          <dbl>
##  1           TORNADO          91346
##  2         TSTM WIND           6957
##  3             FLOOD           6789
##  4    EXCESSIVE HEAT           6525
##  5         LIGHTNING           5230
##  6              HEAT           2100
##  7         ICE STORM           1975
##  8       FLASH FLOOD           1777
##  9 THUNDERSTORM WIND           1488
## 10              HAIL           1361

Then the numeric values CROPDMG and PROPDMG were multiplied by their respective numeric values as presented by CROPDMGEXP and PROPDMGEXP columns respectively. Economic impact:

## Warning in cbind(Multiplier, Symbol): number of rows of result is not a
## multiple of vector length (arg 2)

Results

Health Impact The top 10 events with the highest total fatalities and injuries are shown graphically.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.3.3
HTF <- ggplot(df.fatalities[1:10,], colour="red",aes(x=reorder(EVTYPE, -total.fatalities), y=total.fatalities))+geom_bar(stat="identity") + theme(axis.text.x = element_text(angle=90, vjust=0.5,colour="blue", hjust=1))+ggtitle("Top 10 Events with Highest Total Fatalities") +labs(x="EVENT TYPE", y="Total Fatalities",col = heat.colors(10))
HTI <- ggplot(df.injuries[1:10,], aes(x=reorder(EVTYPE, -total.injuries), y=total.injuries))+geom_bar(stat="identity") + theme(axis.text.x = element_text(angle=90, vjust=0.5,colour="blue", hjust=1))+ggtitle("Top 10 Events with Highest Total Injuries") +labs(x="EVENT TYPE", y="Total Injuries")
HTF

HTI