What Types of Weather Events are the Most Dangerous?

Reproducible Research: Peer Assessment 2

By: Nicholas Dell’Omo


Synopsis

Storms and other severe weather events can cause both economic and public heath issues for cities and communities around the United States. Many severe weather events can cause property damage, injuries or even fatalities. Preventing such outcomes should be a major concern of both local, state and federal governments.

The purpose of this analysis is to explore data from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The goal will be to answer the simple question, what types of weather events care the most dangerous? To determine which storms in fact are the most dangerous, this project will explore two simple questions:

  1. Across the United States, which types of events are most harmful with respect to population health?
  2. Across the United States, which types of events have the greatest economic consequences?

Data Processing

These are the required libraries to run the code.

library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Setting working directory and reading in storm data.

setwd("~/Documents/Coursera/RR")
storm<-read.csv("repdata-data-StormData.csv",stringsAsFactors=F)

These are the relevant variables for this project:
EVTYPE - as event type name (e.g. tornado, flood,)
FATALITIES - as a number of fatalities occured (Related to harm to human body)
INJURIES - as a number of injuries occured (Related to harm to human body)
PROPDMG - as a measure of property damage in USD
PROPDMGEXP - as a measure of magnitude of property damage (e.g. thousands, millions USD, etc.)
CROPDMG - as a measure of crop damage in USD
CROPDMGEXP - as a measure of magnitude of crop damage (e.g. thousands, millions USD, etc.)

Subsetting the relevant variables to shrink the size of my data set.

storm<-select(storm,EVTYPE,FATALITIES,INJURIES,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP)

Adding together the number of fatalities and injuries in the data.

storm$health=storm$INJURIES+storm$FATALITIES

Cleaning up missing values and changing the measures of magnitude into numbers to easly determine the dollar amount later. I will replace all the missing vaules with 0 and change the mangintues to numbers. Lastly I will add all the money together.

storm$PROPDMGEXP<-ifelse(storm$PROPDMGEXP %in% c("","-","?","+"),"0",storm$PROPDMGEXP)
storm$PROPDMGEXP<-ifelse(storm$PROPDMGEXP %in% c("H","h"), "2",storm$PROPDMGEXP)
storm$PROPDMGEXP<-ifelse(storm$PROPDMGEXP %in% c("K","k"), "3",storm$PROPDMGEXP)
storm$PROPDMGEXP<-ifelse(storm$PROPDMGEXP %in% c("M","m"), "6",storm$PROPDMGEXP)
storm$PROPDMGEXP<-ifelse(storm$PROPDMGEXP %in% c("B","b"), "9",storm$PROPDMGEXP)

storm$CROPDMGEXP<-ifelse(storm$CROPDMGEXP %in% c("","-","?","+"),"0",storm$CROPDMGEXP)
storm$CROPDMGEXP<-ifelse(storm$CROPDMGEXP %in% c("H","h"), "2",storm$CROPDMGEXP)
storm$CROPDMGEXP<-ifelse(storm$CROPDMGEXP %in% c("K","k"), "3",storm$CROPDMGEXP)
storm$CROPDMGEXP<-ifelse(storm$CROPDMGEXP %in% c("M","m"), "6",storm$CROPDMGEXP)
storm$CROPDMGEXP<-ifelse(storm$CROPDMGEXP %in% c("B","b"), "9",storm$CROPDMGEXP)

storm$PROPDMGEXP<-as.numeric(storm$PROPDMGEXP)
storm$CROPDMGEXP<-as.numeric(storm$CROPDMGEXP)

storm$prop<-storm$PROPDMG * 10^storm$PROPDMGEXP
storm$crop<-storm$CROPDMG * 10^storm$CROPDMGEXP
storm$moneys<-storm$prop + storm$crop

Results

1. Across the United States, which types of events are most harmful with respect to population health?

Here is the code needed to answer the above question.

health<-storm %>%
      group_by(EVTYPE) %>%
      summarise(Total.Health=sum(health))
health<-arrange(health,-Total.Health)

The top 10 most harmful events with repsect to population health are listed below in a table and in a plot:

head(health,n=10)
## Source: local data frame [10 x 2]
## 
##               EVTYPE Total.Health
## 1            TORNADO        96979
## 2     EXCESSIVE HEAT         8428
## 3          TSTM WIND         7461
## 4              FLOOD         7259
## 5          LIGHTNING         6046
## 6               HEAT         3037
## 7        FLASH FLOOD         2755
## 8          ICE STORM         2064
## 9  THUNDERSTORM WIND         1621
## 10      WINTER STORM         1527
head.heath<-head(health,n=10)
ggplot(data = head.heath, aes(x = head.heath$EVTYPE,y = head.heath$Total.Health)) + 
      geom_bar(stat = "identity",color="navyblue",fill="darkgray") +
      theme_bw() +
      theme(axis.text.x = element_text(angle = 33, hjust = 1))+
      xlab("Weather Type") +
      ylab("Number of Health Issues") + 
      ggtitle("Top Weather Related Health Issues in the USA")

plot of chunk unnamed-chunk-7

Moral of the story, don’t live anywhere near tornados!!!

2. Across the United States, which types of events have the greatest economic consequences?

Here is the code needed to answer the above question.

money<-storm %>%
      group_by(EVTYPE) %>%
      summarise(Total.Money=sum(moneys))
money<-arrange(money,-Total.Money)

Listed below are the top 10 weather events that cause the greatest economci consquences along with a pretty bar chart:

head(money,n=10)
## Source: local data frame [10 x 2]
## 
##               EVTYPE Total.Money
## 1              FLOOD   1.503e+11
## 2  HURRICANE/TYPHOON   7.191e+10
## 3            TORNADO   5.736e+10
## 4        STORM SURGE   4.332e+10
## 5               HAIL   1.876e+10
## 6        FLASH FLOOD   1.824e+10
## 7            DROUGHT   1.502e+10
## 8          HURRICANE   1.461e+10
## 9        RIVER FLOOD   1.015e+10
## 10         ICE STORM   8.967e+09
head.money<-head(money,n=10)
ggplot(data = head.money, aes(x = head.money$EVTYPE,y = head.money$Total.Money)) + 
      geom_bar(stat = "identity",color="navyblue",fill="darkgray") +
      theme_bw() +
      theme(axis.text.x = element_text(angle = 33, hjust = 1))+
      xlab("Weather Type") +
      ylab("Cost of Damange ($)") + 
      ggtitle("Top Weather Related Damanges ($) in the USA")

plot of chunk unnamed-chunk-9

Well, now it looks like floods cost the most. Hmm, our next assignment should be determining the safest and cheapest place to live in the United States. Till then, good luck out there!