The impacts of severe weather events on both economic and population health in the USA based on the NOAA Storm Database.

1. Synopsis

This analysis explores the National Oceanic and Atmospheric Administration (NOAA) storm database. The database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage during. This report will analyse the data to discover which weather events have the greatest economic and public health impact.

The data and documentaion for this report. Data: Storm Data 47Mb] Documentation: National Weather Service Storm Data Documentation

2. Data Processing

Load Library

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Load the data. Data must be in working directory:

NOAA_data <- read.csv("repdata-data-StormData.csv.bz2")

View the data variables

names(NOAA_data)
##  [1] "STATE__"    "BGN_DATE"   "BGN_TIME"   "TIME_ZONE"  "COUNTY"    
##  [6] "COUNTYNAME" "STATE"      "EVTYPE"     "BGN_RANGE"  "BGN_AZI"   
## [11] "BGN_LOCATI" "END_DATE"   "END_TIME"   "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE"  "END_AZI"    "END_LOCATI" "LENGTH"     "WIDTH"     
## [21] "F"          "MAG"        "FATALITIES" "INJURIES"   "PROPDMG"   
## [26] "PROPDMGEXP" "CROPDMG"    "CROPDMGEXP" "WFO"        "STATEOFFIC"
## [31] "ZONENAMES"  "LATITUDE"   "LONGITUDE"  "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS"    "REFNUM"

Exploring the property damage and crop damage variables. Using the column recording a multiplier for each observation. The columns are: PROPDMGEXP and CROPDMGEXP. In this two columns we have abbreviated multipliers H (Hundred), K (Thousand), M (Million) and B (Billion).

## Cleaning PROPDMG Column
NOAA_data$PROPDMGEXP <- gsub("[Hh]", "2", NOAA_data$PROPDMGEXP)
NOAA_data$PROPDMGEXP <- gsub("[Kk]", "3", NOAA_data$PROPDMGEXP)
NOAA_data$PROPDMGEXP <- gsub("[Mm]", "6", NOAA_data$PROPDMGEXP)
NOAA_data$PROPDMGEXP <- gsub("[Bb]", "9", NOAA_data$PROPDMGEXP)
NOAA_data$PROPDMGEXP <- gsub("\\+|\\-|\\?\\ ", "0",  NOAA_data$PROPDMGEXP)
NOAA_data$PROPDMGEXP <- as.numeric(NOAA_data$PROPDMGEXP)
## Warning: NAs introduced by coercion
NOAA_data$PROPDMGEXP[is.na(NOAA_data$PROPDMGEXP)] <- 0

## Cleaning CROPDMGEXP Column
NOAA_data$CROPDMGEXP <- gsub("[Hh]", "2", NOAA_data$CROPDMGEXP)
NOAA_data$CROPDMGEXP <- gsub("[Kk]", "3", NOAA_data$CROPDMGEXP)
NOAA_data$CROPDMGEXP <- gsub("[Mm]", "6", NOAA_data$CROPDMGEXP)
NOAA_data$CROPDMGEXP <- gsub("[Bb]", "9", NOAA_data$CROPDMGEXP)
NOAA_data$CROPDMGEXP <- gsub("\\+|\\-|\\?\\ ", "0", NOAA_data$CROPDMGEXP)
NOAA_data$CROPDMGEXP <- as.numeric(NOAA_data$CROPDMGEXP)
## Warning: NAs introduced by coercion
NOAA_data$CROPDMGEXP[is.na(NOAA_data$CROPDMGEXP)] <- 0

Creating two new variables (PROPDMGVAR and CROPDMGVAR) containg the total values of property and crop damages.

## Cleaning some columns of interest
NOAA_data <- mutate(NOAA_data, PROPDMGVAR = PROPDMG * (10 ^ PROPDMGEXP), CROPDMGVAR = CROPDMG * 
(10 ^ CROPDMGEXP))

Results

1. Across the United States, which types of events are most harmful with respect to population health?

Summarise the varaibles (Fatalities and Injuries) by the type of weather event

POP <- summarise(group_by(NOAA_data, EVTYPE), TOTAL_FATALITIES = sum(FATALITIES), TOTAL_INJURIES = sum(INJURIES))

Create a variable of Total Injuries and Fatalities

POP <- mutate(POP, TOTAL_LOSS = TOTAL_FATALITIES + TOTAL_INJURIES)

Ouput showing (in descending order) total fatalities by weather event type

arrange(POP, desc(TOTAL_FATALITIES))[1:10, 1:2]
## Source: local data frame [10 x 2]
## 
##            EVTYPE TOTAL_FATALITIES
##            (fctr)            (dbl)
## 1         TORNADO             5633
## 2  EXCESSIVE HEAT             1903
## 3     FLASH FLOOD              978
## 4            HEAT              937
## 5       LIGHTNING              816
## 6       TSTM WIND              504
## 7           FLOOD              470
## 8     RIP CURRENT              368
## 9       HIGH WIND              248
## 10      AVALANCHE              224

Ouput showing (in descending order) total injuries by weather event type

arrange(POP, desc(TOTAL_INJURIES))[1:10, c(1, 3)]
## Source: local data frame [10 x 2]
## 
##               EVTYPE TOTAL_INJURIES
##               (fctr)          (dbl)
## 1            TORNADO          91346
## 2          TSTM WIND           6957
## 3              FLOOD           6789
## 4     EXCESSIVE HEAT           6525
## 5          LIGHTNING           5230
## 6               HEAT           2100
## 7          ICE STORM           1975
## 8        FLASH FLOOD           1777
## 9  THUNDERSTORM WIND           1488
## 10              HAIL           1361

Ouput showing (in descending order) total injuries & total fatalities by weather event type

arrange(POP, desc(TOTAL_LOSS))[1:10, c(1, 4)]
## Source: local data frame [10 x 2]
## 
##               EVTYPE TOTAL_LOSS
##               (fctr)      (dbl)
## 1            TORNADO      96979
## 2     EXCESSIVE HEAT       8428
## 3          TSTM WIND       7461
## 4              FLOOD       7259
## 5          LIGHTNING       6046
## 6               HEAT       3037
## 7        FLASH FLOOD       2755
## 8          ICE STORM       2064
## 9  THUNDERSTORM WIND       1621
## 10      WINTER STORM       1527

A Plot showing the total injuries and fatalities

TOTAL_POP <- arrange(POP, desc(TOTAL_LOSS))[1:10, c(1, 4)]
par(mar=c(11,5,1,1))
barplot(height = TOTAL_POP$TOTAL_LOSS, names.arg = TOTAL_POP$EVTYPE, main = 'Fatalities', las=2)

2. Across the United States, which types of events have the greatest economic consequences

Summarise the varaibles (Property and Crop Damage) by the type of weather event

ECON <- summarise(group_by(NOAA_data, EVTYPE), TOTAL_PROPDMG = sum(PROPDMGVAR), TOTAL_CROPDMG = sum(CROPDMGVAR))

Create a variable of total property and crop damage

ECON <- mutate(ECON, TOTAL_ECON_LOSS = TOTAL_PROPDMG + TOTAL_CROPDMG)

Ouput showing (in descending order) total property damage by weather event type par(mar=c(10,5,1,1))

arrange(ECON, desc(TOTAL_PROPDMG))[1:10, 1:2]
## Source: local data frame [10 x 2]
## 
##               EVTYPE TOTAL_PROPDMG
##               (fctr)         (dbl)
## 1              FLOOD  144657709807
## 2  HURRICANE/TYPHOON   69305840000
## 3            TORNADO   56947380676
## 4        STORM SURGE   43323536000
## 5        FLASH FLOOD   16822673978
## 6               HAIL   15735267513
## 7          HURRICANE   11868319010
## 8     TROPICAL STORM    7703890550
## 9       WINTER STORM    6688497251
## 10         HIGH WIND    5270046295

Ouput showing (in descending order) total crop damage by weather event type

arrange(ECON, desc(TOTAL_CROPDMG))[1:10, c(1, 3)]
## Source: local data frame [10 x 2]
## 
##               EVTYPE TOTAL_CROPDMG
##               (fctr)         (dbl)
## 1            DROUGHT   13972566000
## 2              FLOOD    5661968450
## 3        RIVER FLOOD    5029459000
## 4          ICE STORM    5022113500
## 5               HAIL    3025954473
## 6          HURRICANE    2741910000
## 7  HURRICANE/TYPHOON    2607872800
## 8        FLASH FLOOD    1421317100
## 9       EXTREME COLD    1292973000
## 10      FROST/FREEZE    1094086000

Ouput showing (in descending order) total property and crop by weather event type

arrange(ECON, desc(TOTAL_ECON_LOSS))[1:10, c(1, 4)]
## Source: local data frame [10 x 2]
## 
##               EVTYPE TOTAL_ECON_LOSS
##               (fctr)           (dbl)
## 1              FLOOD    150319678257
## 2  HURRICANE/TYPHOON     71913712800
## 3            TORNADO     57362333946
## 4        STORM SURGE     43323541000
## 5               HAIL     18761221986
## 6        FLASH FLOOD     18243991078
## 7            DROUGHT     15018672000
## 8          HURRICANE     14610229010
## 9        RIVER FLOOD     10148404500
## 10         ICE STORM      8967041360

A Plot showing the total injuries and fatalities

TOTAL_ECON <- arrange(ECON, desc(TOTAL_ECON_LOSS))[1:10, c(1, 4)]
par(mar=c(11,5,1,1))
barplot(height = TOTAL_ECON$TOTAL_ECON_LOSS, names.arg = TOTAL_ECON$EVTYPE, main = 'Property Damage', las=2)

Summary of Results

This analysis shows weather events that cause the greatest public health and economic problems are:

  1. The weather events most harmful to public health are Tornadoes
  2. The weather events most severe economic consequences are Floods