NOAA Storm Data Analysis

Synopsis

url <- 'https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2'

This report analyzes NOAA Storm Data from 1950 to 2011 to determine the impacts of various types of events. The report will look at impact to population health as well as economic impact for various event types.

Data Processing

zipfile <- 'repdata-data-StormData.csv.bz2'
csvfile <- 'repdata-data-StormData.csv'

Download Data

Program will automatically download data file and unzip it into a CSV file.

However, this can be done by hand by these steps: 1. Download data from here to create local file repdata-data-StormData.csv.bz2 2. Unzip file repdata-data-StormData.csv.bz2 to create repdata-data-StormData.csv

if (!file.exists(csvfile)) {
  if (!file.exists(zipfile)) {
    download.file(url, destfile=zipfile)
  }
  library(R.utils, quietly=T)  
  bunzip2(zipfile, csvfile)
}

if (!file.exists(csvfile)) {
  stop("Error in downloading/unzipping data file")
}

Preview of data

There are more complete records beginning in 1995, so filtering out all records before 1995:

library(lubridate)
noaa <- read.csv(csvfile, header=T)
noaa$year <- year(mdy_hms(noaa$BGN_DATE))
noaa <- noaa[noaa$year >= 1995,]

More information

More information about National Weather Service Storm Data and FAQ on storm events.

Population Health Impact

Population heatlh impact includes injuries and casualties. The following figure shows injuries and casualties caused by various event types:

noaa$popHealth <- noaa$FATALITIES + noaa$INJURIES

popHealth.event <- aggregate(popHealth ~ EVTYPE, noaa, sum)
popHealth.event <- popHealth.event[order(-popHealth.event$popHealth),]
popHealth.event.top <- popHealth.event[1:5,]
with(popHealth.event.top,
     barplot(popHealth, names.arg=EVTYPE, 
             xlab="Event Type",
             ylab="Injuries + Fatalities",
             main="Top 5 Population Health vs Event Types",
             ylim=c(0,30000)
             ))

popHealth.event.top
##             EVTYPE popHealth
## 666        TORNADO     23310
## 112 EXCESSIVE HEAT      8428
## 144          FLOOD      7192
## 358      LIGHTNING      5360
## 683      TSTM WIND      3871

The top 5 events as a proportion of total population health impact:

top.five <- sum(popHealth.event.top$popHealth)
total <- sum(popHealth.event$popHealth)
top.five / total
## [1] 0.6626627

Economic Impact

Economic impact includes property damages and crop damages. The following figure shows damages caused by various event types:

normalizeExp <- function(nums, exps) {
  f <- function(n, e) { 
    if (n == 0 || is.na(e)) {
      n
    } else if (e == "M") {
      n * 1000 * 1000
    } else if (e == "K") {
      n * 1000
    } else {
      n
    }
  }
  mapply(f, nums, exps)
}

noaa$econImp <- normalizeExp(noaa$PROPDMG, noaa$PROPDMGEXP) +
  normalizeExp(noaa$CROPDMG, noaa$CROPDMGEXP)

econImp.event <- aggregate(econImp ~ EVTYPE, noaa, sum)
econImp.event <- econImp.event[order(-econImp.event$econImp),]
econImp.event.top <- econImp.event[1:5,]
with(econImp.event.top,
     barplot(econImp / (1000 * 1000 * 1000), names.arg=EVTYPE, 
             xlab="Event Type",
             ylab="Property Damage + Crop Damage in Billions",
             main="Top 5 Economic Impact vs Event Types",
             ylim=c(0, 30)))

econImp.event.top
##          EVTYPE     econImp
## 144       FLOOD 26944847580
## 666     TORNADO 19911815433
## 206        HAIL 15854599064
## 134 FLASH FLOOD 15709847661
## 84      DROUGHT 13468172002

The top 5 events as a proportion of total economic impact:

top.five <- sum(econImp.event.top$econImp)
total <- sum(econImp.event$econImp)
top.five / total
## [1] 0.6278942

Results

Out of all events since 1995, tornado has the highest cumulative population health impact. As for economic impact, flood has the highest impact.