Introduction

The basic goal of this assignment is to explore the NOAA Storm Database and answer some basic questions about severe weather events.

This Analysis attempts to answer the following Questions:

Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

Across the United States, which types of events have the greatest economic consequences?

Audience

The intended audience for this report is a goverment of municipal manager who might be responsible for preparing for severe weather events and will need to prioritize resources for different types of events.

Reccomendations

This report will provide an analysis of the data but will not make specific reccomendations.

Data Processing

Data is sourced from the NOAA database set provided here.

The first step, as always, is to read the data:

noaa<-read.csv(gzfile("repdata-data-StormData.csv.bz2"))

Harmful to health

In order to determine the type of events that is most harmful to healt, we will consider both fatalities and injuries cased by the event.

require(plyr)
## Loading required package: plyr
harmful_index<-ddply(noaa, c("EVTYPE"), summarise, N=length(FATALITIES),
                     TOTAL_FATALITIES=sum(FATALITIES),
                     AVG_FATALITIES=mean(FATALITIES), 
                     MEDIAN_FATALITIES=median(FATALITIES)
                     )

And sort it according to the number fo fatalities:

harmful_index<-harmful_index[order(-harmful_index$TOTAL_FATALITIES),]

Cost of Events

To answer the questions, we need to determine the total costs of the event. This is done by multiplying the provided cost with the magnitude and adding the costs for crop damage and property damage:

calc_exp <- function(x, exp) {
  if (any(exp == "B"))
    return (x * 1000000000)
  else if (exp == "M")
    return (x * 1000000)
  else if (exp == "H")
    return (x * 100000)
  else if (exp %in% c("k", "K"))
    return (x * 1000)
  else
    return (1)
}

noaa$TOTCOST<-(calc_exp(noaa$PROPDMG, noaa$PROPDMGEXP) + calc_exp(noaa$CROPDMG, noaa$CROPDMGEXP))

This is then processed to an average cost per event type:

require(plyr)
cost_index<-ddply(noaa, c("EVTYPE"), summarise, 
                  N=length(TOTCOST), 
                  TOT_COST=sum(TOTCOST),
                  AVG_COST=mean(TOTCOST))

And sort it according to the average cost:

cost_index<-cost_index[order(-cost_index$TOT_COST),]

Results

Most Harmful Events

The assumption is made that average fatalities per event is the best indicator of the type of events that is most harmful to our health.

The following plot illustrates the 10 most harmful events with the average number of fatalities:

barplot(B, main=“MY NEW BARPLOT”, xlab=“LETTERS”, ylab=“MY Y VALUES”, names.arg=c(“A”,“B”,“C”,“D”,“E”,“F”,“G”), border=“red”, density=c(90, 70, 50, 40, 30, 20, 10))

harmful_index[1:10,]
##             EVTYPE      N TOTAL_FATALITIES AVG_FATALITIES
## 834        TORNADO  60652             5633       0.092874
## 130 EXCESSIVE HEAT   1678             1903       1.134088
## 153    FLASH FLOOD  54277              978       0.018019
## 275           HEAT    767              937       1.221643
## 464      LIGHTNING  15754              816       0.051796
## 856      TSTM WIND 219940              504       0.002292
## 170          FLOOD  25326              470       0.018558
## 585    RIP CURRENT    470              368       0.782979
## 359      HIGH WIND  20212              248       0.012270
## 19       AVALANCHE    386              224       0.580311
##     MEDIAN_FATALITIES
## 834                 0
## 130                 0
## 153                 0
## 275                 0
## 464                 0
## 856                 0
## 170                 0
## 585                 1
## 359                 0
## 19                  0

Or in graphical format:

TOP10<-harmful_index[1:10,]
barplot(TOP10$TOTAL_FATALITIES, 
        main="Total fatalaties per event type", 
        xlab="Event type",
        names.arg=TOP10$EVTYPE)

plot of chunk unnamed-chunk-8

Economic consequences

The assumption is made that the total cost per event type is the best indicator of type of event that have the largest economic consequence.

cost_index[1:10,]
##                 EVTYPE      N  TOT_COST  AVG_COST
## 834            TORNADO  60652 3.312e+15 5.461e+10
## 153        FLASH FLOOD  54277 1.599e+15 2.947e+10
## 856          TSTM WIND 219940 1.445e+15 6.571e+09
## 244               HAIL 288661 1.268e+15 4.394e+09
## 170              FLOOD  25326 1.068e+15 4.217e+10
## 760  THUNDERSTORM WIND  82563 9.436e+14 1.143e+10
## 464          LIGHTNING  15754 6.069e+14 3.853e+10
## 786 THUNDERSTORM WINDS  20843 4.650e+14 2.231e+10
## 359          HIGH WIND  20212 3.420e+14 1.692e+10
## 972       WINTER STORM  11433 1.347e+14 1.178e+10

This can be graphically summarised as follows:

TOP10COST<-cost_index[1:10,]
barplot(TOP10COST$TOT_COST,
        main="Total cost per event type",
        xlab="Event type",
        names.arg=TOP10COST$EVTYPE)

plot of chunk unnamed-chunk-10