Synopsis

The purpose of this report is to determine which types of severe weather events are most harmful to people’s health and economically costly. This report uses data from the NOAA Storm Database from 1950 - November 2011. During this time period, on average, the most fatalities resulted from “Tornadoes, TSTM Wind, Hail” events while the most injuries resulted from “Heat Wave” events. On average, the most property damage resulted from “Tornadoes, TSTM Wind, Hail” events.

Data Processing

The data set was downloaded from the Reproducible Research course website on January 3, 2017 at 9:55 AM EST. The raw data file was a CSV file compressed using the bzip2 algorithm.

Loading the data

storms <- read.csv("repdata%2Fdata%2FStormData.csv.bz2")

Results

Analysis of which type of events are most harmful to population health

Both fatalities and injuries are reported in the NOAA database. Determine the mean number of fatalities and injuries for each event type.

fatalities <- aggregate(FATALITIES ~ EVTYPE, data = storms, mean)
injuries <- aggregate(INJURIES ~ EVTYPE, data = storms, mean)

From the output of the following code, we see that many of the event types had a mean of 0 for injuries or fatalities.

dim(fatalities)
## [1] 985   2
sum(fatalities$FATALITIES == 0)
## [1] 817
dim(injuries)
## [1] 985   2
sum(injuries$INJURIES == 0)
## [1] 827

Since we are interested in events that cause the most harm to human health, we will plot the events with the top 20 mean values for fatalities.

topfatal <- fatalities[order(-fatalities$FATALITIES)[1:20],]
par(mar = c(11,5,4,4))
barplot(topfatal$FATALITIES, names.arg = topfatal$EVTYPE, las = 2, cex.names = 0.65, ylab = "Average Number of Fatalities")

Similarly, we will plot the events by type with the top 20 mean values for injuries.

topinjure <- injuries[order(-injuries$INJURIES)[1:20],]
par(mar = c(11,5,4,4))
barplot(topinjure$INJURIES, names.arg = topinjure$EVTYPE, las = 2, cex.names = 0.65, ylab = "Average Number of Injuries")

From these plots, we see that, on average, the most fatalities resulted from “Tornadoes, TSTM Wind, Hail” events while the most injuries resulted from “Heat Wave” events.

Anaylsis of which type of events are most economically costly

We will use the property damage data to determine which events are most economically costly. This data is contained in two columns, one containing a number and another containing an exponent or alphabetical character (for example, B is used to denote billions of dollars.) The following code will combine this information into one column.

for(i in 1:length(storms$PROPDMGEXP)) {
if(storms$PROPDMGEXP[i] == "B" |  storms$PROPDMGEXP[i] == "b") {
    storms$newexp[i] <- 9
} else if(storms$PROPDMGEXP[i] == "M" |  storms$PROPDMGEXP[i] == "m"){
    storms$newexp[i] <- 6
} else if(storms$PROPDMGEXP[i] == "K" |  storms$PROPDMGEXP[i] == "k"){
    storms$newexp[i] <- 3
} else if(storms$PROPDMGEXP[i] == "H" |  storms$PROPDMGEXP[i] == "h"){
    storms$newexp[i] <- 2
} else {storms$newexp[i] <- storms$PROPDMGEXP[i]}
}                  
                            
storms$damage <- storms$PROPDMG * 10^storms$newexp

Next, we will determine the average property damage by event type.

avgdamage <- aggregate(damage ~ EVTYPE, data = storms, mean)

Since we are interested in events that cause the most damage, we will plot the top 20 average values for property damage.

topdamage <- avgdamage[order(-avgdamage$damage)[1:20],]
par(mar = c(11,11,4,4))
barplot(topdamage$damage, names.arg = topdamage$EVTYPE, las = 2, cex.axis = 0.65, cex.names = 0.65, cex.lab = 0.75, ylab = "Average Property Damage (dollars)")

From the graph, we see that, on average, the event type “Tornadoes, TSTM Wind, Hail” caused the most property damage.