1. Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

2. Data Processing

2.1 Data

The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. You can download the file from the course web site:

There is also some documentation of the database available. Here you will find how some of the variables are constructed/defined.

2.2 Loading data

The following packages will be needed in the course of this analysis:

library(ggplot2)
library(grid)
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 3.3.2

We load our data.

stormData <- read.csv(bzfile("repdata-data-StormData.csv.bz2"), header = TRUE)

This dataset consists of lot of information most of which is not required for our present study. So, here is the code to extract the required data for health and economic impact analysis against weather.

stormDataSubset <- stormData[,c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")]

Now we will prepare the data for investigating the effects of Storms and other severe weather events on populations health and the economic consequences.

2.3 Processing

First, we must replace all missing value by 0.

stormDataSubset$FATALITIES[(stormDataSubset$FATALITIES == "")] <- 0
stormDataSubset$INJURIES[(stormDataSubset$INJURIES == "")] <- 0
stormDataSubset$PROPDMG[(stormDataSubset$PROPDMG == "")] <- 0
stormDataSubset$CROPDMG[(stormDataSubset$CROPDMG == "")] <- 0

stormDataSubset$PROPDMGEXP <- as.character(stormDataSubset$PROPDMGEXP)
stormDataSubset$CROPDMGEXP <- as.character(stormDataSubset$CROPDMGEXP)

Exploring the prop and crop exponent data, we must convert all of them into numeric value

unique(stormDataSubset$PROPDMGEXP)
##  [1] "K" "M" ""  "B" "m" "+" "0" "5" "6" "?" "4" "2" "3" "h" "7" "H" "-"
## [18] "1" "8"
unique(stormDataSubset$CROPDMGEXP)
## [1] ""  "M" "K" "m" "B" "?" "0" "k" "2"

Function for assigning values for the property exponent data (0 to invalid data).

getExp <- function(e) {
    if (e %in% c("h", "H"))
        return(2)
    else if (e %in% c("k", "K"))
        return(3)
    else if (e %in% c("m", "M"))
        return(6)
    else if (e %in% c("b", "B"))
        return(9)
    else if (e %in% c("0","1","2","3","4","5","6","7","8","9")) 
        return(as.numeric(e))
    else if (e %in% c("-", "?", "+"))
        return(1)
    else if (e %in% c(""))
        return(0)
    else {
        stop("Invalid value.")
    }
}

Multiply the value of damage and its exponent. Store the result in new columns.

propExp <- sapply(stormDataSubset$PROPDMGEXP, FUN=getExp)
stormDataSubset$propDamage <- stormDataSubset$PROPDMG * (10 ** propExp)

cropExp <- sapply(stormDataSubset$CROPDMGEXP, FUN=getExp)
stormDataSubset$cropDamage <- stormDataSubset$CROPDMG * (10 ** cropExp)

Summary data by event, then sort data by property damage, crop damage, fatalities and injuries.

stormDataRed <- stormDataSubset[,c("EVTYPE", "FATALITIES", "INJURIES", "propDamage", "cropDamage")]
stormDataRedSum <- aggregate(. ~ EVTYPE ,data = stormDataRed, FUN=sum)
stormDataRedSum <- stormDataRedSum[(stormDataRedSum$propDamage > 0 | stormDataRedSum$cropDamage > 0), ]
stormDataRedSum$totalDamage <- stormDataRedSum$propDamage + stormDataRedSum$cropDamage

propDmgSorted <- stormDataRedSum[order(stormDataRedSum$propDamage, decreasing = T), ]
cropDmgSorted <- stormDataRedSum[order(stormDataRedSum$cropDamage, decreasing = T), ]
totalDmgSorted <- stormDataRedSum[order(stormDataRedSum$totalDamage, decreasing = T), ]

fatalSorted <- stormDataRedSum[order(stormDataRedSum$FATALITIES, decreasing = T), ]
injurySorted <- stormDataRedSum[order(stormDataRedSum$INJURIES, decreasing = T), ]

Now that the data has been properly processed, the results of the analysis can be presented.

3. Results

3.1. Effects on population health

Top 10 events which cause the most effects on polulation heath.

p1 <- ggplot(data=head(injurySorted,10), aes(x=reorder(EVTYPE, INJURIES), y=INJURIES)) +
   geom_bar(fill="olivedrab",stat="identity")  + coord_flip() + 
    ylab("Total number of injuries") + xlab("Event type") +
    ggtitle("Health impact of weather events in the US - Top 10
            ") +
    theme(legend.position="none")

p2 <- ggplot(data=head(fatalSorted,10), aes(x=reorder(EVTYPE, FATALITIES), y=FATALITIES)) +
    geom_bar(fill="red4",stat="identity") + coord_flip() +
    ylab("Total number of fatalities") + xlab("Event type") +
    ggtitle("                                                 ") +
    theme(legend.position="none")

grid.arrange(p1, p2, ncol=1, nrow = 2, heights  = c(0.55, 0.45))

3.2. Economic Consequences

Top 10 events which cause the most effects on economic.

p1 <- ggplot(data=head(totalDmgSorted,10), 
             aes(x=reorder(EVTYPE, totalDamage), y = totalDamage / (10^9), fill = totalDamage)) +
    geom_bar(fill="blue", stat="identity") + coord_flip() + 
    xlab("Event type") + ylab("Total damage in dollars ($ Billions)") +
    ggtitle("Economic impact of weather events in the US - Top 10") +
    theme(legend.position="none")

grid.arrange(p1, heights=0.55)

3.3 Summary

As you can see from previous plots. Tornadoes are most harmful with respect to popuulation health and floods have the greatest economic consequences.

head(fatalSorted,1)
##      EVTYPE FATALITIES INJURIES  propDamage cropDamage totalDamage
## 834 TORNADO       5633    91346 56947381217  414953270 57362334487
head(injurySorted,1)
##      EVTYPE FATALITIES INJURIES  propDamage cropDamage totalDamage
## 834 TORNADO       5633    91346 56947381217  414953270 57362334487
head(totalDmgSorted,1)
##     EVTYPE FATALITIES INJURIES   propDamage cropDamage  totalDamage
## 170  FLOOD        470     6789 144657709807 5661968450 150319678257