Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This analysis involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

The basic goal of this analysis is to explore the NOAA Storm Database and answer some basic questions about severe weather events.

  1. Across the United States, which types of events are most harmful with respect to population health?
  2. Across the United States, which types of events have the greatest economic consequences?

From these data, we found that tornadoes and heat are most dangerous event types to people, while flooding, hurricanes, and storm surges are the most costly event types to the economy.

Data Processing

The data for this analysis come from National Weather Service. There is also some documentation of the database available.

Storm Data FAQ

Download and load data into R

# Download the Storm Data dataset
if(!file.exists("./data")){dir.create("./data")}
destination.file <- "data/stormdata.csv.bz2"
if (!file.exists(destination.file)){
    download.file("http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destination.file, method = "auto")
}
# Load the data in the stormData variable.
stormData <- read.csv(destination.file)

Processing

In this section, we will address two questions introduced at the beginning of the report.

  • Across the United States, which types of events are most harmful with respect to population health?

We will focus on fields FATALITIES and INJURIES.

library(plyr)
# Summarize data
fataldata <- arrange(ddply(stormData, .(EVTYPE), summarise, TotalFatalities=sum(FATALITIES)), desc(TotalFatalities))
injurdata <- arrange(ddply(stormData, .(EVTYPE), summarise, TotalInjuries=sum(INJURIES)), desc(TotalInjuries))

# Make small subsets with maximum of data
fataldataSmall <- head(fataldata)
injurdataSmall <- head(injurdata)

# Plot data
par(mfcol=c(1,2))
barplot(fataldataSmall$TotalFatalities, names.arg = fataldataSmall$EVTYPE,main = "Number of fatalities events", cex.names=0.6, las=2)
barplot(injurdataSmall$TotalInjuries, names.arg = injurdataSmall$EVTYPE,main = "Number of injuries events", cex.names=0.6, las=2)

  • Across the United States, which types of events have the greatest economic consequences?

We will focus on fields PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP

# Calculate real value
stormData$PROPDMGValue <- ifelse(stormData$PROPDMGEXP =="H", stormData$PROPDMG*100, 
                            ifelse(stormData$PROPDMGEXP =="K", stormData$PROPDMG*1000, 
                                 ifelse(stormData$PROPDMGEXP == "M", stormData$PROPDMG*1e6, 
                                        ifelse(stormData$PROPDMGEXP == "B", stormData$PROPDMG*1e9,
                                               stormData$PROPDMG))))

stormData$CROPDMGValue <- ifelse(stormData$CROPDMGEXP =="H", stormData$CROPDMG*100, 
                            ifelse(stormData$CROPDMGEXP =="K", stormData$CROPDMG*1000, 
                                 ifelse(stormData$CROPDMGEXP == "M", stormData$CROPDMG*1e6, 
                                        ifelse(stormData$CROPDMGEXP == "B", stormData$CROPDMG*1e9,
                                               stormData$CROPDMG))))

# Summarize data
propdmgdata <- arrange(ddply(stormData, .(EVTYPE), summarise, TotalDamage=sum(PROPDMGValue)), desc(TotalDamage))
cropdmgdata <- arrange(ddply(stormData, .(EVTYPE), summarise, TotalDamage=sum(CROPDMGValue)), desc(TotalDamage))

# Make small subsets with maximum of data
propdmgdataSmall <- head(propdmgdata)
cropdmgdataSmall <- head(cropdmgdata)

# Plot data
par(mfcol=c(1,2))
barplot(propdmgdataSmall$TotalDamage/1e6, names.arg = propdmgdataSmall$EVTYPE,main = "The costs of property damage\nmillion $",cex.axis=0.8,cex.names=0.5,las=2)
barplot(cropdmgdataSmall$TotalDamage/1e6, names.arg = cropdmgdataSmall$EVTYPE,main = "The costs of damage to crops\nmillion $",cex.axis=0.8,cex.names=0.5,las=2)

Results

Weather event causing the most number of fatalities - TORNADO

Weather event causing the most number of injuries - TORNADO

Weather event causing the most number of property damage - FLOODS

Weather event causing the most number of crop damage - DROUGHT