Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The purpose of the analysis is to find out which types of disasters are most threatening for human health and which account for most damage in their property.The scope of the analysis embraces the period 1950 - 2011 in the USA.
wd <- "C:/Users/saavedra/Google Drive/FPI-GD/Cursos/Reproducible Research - COURSERA/assignment/Peer Assessment 2"
# Download file
if (!file.exists(paste(wd, "/repdata-data-StormData.csv.bz2", sep=""))) {
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", paste(wd, "/repdata-data-StormData.csv.bz2", sep=""))
}
# Unzip file
if (!file.exists(paste(wd, "/repdata-data-StormData.csv", sep=""))) {
library(R.utils)
bunzip2(paste(wd, "/repdata-data-StormData.csv.bz2", sep=""), remove = FALSE)
}
# Load data
data <- read.csv("repdata-data-StormData.csv")
names(data)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
# Select data of health and economic impact
mydata <- data[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP", "CROPDMG","CROPDMGEXP")]
# Recode PROPDMEXP units
library(car)
mydata$PROPDMGEXP <- recode(mydata$PROPDMGEXP, "''=0; '_'=0; '?'=0; '+'=0; 0=1; 1=10; 2=100; 3=1000; 4=1e4; 5=1e5; 6=1e6; 7=1e7; 8=1e8; 'B'=1e9; 'h'=1e2; 'H'=1e2; 'K'=1e3; 'm'=1e6; 'M'=1e6")
mydata$PROPDMGEXP <- as.numeric(mydata$PROPDMGEXP)
mydata$PROP <- mydata$PROPDMG * mydata$PROPDMGEXP
# Recode CROPDMG units (0 to invalid exponent data)
mydata$CROPDMGEXP <- recode(mydata$CROPDMGEXP, "''=0; '?'=0; 0=1; 1=10; 1=100; 'B'=1e9; 'k'=1e3; 'K'=1e3; 'm'=1e6; 'M'=1e6")
mydata$CROPDMGEXP <- as.numeric(mydata$CROPDMGEXP)
mydata$CROP <- mydata$CROPDMG * mydata$CROPDMGEXP
# Aggregate the data by event
fatal <- aggregate(FATALITIES ~ EVTYPE, data = mydata, FUN = sum)
injury <- aggregate(INJURIES ~ EVTYPE, data = mydata, FUN = sum)
prop <- aggregate(PROP ~ EVTYPE, data = mydata, FUN = sum)
crop <- aggregate(CROP ~ EVTYPE, data = mydata, FUN = sum)
Tornados have occured to be the most dangerous events for human health. They account for the biggest number of fatalities and injuries. (Fig.1).
# get top10 event with highest fatalities
fatal10 <- fatal[order(-fatal$FATALITIES), ][1:10, ]
# get top10 event with highest injuries
injury10 <- injury[order(-injury$INJURIES), ][1:10, ]
# Plotting
par(mfrow = c(1, 2), mar = c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(fatal10$FATALITIES, las = 3, names.arg = fatal10$EVTYPE, main = "",
ylab = "number of fatalities", col = "grey")
barplot(injury10$INJURIES, las = 3, names.arg = injury10$EVTYPE, main = "",
ylab = "number of injuries", col = "grey")
Fig. 1: Weather Events With The Top 10 Highest Fatalities (left) and Injuries (right)
Flood, huricane and tornado have caused the greatest damage to properties. While drought and flood was the reason that caused the greatest damage to the crops (Fig.2).
# get top 10 events with highest property damage
prop10 <- prop[order(-prop$PROP), ][1:10, ]
# get top 10 events with highest crop damage
crop10 <- crop[order(-crop$CROP), ][1:10, ]
# Plotting
par(mfrow = c(1, 2), mar = c(12, 4, 3, 2), mgp = c(3, 1, 0), cex = 0.8)
barplot(prop10$PROP/(10^9), las = 3, names.arg = prop10$EVTYPE,
main = "", ylab = "Cost of damages ($ billions)",
col = "grey")
barplot(crop10$CROP/(10^9), las = 3, names.arg = crop10$EVTYPE,
main = "", ylab = "Cost of damages ($ billions)",
col = "grey")
Fig.2: Top 10 Events with Greatest Property (left) and Crop Damages (right)