We analyze the impact of Storms and other severe weather events on public health and economic problems based on U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. The events in the database start in the year 1950 and end in November 2011. We use data of fatalities, injuries, property and crop damage to decide which types of event are most harmful to the population health (fatalities and injuries) and economy (property and crop damage). We found that Marine Thunderstorm Wind (TSTM WIND) caused most harmful with respect to population health in term of both fatalities and injuries, while HURRICANE/TYPHOON (HURRICANE/TYPHOON)have the greatest economic consequences in term of damage amounts.
cache = TRUE
echo = TRUE
if(!file.exists("stormData.csv.bz2")) {
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",
destfile = "stormData.csv.bz2")
}
NOAA <- read.csv(bzfile("stormData.csv.bz2"), sep=",", header=T)
## Warning in scan(file, what, nmax, sep, dec, quote, skip, nlines,
## na.strings, : EOF within quoted string
We only need following columns for this analysis: ‘EVTYPE’,‘FATALITIES’,‘INJURIES’, ‘PROPDMG’, ‘PROPDMGEXP’, ‘CROPDMG’, ‘CROPDMGEXP’.
cache = TRUE
echo = TRUE
tidyNOAA <- NOAA[,c('EVTYPE','FATALITIES','INJURIES', 'PROPDMG', 'PROPDMGEXP', 'CROPDMG', 'CROPDMGEXP')]
tidyNOAA$FATALITIES <-as.numeric(tidyNOAA$FATALITIES)
tidyNOAA$INJURIES <-as.numeric(tidyNOAA$INJURIES)
tidyNOAA$PROPDMG <-as.numeric(tidyNOAA$PROPDMG)
tidyNOAA$CROPDMG <-as.numeric(tidyNOAA$CROPDMG)
head(tidyNOAA)
## EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO 19282 19199 19216 K 17967
## 2 TORNADO 19282 18909 19130 K 17967
## 3 TORNADO 19282 19229 19216 K 17967
## 4 TORNADO 19282 19229 19130 K 17967
## 5 TORNADO 19282 19229 19130 K 17967
## 6 TORNADO 19282 19368 19130 K 17967
The PROPDMG and CROPDMG damages columns contain values and PROPDMGEXP and CROPDMGEXP contain units of PROPDMG and CROPDMG respectively (i.e. “H” = hundreds, “K” = thousands, “M” = millions and “B” = billions), we must convert PROPDMG and CROPDMG into corresponding values to facilitate calculations.
Modifications of Property Damage (PROPDMG) with corresponding units (PROPDMGEXP) - i.e. Convert H, K, M, B units to a newly created Property Damage Amount column (PROPDMGAMT)
cache = TRUE
echo = TRUE
tidyNOAA$PROPDMGAMT = 0
tidyNOAA[tidyNOAA$PROPDMGEXP == "H", ]$PROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$PROPDMGEXP == "H", ]$PROPDMG) * 10^2
tidyNOAA[tidyNOAA$PROPDMGEXP == "K", ]$PROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$PROPDMGEXP == "K", ]$PROPDMG) * 10^3
tidyNOAA[tidyNOAA$PROPDMGEXP == "M", ]$PROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$PROPDMGEXP == "M", ]$PROPDMG) * 10^6
tidyNOAA[tidyNOAA$PROPDMGEXP == "B", ]$PROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$PROPDMGEXP == "B", ]$PROPDMG) * 10^9
Modifications of Crop Damage (CROPDMG) with corresponding units (CROPDMGEXP) - i.e. Convert H, K, M, B units to a newly created Crop Damage Amount column (CROPDMGAMT)
cache = TRUE
echo = TRUE
tidyNOAA$CROPDMGAMT = 0
tidyNOAA[tidyNOAA$CROPDMGEXP == "H", ]$CROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$CROPDMGEXP == "H", ]$CROPDMG) * 10^2
tidyNOAA[tidyNOAA$CROPDMGEXP == "K", ]$CROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$CROPDMGEXP == "K", ]$CROPDMG) * 10^3
tidyNOAA[tidyNOAA$CROPDMGEXP == "M", ]$CROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$CROPDMGEXP == "M", ]$CROPDMG) * 10^6
tidyNOAA[tidyNOAA$CROPDMGEXP == "B", ]$CROPDMGAMT = as.numeric(tidyNOAA[tidyNOAA$CROPDMGEXP == "B", ]$CROPDMG) * 10^9
head(tidyNOAA)
## EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 TORNADO 19282 19199 19216 K 17967
## 2 TORNADO 19282 18909 19130 K 17967
## 3 TORNADO 19282 19229 19216 K 17967
## 4 TORNADO 19282 19229 19130 K 17967
## 5 TORNADO 19282 19229 19130 K 17967
## 6 TORNADO 19282 19368 19130 K 17967
## PROPDMGAMT CROPDMGAMT
## 1 19216000 0
## 2 19130000 0
## 3 19216000 0
## 4 19130000 0
## 5 19130000 0
## 6 19130000 0
Q1 Across the United States, which types of events (EVTYPE) are most harmful with respect to population health?
A1.1. Plot number of fatalities (FATALITIES) by the most harmful event type (EVTYPE)
cache = TRUE
echo = TRUE
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.2.5
fatalities <- aggregate(FATALITIES ~ EVTYPE, data=tidyNOAA, sum)
fatalities <- fatalities[order(-fatalities$FATALITIES), ][1:10, ]
fatalities$EVTYPE <- factor(fatalities$EVTYPE, levels = fatalities$EVTYPE)
ggplot(fatalities, aes(x = EVTYPE, y = FATALITIES)) +
geom_bar(stat = "identity", fill = "red") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
xlab("Type of Events") + ylab("Fatalities") + ggtitle("Number of Fatalities Ranked by Top 10 Weather Events")
A1.2. Plot number of injuries (INJURIES) by the most harmful event type (EVTYPE)
cache = TRUE
echo = TRUE
library(ggplot2)
injuries <- aggregate(INJURIES ~ EVTYPE, data=tidyNOAA, sum)
injuries <- injuries[order(-injuries$INJURIES), ][1:10, ]
injuries$EVTYPE <- factor(injuries$EVTYPE, levels = injuries$EVTYPE)
ggplot(injuries, aes(x = EVTYPE, y = INJURIES)) +
geom_bar(stat = "identity", fill = "red") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
xlab("Type of Events") + ylab("Injuries") + ggtitle("Number of Injuries Ranked by Top 10 Weather Events")
Q2 Across the United States, which types of events have the greatest economic consequences?
A2 Plot damages amount (PROPDMGNUM + CROPDMGNUM) by types of events (EVTYPE) created the most economic damages
cache = TRUE
echo = TRUE
library(ggplot2)
damages <- aggregate(PROPDMGAMT + CROPDMGAMT ~ EVTYPE, data=tidyNOAA, sum)
names(damages) = c("EVTYPE", "TOTALDAMAGE")
damages <- damages[order(-damages$TOTALDAMAGE), ][1:10, ]
damages$EVTYPE <- factor(damages$EVTYPE, levels = damages$EVTYPE)
ggplot(damages, aes(x = EVTYPE, y = TOTALDAMAGE)) +
geom_bar(stat = "identity", fill = "red") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
xlab("Type of Events") + ylab("Damages(US$)") + ggtitle("Property & Crop Damages Ranked by Top 10 Weather Events")
In conclusion, we found that Marine Thunderstorm Wind (TSTM WIND) caused most harmful with respect to population health in term of both fatalities and injuries, while HURRICANE/TYPHOON (HURRICANE/TYPHOON) have the greatest economic consequences in term of damage amounts.