Get the data from NOAA. Analyse impact by deaths. Tornadoes come first. Analyse impact by economy. Floods come first. This report downloads data from NOAA Storm Database and performs a statistical analysis on the impact of physical events to population health and economy.
Examining the event types, we observe that most of the physical phenomena cause injuries to people, which sometimes are fatal. By far, Tornadoes are the most dangerous events, caused ~100.000 injuries on the last 60 years.
When analysing the event types by the impact on the economy, we observe that floods caused $15 billions damages on the last 60 years, mostly on properties.
stormdata <- read.csv("~/RepData_PA2/repdata_data_StormData.csv")
dim(stormdata)
## [1] 902297 37
head(stormdata)
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 1 1 4/18/1950 0:00:00 0130 CST 97 MOBILE AL
## 2 1 4/18/1950 0:00:00 0145 CST 3 BALDWIN AL
## 3 1 2/20/1951 0:00:00 1600 CST 57 FAYETTE AL
## 4 1 6/8/1951 0:00:00 0900 CST 89 MADISON AL
## 5 1 11/15/1951 0:00:00 1500 CST 43 CULLMAN AL
## 6 1 11/15/1951 0:00:00 2000 CST 77 LAUDERDALE AL
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 1 TORNADO 0 0
## 2 TORNADO 0 0
## 3 TORNADO 0 0
## 4 TORNADO 0 0
## 5 TORNADO 0 0
## 6 TORNADO 0 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F MAG FATALITIES
## 1 NA 0 14.0 100 3 0 0
## 2 NA 0 2.0 150 2 0 0
## 3 NA 0 0.1 123 2 0 0
## 4 NA 0 0.0 100 2 0 0
## 5 NA 0 0.0 150 2 0 0
## 6 NA 0 1.5 177 2 0 0
## INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP WFO STATEOFFIC ZONENAMES
## 1 15 25.0 K 0
## 2 0 2.5 K 0
## 3 2 25.0 K 0
## 4 2 2.5 K 0
## 5 2 2.5 K 0
## 6 6 2.5 K 0
## LATITUDE LONGITUDE LATITUDE_E LONGITUDE_ REMARKS REFNUM
## 1 3040 8812 3051 8806 1
## 2 3042 8755 0 0 2
## 3 3340 8742 0 0 3
## 4 3458 8626 0 0 4
## 5 3412 8642 0 0 5
## 6 3450 8748 0 0 6
To calculate the injuries to humans, damages dataframe is being used, to aggregate both fatal and non-fatal injuries.
The economic impact is assessed by calculating the exponential value of the property and corp damage in data frame economic.
Two small data frames dam and eco are used to calculate only the top 10 events in human and economic impact respectively.
library(Hmisc)
## Loading required package: grid
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
##
## Attaching package: 'Hmisc'
##
## The following objects are masked from 'package:base':
##
## format.pval, round.POSIXt, trunc.POSIXt, units
library(reshape)
library(ggplot2)
library(car)
stormdata$EVTYPE <- capitalize(tolower(stormdata$EVTYPE))
damages<-aggregate(cbind(FATALITIES, INJURIES) ~ EVTYPE , stormdata, sum)
dam<-melt(head(damages[order(-damages$FATALITIES,-damages$INJURIES),],10))
## Using EVTYPE as id variables
stormdata$PROPDMG <- stormdata$PROPDMG * as.numeric(Recode(stormdata$PROPDMGEXP,
"'0'=1;'1'=10;'2'=100;'3'=1000;'4'=10000;'5'=100000;'6'=1000000;'7'=10000000;'8'=100000000;'B'=1000000000;'h'=100;'H'=100;'K'=1000;'m'=1000000;'M'=1000000;'-'=0;'?'=0;'+'=0",
as.factor.result = FALSE))
stormdata$CROPDMG <- stormdata$CROPDMG * as.numeric(Recode(stormdata$CROPDMGEXP,
"'0'=1;'2'=100;'B'=1000000000;'k'=1000;'K'=1000;'m'=1000000;'M'=1000000;''=0;'?'=0",
as.factor.result = FALSE))
economic <- aggregate(cbind(PROPDMG, CROPDMG) ~ EVTYPE, stormdata, sum)
eco <- melt(head(economic[order(-economic$PROPDMG, -economic$CROPDMG), ], 10))
## Using EVTYPE as id variables
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health? * Question 1: Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
By using the ggplot2 library we present a combined flipped barplot graph of the fatal (Deaths) and non-fatal Injuries, by event type.
ggplot(dam, aes(x = EVTYPE, y = value, fill = variable)) + geom_bar(stat = "identity") +
coord_flip() + ggtitle("Harmful events") + labs(x = "", y = "number of people impacted") +
scale_fill_manual(values = c("blue", "red"), labels = c("Deaths", "Injuries"))
Across the United States, which types of events have the greatest economic consequences? * Question 2: Across the United States, which types of events have the greatest economic consequences?
By using the ggplot2 library we present a combined flipped barplot graph of the property and corp damages, by event type.
ggplot(eco, aes(x = EVTYPE, y = value, fill = variable)) + geom_bar(stat = "identity") +
coord_flip() + ggtitle("Economic consequences") + labs(x = "", y = "cost of damages in dollars") +
scale_fill_manual(values = c("blue", "red"), labels = c("Property Damage",
"Crop Damage"))