In this report we investigate the impact of sever weather events in USA on both people health and economic damages. The analyzed period is 1950 - 2011. The goal of this report is to identify the events type that had the highest impact on people and economics. In order to focus on the worst events we selected the worst 10 events for each of the following indicators:
The data have been obtained from the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database Data. Tha data has been uncompressed and only relevant columns are loaded from csv file:
Archive <- "repdata-data-StormData.csv.bz2"
if (!file.exists(Archive)){
stop(cat("data file ",Archive," not available"))
}
## library(R.utils)
## bunzip2(Archive)
## source("http://bioconductor.org/biocLite.R")
## biocLite("limma")
library(limma)
Harmful_Data <- read.columns("repdata-data-StormData.csv", c("EVTYPE", "FATALITIES", "INJURIES"),
sep=",")
Damage_Data <- read.columns("repdata-data-StormData.csv",
c("EVTYPE", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP"),
sep =",")
in the following summary of data related to Economical Damages, we can see that PROPDGMEXP and CROPDMGEXP spurious values like “k”, “O” and Others are neglectible, so we assume that those spurious values are equivalent to empty cells.
summary(Damage_Data)
## EVTYPE PROPDMG PROPDMGEXP
## HAIL :288661 Min. : 0.00 :465934
## TSTM WIND :219940 1st Qu.: 0.00 K :424665
## THUNDERSTORM WIND: 82563 Median : 0.00 M : 11330
## TORNADO : 60652 Mean : 12.06 0 : 216
## FLASH FLOOD : 54277 3rd Qu.: 0.50 B : 40
## FLOOD : 25326 Max. :5000.00 5 : 28
## (Other) :170878 (Other): 84
## CROPDMG CROPDMGEXP
## Min. : 0.000 :618413
## 1st Qu.: 0.000 K :281832
## Median : 0.000 M : 1994
## Mean : 1.527 k : 21
## 3rd Qu.: 0.000 0 : 19
## Max. :990.000 B : 9
## (Other): 9
We thus proceed normalizing the PROPDMG and CROPDMG in $ unit.
Damage_Data$PROPDMG[Damage_Data$PROPDMGEXP == "K"] <-
Damage_Data$PROPDMG[Damage_Data$PROPDMGEXP == "K"] * 1e3
Damage_Data$PROPDMG[Damage_Data$PROPDMGEXP == "M"] <-
Damage_Data$PROPDMG[Damage_Data$PROPDMGEXP == "M"] * 1e6
Damage_Data$PROPDMG[Damage_Data$PROPDMGEXP == "B"] <-
Damage_Data$PROPDMG[Damage_Data$PROPDMGEXP == "B"] * 1e9
Damage_Data$CROPDMG[Damage_Data$CROPDMGEXP == "K"] <-
Damage_Data$CROPDMG[Damage_Data$CROPDMGEXP == "K"] * 1e3
Damage_Data$CROPDMG[Damage_Data$CROPDMGEXP == "M"] <-
Damage_Data$CROPDMG[Damage_Data$CROPDMGEXP == "M"] * 1e6
Damage_Data$CROPDMG[Damage_Data$CROPDMGEXP == "B"] <-
Damage_Data$CROPDMG[Damage_Data$CROPDMGEXP == "B"] * 1e9
In this chapter we present the results from the loaded datasets. More precisely we will identify the mosy harmful types of event as well as the types of events that caused the greatest economic consequences.
We will focus on the Worst 10 Event Types for each of the following indicators:
For all indicators, we consider the sum of the indicator values. In other words the worst event type is the one with the highest “cumulative” impact over the observation period. Other aggregation possibilities (like the average or the max) are not considered in this study.
Although the question refers to “most harmful” events type, we keep Fatalities and Injuries as independent indicators, and thus we provide the worst 10 events type for each indicator. As previously described we sum the indicators provided for each event.
Q1_Data <- aggregate(Harmful_Data[,c("FATALITIES", "INJURIES")],
by=list(Harmful_Data[,"EVTYPE"]), FUN=sum)
names(Q1_Data)[1] <- c("EVTYPE")
iFatal <- order(Q1_Data$FATALITIES, decreasing = TRUE)
iInj <- order(Q1_Data$INJURIES, decreasing = TRUE)
Q1_Fatal <- Q1_Data[iFatal[1:10],c("EVTYPE", "FATALITIES")]
Q1_Inj <- Q1_Data[iInj[1:10],c("EVTYPE","INJURIES")]
The following plot is showing the worst 10 Events Type as far as Fatalities in concerned.
library(ggplot2)
g <- ggplot(Q1_Fatal, aes(EVTYPE, FATALITIES)) + geom_bar(stat="identity") +
coord_flip()
g <- g + labs(title = "Fatalities: Worst 10 events",
y = "Number of Fatalities",
x = "Event Type")
print(g)
Similar picture is provided for the number of Injuries.
library(ggplot2)
g <- ggplot(Q1_Inj, aes(EVTYPE, INJURIES)) + geom_bar(stat="identity") +
coord_flip()
g <- g + labs(title = "Injuries: Worst 10 events",
y = "Number of Injuries",
x = "Event Type")
print(g)
Economic consequences have been calculated as sum of Property and Crop Damages. The total Damages of each event type are aggregated as sum of damages estimated at each event.
Q2_Data <- aggregate(Damage_Data[, c("PROPDMG", "CROPDMG")],
by = list(Damage_Data$EVTYPE), FUN=sum)
names(Q2_Data)[1] <- "EVTYPE"
Q2_Data_Tot <- cbind(Q2_Data, Q2_Data$PROPDMG + Q2_Data$CROPDMG)
names(Q2_Data_Tot)[4] <- "TOTDMG"
iDamage <- order(Q2_Data_Tot$TOTDMG, decreasing = TRUE)
Q2_Damage <- Q2_Data_Tot[iDamage[1:10],c("EVTYPE", "TOTDMG")]
The following plot is showing the worst 10 Events Type as far as Economic Damages in concerned.
library(ggplot2)
g <- ggplot(Q2_Damage, aes(EVTYPE, TOTDMG))+ geom_bar(stat="identity") +
coord_flip()
g <- g + labs(title = "Economic Damages: Worst 10 events",
y = "Damage Value ($)",
x = "Event Type")
print(g)