In this way, it was found that in the first place and by abysmal difference with respect to the other events on the podium, there are the tornadoes, followed by Excessive heat, TSTM Wind, FLood and Lightning.
As expected, the tornado is the event that brings the most economic consequences, followed by Flash flood, TSTM Wind, Hail, Flood.
The following packages were used:
library(ggplot2)
library(dplyr)
library(plyr)
fileUrl <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileUrl, "data.csv.bz2")
data <- read.csv("data.csv.bz2", sep = ",",header = TRUE )
The variables FATALITIES and INJURIES were grouped according to the type of event. In this way, it is possible to identify which event accumulates the greatest number of deaths and injuries and classify them according to their harmfulness.
data_2 <- aggregate(FATALITIES + INJURIES ~ EVTYPE,
FUN = "sum",
data=data)
colnames(data_2)[2]<-"v1"
When analyzing the structure of the data, it was found that the vast majority of events do not report deaths or injuries.
prob <- c(1:10)/10
quantile(data_2$v1, prob=prob)
## 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
## 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 7.6 96979.0
Additionally, it is observed that there is a significant difference between the number of deaths and injuries that accumulates the most harmful event with respect to the second., It is also observed that there is a jump between the fifth most harmful event and the sixth event, for which reason they are chosen as the most harmful events to the top 5.
data_2 <- data_2 %>% arrange(desc(v1))
top <- data_2[1:5,]
head(data_2, n=10)
## EVTYPE v1
## 1 TORNADO 96979
## 2 EXCESSIVE HEAT 8428
## 3 TSTM WIND 7461
## 4 FLOOD 7259
## 5 LIGHTNING 6046
## 6 HEAT 3037
## 7 FLASH FLOOD 2755
## 8 ICE STORM 2064
## 9 THUNDERSTORM WIND 1621
## 10 WINTER STORM 1527
In this way, it was found that in the first place and by abysmal difference with respect to the other events on the podium, there are the tornadoes, followed by Excessive heat, TSTM Wind, FLood and Lightning.
ggplot(data=top,aes(x=reorder(EVTYPE, -v1), y=v1,fill=v1))+
geom_bar(stat= "identity", position="dodge")+
coord_cartesian(ylim= c(4000,15000))+
annotate("text", x= 1, y = 14000, label = "+15000", color = "white")+
labs(x=" Event Type",
y= " Fatalities + Injuries",
title = "Harmful Events")+
scale_fill_gradient("Harmfulness", low = "grey", high = "black")+
theme(axis.text.x = element_text(size=6.35))
Two temporary variables were created that indicate the expansion factor that is included in a separate variable and by which it must be multiplied to indicate the value in dollars. Subsequently, the costs for the property and the crops are added and entered in a new variable.Only the variables to be used were selected and separated into a separate data set.
tempPROPDMG <- mapvalues(data$PROPDMGEXP,
c("K","M","", "B","m","+","0","5","6",
"?","4","2","3","h","7","H","-","1","8"),
c(1e3,1e6, 1, 1e9,1e6, 1, 1,1e5,1e6, 1,
1e4,1e2,1e3, 1,1e7,1e2, 1, 10,1e8))
tempCROPDMG <- mapvalues(data$CROPDMGEXP,
c("","M","K","m","B","?","0","k","2"),
c( 1,1e6,1e3,1e6,1e9,1,1,1e3,1e2))
data$PROPTOTALDMG <- as.numeric(tempPROPDMG) * data$PROPDMG
data$CROPTOTALDMG <- as.numeric(tempCROPDMG) * data$CROPDMG
data$TOTALDMG <- data$PROPTOTALDMG + data$CROPTOTALDMG
data_3 <- data %>% select(EVTYPE, TOTALDMG)
data_4 <- aggregate(TOTALDMG ~ EVTYPE, FUN = "sum", data = data_3)
data_4 <- data_4 %>% arrange(desc(TOTALDMG))
As expected, the tornado is the event that brings the most economic consequences, followed by Flash flood, TSTM Wind, Hail, Flood.
ggplot(data=head(data_4,5),aes(x=reorder(EVTYPE, -TOTALDMG),
y=TOTALDMG,fill=TOTALDMG))+
geom_bar(stat= "identity", position="dodge")+
labs(x=" Event Type",
y= "Property Damage (USD)",
title = "Economic consequences")+
scale_fill_gradient("Economic Damage", low = "grey", high = "black")+
theme(axis.text.x = element_text(size=6.35))
In this way, it was found that in the first place and by abysmal difference with respect to the other events on the podium, there are the tornadoes, followed by Excessive heat, TSTM Wind, FLood and Lightning.
As expected, the tornado is the event that brings the most economic consequences, followed by Flash flood, TSTM Wind, Hail, Flood.