In this document, I will review what are the most dangerous weather events in the USA according to the damage which they cause to the population health and economy.
For that, I will use the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, which tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
On the one hand, the most dangerous weather events for the people health in the USA on average are tornados, according to fatalities, and heatwave, according to injuries.
On the other hand, the most damaging wheater events according to its economics consequences are Flood.
First, I have fixed a working directory, and then I have downloaded the data from the provided URL. I have loaded it using the read.csv function.
Second, some transformations have been done to the data in order to facilitate our analysis. This includes cleaning the non-alphanumeric character on the EVTYPE variable and transforming the PROPDMGEXP and CROPDMGEXP on the corresponding exponent numbers according to the following website.
Third, I have multiplied the CROPDMG and PROPDM variables by the corrected CROPDMGEXP and GPROPDMG variables, to obtain the total economic damage to crops and property. Finally, I have added the variables to obtain the total economic damage
setwd("C:/Users/Gabriel/OneDrive/Coursera/Reproducible Research/Course Project 2")
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",destfile = "data.csv.bz2")
data<-read.csv("data.csv.bz2")
data <- data %>% mutate(EVTYPE = capitalize(gsub("[^A-Za-z0-9]" , " ", EVTYPE))) %>%
mutate_at(vars(PROPDMGEXP,CROPDMGEXP),
funs(as.numeric(dplyr::recode(.,'0'=1,'1'=10,'2'=100,'3'=1000,'4'=10000,'5'=100000,
'6'=1000000,'7'=10000000,'8'=100000000, 'B'=1000000000,
'h'=100,'H'=100, 'k'=1000,'K'=1000,'m'=1000000,'M'=1000000,
' '=0,'-'=0,'?'=0,'+'=0))))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas:
##
## # Simple named list:
## list(mean = mean, median = median)
##
## # Auto named with `tibble::lst()`:
## tibble::lst(mean, median)
##
## # Using lambdas
## list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.
data$CROPDMG<-data$CROPDMG*data$CROPDMGEXP
data$PROPDMG<-data$PROPDMG*data$PROPDMGEXP
data$totalecodmg<-data$CROPDMG+data$PROPDMG
To evaluate this I will select and the 10 more harmful weather events according to fatalities and injuries caused by each event.
fatalities<-data %>%
group_by(EVTYPE) %>%
summarise_at(vars(FATALITIES), funs(round(mean(., na.rm=TRUE),2)))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas:
##
## # Simple named list:
## list(mean = mean, median = median)
##
## # Auto named with `tibble::lst()`:
## tibble::lst(mean, median)
##
## # Using lambdas
## list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.
fatalities<-fatalities[order(-fatalities$FATALITIES),]
ggplot(fatalities[1:10,],aes(x=reorder(EVTYPE, -FATALITIES) ,y=FATALITIES,fill=EVTYPE))+geom_bar(stat="identity", width = 0.5)+theme_bw()+theme(legend.position = "none")+labs(title = "Figure 1: Top ten weather events causing injuries (avg)",x="Event Type", y="Fatalities")+coord_flip()
On the figure we can see the top ten weather events according to the fatalities they cause on average, with tornados being the most dangerous, each of which on average causes 25 deads.
injuries<-data %>%
group_by(EVTYPE) %>%
summarise_at(vars(INJURIES), funs(round(mean(., na.rm=TRUE),2)))
injuries<-injuries[order(-injuries$INJURIES),]
ggplot(injuries[1:10,],aes(x=reorder(EVTYPE, -INJURIES) ,y=INJURIES,fill=EVTYPE))+geom_bar(stat="identity", width = 0.5)+theme_bw()+theme(legend.position = "none")+labs(title = "Figure 2: Top ten weather events causing injuries (avg)",x="Event Type", y="Injuries")+coord_flip()
On the figure we can see the top ten weather events according to the injuries they cause on average, being the most dangerous heat waves, each of which on average causes 25 deads.
To evaluate this I will select and the 10 more harmful weather events according to the economic losses caused in total, considering the damage to crops and property.
eco<-data %>%
group_by(EVTYPE) %>%
summarise_at(vars(totalecodmg), funs(round(sum(., na.rm=TRUE),2)))
eco<-eco[order(-eco$totalecodmg),]
ggplot(eco[1:10,],aes(x=reorder(EVTYPE, -totalecodmg) ,y=totalecodmg,fill=EVTYPE))+geom_bar(stat="identity", width = 0.5)+theme_bw()+theme(legend.position = "none")+labs(title = "Figure 3: Top ten weather events by total economic damage",x="Event Type", y="USD")+coord_flip()
The most harmful type of event according to the economic losses that it causes are floods.