Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
The first step is to download the database
url<- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
destfile<- "C:/Users/Ricardo Carranza/Desktop/20200713 - Rcarranza/documentos/Data Science/Coursera/Reproducible Research/Course Project 2/sd.csv"
download.file(url,destfile)
df<-read.csv("sd.csv")
Questions to answer:
Across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?
Across the United States, which types of events have the greatest economic consequences?
The two variables considered to analyze which event causes the most harm to population health were fatalities and injuries.
library(dplyr)
df.fatalities<- df %>% select(EVTYPE, FATALITIES, INJURIES) %>% group_by(EVTYPE) %>% summarise(total.fatalities=sum(FATALITIES), total.injuries = sum(INJURIES)) %>% arrange(-total.fatalities, -total.injuries) %>% mutate(Total=total.fatalities+total.injuries)
head(df.fatalities,10)
## # A tibble: 10 x 4
## EVTYPE total.fatalities total.injuries Total
## <chr> <dbl> <dbl> <dbl>
## 1 TORNADO 5633 91346 96979
## 2 EXCESSIVE HEAT 1903 6525 8428
## 3 FLASH FLOOD 978 1777 2755
## 4 HEAT 937 2100 3037
## 5 LIGHTNING 816 5230 6046
## 6 TSTM WIND 504 6957 7461
## 7 FLOOD 470 6789 7259
## 8 RIP CURRENT 368 232 600
## 9 HIGH WIND 248 1137 1385
## 10 AVALANCHE 224 170 394
The base data categorize the economic impact in two columns: - property damage (PROPDMG) - crop damage (CROPDMG).
The total damage caused by each event type is calculated with the following code:
df.damage <- df %>% select(EVTYPE, PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP)
Symbol<- sort(unique(as.character(df.damage$PROPDMGEXP)))
Multiplier<- c(0,0,0,1,10,10,10,10,10,10,10,10,10,10^9,10^2,10^2,10^3,10^6,10^6)
convert.Multiplier<- data.frame(Symbol, Multiplier)
df.damage$Prop.Multiplier<- convert.Multiplier$Multiplier[match(df.damage$PROPDMGEXP, convert.Multiplier$Symbol)]
df.damage$Crop.Multiplier<- convert.Multiplier$Multiplier[match(df.damage$PROPDMGEXP, convert.Multiplier$Symbol)]
df.damage<- df.damage %>% mutate (PROPDMG = PROPDMG*Prop.Multiplier) %>% mutate(CROPDMG = CROPDMG*Crop.Multiplier) %>% mutate(TOTAL.DMG = PROPDMG+CROPDMG)
df.damage.total <- df.damage %>% group_by(EVTYPE) %>% summarize(TOTAL.DMG.EVTYPE = sum(TOTAL.DMG)) %>% arrange(-TOTAL.DMG.EVTYPE)
head(df.damage.total,10)
## # A tibble: 10 x 2
## EVTYPE TOTAL.DMG.EVTYPE
## <chr> <dbl>
## 1 HURRICANE 814750235010
## 2 HURRICANE/TYPHOON 802074291330
## 3 FLOOD 231909682070
## 4 TORNADO 85207035607
## 5 FLASH FLOOD 54962957791
## 6 STORM SURGE 43328536000
## 7 HAIL 31046432377
## 8 HIGH WIND 12444111890
## 9 TSTM WIND 12169568890
## 10 WILDFIRE 11938922200
The events with highest health impact effects are shown below
library(ggplot2)
g <- ggplot(df.fatalities[1:10,], aes(x=reorder(EVTYPE,-Total), y=Total))+geom_bar(stat = "identity", fill="steelblue")+ geom_text(aes(label=Total), vjust=-0.1, color="black",size=3.5)+ theme(axis.text.x = element_text(angle=90, vjus=0.5,hjust=1))+ggtitle("Events with Highest Health Impacts")+labs(x="EVENT TYPE", y="Total Health Impact")
g
As shown in the graph, the events that cause the highest health impacts are the tornados
g <- ggplot(df.damage.total[1:10,], aes(x=reorder(EVTYPE, - TOTAL.DMG.EVTYPE), y=TOTAL.DMG.EVTYPE)) + geom_bar(stat = "identity")+ theme(axis.text.x = element_text(angle=90, vjust = 0.5, hjust = 1)) + ggtitle("Events with Highest Economic Impact")+ labs(x="Event Type", y="Total Economic Impact ($USD)")
g
As shown, the events that causes the major economic impacts are hurricanes and typhoons