Synopsis:

Storms and weather events are responsible of personal and property damages around the United States. This project explores the United States National Oceanic and Atmospheric Administration (NOAA) storm database. This database has a historic registry of the weather conditions and its personal and property impacts.

This analysis tries to answer the following answers: 1.-Across the United States, which types of events are most harmful with respect to population health? 2.-Across the United States, which types of events have the greatest economic consequences?

Data Processing:

This analysis requires the source file “StormData.bz2”

inputdata<-read.csv("StormData.bz2")

Data before 1996 is incomplete. We will filter that out. Minutes and Seconds will be removed aswell as they are not important in this context. Reducing dataset to the minimum needed to improve speed.

inputdata$BGN_DATE <- as.Date(inputdata$BGN_DATE,format="%m/%d/%Y")
dataset <- subset(inputdata, BGN_DATE > as.Date("1995-12-31") | FATALITIES > 0 | INJURIES > 0 | PROPDMG > 0 | CROPDMG > 0 , select = c("STATE__", "BGN_DATE", "END_DATE" ,"EVTYPE", "FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP") )

Results:

Across the United States, which type of events (as indicated in the EVTYPE variable) are most harmful with respect to population health?

In order to measure health impact we will consider both Fatalities and Injuries We display the top 10 disasters with major harm to the population

dataset$HARMPOPULATION <- dataset$FATALITIES + dataset$INJURIES
HarmList<-aggregate(dataset$HARMPOPULATION ~ dataset$EVTYPE, dataset, sum)
HarmList<- HarmList[order(HarmList[,2],decreasing=TRUE),]
Top10Harm_List<-HarmList[1:10,]
Top10Harm_Type<-HarmList[1:10,1]
Top10Harm_Value<-HarmList[1:10,2]

Based on the data above, the Tornadoes have by far more impact than other events on injuries and fatalities as shown in next chart.

library(ggplot2)
d<-ggplot(Top10Harm_List, aes(x=Top10Harm_Type,y=Top10Harm_Value))+geom_bar(stat="identity")+
  geom_text(aes(label=Top10Harm_Value),color="blue",size=2)+
  labs(title="Top Harms to Population by Type") +
  labs(y="Fatalities and Injuries", x="Type of Event")+theme(axis.text.x=element_text(angle=45,hjust=1))
d

Across the United States, which types of events have the greatest economic consequences?

In order to measure economic impact we will consider both the damages to crop and property We display the top 10 disasters with major economic impact

dataset$MONETARYDAMAGE <- dataset$CROPDMG + dataset$PROPDMG
MDAMAGELIST<-aggregate(dataset$MONETARYDAMAGE ~ dataset$EVTYPE, dataset, sum)
MDAMAGELIST<- MDAMAGELIST[order(MDAMAGELIST[,2],decreasing=TRUE),]
MDAMAGE_List<-MDAMAGELIST[1:10,]
MDAMAGE_Type<-MDAMAGELIST[1:10,1]
MDAMAGE_Value<-MDAMAGELIST[1:10,2]

Based on the data above, Tornadoes are the more harmful considering monetary impact. As shown in next chart:

g<-ggplot(MDAMAGE_List, aes(x=MDAMAGE_Type,y=MDAMAGE_Value))+geom_bar(stat="identity")+
  geom_text(aes(label=MDAMAGE_Value),color="blue",size=2)+
  labs(title="Top Damages to Crop and Property by Type") +
  labs(y="Crop and Property Damages", x="Type of Event")+theme(axis.text.x=element_text(angle=45,hjust=1))
g