Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Data requires no transformation in the beginning

Data Processing

#download the data file and name it as data.csv.bz2
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2","data.csv.bz2")
#read data from the file
data<-read.csv(bzfile("data.csv.bz2"))
library(dplyr)
library(tidyr)
library(ggplot2)
#extracting required data
injuriesData<-aggregate(INJURIES~EVTYPE,data=data,sum)
fatalitiesData<-aggregate(FATALITIES~EVTYPE,data = data,sum)
cropDMGData<-aggregate(CROPDMG~EVTYPE,data=data,sum)
propDMGData<-aggregate(PROPDMG~EVTYPE,data=data,sum)

#sorting extracted data in descending order
injuriesDataSorted<-arrange(injuriesData,desc(INJURIES))
fatalitiesDataSorted<-arrange(fatalitiesData,desc(FATALITIES))
cropDMGDataSorted<-arrange(cropDMGData,desc(CROPDMG))
propDMGDataSorted<-arrange(propDMGData,desc(PROPDMG))

#Here we want most harmful events so extracting top 5 harmful events
harmfulInjuries<-head(injuriesDataSorted,5)
harmfulFatalities<-head(fatalitiesDataSorted,5)
harmfulCropDMG<-head(cropDMGDataSorted,5)
harmfulPropDMG<-head(propDMGDataSorted,5)

#merging huarmfulIjuries and harmfulFatalities data
harmfulHumanHealth<-merge(harmfulFatalities,harmfulInjuries,by="EVTYPE",all = TRUE)
harmfulHumanHealth<-gather(harmfulHumanHealth,AFFECT,FREQ,-EVTYPE)

#merging harmfulCropDMG and harmfulPropDMG data
harmfulEconomics<-merge(harmfulCropDMG,harmfulPropDMG,by="EVTYPE",all = TRUE)
harmfulEconomics<-gather(harmfulEconomics,AFFECT,FREQ,-EVTYPE)

Results

Plot showing across the United States, which types of events (as indicated in the EVTYPE variable) are most harmful with respect to population health.

ggplot(data = harmfulHumanHealth[!is.na(harmfulHumanHealth$FREQ),], aes(x=FREQ,y=EVTYPE,fill=AFFECT))+
    geom_bar(position = 'dodge', stat='identity') +
    geom_text(aes(label=FREQ),position=position_dodge(width=0.9),hjust=0)+
    xlim(0,110000)+
    labs(x="Frequency",y="Event Types",title="Most harmful weather events for human health 2000-2011")

plot showing across the United States, which types of events have the greatest economic consequences.

ggplot(data = harmfulEconomics[!is.na(harmfulEconomics$FREQ),], aes(x=FREQ,y=EVTYPE,fill=AFFECT))+
    geom_bar(position = 'dodge', stat='identity') +
    geom_text(aes(label=FREQ),position=position_dodge(width=0.9),hjust=0)+
    xlim(0,4000000)+
    labs(x="Frequency",y="Event Types",title="Weather events that have the greatest economic consequences 2000-2011")

  • From the first plot it is hereby cleared that ‘Tornado’ as a weather event is most harmful for the human health in terms of fatalities and injuries.

  • The second plot confirms that ‘Tornado’ as a weather event has the greatest economic consequences.