Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
Data
The data for this assignment come in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. You can download the file from the course web site:
##Fetch Url
fileURL<-"http://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
download.file(fileURL,"repdata-data-StormData.csv.bz2")
##Unzip and read data from file
myData <- read.csv(bzfile("repdata-data-StormData.csv.bz2"),sep=",",header = T)
Q1 : Across the United States, which types of events are most harmful with respect to population health? To answer this question we first create two subdatasets to quantify and create a plot to see the events with most fatalities and injuries.
You can also embed plots, for example:
library(ggplot2)
##Fatalities subset
Fatalities<- aggregate(FATALITIES ~ EVTYPE,myData, sum)
##taking values above mean of fatalities
Fatalities<-Fatalities[Fatalities$FATALITIES>(mean(Fatalities$FATALITIES)),]
## frequency bar plot for fatalities
fat<-ggplot(aes(x=EVTYPE,,y=FATALITIES),data=Fatalities)+
geom_bar(stat = "identity",aes(fill = FATALITIES))+
theme(axis.text.x=element_text(angle=90, hjust=1))+
labs(title="Total Fatalities by Storm Type", x="", y="Total Fatalities")
print(fat)
##Injuries subset
INJ<- aggregate(INJURIES ~ EVTYPE,myData, sum)
##taking values above the mean of INJURIES
INJ<-INJ[INJ$INJURIES>(mean(INJ$INJURIES)),]
##frequency plot for injuries
inj<-ggplot(aes(x=EVTYPE,,y=INJURIES),data=INJ)+
geom_bar(stat = "identity",aes(fill = INJURIES))+
theme(axis.text.x=element_text(angle=90, hjust=1)) +
labs(title="Total Injuries by Storm Type", x="", y="Total Injuries")
print(inj)
##result
paste("Clearly,TORNADO has most harmful impact on public health with",max(Fatalities$FATALITIES),"fatalities and ",max(INJ$INJURIES), "injuries")
## [1] "Clearly,TORNADO has most harmful impact on public health with 5633 fatalities and 91346 injuries"
## [1] "Clearly,TORNADO has most harmful impact on public health with 5633 fatalities and 91346 injuries"
Q2 :Across the United States, which types of events have the greatest economic consequences? To answer this question , columns PROPDMG and PROPDMGEXP are used. Note here,expenditures are marked in “K”,“M” and “B” (PROPDMGEXP) meaning thousands , millions and billions of dollars respectively. Subdatasets are created for actual expenditure in dollars and one for total expenditure for a weather type.Plots are created to determine which weather type have maximum impact on economy.
##creating subset
eco<-myData[myData$PROPDMG>0&myData$PROPDMGEXP %in% c("K","M","B"),c("EVTYPE","PROPDMG","PROPDMGEXP")]
##Calculating actual damages
multiplier<-data.frame(PROPDMGEXP=c("K","M","B"),multiplier=c(1000,1000000,1000000000))
eco<-merge(eco,multiplier,by= "PROPDMGEXP")
eco$DAMAGE<-eco$PROPDMG*eco$multiplier/1000000000
##creating summaries for EVTYPE
ecodamage<- aggregate(DAMAGE ~ EVTYPE,eco, sum)
ecodamage<-ecodamage[ecodamage$DAMAGE>(mean(ecodamage$DAMAGE)),]
##frequency plot for damages
ggplot(aes(x=EVTYPE,,y=DAMAGE),data=ecodamage)+
geom_bar(stat = "identity",aes(fill = DAMAGE))+
theme(axis.text.x=element_text(angle=90, hjust=1)) +
labs(title="Total Damage($ billion) by Storm Type", x="", y="Total Damages")
paste(“Clearly, FLOODS have the most harmful impact on economy with”,max(ecodamage$DAMAGE),“billion dollars followed by HURRICANE and TORNADO respectively.”)
[1] “Clearly, FLOODS have the most harmful impact on economy with 144.6577098 billion dollars followed by HURRICANE and TORNADO respectively.”