Title: This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
12/5/2019
Synopsis
This document processes the storm data from NOAA storm database and provide the analysis on which weather type results in the most population damage and ecnomic damage
Data processing
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",destfile = "file1.csv.bz")
file2<-read.csv("file1.csv.bz")
file3<-file2[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","CROPDMG")]
Data processing for damages to people
Calculate fatalaties damage
pfat<-aggregate(file3$FATALITIES,by=list(file3$EVTYPE),FUN=sum)
Calculate injuries damage
pinj<-aggregate(file3$INJURIES,by=list(file3$EVTYPE),FUN=sum)
names(pfat)<-c("EVTYPE","Number")
names(pinj)<-c("EVTYPE","Number")
pmerge<-merge(pfat,pinj,by= "EVTYPE")
Results (part 1)
pmerge$tot<-rowSums(pmerge[,2:3])
pmerge$EVTYPE[which.max(pmerge$tot)]
## [1] TORNADO
## 985 Levels: HIGH SURF ADVISORY COASTAL FLOOD FLASH FLOOD ... WND
Sort the dataframe based on the descending number of total damage## (injuries+fatalities)
pmerge2<-pmerge[with(pmerge,order(-tot)),]
Plotting
library("ggplot2")
pdamage<-pmerge2[1:5,]
ggplot(pdamage,aes(x=EVTYPE, y= tot))+geom_point(size=4)+xlab("Type of weather conditions")+ylab("Total number of injuries and fatalities")+ggtitle("Top 5 fatalities and injuries")+ theme(axis.text.x = element_text(angle = 45, hjust = 1))

File processing for 2nd part
## Get economical damage ##
### Property damage ###
pprop<-aggregate(file3$PROPDMG,by=list(file3$EVTYPE),FUN=sum)
### Crop damage ###
pcrop<-aggregate(file3$CROPDMG,by=list(file3$EVTYPE),FUN=sum)
names(pprop)<-c("EVTYPE","Number")
names(pcrop)<-c("EVTYPE","Number")
Emerge<-merge(pprop,pcrop,by= "EVTYPE")
Emerge$tot<-rowSums(Emerge[,2:3])
names(Emerge)<-c("EVTYPE","Property","Crop","tot")
pmerge$EVTYPE[which.max(pmerge$tot)]
## [1] TORNADO
## 985 Levels: HIGH SURF ADVISORY COASTAL FLOOD FLASH FLOOD ... WND
Emerge2<-Emerge[with(Emerge,order(-tot)),]
### Sort the dataframe based on the descending number of property damage ###
Emerge3<-Emerge[with(Emerge,order(-Property)),]
### Sort the dataframe based on the descending number of crop damage ###
Emerge4<-Emerge[with(Emerge,order(-Crop)),]
Results
Edamage<-Emerge2[1:5,]
Edamage2<-Emerge3[1:5,]
Edamage3<-Emerge4[1:5,]
### Take a look at the property damage stat ###
summary(Edamage2$Property)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 876844 899938 1335966 1549026 1420125 3212258
### Take a look at the crop damage stat ###
summary(Edamage3$Crop)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 100019 109203 168038 227211 179201 579596
Plotting
library(ggplot2)
require(gridExtra)
## Loading required package: gridExtra
p1<-ggplot(Edamage2,aes(x=EVTYPE, y= Property))+geom_point(size=4,color="blue")+xlab("Type of weather conditions")+ylab("Property damages")+ggtitle("Top 5 property damages")+theme(text = element_text(size=7))
p2<-ggplot(Edamage3,aes(x=EVTYPE, y= Crop))+geom_point(size=4,color="red")+xlab("Type of weather conditions")+ylab("Crop damages")+ggtitle("Top 5 crop damages")+theme(text = element_text(size=7))
grid.arrange(p1,p2,ncol=2)

Summary
From the results and graphs one can see that tornado has resulted in the most number of injuries and fatalites. As far as the impact of weather on econmy, one can see that Tornado results in the most property damage and hail curts crop the most.