Title: This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

12/5/2019

Synopsis

This document processes the storm data from NOAA storm database and provide the analysis on which weather type results in the most population damage and ecnomic damage

Data processing

download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2",destfile = "file1.csv.bz")
file2<-read.csv("file1.csv.bz")
file3<-file2[,c("EVTYPE","FATALITIES","INJURIES","PROPDMG","CROPDMG")]

Data processing for damages to people

Calculate fatalaties damage

pfat<-aggregate(file3$FATALITIES,by=list(file3$EVTYPE),FUN=sum)

Calculate injuries damage

pinj<-aggregate(file3$INJURIES,by=list(file3$EVTYPE),FUN=sum)
names(pfat)<-c("EVTYPE","Number")
names(pinj)<-c("EVTYPE","Number")
pmerge<-merge(pfat,pinj,by= "EVTYPE")

Results (part 1)

pmerge$tot<-rowSums(pmerge[,2:3])
pmerge$EVTYPE[which.max(pmerge$tot)]
## [1] TORNADO
## 985 Levels:    HIGH SURF ADVISORY  COASTAL FLOOD  FLASH FLOOD ... WND

Sort the dataframe based on the descending number of total damage## (injuries+fatalities)

pmerge2<-pmerge[with(pmerge,order(-tot)),]

Plotting

library("ggplot2")
pdamage<-pmerge2[1:5,]
ggplot(pdamage,aes(x=EVTYPE, y= tot))+geom_point(size=4)+xlab("Type of weather conditions")+ylab("Total number of injuries and fatalities")+ggtitle("Top 5 fatalities and injuries")+ theme(axis.text.x = element_text(angle = 45, hjust = 1))

File processing for 2nd part

## Get economical damage ##
### Property damage ###
pprop<-aggregate(file3$PROPDMG,by=list(file3$EVTYPE),FUN=sum)
### Crop damage ###
pcrop<-aggregate(file3$CROPDMG,by=list(file3$EVTYPE),FUN=sum)
names(pprop)<-c("EVTYPE","Number")
names(pcrop)<-c("EVTYPE","Number")
Emerge<-merge(pprop,pcrop,by= "EVTYPE")
Emerge$tot<-rowSums(Emerge[,2:3])
names(Emerge)<-c("EVTYPE","Property","Crop","tot")
pmerge$EVTYPE[which.max(pmerge$tot)]
## [1] TORNADO
## 985 Levels:    HIGH SURF ADVISORY  COASTAL FLOOD  FLASH FLOOD ... WND
Emerge2<-Emerge[with(Emerge,order(-tot)),]
### Sort the dataframe based on the descending number of property damage ###
Emerge3<-Emerge[with(Emerge,order(-Property)),]
### Sort the dataframe based on the descending number of crop damage ###
Emerge4<-Emerge[with(Emerge,order(-Crop)),]

Results

Edamage<-Emerge2[1:5,]
Edamage2<-Emerge3[1:5,]
Edamage3<-Emerge4[1:5,]
### Take a look at the property damage stat ###
summary(Edamage2$Property)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  876844  899938 1335966 1549026 1420125 3212258
### Take a look at the crop damage stat ###
summary(Edamage3$Crop)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  100019  109203  168038  227211  179201  579596

Plotting

library(ggplot2)
require(gridExtra)
## Loading required package: gridExtra
p1<-ggplot(Edamage2,aes(x=EVTYPE, y= Property))+geom_point(size=4,color="blue")+xlab("Type of weather conditions")+ylab("Property damages")+ggtitle("Top 5 property damages")+theme(text = element_text(size=7))
p2<-ggplot(Edamage3,aes(x=EVTYPE, y= Crop))+geom_point(size=4,color="red")+xlab("Type of weather conditions")+ylab("Crop damages")+ggtitle("Top 5 crop damages")+theme(text = element_text(size=7))
grid.arrange(p1,p2,ncol=2)

Summary

From the results and graphs one can see that tornado has resulted in the most number of injuries and fatalites. As far as the impact of weather on econmy, one can see that Tornado results in the most property damage and hail curts crop the most.