Synopsis
The basic goal of this assignment is to explore the NOAA Storm Database and answer some basic questions about severe weather events. I use the database to answer the questions below and show the code for my entire analysis. My analysis can consist of tables, figures, or other summaries. I also use any R package to support my analysis.
Data Processing
data <- read.csv("~/Desktop/repdata-data-StormData.csv")
Results
Questions
data.injuries<-aggregate(data$INJURIES,by=list(data$EVTYPE),sum)
data.fatalities<-aggregate(data$FATALITIES,by=list(data$EVTYPE),sum)
data.injuries.high <- data.injuries[head(order(data.injuries$x, decreasing=TRUE), 10),]
data.fatalities.high <- data.fatalities[head(order(data.fatalities$x, decreasing=TRUE), 10),]
data.harm<-cbind(data.fatalities.high,data.injuries.high)
names(data.harm)<-c("Event Types","Fatalities","Event Types","Injuries")
data.harm
## Event Types Fatalities Event Types Injuries
## 834 TORNADO 5633 TORNADO 91346
## 130 EXCESSIVE HEAT 1903 TSTM WIND 6957
## 153 FLASH FLOOD 978 FLOOD 6789
## 275 HEAT 937 EXCESSIVE HEAT 6525
## 464 LIGHTNING 816 LIGHTNING 5230
## 856 TSTM WIND 504 HEAT 2100
## 170 FLOOD 470 ICE STORM 1975
## 585 RIP CURRENT 368 FLASH FLOOD 1777
## 359 HIGH WIND 248 THUNDERSTORM WIND 1488
## 19 AVALANCHE 224 HAIL 1361
Conclusin: From the dataset, we could see that the TORNADO is the most harmful event with the respect to population health.
data.prop<-aggregate(data$PROPDMG,by=list(data$EVTYPE),sum)
data.crop<-aggregate(data$CROPDMG,by=list(data$EVTYPE),sum)
data.cost<-data.frame(data.prop[,1],data.prop[,2]+data.crop[,2])
names(data.cost)<-c("Event Types","Economics Cost")
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
data.cost <- arrange(data.cost, desc(data.cost[,2]))
data.cost.high<- head(data.cost, 10)
#data.cost.high[,1] <- as.character(data.cost.high[,1])
data.cost.high[,1] <- factor(data.cost.high[,1])
plot(data.cost.high[,1],data.cost.high[,2]/100000,xlab="Event Types",ylab="Economic Cost (10^5 US Dollars)",main="Event Types with Greatest Economic Consequences")
Conclusin: From the dataset and the plot, we could see that the TORNADO costs the greatest economic consequence.