The basic goal of this document is to explore the NOAA Storm Database and answer some basic questions about severe weather events:
In this document the top 10 event types in terms of fatalities, injuries and property damage are viewed. In all cases the tornado is the most harmfull. It is recommended to give attention to excessive heat because both fatalities as injuries are high.
The data is aquired from the internet (downloaded), and loaded into a data.frame for futher processing.
We download and read the file:
url = "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
destfile = "C:/TFS/Additional/R/Reproducible Research/Project2/StormData.csv.bz2"
download.file(url, destfile)
StormData <- read.csv(bzfile(destfile))
We aggregate the data by event type and sum the fatalities, injuries and property damage. This will allow us to view the top 10 event types by each category (fatalities, injuries and property damage). This will help us to answer our research questions.
Aggregate data by event type and sum the fatalities and take the top 10:
StormData.Fatalities.aggregate = aggregate(FATALITIES ~ EVTYPE, data=StormData, sum)
StormData.Fatalities = StormData.Fatalities.aggregate[StormData.Fatalities.aggregate$FATALITIES != 0,]
StormData.Fatalities.Sorted = StormData.Fatalities[order(-StormData.Fatalities$FATALITIES),]
StormData.Fatalities.Top10 = StormData.Fatalities.Sorted[1:10,]
Aggregate data by event type and sum the injuries and take the top 10:
StormData.Injuries.aggregate = aggregate(INJURIES ~ EVTYPE, data=StormData, sum)
StormData.Injuries = StormData.Injuries.aggregate[StormData.Injuries.aggregate$INJURIES != 0,]
StormData.Injuries.Sorted = StormData.Injuries[order(-StormData.Injuries$INJURIES),]
StormData.Injuries.Top10 = StormData.Injuries.Sorted[1:10,]
Aggregate data by event type and sum the property damage and take the top 10:
StormData.Damage.aggregate = aggregate(PROPDMG ~ EVTYPE, data=StormData, sum)
StormData.Damage = StormData.Damage.aggregate[StormData.Damage.aggregate$PROPDMG != 0,]
StormData.Damage.Sorted = StormData.Damage[order(-StormData.Damage$PROPDMG),]
StormData.Damage.Top10 = StormData.Damage.Sorted[1:10,]
colnames(StormData.Damage.Top10) = c("Event type", "Property Damage")
We answer this research question by creating a plot, first for fatalities, for injuries and the two combined.
By fatality:
library("ggplot2")
ggplot(StormData.Fatalities.Top10, aes(reorder(EVTYPE, -FATALITIES), FATALITIES)) + geom_bar(stat="identity") + ggtitle("Top 10 event types by fatalities") +
theme(axis.text.x = element_text(angle = -90, hjust=0), plot.title = element_text(lineheight=.8, face="bold")) + xlab("Type of event")
Tornado causes the most fatalities by far. Excesive heat also has relatively many fatalities.
By Injuries:
ggplot(StormData.Injuries.Top10, aes(reorder(EVTYPE, -INJURIES), INJURIES)) + geom_bar(stat="identity") + ggtitle("Top 10 event types by Injuries") +
theme(axis.text.x = element_text(angle = -90, hjust=0), plot.title = element_text(lineheight=.8, face="bold")) + xlab("Type of event")
Tornado causes the most fatalities by far.
The combination:
StormData.Combo = merge(StormData.Injuries.Top10, StormData.Fatalities.Top10)
ggplot(StormData.Combo, aes(x = INJURIES, y = FATALITIES, col=EVTYPE)) + geom_point(size=4) + ggtitle("Event types by both injuries as fatalities") + theme(plot.title = element_text(lineheight=.8, face="bold"))
When looking at both injuris and fatalities it becomes apparent that excessive heat should be considered when taking preventive measures.
To answer we look at the data we prepared before, the top 10 event types when looking at property damage:
StormData.Damage.Top10
## Event type Property Damage
## 834 TORNADO 3212258.2
## 153 FLASH FLOOD 1420124.6
## 856 TSTM WIND 1335965.6
## 170 FLOOD 899938.5
## 760 THUNDERSTORM WIND 876844.2
## 244 HAIL 688693.4
## 464 LIGHTNING 603351.8
## 786 THUNDERSTORM WINDS 446293.2
## 359 HIGH WIND 324731.6
## 972 WINTER STORM 132720.6
Again tornado causes the most damage.
On an overall level we can conclude tornado causes the most damage. We did notice excessive heat causes both many fatalities as injuries.