Natural disasters result in a loss of human life and considerable monetary damages. Here we seek to better understand what types of disasters and weather anomalies result in the most loss of life and loss of property. The data was analysed to identify the top conditions that result in loss of life or property damange (in terms of US dollars). We identified that tornados and other type of wind events result in the top deaths and damage.
Below the code that aquires the data file and loads it into a data frame. The large data set can be saved in a R object format so that future analysis can be done quickly.
getData <- function(url = "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"){
if (exists("StormData") == TRUE){
print("It's already loaded!")
}else if(file.exists("Storm.RData") == TRUE){
print("Loaded via the RData file.")
load("Storm.RData")
}else if(file.exists("StormData.bz2") == TRUE){
print("Must import from CSV. This could take a bit. Do 10 pushups.")
StormData <- read.csv(bzfile("StormData.bz2"))
save(StormData, file = "Storm.RData")
save.image()
unlink("Storm.RData")
unlink(".RData")
}else{
print("We need to download this file. This will take a bit. Do 20 pushups.")
download.file(url = url, destfile = "StormData.bz2")
StormData <- read.csv(bzfile("StormData.bz2"))
save(StormData, file = "Storm.RData")
save.image()
unlink("Storm.RData")
unlink(".RData")
}
StormData
}
StormData <- getData()
## [1] "Loaded via the RData file."
If we exam the number of event types (EVTYPE) we find that there are over 900. This makes visualizing all this data by event type impractical.
library(ggplot2)
length(unique(StormData$EVTYPE))
## [1] 985
We can filter our data set to include only the top 5 events based on two indicators, fatalities (FATALITIES) and property damage (PROPDMG).
FatalEvent <- tapply(StormData$FATALITIES, StormData$EVTYPE, sum)
OrFatalEvent <- FatalEvent[order(-FatalEvent)]
NamesFatalEvents <- names(OrFatalEvent[1:5])
PropEvent <- tapply(StormData$PROPDMG, StormData$EVTYPE, sum)
OrPropEvent <- PropEvent[order(-PropEvent)]
NamesPropEvents <- names(OrPropEvent[1:5])
Here I report the top 5 events in terms of fatalities and property damage.
library(ggplot2)
NamesFatalEvents
## [1] "TORNADO" "EXCESSIVE HEAT" "FLASH FLOOD" "HEAT"
## [5] "LIGHTNING"
NamesPropEvents
## [1] "TORNADO" "FLASH FLOOD" "TSTM WIND"
## [4] "FLOOD" "THUNDERSTORM WIND"
FatalStormData <- StormData[StormData$EVTYPE %in% NamesFatalEvents,]
a <- ggplot(FatalStormData, aes(EVTYPE, FATALITIES)) +
geom_bar(stat = "identity")
a
Figure 1. Here the top 5 events that result in fatalities are shown. Clearly since monitoring started in the US, tornados have caused the most deaths.
PropStormData <- StormData[StormData$EVTYPE %in% NamesPropEvents,]
b <- ggplot(PropStormData, aes(EVTYPE, PROPDMG)) +
geom_bar(stat = "identity")
b
Figure 2. Here the top 5 events that result in property damange are shown. Clearly since monitoring started in the US, tornados have caused the most damage
Wind, she is not a friend of man. Will we ever be able to tame this wild beast. Based on these results I suggest we convert her windy rage into clean green energy. This will lower the cost of warming Hot Pockets in microwave ovens and at the same time decrease human death.