Tornadoes cause enormous damage in the United States (US). Tornadoes have caused over 96,000 injuries or deaths in the in the US from 1950 to November 2011. This is over nine times the number of people impacted for the second and third most impactful weather events (flood and thunderstorm wind). Looking at the economic impact of tornadoes also reveals that they create the most combined property and crop damage of any weather event. With economic impact, other weather events such as thunderstorm wind and flooding also carry a very heavy financial cost that is close to tornadoes.
The data for this analysis was download from the Coursera Reproducible Research class website via the below URL and methods. The data are originally from the National Weather Service.
wd <- "C:\\Users\\chrisc\\Documents\\R"
dataFile <- "repdata_data_StormData.csv.bz2"
destinationFullPath <- paste(wd, dataFile, sep="\\")
setwd(wd)
if(!file.exists(destinationFullPath))
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destinationFullPath)
if(!exists("stormData"))
stormData <- read.csv(dataFile)
Since these data contained many dupliicate event types, an effort was made to consolidate the events that were going to impact the analysis the most and were obvious duplicates. Since my expertise isn’t weather and some of these potential event type changes would require judgement calls, I was conservative in this consolidation.
stormData$EVTYPE[stormData$EVTYPE == "TSTM WIND"] <- "THUNDERSTORM WIND"
stormData$EVTYPE[stormData$EVTYPE == "THUNDERSTORM WINDS"] <- "THUNDERSTORM WIND"
stormData$EVTYPE[stormData$EVTYPE == "THUNDERSTORM WINDSS"] <- "THUNDERSTORM WIND"
stormData$EVTYPE[stormData$EVTYPE == "FLASH FLOOD"] <- "FLOOD"
stormData$EVTYPE[stormData$EVTYPE == "FLASH FLOODS"] <- "FLOOD"
stormData$EVTYPE[stormData$EVTYPE == "FLASH FLOODING"] <- "FLOOD"
stormData$EVTYPE[stormData$EVTYPE == "FLOODING"] <- "FLOOD"
stormData$EVTYPE[stormData$EVTYPE == "FLOOD/FLASH FLOOD"] <- "FLOOD"
stormData$EVTYPE[stormData$EVTYPE == "FLASH FLOOD/FLOOD"] <- "FLOOD"
stormData$EVTYPE[stormData$EVTYPE == "FLASH FLOODING/FLOOD"] <- "FLOOD"
To answer this question, I aggregated the injury and fatality data and displayed in a bar graph. The bar graph illustrates that tornadoes have the largest impact.
sumFatalitiesByEvent <- aggregate(FATALITIES~EVTYPE, stormData, sum)
sumInjuriesByEvent <- aggregate(INJURIES~EVTYPE, stormData, sum)
sumHarmByEvent <- merge(sumFatalitiesByEvent, sumInjuriesByEvent)
sumHarmByEvent$Total <- sumHarmByEvent$FATALITIES + sumHarmByEvent$INJURIES
sortedEvents <- sumHarmByEvent[order(-sumHarmByEvent$Total),]
library(ggplot2)
p1 <-ggplot(data=sortedEvents[1:5,], aes(x= reorder(EVTYPE, -Total), y=Total)) +
geom_bar(stat="identity", color="blue", fill="steelblue") +
geom_text(aes(label=Total), vjust=-0.5, size=3.5) +
theme_minimal()
print(p1 + ggtitle("Weather Events with the Most People Injured from 1950-2011") + labs(y="People Impacted (Injury or Fatality)", x = "Weather Event"))
To answer this question, I aggregated the property and crop damage data and displayed in a bar graph. The bar graph illustrates that tornadoes have the largest economic impact but other weather events also cause substantial economic impacts that are close in scale.
sumPropDamageByEvent <- aggregate(PROPDMG~EVTYPE, stormData, sum)
sumCropDamageByEvent <- aggregate(CROPDMG~EVTYPE, stormData, sum)
sumAllDamageByEvent <- merge(sumPropDamageByEvent, sumCropDamageByEvent)
sumAllDamageByEvent$Total <- sumPropDamageByEvent$PROPDMG + sumCropDamageByEvent$CROPDMG
sortedEvents <- sumAllDamageByEvent[order(-sumAllDamageByEvent$Total),]
library(ggplot2)
p2 <-ggplot(data=sortedEvents[1:5,], aes(x= reorder(EVTYPE, -PROPDMG), y=PROPDMG)) +
geom_bar(stat="identity", color="blue", fill="steelblue") +
ggtitle("Weather Events with the Most Economic Damage from 1950-2011") + labs(y="Economic Damage", x = "Weather Event")
print(p2)