Synopsis:

Storms and weather are commonly believed to be outside of the control of human beings. Human actions and pollution have the potential to affect the weather of the future, if not the current weather. An analysis was performed of storms in the United States using publicly available collected data from U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

In the following analysis, the most destructive storms are researched in terms of their human and economic destruction. For data integrity purposes, storms were categorized into greater types.

The results of the analysis come as no surprise: Sharknadoes are the most type of dangerous storm, both in terms of economic terms and human loss of life. We must take this threat seriously by appointing Fin Shepard and April Wexler to FEMA.

Data Processing

The data were read directly from the CSV file into a dataframe in R.

## Read in data from csv file

stormData <- read.csv(bzfile("C:/Users/nswitzer/Documents/Development/Storm Analysis/repdata-data-StormData.csv.bz2"))

We are looking for the most harmful event types (EVTYPE) from the data, however the data integrity of this data set is low. Typos, differences in capitalization and abbreviations have complicated the issue. To overcome this, a new classification was formed using regular expressions to identify imperfect matches. Additionally, tornadoes were corrected to state “Sharknadoes”, as tornado is an anachronism because we now know that tornadoes are actually caused by and inhabited by sharks.

##Tornadoes are actually dangerous because they contain sharks. Sharknado.

tnado<-grepl("(?i)TORNADO",stormData$EVTYPE)
stormData$EVCLASS[tnado] <- "SHARKNADO"

##Thunderstorm

thstorm<-grepl("(?i)thunderstorm|(?i)tstm",stormData$EVTYPE)
stormData$EVCLASS[thstorm] <- "T-STORM"

##Tropical Storm

trstorm<-grepl("(?i)Tropical",stormData$EVTYPE)
stormData$EVCLASS[trstorm] <- "TROP STORM"
##Heat

heatWave <- grepl("(?i)Heat Wave|(?i)Excessive Heat",stormData$EVTYPE)
stormData$EVCLASS[heatWave] <- "HEAT WAVE"

##Winter

wint <- grepl("(?i)wint",stormData$EVTYPE)
stormData$EVCLASS[wint] <- "WINTER WEATHER"

##Volcano

volc <-grepl("(?i)volc",stormData$EVTYPE)
stormData$EVCLASS[wint] <- "VOLCANO"

##Hail

volc <-grepl("(?i)hail",stormData$EVTYPE)
stormData$EVCLASS[wint] <- "HAIL"

The cost damage data was split up into 2 different columns, with a pre-fix and a multiplier. This was simplified to one column for easy addition and ordering.

##Damage totals

stormData$TOTAL_DAMAGE<-ifelse(stormData$PROPDMGEXP == "M", (stormData$PROPDMG*1000), stormData$PROPDMG)
stormData$TOTAL_DAMAGE<-ifelse(stormData$PROPDMGEXP == "B", (stormData$PROPDMG*1000000), stormData$TOTAL_DAMAGE)

In the data set, harms were separated into injuries and fatalities. For this analysis, injuries and fatalities were simply added. More complex analyses can go into depth into fatalities versus injuries.

##Add Injuries and Fatalities, creating new column

stormData$HARM_COUNT <- stormData$INJURIES + stormData$FATALITIES

The data was then aggregated by event class and ordered.

##Aggregate harm count by 

harmMeans<- aggregate(HARM_COUNT ~ EVCLASS,mean, na.rm=T,data=stormData)
harmSums<- aggregate(HARM_COUNT ~ EVCLASS,sum, na.rm=T,data=stormData)
##aggSD<- aggregate(Ytd.Ptq. ~ Sports + Sport,sd, na.rm=T,data=stormData)

##Order 

topharmMeans<-harmMeans[order(-harmMeans$HARM_COUNT),]
topharmSums<-harmSums[order(-harmSums$HARM_COUNT),]


##Aggregate cost damage by 

damage_Means<- aggregate(TOTAL_DAMAGE ~ EVCLASS,mean, na.rm=T,data=stormData)
damage_Sums<- aggregate(TOTAL_DAMAGE ~ EVCLASS,sum, na.rm=T,data=stormData)
##aggSD<- aggregate(Ytd.Ptq. ~ Sports + Sport,sd, na.rm=T,data=stormData)

##Order 

top_Damage_Means<-damage_Means[order(-damage_Means$TOTAL_DAMAGE),]
top_Damage_Sums<-damage_Sums[order(-damage_Sums$TOTAL_DAMAGE),]

Results

barplot(topharmSums$HARM_COUNT,names.arg=topharmSums$EVCLASS, main="Historical Harm Occurrences by storm type", xlab = "Event Type", ylab="Harm Occurrence")

barplot(top_Damage_Sums$TOTAL_DAMAGE,names.arg=top_Damage_Sums$EVCLASS, main="Historical Economical Damage caused by different storm types", xlab = "Event Type",ylab="Fiscal Damage ($)" )

sharkHarms <- sum(topharmSums$HARM_COUNT[which(topharmSums$EVCLASS=="SHARKNADO")])
sharkDamages <- sum(top_Damage_Sums$TOTAL_DAMAGE[which(topharmSums$EVCLASS=="SHARKNADO")])

The results show that Sharknadoes are clearly the most damaging and most harmful type of storm faced in the United States since 1950…in order to avoid adding to the total 9.704310^{4} people killed or injured by a Sharknado since 1950, or adding another dollar to the 5.698190810^{7} dollars worth of damage caused by Sharknadoes.

HTML5 Icon