Synopsis

The data for this analysis was provided by the National Oceanic and Atmospheric Administration (NOAA). Below in the Data Processing section you will find the URL of the data file as well as the code for opening it and loading it into a data frame.

For the purposes of this analysis we are concerned with event data that pertains to fatalities, injuries and the destruction of property, including crops.

Each of these items will be graphed visually in order to aid understanding of which weather events have the highest negative impact.

The analysis will show that tornadoes are the top cause of death and destruction when compared to all other weather events in the United States. Flooding and wind also contribute significantly.

Data Processing

The main issue with this dataset is the irregularity of the names contained in the EVTYPES variable. According to the PDF at this URL https://d396qusza40orc.cloudfront.net/repdata%2Fpeer2_doc%2Fpd01016005curr.pdf there are 48 types of weather events designated. These values can be found on the document in table 2.1.1 (Storm Data Event Table).

Here we have used a series of regular expressions to map what we feel to be the most accurate legitimate variable to those included in the dataset. If you choose to re-run this data yourself you will want to examine the choices we have made here carefully for you may disagree with what we have done.

From there we have sub-setted the data using only what we need and used ggplot2 to plot the data.

if (!file.exists("repdata-data-StormData.csv.bz2")){ 
  download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FrawdataData.csv.bz2", 
  method = "curl",destfile = "repdata-data-StormData.csv.bz2")
}

dat <- read.csv(bzfile("repdata-data-StormData.csv.bz2"))

dat$EVTYPE = toupper(dat$EVTYPE)

dt<-dat[rowSums(abs(dat[,23:25]))>0,]

dt<- subset(dt, select = c("EVTYPE", "FATALITIES", "INJURIES", "PROPDMG", "PROPDMGEXP", "CROPDMG", "CROPDMGEXP"))

dtSubset<-dt

dtSubset$PermittedEvent<-"type"

dtSubset$PermittedEvent[grep(".*LOW TIDE*.", dtSubset$EVTYPE)] <- "Astronomical Low Tide"
dtSubset$PermittedEvent[grep(".*AVALA*.", dtSubset$EVTYPE)] <- "Avalanche"
dtSubset$PermittedEvent[grep(".*BLIZ*.", dtSubset$EVTYPE)] <- "Blizzard"
dtSubset$PermittedEvent[grep(".*COASTAL FLOOD*.", dtSubset$EVTYPE)] <- "Coastal Flood"
dtSubset$PermittedEvent[grep(".*COLD*.", dtSubset$EVTYPE)] <- "Cold/Wind Chill"
dtSubset$PermittedEvent[grep(".*DEBRIS*.", dtSubset$EVTYPE)] <- "Debris Flow"
dtSubset$PermittedEvent[grep(".*FOG*.", dtSubset$EVTYPE)] <- "Dense Fog"
dtSubset$PermittedEvent[grep(".*SMOKE*.", dtSubset$EVTYPE)] <- "Dense Smoke"
dtSubset$PermittedEvent[grep(".*DROUGHT*.", dtSubset$EVTYPE)] <- "Excessive Heat"
dtSubset$PermittedEvent[grep(".*DUST D*.", dtSubset$EVTYPE)] <- "Dust Devil"
dtSubset$PermittedEvent[grep(".*DUST*.", dtSubset$EVTYPE)] <- "Dust Storm"
dtSubset$PermittedEvent[grep(".* HEAT*.", dtSubset$EVTYPE)] <- "Excessive Heat"
dtSubset$PermittedEvent[grep(".* COLD*.", dtSubset$EVTYPE)] <- "Extreme Cold/Wind Chill"
dtSubset$PermittedEvent[grep(".*FLASH FLOOD*.", dtSubset$EVTYPE)] <- "Flash Flood"
dtSubset$PermittedEvent[grep(".*FLOOD*.", dtSubset$EVTYPE)] <- "Flood"
dtSubset$PermittedEvent[grep(".*FREEZE*.", dtSubset$EVTYPE)] <- "Frost/Freeze"
dtSubset$PermittedEvent[grep(".*FUNNEL*.", dtSubset$EVTYPE)] <- "Funnel Cloud"
dtSubset$PermittedEvent[grep(".*FREEZING FOG*.", dtSubset$EVTYPE)] <- "Freezing Fog"
dtSubset$PermittedEvent[grep(".*HAIL*.", dtSubset$EVTYPE)] <- "Hail"
dtSubset$PermittedEvent[grep(".*HEAT*.", dtSubset$EVTYPE)] <- "Heat"
dtSubset$PermittedEvent[grep(".* RAIN*.", dtSubset$EVTYPE)] <- "Heavy Rain"
dtSubset$PermittedEvent[grep(".* SNOW*.", dtSubset$EVTYPE)] <- "Heavy Snow"
dtSubset$PermittedEvent[grep(".* SURF*.", dtSubset$EVTYPE)] <- "High Surf"
dtSubset$PermittedEvent[grep(".* WIND*.", dtSubset$EVTYPE)] <- "High Wind"
dtSubset$PermittedEvent[grep(".*ICANE*.", dtSubset$EVTYPE)] <- "Hurricane (Typhoon)"
dtSubset$PermittedEvent[grep(".*ICE STORM*.", dtSubset$EVTYPE)] <- "Ice Storm"
dtSubset$PermittedEvent[grep(".*LAKE-EFFECT SNOW*.", dtSubset$EVTYPE)] <- "Lake-Effect Snow"
dtSubset$PermittedEvent[grep(".*LAKESHORE*.", dtSubset$EVTYPE)] <- "Lakeshore Flood"
dtSubset$PermittedEvent[grep(".*LIGHTN*.", dtSubset$EVTYPE)] <- "Lightning"
dtSubset$PermittedEvent[grep(".*MARINE HAIL*.", dtSubset$EVTYPE)] <- "Marine Hail"
dtSubset$PermittedEvent[grep(".*MARINE HIGH WIND*.", dtSubset$EVTYPE)] <- "Marine High Wind"
dtSubset$PermittedEvent[grep(".*MARINE STRONG WIND*.", dtSubset$EVTYPE)] <- "Marine Strong Wind"
dtSubset$PermittedEvent[grep(".*MARINE THUN*.", dtSubset$EVTYPE)] <- "Marine Thunderstorm Wind"
dtSubset$PermittedEvent[grep(".*RIP C*.", dtSubset$EVTYPE)] <- "Rip Current"
dtSubset$PermittedEvent[grep(".*SEICHE*.", dtSubset$EVTYPE)] <- "Seiche"
dtSubset$PermittedEvent[grep(".*SLEET*.", dtSubset$EVTYPE)] <- "Sleet"
dtSubset$PermittedEvent[grep(".*SURGE*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*STRONG W*.", dtSubset$EVTYPE)] <- "Strong Wind"
dtSubset$PermittedEvent[grep(".*STORM W*.", dtSubset$EVTYPE)] <- "Thunderstorm Wind"
dtSubset$PermittedEvent[grep(".*NADO*.", dtSubset$EVTYPE)] <- "Tornado"
dtSubset$PermittedEvent[grep(".*DEPRESS*.", dtSubset$EVTYPE)] <- "Tropical Depression"
dtSubset$PermittedEvent[grep(".*ICAL S*.", dtSubset$EVTYPE)] <- "Tropical Storm"
dtSubset$PermittedEvent[grep(".*TSU*.", dtSubset$EVTYPE)] <- "Tsunami"
dtSubset$PermittedEvent[grep(".*VOLC*.", dtSubset$EVTYPE)] <- "Volcanic Ash"
dtSubset$PermittedEvent[grep(".*SPOUT*.", dtSubset$EVTYPE)] <- "Waterspout"
dtSubset$PermittedEvent[grep(".*FIRE*.", dtSubset$EVTYPE)] <- "Wildfire"
dtSubset$PermittedEvent[grep(".*WINTER S*.", dtSubset$EVTYPE)] <- "Winter Storm"
dtSubset$PermittedEvent[grep(".*WINTER W*.", dtSubset$EVTYPE)] <- "Winter Weather"
dtSubset$PermittedEvent[grep(".*FLD*.", dtSubset$EVTYPE)] <- "Flood"
dtSubset$PermittedEvent[grep(".*NDAO*.", dtSubset$EVTYPE)] <- "Tornado"
dtSubset$PermittedEvent[grep(".*WIND*.", dtSubset$EVTYPE)] <- "Strong Wind"
dtSubset$PermittedEvent[grep(".*RAIN*.", dtSubset$EVTYPE)] <- "Heavy Rain"
dtSubset$PermittedEvent[grep(".*SNOW*.", dtSubset$EVTYPE)] <- "Heavy Snow"
dtSubset$PermittedEvent[grep(".*DRY M*.", dtSubset$EVTYPE)] <- "Heat"
dtSubset$PermittedEvent[grep(".*ICY R*.", dtSubset$EVTYPE)] <- "Frost/Freeze"
dtSubset$PermittedEvent[grep(".*ICE*.", dtSubset$EVTYPE)] <- "Frost/Freeze"
dtSubset$PermittedEvent[grep(".*THUNDER*.", dtSubset$EVTYPE)] <- "Thunderstorm Wind"
dtSubset$PermittedEvent[grep(".*TNING*.", dtSubset$EVTYPE)] <- "Heavy Snow"
dtSubset$PermittedEvent[grep(".*LANDSLIDE*.", dtSubset$EVTYPE)] <- "Heavy Rain"
dtSubset$PermittedEvent[grep(".*MARINE*.", dtSubset$EVTYPE)] <- "Marine Thunderstorm Wind"
dtSubset$PermittedEvent[grep(".*RAPIDLY RISING WATER*.", dtSubset$EVTYPE)] <- "Flash Flood"
dtSubset$PermittedEvent[grep(".*HIGH WATER*.", dtSubset$EVTYPE)] <- "Flood"
dtSubset$PermittedEvent[grep(".*COASTAL STORM*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*COASTALSTORM*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*COASTAL EROSION*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*HIGH TIDES*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*HIGH WAVES*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*BEACH EROSION*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*DROWNING*.", dtSubset$EVTYPE)] <- "Flood"
dtSubset$PermittedEvent[grep(".*FROST*.", dtSubset$EVTYPE)] <- "Frost/Freeze"
dtSubset$PermittedEvent[grep(".*GLAZE*.", dtSubset$EVTYPE)] <- "Ice Storm"
dtSubset$PermittedEvent[grep(".*DAM BREAK*.", dtSubset$EVTYPE)] <- "Heavy Rain"
dtSubset$PermittedEvent[grep(".*MUD*.", dtSubset$EVTYPE)] <- "Heavy Rain"
dtSubset$PermittedEvent[grep(".*URBAN*.", dtSubset$EVTYPE)] <- "Flood"
dtSubset$PermittedEvent[grep(".*TYPHOON*.", dtSubset$EVTYPE)] <- "Hurricane (Typhoon)"
dtSubset$PermittedEvent[grep(".*WARM WEATHER*.", dtSubset$EVTYPE)] <- "Excessive Heat"
dtSubset$PermittedEvent[grep(".*UNSEASONABLY WARM AND DRY*.", dtSubset$EVTYPE)] <- "Excessive Heat"
dtSubset$PermittedEvent[grep(".*UNSEASONABLY WARM*.", dtSubset$EVTYPE)] <- "Excessive Heat"
dtSubset$PermittedEvent[grep(".*EXPOSURE*.", dtSubset$EVTYPE)] <- "Extreme Cold/Wind Chill"
dtSubset$PermittedEvent[grep(".*HYPOTHERMIA*.", dtSubset$EVTYPE)] <- "Extreme Cold/Wind Chill"
dtSubset$PermittedEvent[grep(".*LOW TEMPERATURE*.", dtSubset$EVTYPE)] <- "Extreme Cold/Wind Chill"
dtSubset$PermittedEvent[grep(".*SEAS*.", dtSubset$EVTYPE)] <- "Marine Thunderstorm Wind"
dtSubset$PermittedEvent[grep(".*ROGUE WAVE*.", dtSubset$EVTYPE)] <- "Storm Surge/Tide"
dtSubset$PermittedEvent[grep(".*ROCK SLIDE*.", dtSubset$EVTYPE)] <- "Heavy Rain"
dtSubset$PermittedEvent[grep(".*HIGH SWELLS*.", dtSubset$EVTYPE)] <- "Marine Thunderstorm Wind"
dtSubset$PermittedEvent[grep(".*LANDSLUMP*.", dtSubset$EVTYPE)] <- "Heavy Rain"
dtSubset$PermittedEvent[grep(".*MIXED PRE*.", dtSubset$EVTYPE)] <- "Ice Storm"
dtSubset$PermittedEvent[grep(".*LANDSLUMP*.", dtSubset$EVTYPE)] <- "Heavy Rain"

TheDamage <- subset(dtSubset, PermittedEvent!='type')
row.names(TheDamage) <- NULL

TheDamage <- TheDamage[,-(1)] 
TheDamage$PermittedEvent = toupper(TheDamage$PermittedEvent)
colnames(TheDamage)[7] <- "EVTYPE"

library(ggplot2)

Let’s take a look at the four main indicators.

Results

Here we have charted the number of fatalities by event type. It’s clear that tornadoes are by far the most dangerous, with the top five being filled out by heat, floods, strong wind and heavy snow.

q<-qplot(EVTYPE, FATALITIES, data=TheDamage, geom="bar", stat="identity")
q + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Here we have charted the number of injuries by event type. Once again tornadoes rank first, this time by a considerable margin, roughly seven times the next highest rank. Making up the remainder of the top five are strong wind, heat, floods and heavy snow.

Though the order and sum are different, the same events account for the top five in both injuries and fatalities.

q2<-qplot(EVTYPE, INJURIES, data=TheDamage, geom="bar", stat="identity")
q2 + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Here we have charted the amount of property damage by event type. For this analysis we’re going to consider crops to be a form of property and add them together with property damage. We see that tornadoes are again the most devastating of all weather events. Floods are a close second, with strong wind, thunderstorm wind and hail completing the top five.

TheDamage$PROPDMG<- TheDamage$PROPDMG + TheDamage$CROPDMG

q3<-qplot(EVTYPE, PROPDMG, data=TheDamage, geom="bar", stat="identity")
q3 + theme(axis.text.x = element_text(angle = 90, hjust = 1))