The goal of this report is to explore the NOAA storm database to find out which weather events have the most negative health and economic consequences. Specifically, it looks at the top weather events associated with death, injury, and property damage across the United States, spanning from 1950 through November 2011.
I downloaded the raw data file (repdata-data-StormData.csv.bz2) from the Coursra Reproducible Data class website along with a PDF from NOAA documenting the dataset. The data and PDF are also available from the NOAA website: http://www.nws.noaa.gov/
I first had to uncompress the data file. I did this by viewing my working directory in RStudio, and then double-clicking the compressed file. After the file uncompressed, I read in the dataset using the following code. Reading in the file takes a moment or two due to its size:
weather <- read.csv("repdata-data-StormData.csv",header=TRUE,sep=",")
The dataset consists of 902297 obervations of 37 variables.
dim(weather)
## [1] 902297 37
One of the challenges of this data is the condition of the event type data (EVTYPE). The NOAA documentation lists 48 event types, but the dataset itself has many more event types—985!
length(levels(weather$EVTYPE))
## [1] 985
These “extra” events are due to misspellings, differences in case (uppercase, mixed case, and so on), and inconsistent wording. The clean up took quite a bit of time. I had to run an initial analysis on the data and inspect the EVTYPES to see which ones were not in line with the NOAA event types. For example, I subsetted the weather data based on those events whose fatalities are greater than 0:
fatalities <- subset(weather, select=c(EVTYPE,FATALITIES),subset=(FATALITIES > 0))
Then I calculated the total fatalities by event type. As you can see by just looking at the first few lines of totalFatalities, avalanche is misspelled (AVALANCE), blowing snow appears as a lowercase and uppercase, black ice is not a NOAA event type, and so on.
require(plyr)
## Loading required package: plyr
totalFatalities <- ddply(fatalities, "EVTYPE", summarize,
all_fatalities = sum(FATALITIES))
head(totalFatalities)
## EVTYPE all_fatalities
## 1 AVALANCE 1
## 2 AVALANCHE 224
## 3 BLACK ICE 1
## 4 BLIZZARD 101
## 5 blowing snow 1
## 6 BLOWING SNOW 1
I visually inspected these data as well as data subsetted for injuries, and then wrote the following code to clean up the EVTYPE factors. In some cases I had to make a judgement call as to how to recode a factor. It’s quite a bit of code, but it does the trick of aggregating the data into fewer factors, thus bringing it more into line with 48 event types listed by NOAA.
## Change EVTYPE from factor to character to allow the cleanup
weather$EVTYPE <- as.character(weather$EVTYPE)
weather$EVTYPE[weather$EVTYPE=="TSTM WIND"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WINDS"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WINDSS"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="TSTM WIND (G45)"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WIND (G40)"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WIND G52"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERTORM WINDS"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="TSTM WIND (G35)"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WINDS"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WINDS 13"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORMS WINDS"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="TSTM WIND (G40)"] <- "THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="TORNADOES, TSTM WIND, HAIL"] <- "TORNADO"
weather$EVTYPE[weather$EVTYPE=="WATERSPOUT/TORNADO"] <- "TORNADO"
weather$EVTYPE[weather$EVTYPE=="TORNADO F2"] <- "TORNADO"
weather$EVTYPE[weather$EVTYPE=="TORNADO F3"] <- "TORNADO"
weather$EVTYPE[weather$EVTYPE=="WATERSPOUT TORNADO"] <- "TORNADO"
weather$EVTYPE[weather$EVTYPE=="TROPICAL STORM GORDON"] <- "TROPICAL STORM"
weather$EVTYPE[weather$EVTYPE=="BLOWING SNOW"] <- "BLIZZARD"
weather$EVTYPE[weather$EVTYPE=="SNOW/HIGH WINDS"] <- "BLIZZARD"
weather$EVTYPE[weather$EVTYPE=="Blowing Snow"] <- "BLIZZARD"
weather$EVTYPE[weather$EVTYPE=="blowing snow"] <- "BLIZZARD"
weather$EVTYPE[weather$EVTYPE=="HIGH WINDS/SNOW"] <- "BLIZZARD"
weather$EVTYPE[weather$EVTYPE=="HEAVY SNOW/BLIZZARD/AVALANCHE"] <- "BLIZZARD"
weather$EVTYPE[weather$EVTYPE=="HURRICANE"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="HURRICANE ERIN"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="HURRICANE FELIX"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="HURRICANE OPAL"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="HURRICANE OPAL/HIGH WINDS"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="TYPHOON"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="WINTER STORM HIGH WINDS"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="WINTER STORMS"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="THUNDERSNOW"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="SNOW SQUALL"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="Snow Squalls"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="SNOW/ BITTER COLD"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="SNOW AND ICE"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="HEAVY SNOW/ICE"] <- "WINTER STORM"
weather$EVTYPE[weather$EVTYPE=="ICE ON ROAD"] <- "ICE STORM"
weather$EVTYPE[weather$EVTYPE=="ICY ROADS"] <- "ICE STORM"
weather$EVTYPE[weather$EVTYPE=="GLAZE/ICE STORM"] <- "ICE STORM"
weather$EVTYPE[weather$EVTYPE=="ICE STORM/FLASH FLOOD"] <- "ICE STORM"
weather$EVTYPE[weather$EVTYPE=="ICE ROADS"] <- "ICE STORM"
weather$EVTYPE[weather$EVTYPE=="MARINE TSTM WIND"] <- "MARINE THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="COASTAL STORM"] <- "MARINE THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="COASTALSTORM"] <- "MARINE THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="Coastal Storm"] <- "MARINE THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="COASTAL FLOODING/EROSION"] <- "MARINE THUNDERSTORM WIND"
weather$EVTYPE[weather$EVTYPE=="URBAN/SML STREAM FLD"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="URBAN FLOOD"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="RIVER FLOOD"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="FLOODING"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="URBAN FLOODING"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="RAPIDLY RISING WATER"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="MINOR FLOODING"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="River Flooding"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="HIGH WIND AND SEAS"] <- "MARINE HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="HIGH WINDS"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="WIND"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="WINDS"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="STRONG WIND"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="Strong Winds"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="Gusty winds"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="HIGH WIND 48"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="HIGH WIND/HEAVY SNOW"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="HIGH WINDS/COLD"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="NON TSTM WIND"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="NON-SEVERE WIND DAMAGE"] <- "HIGH WIND"
weather$EVTYPE[weather$EVTYPE=="WILD/FOREST FIRE"] <- "WILDFIRE"
weather$EVTYPE[weather$EVTYPE=="WILD FIRES"] <- "WILDFIRE"
weather$EVTYPE[weather$EVTYPE=="BRUSH FIRE"] <- "WILDFIRE"
weather$EVTYPE[weather$EVTYPE=="WINTER WEATHER/MIX"] <- "WINTER WEATHER"
weather$EVTYPE[weather$EVTYPE=="WINTRY MIX"] <- "WINTER WEATHER"
weather$EVTYPE[weather$EVTYPE=="MIXED PRECIP"] <- "WINTER WEATHER"
weather$EVTYPE[weather$EVTYPE=="Heavy snow shower"] <- "WINTER WEATHER"
weather$EVTYPE[weather$EVTYPE=="SNOW"] <- "WINTER WEATHER"
weather$EVTYPE[weather$EVTYPE=="Snow"] <- "WINTER WEATHER"
weather$EVTYPE[weather$EVTYPE=="AVALANCE"] <- "AVALANCHE"
weather$EVTYPE[weather$EVTYPE=="LIGHTNING."] <- "LIGHTNING"
weather$EVTYPE[weather$EVTYPE=="LIGHTNING AND THUNDERSTORM WIN"] <- "LIGHTNING"
weather$EVTYPE[weather$EVTYPE=="LIGHTNING INJURY"] <- "LIGHTNING"
weather$EVTYPE[weather$EVTYPE=="LANDSLIDES"] <- "LANDSLIDE"
weather$EVTYPE[weather$EVTYPE=="Mudslide"] <- "LANDSLIDE"
weather$EVTYPE[weather$EVTYPE=="Mudslides"] <- "LANDSLIDE"
weather$EVTYPE[weather$EVTYPE=="TSTM WIND/HAIL"] <- "HAIL"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WINDS HAIL"] <- "HAIL"
weather$EVTYPE[weather$EVTYPE=="THUNDERSTORM WINDS/HAIL"] <- "HAIL"
weather$EVTYPE[weather$EVTYPE=="SMALL HAIL"] <- "HAIL"
weather$EVTYPE[weather$EVTYPE=="FLASH FLOODING"] <- "FLASH FLOOD"
weather$EVTYPE[weather$EVTYPE=="FLASH FLOOD/FLOOD"] <- "FLASH FLOOD"
weather$EVTYPE[weather$EVTYPE=="FLASH FLOODING/FLOOD"] <- "FLASH FLOOD"
weather$EVTYPE[weather$EVTYPE=="FLASH FLOODS"] <- "FLASH FLOOD"
weather$EVTYPE[weather$EVTYPE=="FLOOD/FLASH FLOOD"] <- "FLASH FLOOD"
weather$EVTYPE[weather$EVTYPE=="FLOOD & HEAVY RAIN"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="FLOOD/RIVER FLOOD"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="RIVER FLOODING"] <- "FLOOD"
weather$EVTYPE[weather$EVTYPE=="EXTREME COLD"] <- "EXTREME COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="EXTREME WINDCHILL"] <- "EXTREME COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="Extended Cold"] <- "EXTREME COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="Extreme Cold"] <- "EXTREME COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="FOG"] <- "DENSE FOG"
weather$EVTYPE[weather$EVTYPE=="FOG AND COLD TEMPERATURES"] <- "DENSE FOG"
weather$EVTYPE[weather$EVTYPE=="RIP CURRENTS"] <- "RIP CURRENT"
weather$EVTYPE[weather$EVTYPE=="RIP CURRENTS/HEAVY SURF"] <- "RIP CURRENT"
weather$EVTYPE[weather$EVTYPE=="STORM SURGE"] <- "STORM SURGE/TIDE"
weather$EVTYPE[weather$EVTYPE=="TIDAL FLOODING"] <- "STORM SURGE/TIDE"
weather$EVTYPE[weather$EVTYPE=="ASTRONOMICAL HIGH TIDE"] <- "STORM SURGE/TIDE"
weather$EVTYPE[weather$EVTYPE=="FREEZING RAIN"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="ICE"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="FREEZING DRIZZLE"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="FREEZING RAIN/SNOW"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="Freezing Spray"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="GLAZE"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="FALLING SNOW/ICE"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="RAIN/SNOW"] <- "SLEET"
weather$EVTYPE[weather$EVTYPE=="HEAVY SURF/HIGH SURF"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="ROUGH SEAS"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="ROUGH SURF"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="HAZARDOUS SURF"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="HEAVY SURF"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="HIGH SEAS"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="High Surf"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="ROGUE WAVE"] <- "HIGH SURF"
weather$EVTYPE[weather$EVTYPE=="STRONG WINDS"] <- "STRONG WIND"
weather$EVTYPE[weather$EVTYPE=="DRY MICROBURST"] <- "STRONG WIND"
weather$EVTYPE[weather$EVTYPE=="GUSTY WINDS"] <- "STRONG WIND"
weather$EVTYPE[weather$EVTYPE=="GUSTY WIND"] <- "STRONG WIND"
weather$EVTYPE[weather$EVTYPE=="LIGHT SNOW"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="MODERATE SNOWFALL"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="COLD"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="Cold"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="RECORD COLD"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="COLD AND SNOW"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="Cold Temperature"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="COLD WAVE"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="COLD WEATHER"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="COLD/WINDS"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="LOW TEMPERATURE"] <- "COLD/WIND CHILL"
weather$EVTYPE[weather$EVTYPE=="RECORD WARMTH"] <- "HEAT"
weather$EVTYPE[weather$EVTYPE=="UNSEASONABLY WARM"] <- "HEAT"
weather$EVTYPE[weather$EVTYPE=="RECORD HEAT"] <- "HEAT"
weather$EVTYPE[weather$EVTYPE=="HEAT WAVE"] <- "HEAT"
weather$EVTYPE[weather$EVTYPE=="EXTREME HEAT"] <- "HEAT"
weather$EVTYPE[weather$EVTYPE=="Heat Wave"] <- "HEAT"
weather$EVTYPE[weather$EVTYPE=="WARM WEATHER"] <- "HEAT"
weather$EVTYPE[weather$EVTYPE=="UNSEASONABLY DRY"] <- "DROUGHT"
weather$EVTYPE[weather$EVTYPE=="DROUGHT/EXCESSIVE HEAT"] <- "DROUGHT"
weather$EVTYPE[weather$EVTYPE=="HEAT WAVE DROUGHT"] <- "DROUGHT"
weather$EVTYPE[weather$EVTYPE=="HEAT WAVES"] <- "DROUGHT"
weather$EVTYPE[weather$EVTYPE=="COASTAL FLOODING"] <- "COASTAL FLOOD"
weather$EVTYPE[weather$EVTYPE=="Coastal Flooding"] <- "COASTAL FLOOD"
weather$EVTYPE[weather$EVTYPE=="HURRICANE/TYPHOON"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="Hurricane Edouard"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="HURRICANE EMILY"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="HURRICANE-GENERATED SWELLS"] <- "HURRICANE (TYPHOON)"
weather$EVTYPE[weather$EVTYPE=="FUNNEL CLOUDS"] <- "FUNNEL CLOUD"
weather$EVTYPE[weather$EVTYPE=="FUNNEL"] <- "FUNNEL CLOUD"
weather$EVTYPE[weather$EVTYPE=="FREEZE"] <- "FROST/FREEZE"
weather$EVTYPE[weather$EVTYPE=="FROST"] <- "FROST/FREEZE"
weather$EVTYPE[weather$EVTYPE=="HEAVY SEAS"] <- "HEAVY SURF"
weather$EVTYPE[weather$EVTYPE=="Heavy Surf"] <- "HEAVY SURF"
weather$EVTYPE[weather$EVTYPE=="Heavy surf and wind"] <- "HEAVY SURF"
weather$EVTYPE[weather$EVTYPE=="HIGH SWELLS"] <- "HEAVY SURF"
weather$EVTYPE[weather$EVTYPE=="HIGH WAVES"] <- "HEAVY SURF"
weather$EVTYPE[weather$EVTYPE=="HIGH WIND/SEAS"] <- "HEAVY SURF"
weather$EVTYPE[weather$EVTYPE=="EXCESSIVE RAINFALL"] <- "HEAVY RAIN"
weather$EVTYPE[weather$EVTYPE=="HEAVY RAINS"] <- "HEAVY RAIN"
weather$EVTYPE[weather$EVTYPE=="Torrential Rainfall"] <- "HEAVY RAIN"
weather$EVTYPE[weather$EVTYPE=="Temperature record"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="MONTHLY PRECIPITATION"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="DROWNING"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="HYPERTHERMIA/EXPOSURE"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="HYPOTHERMIA"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="Hypothermia/Exposure"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="HYPOTHERMIA/EXPOSURE"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="BLACK ICE"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="Marine Accident"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="MARINE MISHAP"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="HIGH"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="W"] <- "OTHER"
weather$EVTYPE[weather$EVTYPE=="Whirlwind"] <- "DUST STORM"
weather$EVTYPE[weather$EVTYPE=="Dust Devil"] <- "DUST DEVIL"
weather$EVTYPE[weather$EVTYPE=="EXCESSIVE SNOW"] <- "HEAVY SNOW"
weather$EVTYPE[weather$EVTYPE=="WATERSPOUTS"] <- "WATERSPOUT"
## Change EVTYPE back to a factor
weather$EVTYPE <- as.factor(weather$EVTYPE)
The results are in three sections: Fatalities, Injuries, and Property Damage. I use ggplot2 package for plotting.
require(ggplot2)
## Loading required package: ggplot2
To investigate weather-related fatalities, subset the data based on any weather event that has fatalities greater than 0.
fatalities <- subset(weather, select=c(EVTYPE,FATALITIES),subset=(FATALITIES > 0))
Next calculate the total fatalities by weather event:
totalFatalities <- ddply(fatalities, "EVTYPE", summarize,
all_fatalities = sum(FATALITIES))
Because I am interested in the most harmful weather events, I subsetted the data to include only those events that are associated with more than 100 fatalities:
topFatalities <- subset(totalFatalities, select=c(EVTYPE,all_fatalities),
subset=(all_fatalities > 100))
Finally, plot the data. As you can see, tornados are by far the most deadly weather event, followed by excessive heat, heat, flash flood, and lightning.
ggplot(topFatalities, aes(x=EVTYPE,y=all_fatalities)) +
geom_bar(position="dodge", stat="identity") +
xlab("Weather event") + ylab("Number of fatalities") +
ggtitle("Fatalities Associated with Weather") +
scale_y_continuous(breaks=c(0,500,1000,1500,2000,
250,3000,3500,4000,4500,5000,5500)) +
theme(axis.text.x = element_text(angle=45, hjust=1))
To investigate weather-related injuries, I followed a process similar to that used for fatalities. First, subset the data to get those event types associated with injuries.
injuries <- subset(weather, select=c(EVTYPE,INJURIES),subset=(INJURIES > 0))
Then calculate the total injuries by event type.
totalInjuries <- ddply(injuries, "EVTYPE", summarize,
all_injuries = sum(INJURIES))
I decide to limit what I plot to those events causing more than 1,000 injuries. That’s because I am concerned only with finding the weather events with the most severe consequences.
topInjuries <- subset(totalInjuries, select=c(EVTYPE,all_injuries),
subset=(all_injuries > 1000))
The plot reveals that tornados cause the most injuries, followed by thunderstorms, flood, excessive heat, and lightning.
ggplot(topInjuries, aes(x=EVTYPE,y=all_injuries)) +
geom_bar(position="dodge", stat="identity") +
xlab("Weather event") + ylab("Number of injuries") +
ggtitle("Injuries Associated with Weather") +
theme(axis.text.x = element_text(angle=45, hjust=1))
The weather dataset has information on property damage as well as crop damage. I decided to focus solely on property damage for this portion of the analysis.
Subset the weather based on those weather events that have property damage greater than 0.
propertyDamage <- subset(weather, select=c(EVTYPE,PROPDMG),subset=(PROPDMG > 0))
Calculate the total property damage by weather event.
totalDamage <- ddply(propertyDamage, "EVTYPE", summarize,
all_damage = sum(PROPDMG))
Next, order the data so they are in decreasing order with respect to damage. Because I am interested only in the weather events that cause the most damage, I subset the data to include only the first 20 weather events.
x <- totalDamage[order(totalDamage$all_damage, decreasing = TRUE),]
topTwenty <- x[1:20,]
Finally, plot the top twenty weather events associated with the most property damage. Once again, tornados are in the lead, causing more property damage than any other weather event. Thunderstorm wind, flash flood, flood, and hail are next.
ggplot(topTwenty, aes(x=EVTYPE,y=all_damage)) +
geom_bar(position="dodge", stat="identity") +
xlab("Weather event") + ylab("Property damage") +
ggtitle("Property Damage Associated with Weather") +
theme(axis.text.x = element_text(angle=45, hjust=1))
The analysis was run on a Mac Book Pro (2.6 GHz Intel Core i7 with 16 GB 1600 MHz DDR3) running OS X v10.9.3, using R Studio.
sessionInfo()
## R version 3.0.3 (2014-03-06)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_1.0.0 plyr_1.8.1
##
## loaded via a namespace (and not attached):
## [1] colorspace_1.2-4 digest_0.6.4 evaluate_0.5.5 formatR_0.10
## [5] grid_3.0.3 gtable_0.1.2 htmltools_0.2.4 knitr_1.6
## [9] labeling_0.2 MASS_7.3-33 munsell_0.4.2 proto_0.3-10
## [13] Rcpp_0.11.2 reshape2_1.4 rmarkdown_0.2.46 scales_0.2.4
## [17] stringr_0.6.2 tools_3.0.3