This analysis explores the NOAA Storm Database and addresses two questions relating to the impact of severe weather events on the health of the population and the economy. The events recorded in the database start in the year 1950 and end in November 2011. In the earlier years of the database there are generally fewer events recorded, most likely due to a lack of good records.
The steps taken to analise the source data is clearly documented with R-code, tables, figures and summaries below and leaves a clear chain of evidence for future reproducibility.
The first half of the analysis is spent on cleaning the data where many event labels are recorded with spelling mistakes or undefined labels. The cleaning exercise also identify the variables in scope, removing unnecessary data from the data scope.
The second half of the analysis uses the clean data to review both the health (fatalities and injuries) and the economic (crop and property damage) impact on the US population. Ultimately the 12 top events in each category is identified, highlighting the top 3.
library(R.utils) #library section do not cache, switch off messages
library(dplyr)
library(ggplot2)
library(scales)
Source the zipped file containing a coma sepeated .csv file of the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2.
Below code reads the zip file from the working directory and unzips it to StormData.csv
bunzip2("repdata_data_StormData.csv.bz2", "StormData.csv", remove = FALSE, skip = TRUE)
## [1] "StormData.csv"
## attr(,"temporary")
## [1] FALSE
# 'remove = FALSE', to keep the original file
# 'skip = TRUE', to prevent overwright the destination file if it exists
stormData <- read.csv("StormData.csv")
Reduce scope of data included to more manageable format
Eliminate the initial years for which there are few events recorded. Events before 1 January 1996 is excluded form this analysis.
Remove variables / columns not necessary for analysis. Only event type and variables related to the extend of the damage is included in this analysis.
Clean up the even type labels to have no leading or trailing spaces. Change all the Event Type labels to upper case for consistency.
stormData$BGN_DATE<-as.Date(as.character(stormData$BGN_DATE), "%m/%d/%Y") #change the date into a date format
sd <- subset(stormData, BGN_DATE > as.Date("1995-12-31") )
#Only keep the variables / columns needed in the analysis
KeepCols = c("BGN_DATE","EVTYPE","FATALITIES","INJURIES","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")
sd <- sd[ , (names(sd) %in% KeepCols)]
# Clean up the Event type descriptions
#Remove readings / lines where the measurements are all 0
sd <- subset(sd, !(FATALITIES == 0 & INJURIES == 0 & PROPDMG == 0 & CROPDMG == 0))
#Change all event descriptions to upper case
sd = as.data.frame(sapply(sd, toupper))
#Remove the trailing and leading spaces
sd <- as.data.frame(apply(sd,2,function (x) gsub("^\\s+|\\s+$", "", x)))
Check the events types labels in the data against the valid labels listed in section 2.1.1 of the Storm Data Documentation.
#valid event labels listed in the Sotrm Data Documentation
validEvents <- c("ASTRONOMICAL LOW TIDE","AVALANCHE","BLIZZARD","COASTAL FLOOD","COLD/WIND CHILL","DEBRIS FLOW","DENSE FOG","DENSE SMOKE","DROUGHT","DUST DEVIL","DUST STORM","EXCESSIVE HEAT","EXTREME COLD/WIND CHILL","FLASH FLOOD","FLOOD","FREEZING FOG","FROST/FREEZE","FUNNEL CLOUD","HAIL","HEAT","HEAVY RAIN","HEAVY SNOW","HIGH SURF","HIGH WIND","HURRICANE","ICE STORM","LAKE-EFFECT SNOW","LAKESHORE FLOOD","LIGHTNING","MARINE HAIL","MARINE HIGH WIND","MARINE STRONG WIND","MARINE THUNDERSTORM WIND","RIP CURRENT","SEICHE","SLEET","STORM SURGE/TIDE","STRONG WIND","THUNDERSTORM WIND","TORNADO","TROPICAL DEPRESSION","TROPICAL STORM","TSUNAMI","VOLCANIC ASH","WATERSPOUT","WILDFIRE","WINTER STORM","WINTER WEATHER")
numValidE <- length(validEvents)
#all the event labels listed in the data
currentEvents <-unique(sd$EVTYPE)
numCurrentE <- nlevels(currentEvents)
#the event labels that needs to be fixed
tofix<- !(currentEvents %in% validEvents)
listtofix <- currentEvents[tofix]
numToFix <- sum(tofix) #number of rows with invalid data
numOther <- nrow(sd[(sd$EVTYPE=="OTHER"),]) #number of rows with the label 'Other'
listtofix
## [1] TSTM WIND FREEZING RAIN
## [3] EXTREME COLD TSTM WIND/HAIL
## [5] RIP CURRENTS OTHER
## [7] WILD/FOREST FIRE STORM SURGE
## [9] ICE JAM FLOOD (MINOR URBAN/SML STREAM FLD
## [11] FOG ROUGH SURF
## [13] HEAVY SURF MARINE ACCIDENT
## [15] FREEZE DRY MICROBURST
## [17] WINDS COASTAL STORM
## [19] EROSION/CSTL FLOOD RIVER FLOODING
## [21] DAMAGING FREEZE BEACH EROSION
## [23] HEAVY RAIN/HIGH SURF UNSEASONABLE COLD
## [25] EARLY FROST WINTRY MIX
## [27] COASTAL FLOODING TORRENTIAL RAINFALL
## [29] LANDSLUMP HURRICANE EDOUARD
## [31] TIDAL FLOODING STRONG WINDS
## [33] EXTREME WINDCHILL GLAZE
## [35] EXTENDED COLD WHIRLWIND
## [37] HEAVY SNOW SHOWER LIGHT SNOW
## [39] MIXED PRECIP COLD
## [41] FREEZING SPRAY DOWNBURST
## [43] MUDSLIDES MICROBURST
## [45] MUDSLIDE SNOW
## [47] SNOW SQUALLS WIND DAMAGE
## [49] LIGHT SNOWFALL FREEZING DRIZZLE
## [51] GUSTY WIND/RAIN GUSTY WIND/HVY RAIN
## [53] WIND COLD TEMPERATURE
## [55] HEAT WAVE COLD AND SNOW
## [57] RAIN/SNOW TSTM WIND (G45)
## [59] GUSTY WINDS GUSTY WIND
## [61] TSTM WIND 40 TSTM WIND 45
## [63] HARD FREEZE TSTM WIND (41)
## [65] RIVER FLOOD TSTM WIND (G40)
## [67] MUD SLIDE SNOW AND ICE
## [69] AGRICULTURAL FREEZE SNOW SQUALL
## [71] ICY ROADS THUNDERSTORM
## [73] HYPOTHERMIA/EXPOSURE LAKE EFFECT SNOW
## [75] MIXED PRECIPITATION BLACK ICE
## [77] COASTALSTORM DAM BREAK
## [79] BLOWING SNOW FROST
## [81] GRADIENT WIND UNSEASONABLY COLD
## [83] TSTM WIND AND LIGHTNING WET MICROBURST
## [85] HEAVY SURF AND WIND TYPHOON
## [87] LANDSLIDES HIGH SWELLS
## [89] HIGH WINDS SMALL HAIL
## [91] UNSEASONAL RAIN COASTAL FLOODING/EROSION
## [93] TSTM WIND (G45) HIGH WIND (G40)
## [95] TSTM WIND (G35) COASTAL EROSION
## [97] UNSEASONABLY WARM COASTAL FLOODING/EROSION
## [99] HYPERTHERMIA/EXPOSURE ROCK SLIDE
## [101] GUSTY WIND/HAIL HEAVY SEAS
## [103] LANDSPOUT RECORD HEAT
## [105] EXCESSIVE SNOW FLOOD/FLASH/FLOOD
## [107] WIND AND WAVE FLASH FLOOD/FLOOD
## [109] LIGHT FREEZING RAIN ICE ROADS
## [111] HIGH SEAS RAIN
## [113] ROUGH SEAS TSTM WIND G45
## [115] NON-SEVERE WIND DAMAGE WARM WEATHER
## [117] THUNDERSTORM WIND (G40) LANDSLIDE
## [119] HIGH WATER LATE SEASON SNOW
## [121] WINTER WEATHER MIX ROGUE WAVE
## [123] FALLING SNOW/ICE NON-TSTM WIND
## [125] NON TSTM WIND BRUSH FIRE
## [127] BLOWING DUST HIGH SURF ADVISORY
## [129] HAZARDOUS SURF COLD WEATHER
## [131] ICE ON ROAD DROWNING
## [133] MARINE TSTM WIND HURRICANE/TYPHOON
## [135] WINTER WEATHER/MIX ASTRONOMICAL HIGH TIDE
## [137] HEAVY SURF/HIGH SURF
## 183 Levels: AGRICULTURAL FREEZE ... WINTRY MIX
There are 48 valid event type labels. The data contains 183 unique event type labels, 137 are invalid as listed above. Referring to the above list of invalid labels
Remove the 34 lines where the even type it labelled “OTHER”
Re-label the rest of the invalid labels with valid labels.
# Remove the line where the event type is OTHER
sd<-sd[!(sd$EVTYPE=="OTHER"),]
##Fix written specific for the rows where EVTYPE was assigned invallid values
for (i in 1:length(sd$EVTYPE)){
if (!(sd$EVTYPE[i] %in% validEvents)) {
if (isTRUE(grep("THUNDERSTORM", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "THUNDERSTORM WIND"
} else if (isTRUE(grep("TSTM", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "THUNDERSTORM WIND"
} else if (isTRUE(grep("EXTREME", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "EXTREME COLD/WIND CHILL"
} else if (isTRUE(grep("EXTENDED", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "EXTREME COLD/WIND CHILL"
} else if (isTRUE(grep("WIND", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "STRONG WIND"
} else if (isTRUE(grep("EFFECT", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "LAKE-EFFECT SNOW"
} else if (isTRUE(grep("SNOW", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HEAVY SNOW"
} else if (isTRUE(grep("ASTRONOMICAL HIGH TIDE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "STORM SURGE/TIDE"
} else if (isTRUE(grep("BLOWING DUST", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "DUST STORM"
} else if (isTRUE(grep("DAM BREAK", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "LAKESHORE FLOOD"
} else if (isTRUE(grep("DRY MICROBURST", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "DUST DEVIL"
} else if (isTRUE(grep("LANDSPOUT", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "TORNADO"
} else if (isTRUE(grep("RIP CURRENTS", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "RIP CURRENT"
} else if (isTRUE(grep("SMALL HAIL", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HAIL"
} else if (isTRUE(grep("STORM SURGE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "TSUNAMI"
} else if (isTRUE(grep("TYPHOON", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HURRICANE"
} else if (isTRUE(grep("UNSEASONABLY WARM", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HEAT"
} else if (isTRUE(grep("WINTER", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "WINTER WEATHER"
} else if (isTRUE(grep("WINTRY", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "WINTER WEATHER"
} else if (isTRUE(grep("HURRICANE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HURRICANE"
} else if (isTRUE(grep("DROWNING", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FLASH FLOOD"
} else if (isTRUE(grep("FLASH", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FLASH FLOOD"
} else if (isTRUE(grep("FIRE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "WILDFIRE"
} else if (isTRUE(grep("COASTAL", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "COASTAL FLOOD"
} else if (isTRUE(grep("EROSION", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "COASTAL FLOOD"
} else if (isTRUE(grep("TIDAL", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "COASTAL FLOOD"
} else if (isTRUE(grep("EXPOSURE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FROST/FREEZE"
} else if (isTRUE(grep("UNSEASONABLE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FROST/FREEZE"
} else if (isTRUE(grep("GLAZE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FROST/FREEZE"
} else if (isTRUE(grep("FROST", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FROST/FREEZE"
} else if (isTRUE(grep("FREEZE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FROST/FREEZE"
} else if (isTRUE(grep("COLD", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "COLD/WIND CHILL"
} else if (isTRUE(grep("FREEZING", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FREEZING FOG"
} else if (isTRUE(grep("FOG", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FREEZING FOG"
} else if (isTRUE(grep("SLIDE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "AVALANCHE"
} else if (isTRUE(grep("SLUMP", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "AVALANCHE"
} else if (isTRUE(grep("SURF", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HIGH SURF"
} else if (isTRUE(grep("RAIN", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HEAVY RAIN"
} else if (isTRUE(grep("PRECIP", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HEAVY RAIN"
} else if (isTRUE(grep("BURST", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HEAVY RAIN"
} else if (isTRUE(grep("MARINE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HIGH SURF"
} else if (isTRUE(grep("ROGUE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HIGH SURF"
} else if (isTRUE(grep("HIGH", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HIGH SURF"
} else if (isTRUE(grep("SEAS", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "HIGH SURF"
} else if (isTRUE(grep("ICE", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "ICE STORM"
} else if (isTRUE(grep("ICY", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "ICE STORM"
} else if (isTRUE(grep("STREAM", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FLOOD"
} else if (isTRUE(grep("RIVER", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "FLOOD"
} else if (isTRUE(grep("WARM", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "EXCESSIVE HEAT"
} else if (isTRUE(grep("HEAT", sd$EVTYPE[i])==1)){
sd$EVTYPE[i] <- "EXCESSIVE HEAT"
}
}
}
#review fatality data
sd$FATALITIES <- as.numeric(as.character(sd$FATALITIES))
fatalities <- aggregate(FATALITIES ~ EVTYPE, sd, sum)
colnames(fatalities)[2] <- "number"
fatalities$impact <- "fatality"
topFatalities<-fatalities[fatalities$number>=quantile(fatalities$number)[4],]
#determine top3
topFatalitiesT3<-topFatalities[topFatalities$number>=quantile(topFatalities$number)[4],]
topFatalities$top3 <- "No"
w <- with(topFatalities, EVTYPE %in% topFatalitiesT3$EVTYPE)
topFatalities$top3 <- replace(topFatalities$top3, w, "Yes")
#review injury data
sd$INJURIESDiv <- as.numeric(as.character(sd$INJURIES))/10
injuries <- aggregate(INJURIESDiv ~ EVTYPE, sd, sum)
colnames(injuries)[2] <- "number"
injuries$impact <- "injury"
topInjuries<-injuries[injuries$number>=quantile(injuries$number)[4],]
#determine top3
topInjuriesT3<-topInjuries[topInjuries$number>=quantile(topInjuries$number)[4],]
topInjuries$top3 <- "No"
w <- with(topInjuries, EVTYPE %in% topInjuriesT3$EVTYPE)
topInjuries$top3 <- replace(topInjuries$top3, w, "Yes")
#Add fatalities and Injuries to the same data frame
healthImpact <- rbind(topInjuries, topFatalities)
Impact_names <- c(`fatality` = "Number of fatalities",`injury` = "Number of injuries / 10")
#Plot the results with 2 ggplot bar graphs in a facet grid
mt <- ggplot(healthImpact, aes(x=EVTYPE, y=number)) +
geom_bar(stat="identity") +
facet_grid(. ~ impact, scales = "free", labeller = as_labeller(Impact_names)) +
geom_text(aes(label = EVTYPE, colour = factor(top3)),position = position_stack(vjust = 1.1), size = 2) +
scale_colour_discrete(l = 40) +
labs( x = "Events", y = "Numbers", title ="Population health impact per the top 25% extreme weather events", subtitle = "January 1996 to November 2011") +
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank(),plot.title = element_text(size = 10, face = "bold"))
mt
In order to plot both fatalities and injuries in the same facet, the total injuries per event is devided by 10.
Looking at the bar charts the top 3 excessive weather event resutling in the largest number of fatalities are:
Excessive heat,
Tonadoes and
Flash Floods, in that order.
The top 3 resulting in the largest number of injuries are
Tornadoes (by a large margin),
Floods and
Excessive heat, in that order.
#Reduce size of data set
sdDmg <- subset(sd, !(PROPDMG == 0 & CROPDMG == 0))
KeepCols = c("BGN_DATE","EVTYPE","PROPDMG","PROPDMGEXP","CROPDMG","CROPDMGEXP")
sdDmg <- sdDmg[ , (names(sdDmg) %in% KeepCols)]
## Review property damage
#make factor values numeric
sdDmg$PROPDMG <- as.numeric(as.character(sdDmg$PROPDMG))
#copy to new variable column
sdDmg$RealPD <- sdDmg$PROPDMG
## replace values with multiplied true impact
w <- with(sdDmg, PROPDMGEXP == "B")
sdDmg$RealPD <- replace(sdDmg$PROPDMG, w, sdDmg$PROPDMG[w] * 1000000000)
w <- with(sdDmg, PROPDMGEXP == "M")
sdDmg$RealPD <- replace(sdDmg$PROPDMG, w, sdDmg$PROPDMG[w] * 1000000)
w <- with(sdDmg, PROPDMGEXP == "K")
sdDmg$RealPD <- replace(sdDmg$PROPDMG, w, sdDmg$PROPDMG[w] * 1000)
#sum up property damage per event type
sdDmg$RealPDDiv <- sdDmg$RealPD/10 #devide by 10 to plot on the same scale as crop
pDmg <- aggregate(RealPDDiv ~ EVTYPE, sdDmg, sum)
colnames(pDmg)[2] <- "number"
pDmg$impact <- "property"
topRealPD<-pDmg[pDmg$number>=quantile(pDmg$number)[4],]
#determine top3
topRealPDT3<-topRealPD[topRealPD$number>=quantile(topRealPD$number)[4],]
topRealPD$top3 <- "No"
w <- with(topRealPD, EVTYPE %in% topRealPDT3$EVTYPE)
topRealPD$top3 <- replace(topRealPD$top3, w, "Yes")
## Review crop damage
#make factor values numeric
sdDmg$CROPDMG <- as.numeric(as.character(sdDmg$CROPDMG))
#copy to new variable column
sdDmg$RealCD <- sdDmg$CROPDMG
## replace values with multiplied true impact
w <- with(sdDmg, CROPDMGEXP == "B")
sdDmg$RealCD <- replace(sdDmg$CROPDMG, w, sdDmg$CROPDMG[w] * 1000000000)
w <- with(sdDmg, CROPDMGEXP == "M")
sdDmg$RealCD <- replace(sdDmg$CROPDMG, w, sdDmg$CROPDMG[w] * 1000000)
w <- with(sdDmg, CROPDMGEXP == "K")
sdDmg$RealCD <- replace(sdDmg$CROPDMG, w, sdDmg$CROPDMG[w] * 1000)
#sum up crop damage per event type
cDmg <- aggregate(RealCD ~ EVTYPE, sdDmg, sum)
colnames(cDmg)[2] <- "number"
cDmg$impact <- "crop"
topRealCD<-cDmg[cDmg$number>=quantile(cDmg$number)[4],]
#determine top3
topRealCDT3<-topRealCD[topRealCD$number>=quantile(topRealCD$number)[4],]
topRealCD$top3 <- "No"
w <- with(topRealCD, EVTYPE %in% topRealCDT3$EVTYPE)
topRealCD$top3 <- replace(topRealCD$top3, w, "Yes")
#Add crop and property to the same data frame
economicImpact <- rbind(topRealCD, topRealPD)
#Plot the results with 2 ggplot bar graphs in a facet grid
eImpact_names <- c(`property` = "US Dollar impact on Property / 10",`crop` = "US Dollar impact on Crop")
mt <- ggplot(economicImpact, aes(x=EVTYPE, y=number)) +
geom_bar(stat="identity") +
facet_grid(. ~ impact, labeller = as_labeller(eImpact_names)) +
geom_text(aes(label = EVTYPE, colour = factor(top3)),position = position_stack(vjust = 1.1), size = 2) + scale_y_continuous(labels = dollar) +
scale_colour_discrete(l = 40) +
labs( x = "Events", y = "Numbers", title ="Economic impact per the top 25% extreme weather events", subtitle = "January 1996 to November 2011") +
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank(),plot.title = element_text(size = 10, face = "bold"))
mt
In order to plot the two bar graphs in the same facet grid, the dollar value of the property damage is devided by 10. The top 3 events causing the most Crop damage is in order:
Hail with the largest impact
Thunderstorm wind is second
Flash Floods is a close third
The top 3 events causing the most Property damage is in order:
Thunderstorm wind with the largest impact
Flash Floods is second
Tornados is a close third
But to see the true economic impact the crop and property damage needs to be combined:
#Review total crop and porperty impact combined
### recalc property - Property Damage was divided by 10 before
#sum up property damage per event type
pDmg <- aggregate(RealPD ~ EVTYPE, sdDmg, sum)
colnames(pDmg)[2] <- "number"
pDmg$impact <- "property"
combEImpact <- rbind(cDmg, pDmg) #redefine with the prop damage true value
sumEImpact <- aggregate(number ~ EVTYPE, combEImpact, sum)
topSumEImpact<-sumEImpact[sumEImpact$number>=quantile(sumEImpact$number)[4],]
#determine top3
topEIT3<-topSumEImpact[topSumEImpact$number>=quantile(topSumEImpact$number)[4],]
topSumEImpact$top3 <- "No"
w <- with(topSumEImpact, EVTYPE %in% topEIT3$EVTYPE)
topSumEImpact$top3 <- replace(topSumEImpact$top3, w, "Yes")
mt <- ggplot(topSumEImpact, aes(x=EVTYPE, y=number)) +
geom_bar(stat="identity") +
geom_text(aes(label = EVTYPE, colour = factor(top3)), position = position_stack(vjust = 1.1), size = 2) + scale_y_continuous(labels = dollar) +
scale_colour_discrete(l = 40) +
labs( x = "Events", y = "Numbers", title ="Combined Economic impact per the top 25% extreme weather events", subtitle = "Crop and property total damage in US dollar") +
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank(),plot.title = element_text(size = 10, face = "bold"))
mt
The combined graph shows the severe weather events that cause the largest economic impact is closely alligned with the events identified in property damage. This is most likely because the largest recorded damages are in property, overshadowing the crop damages:
Thunderstorm wind with the largets impact
Flash Floods is second
Tornados is a close third