Synopsis

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. My major focus are on the top 10 to 30 events that have had ove the course of the year a major economic and health impact. The table displays the summary of the findings. Storms and other severe weather events have different economic impact in property damages and crops. This study is aimed to be transparent and reproducible for further research.

Data processing

The packages used for the analysis are as follows in order to analyze and display the data. Also the retrieving of the data. and its treatment of dates.

library(data.table)
library(lubridate)
library(ggplot2)
library(GGally)
library(broom)
library(lucr)
library(cowplot)
library(gridExtra)
library(knitr)

dt <- data.table(read.csv("repdata%2Fdata%2FStormData.csv.bz2", stringsAsFactors = FALSE))
dt[,BGN_DATE := mdy_hms(BGN_DATE)]

Standardize Damage expenses

The cookbook mentions that the Alphabetical characters used for the damage expenses signify magnitude, include “K” for thousands, “M” for millions, and “B” for billions.

#correcting the magnitude of expenses
dt[,PROPDMG:=ifelse(PROPDMGEXP=="K",PROPDMG*1000,ifelse(PROPDMGEXP=="M",PROPDMG*1000000,ifelse(PROPDMGEXP=="B",PROPDMG*1000000000,PROPDMG)))]
dt[,CROPDMG:=ifelse(CROPDMGEXP=="K",CROPDMG*1000,ifelse(CROPDMGEXP=="M",CROPDMG*1000000,ifelse(PROPDMGEXP=="B",CROPDMG*1000000000,CROPDMG)))]

The events variable

There are 985 events categorized by the National Weather Service. Many different kinds of storms. I have selected those that have been repeated more than 50 times. THe idea was to group similar events, but the task was too large and would not be accurate enough. I only show the names of the events ordered by most frequent to least.

events <- factor(tolower(dt$EVTYPE))
# 985 levels
tab <- table(events)
sort(tab, T)[1:50]
## events
##                     hail                tstm wind        thunderstorm wind 
##                   288661                   219942                    82564 
##                  tornado              flash flood                    flood 
##                    60652                    54277                    25327 
##       thunderstorm winds                high wind                lightning 
##                    20843                    20214                    15754 
##               heavy snow               heavy rain             winter storm 
##                    15708                    11742                    11433 
##           winter weather             funnel cloud         marine tstm wind 
##                     7045                     6844                     6175 
## marine thunderstorm wind               waterspout              strong wind 
##                     5812                     3796                     3569 
##     urban/sml stream fld                 wildfire                 blizzard 
##                     3392                     2761                     2719 
##                  drought                ice storm           excessive heat 
##                     2488                     2006                     1678 
##               high winds         wild/forest fire             frost/freeze 
##                     1533                     1457                     1343 
##                dense fog       winter weather/mix           tstm wind/hail 
##                     1293                     1104                     1028 
##  extreme cold/wind chill                     heat                high surf 
##                     1002                      767                      734 
##           tropical storm           flash flooding             extreme cold 
##                      690                      682                      657 
##            coastal flood         lake-effect snow        flood/flash flood 
##                      656                      636                      625 
##                     snow                landslide          cold/wind chill 
##                      617                      600                      539 
##                      fog              rip current              marine hail 
##                      538                      470                      442 
##               dust storm                avalanche                     wind 
##                      427                      386                      346 
##             rip currents              storm surge 
##                      304                      261

The table summary of the data below shows a partial result that will further be analyzed.

summarized <- dt[,.(Fatalities= sum(FATALITIES),Injuries= sum(INJURIES),Crop_Damage = to_currency(sum(CROPDMG), currency_symbol = "$"),Property_Damage = to_currency(sum(PROPDMG), currency_symbol ="$")), by=EVTYPE][order(-Fatalities),]
kable(summarized[Fatalities>1&Injuries>1,])
EVTYPE Fatalities Injuries Crop_Damage Property_Damage
TORNADO 5633 91346 $414,953,270.00 $56,925,660,790.20
EXCESSIVE HEAT 1903 6525 $492,402,000.00 $7,753,700.00
FLASH FLOOD 978 1777 $1,421,317,100.00 $16,140,812,067.10
HEAT 937 2100 $1,461,500.40 $1,797,000.00
LIGHTNING 816 5230 $12,092,090.00 $928,659,446.70
TSTM WIND 504 6957 $554,007,350.00 $4,484,928,495.00
FLOOD 470 6789 $5,661,968,450.00 $144,657,709,807.00
RIP CURRENT 368 232 $0.00 $1,000.00
HIGH WIND 248 1137 $638,571,300.00 $5,270,046,295.00
AVALANCHE 224 170 $0.00 $3,721,800.00
WINTER STORM 206 1321 $26,944,000.00 $6,688,497,251.00
RIP CURRENTS 204 297 $0.00 $162,000.00
HEAT WAVE 172 309 $5,550,000.00 $10,460,050.00
EXTREME COLD 160 231 $1,292,973,000.00 $67,737,400.00
THUNDERSTORM WIND 133 1488 $414,843,050.00 $3,483,121,284.00
HEAVY SNOW 127 1021 $134,653,100.00 $932,589,141.70
EXTREME COLD/WIND CHILL 125 24 $50,000.00 $8,648,000.00
STRONG WIND 103 280 $64,953,500.00 $175,241,450.00
BLIZZARD 101 805 $112,060,000.00 $659,213,950.00
HIGH SURF 101 152 $0.00 $89,575,000.00
HEAVY RAIN 98 251 $733,399,800.00 $694,248,090.00
EXTREME HEAT 96 155 $5,000,000.00 $115,000.00
COLD/WIND CHILL 95 12 $600,000.00 $1,990,000.00
ICE STORM 89 1975 $22,113,505.00 $3,944,927,860.00
WILDFIRE 75 911 $295,472,800.00 $4,765,114,000.00
THUNDERSTORM WINDS 64 908 $190,650,792.00 $1,733,459,025.70
HURRICANE/TYPHOON 64 1275 $2,607,872,800.00 $69,305,840,000.00
FOG 62 734 $0.00 $13,155,500.00
HURRICANE 61 46 $2,741,910,000.00 $11,868,319,010.00
TROPICAL STORM 58 340 $678,346,000.00 $7,703,890,550.00
HEAVY SURF/HIGH SURF 42 48 $0.00 $9,870,000.00
LANDSLIDE 38 52 $20,017,000.00 $324,596,000.00
HIGH WINDS 35 302 $40,720,600.00 $608,323,748.00
COLD 35 48 $0.00 $500,000.00
WINTER WEATHER 33 398 $15,000,000.00 $20,866,000.00
TSUNAMI 33 129 $20,000.00 $144,062,000.00
URBAN/SML STREAM FLD 28 79 $8,488,100.00 $58,309,650.00
WINTER WEATHER/MIX 28 72 $0.00 $6,372,000.00
WIND 23 86 $300,000.00 $8,684,500.00
DUST STORM 22 440 $3,100,000.00 $5,549,000.00
FLASH FLOODING 19 8 $15,106,060.00 $307,763,604.00
DENSE FOG 18 342 $0.00 $9,674,000.00
FLOOD/FLASH FLOOD 17 15 $95,034,000.00 $174,039,009.20
EXTREME WINDCHILL 17 5 $17,000,000.00 $755,000.00
HAIL 15 1361 $3,025,537,890.00 $15,727,367,052.60
MARINE STRONG WIND 14 22 $0.00 $418,330.00
STORM SURGE 13 38 $5,000.00 $43,323,536,000.00
WILD/FOREST FIRE 12 545 $106,796,830.00 $3,001,829,500.00
UNSEASONABLY WARM 11 17 $10,000.00 $0.00
STORM SURGE/TIDE 11 5 $850,000.00 $4,641,188,000.00
WINTER STORMS 10 17 $500,000.00 $500,000.00
MARINE THUNDERSTORM WIND 10 26 $50,000.00 $436,400.00
MARINE TSTM WIND 9 8 $0.00 $5,421,000.00
TROPICAL STORM GORDON 8 43 $500,000.00 $500,000.00
ROUGH SEAS 8 5 $0.00 $0.00
FREEZING RAIN 7 23 $0.00 $8,111,500.00
MARINE MISHAP 7 5 $0.00 $0.00
STRONG WINDS 7 21 $5,000,000.00 $2,314,790.00
HEAVY SURF 7 40 $0.00 $1,390,000.00
GLAZE 7 216 $0.00 $910,000.00
FLOODING 6 2 $8,855,500.00 $108,255,006.00
ICE 6 137 $0.00 $12,655,000.00
SNOW 5 29 $10,000.00 $14,762,550.00
HIGH SEAS 5 8 $0.00 $15,500.00
ICY ROADS 5 31 $0.00 $341,200.00
TSTM WIND/HAIL 5 95 $64,696,250.00 $44,335,500.00
GUSTY WINDS 4 8 $200,000.00 $343,000.00
RAIN/SNOW 4 2 $0.00 $0.00
HEAT WAVE DROUGHT 4 15 $50,000.00 $200,000.00
Mudslide 4 2 $0.00 $0.00
WATERSPOUT 3 29 $0.00 $9,353,700.00
COASTAL FLOOD 3 2 $0.00 $237,665,560.00
WILD FIRES 3 150 $0.00 $624,100,000.00
DRY MICROBURST 3 28 $15,000.00 $6,732,600.00
WATERSPOUT/TORNADO 3 42 $0.00 $51,110,000.00
HIGH WINDS/SNOW 3 6 $0.00 $150,000.00
HIGH WIND AND SEAS 3 20 $0.00 $50,000.00
High Surf 3 4 $0.00 $380,000.00
RIVER FLOOD 2 2 $5,029,459,000.00 $5,118,945,500.00
DUST DEVIL 2 42 $0.00 $700,330.00
FREEZING DRIZZLE 2 15 $0.00 $75,000.00
RECORD HEAT 2 50 $0.00 $0.00
SNOW SQUALL 2 35 $0.00 $680,000.00
EXCESSIVE RAINFALL 2 21 $0.00 $0.00
MIXED PRECIP 2 26 $0.00 $0.00

RESULTS

Across the United States, which types of events (as indicated in the 𝙴𝚅𝚃𝚈𝙿𝙴 variable) are most harmful with respect to population health?

The figure displays all the events which are too long to see in an Q4 paper.

health <- dt[,.(injuries= sum(INJURIES),fatalities= sum(FATALITIES)),by=EVTYPE][injuries>1 & fatalities>1]
health.m <- melt(health)
## Warning in melt.data.table(health): To be consistent with reshape2's melt,
## id.vars and measure.vars are internally guessed when both are 'NULL'. All
## non-numeric/integer/logical type columns are conisdered id.vars, which in
## this case are columns [EVTYPE]. Consider providing at least one of 'id' or
## 'measure' vars in future.
g <- ggplot(health.m, aes(EVTYPE, value/1000))  + xlab("Event types") + ylab("Casualties by the thousands ")
g <- g + geom_bar(aes(fill=variable),stat="identity") +ggtitle("Casualties by all events")
g <- g + theme(axis.text.y=element_text(size=4))+ coord_flip()
g
Fig 1:All Events

Fig 1:All Events

From the graph above I further studied the top 3 events, tornado, excessive heat and flashfoods in terms of fatalities. in terms of injuries it is different.

##plotting the most important throughout event the years

tornado <- dt[EVTYPE=="TORNADO",.(injuries= sum(INJURIES),fatalities=sum(FATALITIES)),
              by=BGN_DATE]
tornado.m <- melt(tornado,id.vars=1)
g1 <- ggplot(tornado.m, aes(as.numeric(year(BGN_DATE)), value))  + xlab("Event types") + ylab("Casualties") + ggtitle("Yearly Tornado Casualties")
g1 <- g1 + geom_bar(aes(fill=variable),stat="identity")
g1 <- g1 + theme(axis.text.x=element_text(angle=90, hjust =1))

##Excessive Heat
heat <- dt[EVTYPE=="EXCESSIVE HEAT",.(injuries= sum(INJURIES),fatalities=sum(FATALITIES)),
              by=BGN_DATE]
heat.m <- melt(heat,id.vars=1)
g2 <- ggplot(heat.m, aes(as.numeric(year(BGN_DATE)), value))  + xlab("Event types") + ylab("Casualties") + ggtitle("Yearly Excessive Heat Casualties")
g2 <- g2 + geom_bar(aes(fill=variable),stat="identity")
g2 <- g2 + theme(axis.text.x=element_text(angle=90, hjust =1))
##Flashfloods
floods <- dt[EVTYPE=="FLASH FLOOD",.(injuries= sum(INJURIES),fatalities=sum(FATALITIES)),
              by=BGN_DATE]
floods.m <- melt(floods,id.vars=1)
g3 <- ggplot(floods.m, aes(as.numeric(year(BGN_DATE)), value))  + xlab("Event types") + ylab("Casualties") + ggtitle("Yearly Flash Floods Casualties")
g3 <- g3 + geom_bar(aes(fill=variable),stat="identity")
g3 <- g3 + theme(axis.text.x=element_text(angle=90, hjust =1))

grid.arrange(g1,g2,g3,ncol=2, nrow=2, layout_matrix = rbind(c(1,1), c(2,3)))
Fig 2:combination of casualties per event through time

Fig 2:combination of casualties per event through time

Across the United States, which types of events have the greatest economic consequences?

  • Top ten Crop Damage:
summarized <- dt[,.(Fatalities= sum(FATALITIES),Injuries= sum(INJURIES),Crop_Damage = sum(CROPDMG),Property_Damage = sum(PROPDMG)), by=EVTYPE][order(-Fatalities),]

croppy <- summarized
proppy <- summarized
croppy <- croppy[order(-Crop_Damage),][1:10,][,.(EVTYPE,Fatalities,Injuries,Crop_Damage = to_currency(Crop_Damage, currency_symbol = "$"),Property_Damage = to_currency(Property_Damage, currency_symbol = "$"))]

proppy <- proppy[order(-Property_Damage),][1:10,][,.(EVTYPE,Fatalities,Injuries,Crop_Damage = to_currency(Crop_Damage, currency_symbol = "$"),Property_Damage = to_currency(Property_Damage, currency_symbol = "$"))]

kable(croppy)
EVTYPE Fatalities Injuries Crop_Damage Property_Damage
DROUGHT 0 4 $12,472,566,001.50 $ 1,046,106,000.00
FLOOD 470 6789 $ 5,661,968,450.00 $144,657,709,807.00
RIVER FLOOD 2 2 $ 5,029,459,000.00 $ 5,118,945,500.00
HAIL 15 1361 $ 3,025,537,890.00 $ 15,727,367,052.60
HURRICANE 61 46 $ 2,741,910,000.00 $ 11,868,319,010.00
HURRICANE/TYPHOON 64 1275 $ 2,607,872,800.00 $ 69,305,840,000.00
FLASH FLOOD 978 1777 $ 1,421,317,100.00 $ 16,140,812,067.10
EXTREME COLD 160 231 $ 1,292,973,000.00 $ , 67,737,400.00
FROST/FREEZE 0 0 $ 1,094,086,000.00 $ , 9,480,000.00
HEAVY RAIN 98 251 $ ,733,399,800.00 $ ,694,248,090.00
* Top ten Property D amage:
#reflecting the real dollars to the damages

kable(proppy)
EVTYPE Fatalities Injuries Crop_Damage Property_Damage
FLOOD 470 6789 $5,661,968,450.00 $144,657,709,807.00
HURRICANE/TYPHOON 64 1275 $2,607,872,800.00 $ 69,305,840,000.00
TORNADO 5633 91346 $ ,414,953,270.00 $ 56,925,660,790.20
STORM SURGE 13 38 $ , , 5,000.00 $ 43,323,536,000.00
FLASH FLOOD 978 1777 $1,421,317,100.00 $ 16,140,812,067.10
HAIL 15 1361 $3,025,537,890.00 $ 15,727,367,052.60
HURRICANE 61 46 $2,741,910,000.00 $ 11,868,319,010.00
TROPICAL STORM 58 340 $ ,678,346,000.00 $ 7,703,890,550.00
WINTER STORM 206 1321 $ , 26,944,000.00 $ 6,688,497,251.00
HIGH WIND 248 1137 $ ,638,571,300.00 $ 5,270,046,295.00
summarized <- dt[,.(Fatalities= sum(FATALITIES),Injuries= sum(INJURIES),Crop_Damage = sum(CROPDMG),Property_Damage = sum(PROPDMG)), by=EVTYPE][order(-Fatalities),]
summarized.m <- melt(summarized[,c(1,4,5)],id.vars=1)
summarized.m <- summarized.m[value>100000]
g4 <- ggplot(summarized.m, aes(EVTYPE,value/1000000000))  + xlab("Event types") + ylab("Damage Expenses in Billions of Dollars")
g4 <- g4+ geom_bar(aes(fill=variable),stat="identity")+ ggtitle("All Economic Damages by Event")
g4 <- g4 + theme(axis.text.y=element_text(size = 4)) + coord_flip()
#top 30 total economic damages
summarized.m.1 <- summarized.m[,sum(value),by= EVTYPE][order(-V1)][1:30,]
g5 <- ggplot(summarized.m.1, aes(EVTYPE,V1/1000000000)) + xlab("Event types") + ylab("Damage Expenses in Billions of Dollars")
g5 <- g5+ geom_bar(stat="identity") + ggtitle("Top 30 Economic Damages by Event")
g5 <- g5 + theme(axis.text.y=element_text(size = 4)) + coord_flip()

grid.arrange(g4,g5,nrow=2)
Fig 3:Damages Expenses by event

Fig 3:Damages Expenses by event