This paper explores the US National Oceanie and Atmospheric Admistration (NOAA) Storm Database (from 1950 to November 2011) and address the following:
Types of weather events that are the most harmful with respect to population health; and
Types of weather events that have the greatest economic consequences,
From the exploration of US NOAA’s storm database, we define what is considered the most harmful to the population health and the indicators of greatest economic consequences.
Following our data analysis, Extreme Heat conditions caused the highest number of deaths for each occurrence, while Tornado caused the highest number of injuries. Tornado also occurred the most frequent among those that are the most harmful to population health.
Hurricane/Typhoon occurred the most frequent among the weather events that caused great damage to property and crops thereby resulting in heavy economic consequences. This is followed by Flood and Extreme Heat conditions. In terms of severity, the greatest damage to property was caused by Flood, while the greatest damage to crop was caused by (River) Flood and Winter Storm.
On data processing tools, we used functions like select and filter to create relevant subsets of the database. We also used functions like match and grepl to perform some data cleaning work, such as event re-naming and event classification. For display of results, we used tables and figures (namely barplots) to show events in order of occurrence level. We also used functions like max to highlight events that caused the greatest harm to population health or greatest property/crop damage.
After downloading the file from the url to saving the file in the local drive (see code below), we process the data to address questions (A) and (B).
if(!file.exists("./data")){dir.create("./data")}
download.file(url="https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile = "./data/StormData.csv.bz2",method = "curl")
bunzip2("./data/StormData.csv.bz2","./data/StormData.csv",remove=FALSE,skip=TRUE)
## [1] "./data/StormData.csv"
## attr(,"temporary")
## [1] FALSE
Stormdata<-read.csv("./data/StormData.csv",sep=",",header=TRUE)
(A) Weather Events that are Most Harmful to Population Health
First, we define what is considered most harmful to population health. Since death brings ultimate harm to the health of the population, we consider weather events that brought about fatalities, i.e. fatalities >0. For injuries, we consider weather events that brought about more than 5 injuries. This amount to the top 1% of the casualties.The code below creates a dataframe “Harmful” with event types that cause fatalities or more than 5 injuries.
Harmful<-select(Stormdata,EVTYPE,INJURIES,FATALITIES)
Harmful<-filter(Harmful,FATALITIES>0 | INJURIES>5)
Second, we set out to find out the types of weather events that brought about death and more than 5 injuries. Data cleaning work is done to:
address the event type names (i.e. variable EVTYPE) which have mis-spellings and inconsistent formatting (e.g. some are in captial letters, singular/plural form and short forms);and
categorise the events (e.g. classifying “Heat”,“Heat Wave” and “Excessive Heat” together under “Extreme Heat”).
t<-table(Harmful$EVTYPE)
w=as.data.frame(t)
event<-filter(w,Freq>0)
event$Evt<-as.character(event$Var1)
event$Evt<-gsub("DENSE FOG","FOG",event$Evt)
event$Evt<-gsub("FOG AND COLD TEMPERATURES","FOG",event$Evt)
event$Evt<-gsub("DUST DEVIL","DUST STORM",event$Evt)
event$Evt<-gsub("LIGHTNING.","LIGHTNING",event$Evt)
event$Evt<-gsub("WILD FIRES","WILD/FOREST FIRE",event$Evt)
event$Evt<-gsub("WILDFIRE","WILD/FOREST FIRE",event$Evt)
event$Evt<-gsub("LIGHTNING.","LIGHTNING",event$Evt)
event$Evt<-gsub("Marine Accident","MARINE MISHAP",event$Evt)
event$Evt<-gsub("EXCESSIVE RAINFALL","HEAVY RAIN",event$Evt)
toMatchC<-c("Cold","COLD","CHILL","WINTER WEATHER","WINTRY","LOW TEMPERATURE")
matchesC<-filter(event,grepl(paste(toMatchC,collapse = "|"),Evt))
matchesC$Evt<-"EXTREME COLD"
auxind<-match(matchesC$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesC)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchF<-c("Flood","FLOOD","FLD")
matchesF<-filter(event,grepl(paste(toMatchF,collapse = "|"),Evt))
matchesF$Evt<-"FLOOD"
auxind<-match(matchesF$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesF)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchH<-c("Heat","HEAT")
matchesH<-filter(event,grepl(paste(toMatchH,collapse = "|"),Evt))
matchesH$Evt<-"EXTREME HEAT"
auxind<-match(matchesH$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesH)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
matchesT<-filter(event,grepl("TORNADO",Evt))
matchesT$Evt<-"TORNADO"
auxind<-match(matchesT$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesT)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
matchesHU<-filter(event,grepl("HURRICANE",Evt))
matchesHU$Evt<-"HURRICANE"
auxind<-match(matchesHU$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesHU)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchHy<-c("Hyperthermia","HYPERTHERMIA","Hypothermia","HYPOTHERMIA")
matchesHy<-filter(event,grepl(paste(toMatchHy,collapse = "|"),Evt))
matchesHy$Evt<-"HYPOTHERMIA"
auxind<-match(matchesHy$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesHy)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchTH<-c("THUNDERSTORM","TSTM","MICROBURST","THUNDERTORM")
matchesTH<-filter(event,grepl(paste(toMatchTH,collapse = "|"),Evt))
matchesTH$Evt<-"THUNDERSTORM"
auxind<-match(matchesTH$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesTH)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchR<-c("ROAD","BLACK ICE","MIXED PRECIP")
matchesR<-filter(event,grepl(paste(toMatchR,collapse = "|"),Evt))
matchesR$Evt<-"ICY ROADS"
auxind<-match(matchesR$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesR)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchWS<-c("SNOW","snow","Snow","Squall","SQUALL","FROST","FREEZE","FREEZING","ICE STORM","GLAZE","WINTER STORM","SLEET","HAIL","BLIZZARD","ICE","Freezing","AVALANCE","AVALANCHE")
matchesWS<-filter(event,grepl(paste(toMatchWS,collapse = "|"),Evt))
matchesWS$Evt<-"WINTER STORM"
auxind<-match(matchesWS$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesWS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
event$Evt[99]<-"LIGHT SNOW" #Since this is LIGHT SNOW, it should not be considered as SNOW STORM
toMatchSl<-c("SLIDE","slide")
matchesSl<-filter(event,grepl(paste(toMatchSl,collapse = "|"),Evt))
matchesSl$Evt<-"LAND/MUD SLIDES"
auxind<-match(matchesSl$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesSl)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchSS=c("COASTAL","STORM SURGE","RISING WATER","SURF","Surf","WATER","WAVE","surf","SEAS","SWELLS","RIP CURRENT")
matchesSS<-filter(event,grepl(paste(toMatchSS,collapse = "|"),Evt))
matchesSS$Evt<-"STORM SURGE"
auxind<-match(matchesSS$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesSS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
toMatchWW=c("WIND","wind","Wind","TROPICAL STORM")
matchesWW<-filter(event,grepl(paste(toMatchWW,collapse = "|"),Evt))
matchesWW$Evt<-"WIND STORM"
auxind<-match(matchesWW$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesWW)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni
(B) Weather Events that have the Greatest Economic Consequences
First, we define what is considered to be of great economic consequences. Undeniably, events that cause damage to property and crops, which in turn affect the livelihood, exports and trade, can result in heavy economic consequences to the country. Upon exploring the data, we identify the severity of property and crop damage by the exponents of billions, i.e. PROPDMGEXP ==“B” or CROPDMGEXP ==“B”.From the two tables below, we note that there are 40 and 9 cases where the damge to property and crop, respectively, was of exponential in the billions.
EconConseq<-select(Stormdata,EVTYPE,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP)
table(EconConseq$PROPDMGEXP)
##
## - ? + 0 1 2 3 4 5
## 465934 1 8 5 216 25 13 4 4 28
## 6 7 8 B h H K m M
## 4 5 1 40 1 6 424665 7 11330
table(EconConseq$CROPDMGEXP)
##
## ? 0 2 B k K m M
## 618413 7 19 1 9 21 281832 1 1994
Second, we set out to find out the types of weather events that brought about damage to property and crop of a scale in the exponential of billions (see code below creating the data frame “EconConseq”). As before, data cleaning work was done to:
address the event type names (i.e. variable EVTYPE);and
categorise the events.
EconConseq<-filter(EconConseq,PROPDMGEXP=="B" | CROPDMGEXP=="B")
s<-table(EconConseq$EVTYPE)
z=as.data.frame(s)
eventE<-filter(z,Freq>0)
eventE$Evt<-as.character(eventE$Var1)
matchesEF<-filter(eventE,grepl("FLOOD",Evt))
matchesEF$Evt<-"FLOOD"
auxind<-match(matchesEF$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEF)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
matchesEH<-filter(eventE,grepl("HURRICANE",Evt))
matchesEH$Evt<-"HURRICANE/TYPHOON"
auxind<-match(matchesEH$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEH)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
matchesET<-filter(eventE,grepl("TORNADO",Evt))
matchesET$Evt<-"TORNADO"
auxind<-match(matchesET$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesET)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
toMatchEWS<-c("HAIL","ICE","FREEZE")
matchesEWS<-filter(eventE,grepl(paste(toMatchEWS,collapse = "|"),Evt))
matchesEWS$Evt<-"WINTER STORM"
auxind<-match(matchesEWS$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEWS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
toMatchEHE<-c("DROUGHT","HEAT")
matchesEHE<-filter(eventE,grepl(paste(toMatchEHE,collapse = "|"),Evt))
matchesEHE$Evt<-"EXTREME HEAT"
auxind<-match(matchesEHE$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEHE)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
matchesESS<-filter(eventE,grepl("STORM SURGE",Evt))
matchesESS$Evt<-"STORM SURGE"
auxind<-match(matchesESS$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesESS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
matchesEFF<-filter(eventE,grepl("WILDFIRE",Evt))
matchesEFF$Evt<-"WILD/FOREST FIRE"
auxind<-match(matchesEFF$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEFF)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
toMatchEWT<-c("WIND","TROPICAL STORM")
matchesEWT<-filter(eventE,grepl(paste(toMatchEWT,collapse = "|"),Evt))
matchesEWT$Evt<-"WIND STORM"
auxind<-match(matchesEWT$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEWT)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni
eventE$Evt<-gsub("HEAVY RAIN/SEVERE WEATHER","HEAVY RAIN",eventE$Evt)
(A) Weather Events that are Most Harmful to Population Health
Weather event types that are the most harmful to the population health, i.e. causing death or more than 5 injuries, are as shown in the table (from the highest occurrence to the lowest) below:
frequency<-aggregate(x=event$Freq,by=list(EventType=event$Evt),FUN=sum)
names(frequency)<-c("EventType","Freq")
HightoLow<-frequency[order(frequency$Freq,decreasing = TRUE),]
HightoLow
## EventType Freq
## 17 TORNADO 2969
## 5 FLOOD 1081
## 4 EXTREME HEAT 872
## 13 LIGHTNING 872
## 16 THUNDERSTORM 863
## 21 WINTER STORM 691
## 15 STORM SURGE 671
## 20 WIND STORM 416
## 3 EXTREME COLD 396
## 19 WILD/FOREST FIRE 92
## 6 FOG 83
## 7 HEAVY RAIN 72
## 8 HURRICANE 52
## 2 DUST STORM 30
## 11 LAND/MUD SLIDES 16
## 10 ICY ROADS 12
## 9 HYPOTHERMIA 8
## 14 MARINE MISHAP 3
## 18 TSUNAMI 2
## 1 DROWNING 1
## 12 LIGHT SNOW 1
Top 10 weather event types (in order of occurrence frequency) that are harmful to human population are shown in the barplot below:
Top10<-HightoLow[10:1,]
barplot(Top10$Freq,horiz=TRUE,col=rainbow(10),legend.text = Top10$EventType,main="Top 10 Weather Events: Harmful to Population Health", xlab = "Frequency")
As shown from the barplot, Tornado occurred the most frequent, resulting death and/or more than 5 injuries, bringing much harm to the human population. This is followed by Flood and Extreme Heat conditions.
In terms of severity, Extreme Heat conditions caused the highest number of deaths for each occurrence, while Tornado caused the highest number of injuries (see code below).
FHarmful<-filter(Harmful,FATALITIES>0)
max(FHarmful$FATALITIES)
## [1] 583
MaxFHarmful<-filter(Harmful,FATALITIES=="583")
MaxFHarmful
## EVTYPE INJURIES FATALITIES
## 1 HEAT 0 583
IHarmful<-filter(Harmful,INJURIES>5)
max(IHarmful$INJURIES)
## [1] 1700
MaxIHarmful<-filter(Harmful,INJURIES=="1700")
MaxIHarmful
## EVTYPE INJURIES FATALITIES
## 1 TORNADO 1700 42
(B) Weather Events that have the Greatest Economic Consequences
Weather event types that have the greatest economic consequences, i.e. causing damage to property or crops by the billions, are as shown in the table (from the highest occurrence to the lowest) below:
frequencyE<-aggregate(x=eventE$Freq,by=list(EventType=eventE$Evt),FUN=sum)
names(frequencyE)<-c("EventType","Freq")
HightoLowE<-frequencyE[order(frequencyE$Freq,decreasing = TRUE),]
HightoLowE
## EventType Freq
## 4 HURRICANE/TYPHOON 18
## 2 FLOOD 7
## 1 EXTREME HEAT 5
## 7 TORNADO 4
## 10 WINTER STORM 4
## 6 STORM SURGE 3
## 8 WILD/FOREST FIRE 2
## 9 WIND STORM 2
## 3 HEAVY RAIN 1
## 5 SEVERE THUNDERSTORM 1
Top 10 weather event types (in order of occurrence frequency) that have great economic consequences are shown in the barplot below:
Top10E<-HightoLowE[10:1,]
barplot(Top10E$Freq,horiz=TRUE,col=rainbow(10),legend.text = Top10E$EventType,main="Top 10 Weather Events: Great Economic Consequences", xlab = "Frequency")
As shown from the barplot, Hurricane/Typhoon occurred the most frequent, resulting great damage to property and crops thereby resulting in heavy economic consequences. This is followed by Flood and Extreme Heat conditions.
In terms of severity, Flood caused the greatest property and crop damage (see code below). Winter Storm also caused great crop damage.
EconConseq<-filter(EconConseq,PROPDMGEXP=="B" | CROPDMGEXP=="B")
PEconConseq<-filter(EconConseq,PROPDMGEXP=="B")
max(PEconConseq$PROPDMG)
## [1] 115
MaxPropConseq<-filter(EconConseq,PROPDMG=="115"& PROPDMGEXP=="B")
MaxPropConseq
## EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 FLOOD 115 B 32.5 M
CEconConseq<-filter(EconConseq,CROPDMGEXP=="B")
max(CEconConseq$CROPDMG)
## [1] 5
MaxCropConseq<-filter(EconConseq,CROPDMG=="5"& CROPDMGEXP=="B")
MaxCropConseq
## EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 RIVER FLOOD 5 B 5 B
## 2 ICE STORM 500 K 5 B
While we are not in control of the occurrence of the weather events that brought about harm to the population health and great economic consequences, policies, technologies and infrastructure can be put in place to minimise the damage by certain weather events. For instance, we can install hydration and cooling elements during extreme heat conditions and better dam and drainage systems to deal with floods.