Title

This paper explores the US National Oceanie and Atmospheric Admistration (NOAA) Storm Database (from 1950 to November 2011) and address the following:

  1. Types of weather events that are the most harmful with respect to population health; and

  2. Types of weather events that have the greatest economic consequences,

Synopsis

From the exploration of US NOAA’s storm database, we define what is considered the most harmful to the population health and the indicators of greatest economic consequences.

Following our data analysis, Extreme Heat conditions caused the highest number of deaths for each occurrence, while Tornado caused the highest number of injuries. Tornado also occurred the most frequent among those that are the most harmful to population health.

Hurricane/Typhoon occurred the most frequent among the weather events that caused great damage to property and crops thereby resulting in heavy economic consequences. This is followed by Flood and Extreme Heat conditions. In terms of severity, the greatest damage to property was caused by Flood, while the greatest damage to crop was caused by (River) Flood and Winter Storm.

On data processing tools, we used functions like select and filter to create relevant subsets of the database. We also used functions like match and grepl to perform some data cleaning work, such as event re-naming and event classification. For display of results, we used tables and figures (namely barplots) to show events in order of occurrence level. We also used functions like max to highlight events that caused the greatest harm to population health or greatest property/crop damage.

Data Processing

After downloading the file from the url to saving the file in the local drive (see code below), we process the data to address questions (A) and (B).

if(!file.exists("./data")){dir.create("./data")}
download.file(url="https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile = "./data/StormData.csv.bz2",method = "curl")
bunzip2("./data/StormData.csv.bz2","./data/StormData.csv",remove=FALSE,skip=TRUE)
## [1] "./data/StormData.csv"
## attr(,"temporary")
## [1] FALSE
Stormdata<-read.csv("./data/StormData.csv",sep=",",header=TRUE)

(A) Weather Events that are Most Harmful to Population Health

First, we define what is considered most harmful to population health. Since death brings ultimate harm to the health of the population, we consider weather events that brought about fatalities, i.e. fatalities >0. For injuries, we consider weather events that brought about more than 5 injuries. This amount to the top 1% of the casualties.The code below creates a dataframe “Harmful” with event types that cause fatalities or more than 5 injuries.

Harmful<-select(Stormdata,EVTYPE,INJURIES,FATALITIES)
Harmful<-filter(Harmful,FATALITIES>0 | INJURIES>5)

Second, we set out to find out the types of weather events that brought about death and more than 5 injuries. Data cleaning work is done to:

  1. address the event type names (i.e. variable EVTYPE) which have mis-spellings and inconsistent formatting (e.g. some are in captial letters, singular/plural form and short forms);and

  2. categorise the events (e.g. classifying “Heat”,“Heat Wave” and “Excessive Heat” together under “Extreme Heat”).

t<-table(Harmful$EVTYPE)
w=as.data.frame(t)
event<-filter(w,Freq>0)
event$Evt<-as.character(event$Var1)

event$Evt<-gsub("DENSE FOG","FOG",event$Evt)
event$Evt<-gsub("FOG AND COLD TEMPERATURES","FOG",event$Evt)
event$Evt<-gsub("DUST DEVIL","DUST STORM",event$Evt)
event$Evt<-gsub("LIGHTNING.","LIGHTNING",event$Evt)
event$Evt<-gsub("WILD FIRES","WILD/FOREST FIRE",event$Evt)
event$Evt<-gsub("WILDFIRE","WILD/FOREST FIRE",event$Evt)
event$Evt<-gsub("LIGHTNING.","LIGHTNING",event$Evt)
event$Evt<-gsub("Marine Accident","MARINE MISHAP",event$Evt)
event$Evt<-gsub("EXCESSIVE RAINFALL","HEAVY RAIN",event$Evt)

toMatchC<-c("Cold","COLD","CHILL","WINTER WEATHER","WINTRY","LOW TEMPERATURE")
matchesC<-filter(event,grepl(paste(toMatchC,collapse = "|"),Evt))
matchesC$Evt<-"EXTREME COLD"
auxind<-match(matchesC$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesC)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni


toMatchF<-c("Flood","FLOOD","FLD")
matchesF<-filter(event,grepl(paste(toMatchF,collapse = "|"),Evt))
matchesF$Evt<-"FLOOD"
auxind<-match(matchesF$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesF)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

toMatchH<-c("Heat","HEAT")
matchesH<-filter(event,grepl(paste(toMatchH,collapse = "|"),Evt))
matchesH$Evt<-"EXTREME HEAT"
auxind<-match(matchesH$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesH)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

matchesT<-filter(event,grepl("TORNADO",Evt))
matchesT$Evt<-"TORNADO"
auxind<-match(matchesT$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesT)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

matchesHU<-filter(event,grepl("HURRICANE",Evt))
matchesHU$Evt<-"HURRICANE"
auxind<-match(matchesHU$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesHU)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni


toMatchHy<-c("Hyperthermia","HYPERTHERMIA","Hypothermia","HYPOTHERMIA")
matchesHy<-filter(event,grepl(paste(toMatchHy,collapse = "|"),Evt))
matchesHy$Evt<-"HYPOTHERMIA"
auxind<-match(matchesHy$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesHy)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

toMatchTH<-c("THUNDERSTORM","TSTM","MICROBURST","THUNDERTORM")
matchesTH<-filter(event,grepl(paste(toMatchTH,collapse = "|"),Evt))
matchesTH$Evt<-"THUNDERSTORM"
auxind<-match(matchesTH$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesTH)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

toMatchR<-c("ROAD","BLACK ICE","MIXED PRECIP")
matchesR<-filter(event,grepl(paste(toMatchR,collapse = "|"),Evt))
matchesR$Evt<-"ICY ROADS"
auxind<-match(matchesR$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesR)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

toMatchWS<-c("SNOW","snow","Snow","Squall","SQUALL","FROST","FREEZE","FREEZING","ICE STORM","GLAZE","WINTER STORM","SLEET","HAIL","BLIZZARD","ICE","Freezing","AVALANCE","AVALANCHE")
matchesWS<-filter(event,grepl(paste(toMatchWS,collapse = "|"),Evt))
matchesWS$Evt<-"WINTER STORM"
auxind<-match(matchesWS$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesWS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

event$Evt[99]<-"LIGHT SNOW" #Since this is LIGHT SNOW, it should not be considered as SNOW STORM

toMatchSl<-c("SLIDE","slide")
matchesSl<-filter(event,grepl(paste(toMatchSl,collapse = "|"),Evt))
matchesSl$Evt<-"LAND/MUD SLIDES"
auxind<-match(matchesSl$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesSl)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

toMatchSS=c("COASTAL","STORM SURGE","RISING WATER","SURF","Surf","WATER","WAVE","surf","SEAS","SWELLS","RIP CURRENT")
matchesSS<-filter(event,grepl(paste(toMatchSS,collapse = "|"),Evt))
matchesSS$Evt<-"STORM SURGE"
auxind<-match(matchesSS$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesSS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

toMatchWW=c("WIND","wind","Wind","TROPICAL STORM")
matchesWW<-filter(event,grepl(paste(toMatchWW,collapse = "|"),Evt))
matchesWW$Evt<-"WIND STORM"
auxind<-match(matchesWW$Var1,event$Var1) #stores repeared rows in event
dfuni<-rbind(event[,1:3],matchesWW)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
event[,1:3]<-dfuni

(B) Weather Events that have the Greatest Economic Consequences

First, we define what is considered to be of great economic consequences. Undeniably, events that cause damage to property and crops, which in turn affect the livelihood, exports and trade, can result in heavy economic consequences to the country. Upon exploring the data, we identify the severity of property and crop damage by the exponents of billions, i.e. PROPDMGEXP ==“B” or CROPDMGEXP ==“B”.From the two tables below, we note that there are 40 and 9 cases where the damge to property and crop, respectively, was of exponential in the billions.

EconConseq<-select(Stormdata,EVTYPE,PROPDMG,PROPDMGEXP,CROPDMG,CROPDMGEXP)
table(EconConseq$PROPDMGEXP)
## 
##             -      ?      +      0      1      2      3      4      5 
## 465934      1      8      5    216     25     13      4      4     28 
##      6      7      8      B      h      H      K      m      M 
##      4      5      1     40      1      6 424665      7  11330
table(EconConseq$CROPDMGEXP)
## 
##             ?      0      2      B      k      K      m      M 
## 618413      7     19      1      9     21 281832      1   1994

Second, we set out to find out the types of weather events that brought about damage to property and crop of a scale in the exponential of billions (see code below creating the data frame “EconConseq”). As before, data cleaning work was done to:

  1. address the event type names (i.e. variable EVTYPE);and

  2. categorise the events.

EconConseq<-filter(EconConseq,PROPDMGEXP=="B" | CROPDMGEXP=="B")
s<-table(EconConseq$EVTYPE)
z=as.data.frame(s)
eventE<-filter(z,Freq>0)
eventE$Evt<-as.character(eventE$Var1)

matchesEF<-filter(eventE,grepl("FLOOD",Evt))
matchesEF$Evt<-"FLOOD"
auxind<-match(matchesEF$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEF)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

matchesEH<-filter(eventE,grepl("HURRICANE",Evt))
matchesEH$Evt<-"HURRICANE/TYPHOON"
auxind<-match(matchesEH$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEH)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

matchesET<-filter(eventE,grepl("TORNADO",Evt))
matchesET$Evt<-"TORNADO"
auxind<-match(matchesET$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesET)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

toMatchEWS<-c("HAIL","ICE","FREEZE")
matchesEWS<-filter(eventE,grepl(paste(toMatchEWS,collapse = "|"),Evt))
matchesEWS$Evt<-"WINTER STORM"
auxind<-match(matchesEWS$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEWS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

toMatchEHE<-c("DROUGHT","HEAT")
matchesEHE<-filter(eventE,grepl(paste(toMatchEHE,collapse = "|"),Evt))
matchesEHE$Evt<-"EXTREME HEAT"
auxind<-match(matchesEHE$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEHE)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

matchesESS<-filter(eventE,grepl("STORM SURGE",Evt))
matchesESS$Evt<-"STORM SURGE"
auxind<-match(matchesESS$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesESS)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

matchesEFF<-filter(eventE,grepl("WILDFIRE",Evt))
matchesEFF$Evt<-"WILD/FOREST FIRE"
auxind<-match(matchesEFF$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEFF)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

toMatchEWT<-c("WIND","TROPICAL STORM")
matchesEWT<-filter(eventE,grepl(paste(toMatchEWT,collapse = "|"),Evt))
matchesEWT$Evt<-"WIND STORM"
auxind<-match(matchesEWT$Var1,eventE$Var1) #stores repeared rows in event
dfuni<-rbind(eventE[,1:3],matchesEWT)[-auxind,] #merges both data frames and erases the repeated rows
dfuni<-dfuni[order(dfuni$Var1),]#sorts the new data frame
eventE[,1:3]<-dfuni

eventE$Evt<-gsub("HEAVY RAIN/SEVERE WEATHER","HEAVY RAIN",eventE$Evt)

Results

(A) Weather Events that are Most Harmful to Population Health

Weather event types that are the most harmful to the population health, i.e. causing death or more than 5 injuries, are as shown in the table (from the highest occurrence to the lowest) below:

frequency<-aggregate(x=event$Freq,by=list(EventType=event$Evt),FUN=sum)
names(frequency)<-c("EventType","Freq")
HightoLow<-frequency[order(frequency$Freq,decreasing = TRUE),]
HightoLow
##           EventType Freq
## 17          TORNADO 2969
## 5             FLOOD 1081
## 4      EXTREME HEAT  872
## 13        LIGHTNING  872
## 16     THUNDERSTORM  863
## 21     WINTER STORM  691
## 15      STORM SURGE  671
## 20       WIND STORM  416
## 3      EXTREME COLD  396
## 19 WILD/FOREST FIRE   92
## 6               FOG   83
## 7        HEAVY RAIN   72
## 8         HURRICANE   52
## 2        DUST STORM   30
## 11  LAND/MUD SLIDES   16
## 10        ICY ROADS   12
## 9       HYPOTHERMIA    8
## 14    MARINE MISHAP    3
## 18          TSUNAMI    2
## 1          DROWNING    1
## 12       LIGHT SNOW    1

Top 10 weather event types (in order of occurrence frequency) that are harmful to human population are shown in the barplot below:

Top10<-HightoLow[10:1,]
barplot(Top10$Freq,horiz=TRUE,col=rainbow(10),legend.text = Top10$EventType,main="Top 10 Weather Events: Harmful to Population Health", xlab = "Frequency")

As shown from the barplot, Tornado occurred the most frequent, resulting death and/or more than 5 injuries, bringing much harm to the human population. This is followed by Flood and Extreme Heat conditions.

In terms of severity, Extreme Heat conditions caused the highest number of deaths for each occurrence, while Tornado caused the highest number of injuries (see code below).

FHarmful<-filter(Harmful,FATALITIES>0)
max(FHarmful$FATALITIES)
## [1] 583
MaxFHarmful<-filter(Harmful,FATALITIES=="583")
MaxFHarmful
##   EVTYPE INJURIES FATALITIES
## 1   HEAT        0        583
IHarmful<-filter(Harmful,INJURIES>5)
max(IHarmful$INJURIES)
## [1] 1700
MaxIHarmful<-filter(Harmful,INJURIES=="1700")
MaxIHarmful
##    EVTYPE INJURIES FATALITIES
## 1 TORNADO     1700         42

(B) Weather Events that have the Greatest Economic Consequences

Weather event types that have the greatest economic consequences, i.e. causing damage to property or crops by the billions, are as shown in the table (from the highest occurrence to the lowest) below:

frequencyE<-aggregate(x=eventE$Freq,by=list(EventType=eventE$Evt),FUN=sum)
names(frequencyE)<-c("EventType","Freq")
HightoLowE<-frequencyE[order(frequencyE$Freq,decreasing = TRUE),]
HightoLowE
##              EventType Freq
## 4    HURRICANE/TYPHOON   18
## 2                FLOOD    7
## 1         EXTREME HEAT    5
## 7              TORNADO    4
## 10        WINTER STORM    4
## 6          STORM SURGE    3
## 8     WILD/FOREST FIRE    2
## 9           WIND STORM    2
## 3           HEAVY RAIN    1
## 5  SEVERE THUNDERSTORM    1

Top 10 weather event types (in order of occurrence frequency) that have great economic consequences are shown in the barplot below:

Top10E<-HightoLowE[10:1,]
barplot(Top10E$Freq,horiz=TRUE,col=rainbow(10),legend.text = Top10E$EventType,main="Top 10 Weather Events: Great Economic Consequences", xlab = "Frequency")

As shown from the barplot, Hurricane/Typhoon occurred the most frequent, resulting great damage to property and crops thereby resulting in heavy economic consequences. This is followed by Flood and Extreme Heat conditions.

In terms of severity, Flood caused the greatest property and crop damage (see code below). Winter Storm also caused great crop damage.

EconConseq<-filter(EconConseq,PROPDMGEXP=="B" | CROPDMGEXP=="B")
PEconConseq<-filter(EconConseq,PROPDMGEXP=="B")
max(PEconConseq$PROPDMG)
## [1] 115
MaxPropConseq<-filter(EconConseq,PROPDMG=="115"& PROPDMGEXP=="B")
MaxPropConseq
##   EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1  FLOOD     115          B    32.5          M
CEconConseq<-filter(EconConseq,CROPDMGEXP=="B")
max(CEconConseq$CROPDMG)
## [1] 5
MaxCropConseq<-filter(EconConseq,CROPDMG=="5"& CROPDMGEXP=="B")
MaxCropConseq
##        EVTYPE PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 1 RIVER FLOOD       5          B       5          B
## 2   ICE STORM     500          K       5          B

Conclusion

While we are not in control of the occurrence of the weather events that brought about harm to the population health and great economic consequences, policies, technologies and infrastructure can be put in place to minimise the damage by certain weather events. For instance, we can install hydration and cooling elements during extreme heat conditions and better dam and drainage systems to deal with floods.