Data Source: Storm Data [47Mb]
Other Resource:
* National Weather Service Storm Data Documentation
* National Climatic Data Center Storm Events FAQ
* Data Collection Info
* How To Handle Exponent Value of PROPDMGEXP and CROPDMGEXP
* More Detail on Cleaning EVTYPE
Exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage. Though the documentation is clear with the Event Types, dataset is not, so cleaning that is the 1st priority. Then we will analyses Fatalities, Injuries, Property damage and Corp damage. Analysis will be done by yearly basis and also in total. The analysis will reveal that “TORNADO” and “FLOOD” are the most dangerous. So let’s begin with importing the libraries.
# Loading libraries
library(dplyr)
library(lubridate)
library(stringr)
library(knitr)
library(kableExtra)
library(ggplot2)
library(reshape2)
library(ggpubr)
library(RColorBrewer)
Let’s read the data from the Data Source: Storm Data [47Mb]:
df <- read.csv("repdata_data_StormData.csv.bz2",stringsAsFactors = FALSE)
dim(df)
## [1] 902297 37
Thats a lot of rowssss……
Column names:
df <- tbl_df(df)
names(df)
## [1] "STATE__" "BGN_DATE" "BGN_TIME" "TIME_ZONE" "COUNTY"
## [6] "COUNTYNAME" "STATE" "EVTYPE" "BGN_RANGE" "BGN_AZI"
## [11] "BGN_LOCATI" "END_DATE" "END_TIME" "COUNTY_END" "COUNTYENDN"
## [16] "END_RANGE" "END_AZI" "END_LOCATI" "LENGTH" "WIDTH"
## [21] "F" "MAG" "FATALITIES" "INJURIES" "PROPDMG"
## [26] "PROPDMGEXP" "CROPDMG" "CROPDMGEXP" "WFO" "STATEOFFIC"
## [31] "ZONENAMES" "LATITUDE" "LONGITUDE" "LATITUDE_E" "LONGITUDE_"
## [36] "REMARKS" "REFNUM"
The given assignment focuses on the types of event(‘EVTYPE’), so let’s check how many events are there in the data:
length(unique(df$EVTYPE))
## [1] 985
This is interesting to see that there are about 1000 types of event even though the documentation says there are only 48 types of event present, we have to fix that in the data preprocessing stage.
Now let’s create a checklist of info’s that will help us to subset the data as analyzing the 902297 rows will take too long.
* Need to convert the 985 event type present in the dataset to the event mentioned in the documentation and drop the ones we cannot convert. As there are hedge no of rows that won’t effect the analysis that much.
* From the Data Collection Info link we can see that from only 1996 all 48 event types are recorded. So we will only consider data from 1996.
* According to the assignment we have to analyze effect of events on population health and economic consequences. So we will only need these columns of the analysis: “EVTYPE”, “FATALITIES”, “INJURIES”, “PROPDMG”, “PROPDMGEXP”, “CROPDMG” and “CROPDMGEXP”.
* We will keep the column: “REFNUM” for future reference to the main data as it is primary key for the dataset.
* The column: “BGN_DATE” is kept to view the events with respect to time.
* As we are analyzing the effect over the country we can also drop state and county code columns.
Lets process the data and sub set it:
#dates are in the format mdy_hms format
df_new <- df %>% select(REFNUM, EVTYPE, BGN_DATE, FATALITIES, INJURIES,
PROPDMG, PROPDMGEXP, CROPDMG, CROPDMGEXP) %>%
mutate(BGN_DATE = mdy_hms(BGN_DATE)) %>% filter(year(BGN_DATE) >= 1996)
We can also drop the rows that have 0 values in FATALITIES, INJURIES,PROPDMG, and CROPDMG as it indicates that there were no damage done:
logical <- (df_new$FATALITIES == 0 & df_new$INJURIES == 0 &
df_new$PROPDMG == 0 & df_new$CROPDMG == 0)
df_new <- df_new %>% filter(!logical)
dim(df_new)
## [1] 201318 9
So we have reduced the row and column no.
Now lets check for NA values in the given dataset:
apply(df_new, 2, function(x) sum(is.na(x)))
## REFNUM EVTYPE BGN_DATE FATALITIES INJURIES PROPDMG
## 0 0 0 0 0 0
## PROPDMGEXP CROPDMG CROPDMGEXP
## 0 0 0
So there are no NA values in the data.
But there was blank values in PROPDMGEXP and CROPDMGEXP lets check if they are still present.
all(df_new[df_new$PROPDMGEXP %in% c("","?","-"),]$PROPDMG == 0)
## [1] TRUE
all(df_new[df_new$CROPDMGEXP %in% c("","?","-"),]$CROPDMG == 0)
## [1] TRUE
So all the blank values, “?” and “-” values in the EXP columns have 0 values in their corresponding DMG columns. We can use this info to subset the data further while doing the 2nd part of the assignment.
Lets work with EXP and DMG columns-
For these columns below are true:
* EXP(0, 1, 2, 3, 4, 5, 6, 7, 8) = 10
* EXP(H,h) = 100
* EXP(K,k) = 1000
* EXP(M,m) = 1000000 * EXP(B,b) = 1000000000
* EXP(+) = 1
* EXP(blank(),?,-) = Have no values in DMG
Lets convert damages to absolute damages-
key <- c(as.character(0:8),"H","h","K","k","M","m","B","b","+","","?","-")
values <- c(rep(10,9),rep(100,2),rep(1000,2),rep(1000000,2),rep(1000000000,2),1,rep(0,3))
df_new <- df_new %>%
mutate(PROPDMG_ABS = as.vector(sapply(PROPDMGEXP,
function(x) values[key == x]))*PROPDMG,
CROPDMG_ABS = as.vector(sapply(CROPDMGEXP,
function(x) values[key == x]))*CROPDMG,
TOTAL_DMG = PROPDMG_ABS+CROPDMG_ABS)
We have extracted the absolute property damage(PROPDMG_ABS) and crop damage(CROPDMG_ABS). TOTAL_DMG is the total economic damage as per the dataset.
Lets now work with the event types:
#no of unique event type in the sub data
length(unique(df_new$EVTYPE))
## [1] 222
So there are much less event types in the sub data.
So lets record the event types mentioned in the dataset:
events <- "Astronomical Low Tide Z
Avalanche Z
Blizzard Z
Coastal Flood Z
Cold/Wind Chill Z
Debris Flow C
Dense Fog Z
Dense Smoke Z
Drought Z
Dust Devil C
Dust Storm Z
Excessive Heat Z
Extreme Cold/Wind Chill Z
Flash Flood C
Flood C
Frost/Freeze Z
Funnel Cloud C
Freezing Fog Z
Hail C
Heat Z
Heavy Rain C
Heavy Snow Z
High Surf Z
High Wind Z
Hurricane (Typhoon) Z
Ice Storm Z
Lake-Effect Snow Z
Lakeshore Flood Z
Lightning C
Marine Hail M
Marine High Wind M
Marine Strong Wind M
Marine Thunderstorm Wind M
Rip Current Z
Seiche Z
Sleet Z
Storm Surge/Tide Z
Strong Wind Z
Thunderstorm Wind C
Tornado C
Tropical Depression Z
Tropical Storm Z
Tsunami Z
Volcanic Ash Z
Waterspout M
Wildfire Z
Winter Storm Z
Winter Weather Z"
events <- strsplit(events,"\n")
events <- as.vector(sapply(events, function(x) {substr(x,start=1,stop = nchar(x)-2)}))
events
## [1] "Astronomical Low Tide" "Avalanche"
## [3] "Blizzard" "Coastal Flood"
## [5] "Cold/Wind Chill" "Debris Flow"
## [7] "Dense Fog" "Dense Smoke"
## [9] "Drought" "Dust Devil"
## [11] "Dust Storm" "Excessive Heat"
## [13] "Extreme Cold/Wind Chill" "Flash Flood"
## [15] "Flood" "Frost/Freeze"
## [17] "Funnel Cloud" "Freezing Fog"
## [19] "Hail" "Heat"
## [21] "Heavy Rain" "Heavy Snow"
## [23] "High Surf" "High Wind"
## [25] "Hurricane (Typhoon)" "Ice Storm"
## [27] "Lake-Effect Snow" "Lakeshore Flood"
## [29] "Lightning" "Marine Hail"
## [31] "Marine High Wind" "Marine Strong Wind"
## [33] "Marine Thunderstorm Wind" "Rip Current"
## [35] "Seiche" "Sleet"
## [37] "Storm Surge/Tide" "Strong Wind"
## [39] "Thunderstorm Wind" "Tornado"
## [41] "Tropical Depression" "Tropical Storm"
## [43] "Tsunami" "Volcanic Ash"
## [45] "Waterspout" "Wildfire"
## [47] "Winter Storm" "Winter Weather"
From the Disscusion forum some one pointed out that there are few events that consists of to events from documentation seperated by “/”. Lets check that.
event_unique <- unique(df_new$EVTYPE)
event_unique[grep("/",event_unique)]
## [1] "TSTM WIND/HAIL" "WILD/FOREST FIRE"
## [3] "URBAN/SML STREAM FLD" "Erosion/Cstl Flood"
## [5] "Heavy Rain/High Surf" "Gusty wind/rain"
## [7] "GUSTY WIND/HVY RAIN" "RAIN/SNOW"
## [9] "Frost/Freeze" "Hypothermia/Exposure"
## [11] "HYPOTHERMIA/EXPOSURE" "COASTAL FLOODING/EROSION"
## [13] "COASTAL FLOODING/EROSION" "HYPERTHERMIA/EXPOSURE"
## [15] "GUSTY WIND/HAIL" "FLOOD/FLASH/FLOOD"
## [17] "FLASH FLOOD/FLOOD" "FALLING SNOW/ICE"
## [19] "EXTREME COLD/WIND CHILL" "HURRICANE/TYPHOON"
## [21] "WINTER WEATHER/MIX" "FROST/FREEZE"
## [23] "HEAVY SURF/HIGH SURF" "STORM SURGE/TIDE"
## [25] "COLD/WIND CHILL"
Now lets look at all the event types:
unique(df_new$EVTYPE)
## [1] "WINTER STORM" "TORNADO"
## [3] "TSTM WIND" "HIGH WIND"
## [5] "FLASH FLOOD" "FREEZING RAIN"
## [7] "EXTREME COLD" "LIGHTNING"
## [9] "HAIL" "FLOOD"
## [11] "TSTM WIND/HAIL" "EXCESSIVE HEAT"
## [13] "RIP CURRENTS" "Other"
## [15] "HEAVY SNOW" "WILD/FOREST FIRE"
## [17] "ICE STORM" "BLIZZARD"
## [19] "STORM SURGE" "Ice jam flood (minor"
## [21] "DUST STORM" "STRONG WIND"
## [23] "DUST DEVIL" "Tstm Wind"
## [25] "URBAN/SML STREAM FLD" "FOG"
## [27] "ROUGH SURF" "Heavy Surf"
## [29] "Dust Devil" "HEAVY RAIN"
## [31] "Marine Accident" "AVALANCHE"
## [33] "Freeze" "DRY MICROBURST"
## [35] "Strong Wind" "WINDS"
## [37] "COASTAL STORM" "Erosion/Cstl Flood"
## [39] "River Flooding" "WATERSPOUT"
## [41] "DAMAGING FREEZE" "Damaging Freeze"
## [43] "HURRICANE" "TROPICAL STORM"
## [45] "Beach Erosion" "High Surf"
## [47] "Heavy Rain/High Surf" "Unseasonable Cold"
## [49] "Early Frost" "Wintry Mix"
## [51] "Extreme Cold" "DROUGHT"
## [53] "Coastal Flooding" "Torrential Rainfall"
## [55] "Landslump" "Hurricane Edouard"
## [57] "Coastal Storm" "TIDAL FLOODING"
## [59] "Tidal Flooding" "Strong Winds"
## [61] "EXTREME WINDCHILL" "Glaze"
## [63] "Extended Cold" "Whirlwind"
## [65] "Heavy snow shower" "Light snow"
## [67] "COASTAL FLOOD" "Light Snow"
## [69] "MIXED PRECIP" "COLD"
## [71] "Freezing Spray" "DOWNBURST"
## [73] "Mudslides" "Microburst"
## [75] "Mudslide" "Cold"
## [77] "SNOW" "Coastal Flood"
## [79] "Snow Squalls" "Wind Damage"
## [81] "Light Snowfall" "Freezing Drizzle"
## [83] "Gusty wind/rain" "GUSTY WIND/HVY RAIN"
## [85] "Wind" "Cold Temperature"
## [87] "Heat Wave" "Snow"
## [89] "COLD AND SNOW" "HEAVY SURF"
## [91] "RAIN/SNOW" "WIND"
## [93] "FREEZE" "TSTM WIND (G45)"
## [95] "Gusty Winds" "GUSTY WIND"
## [97] "TSTM WIND 40" "TSTM WIND 45"
## [99] "HARD FREEZE" "TSTM WIND (41)"
## [101] "HEAT" "RIVER FLOOD"
## [103] "TSTM WIND (G40)" "RIP CURRENT"
## [105] "HIGH SURF" "MUD SLIDE"
## [107] "Frost/Freeze" "SNOW AND ICE"
## [109] "COASTAL FLOODING" "AGRICULTURAL FREEZE"
## [111] "WINTER WEATHER" "STRONG WINDS"
## [113] "SNOW SQUALL" "ICY ROADS"
## [115] "OTHER" "THUNDERSTORM"
## [117] "Hypothermia/Exposure" "HYPOTHERMIA/EXPOSURE"
## [119] "Lake Effect Snow" "Freezing Rain"
## [121] "Mixed Precipitation" "BLACK ICE"
## [123] "COASTALSTORM" "LIGHT SNOW"
## [125] "DAM BREAK" "Gusty winds"
## [127] "blowing snow" "FREEZING DRIZZLE"
## [129] "FROST" "GRADIENT WIND"
## [131] "UNSEASONABLY COLD" "GUSTY WINDS"
## [133] "TSTM WIND AND LIGHTNING" "gradient wind"
## [135] "Gradient wind" "Freezing drizzle"
## [137] "WET MICROBURST" "Heavy surf and wind"
## [139] "FUNNEL CLOUD" "TYPHOON"
## [141] "LANDSLIDES" "HIGH SWELLS"
## [143] "HIGH WINDS" "SMALL HAIL"
## [145] "UNSEASONAL RAIN" "COASTAL FLOODING/EROSION"
## [147] " TSTM WIND (G45)" "TSTM WIND (G45)"
## [149] "HIGH WIND (G40)" "TSTM WIND (G35)"
## [151] "GLAZE" "COASTAL EROSION"
## [153] "UNSEASONABLY WARM" "SEICHE"
## [155] "COASTAL FLOODING/EROSION" "HYPERTHERMIA/EXPOSURE"
## [157] "WINTRY MIX" "RIVER FLOODING"
## [159] "ROCK SLIDE" "GUSTY WIND/HAIL"
## [161] "HEAVY SEAS" " TSTM WIND"
## [163] "LANDSPOUT" "RECORD HEAT"
## [165] "EXCESSIVE SNOW" "LAKE EFFECT SNOW"
## [167] "FLOOD/FLASH/FLOOD" "MIXED PRECIPITATION"
## [169] "WIND AND WAVE" "FLASH FLOOD/FLOOD"
## [171] "LIGHT FREEZING RAIN" "ICE ROADS"
## [173] "HIGH SEAS" "RAIN"
## [175] "ROUGH SEAS" "TSTM WIND G45"
## [177] "NON-SEVERE WIND DAMAGE" "WARM WEATHER"
## [179] "THUNDERSTORM WIND (G40)" "LANDSLIDE"
## [181] "HIGH WATER" " FLASH FLOOD"
## [183] "LATE SEASON SNOW" "WINTER WEATHER MIX"
## [185] "ROGUE WAVE" "FALLING SNOW/ICE"
## [187] "NON-TSTM WIND" "NON TSTM WIND"
## [189] "MUDSLIDE" "BRUSH FIRE"
## [191] "BLOWING DUST" "VOLCANIC ASH"
## [193] " HIGH SURF ADVISORY" "HAZARDOUS SURF"
## [195] "WILDFIRE" "COLD WEATHER"
## [197] "WHIRLWIND" "ICE ON ROAD"
## [199] "SNOW SQUALLS" "DROWNING"
## [201] "EXTREME COLD/WIND CHILL" "MARINE TSTM WIND"
## [203] "HURRICANE/TYPHOON" "DENSE FOG"
## [205] "WINTER WEATHER/MIX" "FROST/FREEZE"
## [207] "ASTRONOMICAL HIGH TIDE" "HEAVY SURF/HIGH SURF"
## [209] "TROPICAL DEPRESSION" "LAKE-EFFECT SNOW"
## [211] "MARINE HIGH WIND" "THUNDERSTORM WIND"
## [213] "TSUNAMI" "STORM SURGE/TIDE"
## [215] "COLD/WIND CHILL" "LAKESHORE FLOOD"
## [217] "MARINE THUNDERSTORM WIND" "MARINE STRONG WIND"
## [219] "ASTRONOMICAL LOW TIDE" "DENSE SMOKE"
## [221] "MARINE HAIL" "FREEZING FOG"
Lets create a list for cleaning the events-
* We can see that the events are not consistent with uppercase or lowercase. So we have to convert them all to anyone.
* We can also see that there are blank spaces in front of some event type. We will use str_trim to fix them.
* “/” are used in some places we will work with them with the below process:
+ If one is found in documentation use that. GUSTY WIND/HAIL->Hail
+ If both are found in the documentation then use the 1st one. EXCESSIVE HEAT/DROUGHT ->EXCESSIVE HEAT
+ If both are not found in the doc then use the which one is close. WILD/FOREST FIRE -> Wildfire
* Some event type doesn’t specify properly, like TSTM WIND this will be Thunderstorm Wind. Similarly LANDSLIDES will be Debris Flow.
* Keeping the event type “Other” as this one cannot be converted to anything.
* Also there 3 types of FOG mentioned in the data where the documentation specifies only two. We need to check that also.
Lets create a backup:
backup_df_new <- df_new
Lets bring all to uppercase and also trim the events:
df_new$EVTYPE = toupper(df_new$EVTYPE)
df_new$EVTYPE = str_trim(df_new$EVTYPE)
length(unique(df_new$EVTYPE))
## [1] 183
That helped a lot, now lets clean
table(df_new[grep("COASTAL",df_new$EVTYPE),]$EVTYPE)
##
## COASTAL FLOODING/EROSION COASTAL EROSION
## 1 1
## COASTAL FLOOD COASTAL FLOODING
## 153 35
## COASTAL FLOODING/EROSION COASTAL STORM
## 3 4
## COASTALSTORM
## 1
df_new[grep("COASTAL",df_new$EVTYPE),]$EVTYPE <- toupper("Coastal Flood")
table(df_new[grep("COLD|HYPOTHERMIA|WINDCHILL",df_new$EVTYPE),]$EVTYPE)
##
## COLD COLD AND SNOW COLD TEMPERATURE
## 20 1 2
## COLD WEATHER COLD/WIND CHILL EXTENDED COLD
## 1 90 1
## EXTREME COLD EXTREME COLD/WIND CHILL EXTREME WINDCHILL
## 166 111 19
## HYPOTHERMIA/EXPOSURE UNSEASONABLE COLD UNSEASONABLY COLD
## 6 1 3
df_new[grep("^COLD|HYPOTHERMIA",
df_new$EVTYPE),]$EVTYPE <- toupper("Cold/Wind Chill")
df_new[grep(" COLD|WINDCHILL",
df_new$EVTYPE),]$EVTYPE <- toupper("Extreme Cold/Wind Chill")
table(df_new[grep("SLIDE|SLUMP",df_new$EVTYPE),]$EVTYPE)
##
## LANDSLIDE LANDSLIDES LANDSLUMP MUD SLIDE MUDSLIDE MUDSLIDES
## 190 1 1 2 5 1
## ROCK SLIDE
## 1
df_new[grep("SLIDE|SLUMP",
df_new$EVTYPE),]$EVTYPE <- toupper("Debris Flow")
table(df_new[grep("FOG",df_new$EVTYPE),]$EVTYPE)
##
## DENSE FOG FOG FREEZING FOG
## 58 101 7
DENSE FOG and FREEZING FOG are documented event types but we can see that FOG is much more common.
temp_df <- df_new[grep("FOG",df_new$EVTYPE),]
temp_df <- temp_df %>% group_by(EVTYPE) %>%
summarise(count=n(),
mn_FATALITIES=mean(FATALITIES),
mn_INJURIES=mean(INJURIES),
mn_PROPDMG = mean(PROPDMG_ABS),
mn_CROPDMG = mean(CROPDMG_ABS),
mn_TOTALDMG = mean(TOTAL_DMG))
kable(temp_df) %>%
kable_styling(bootstrap_options = c("striped", "hover","condensed","responsive"))
| EVTYPE | count | mn_FATALITIES | mn_INJURIES | mn_PROPDMG | mn_CROPDMG | mn_TOTALDMG |
|---|---|---|---|---|---|---|
| DENSE FOG | 58 | 0.1551724 | 2.465517 | 126189.7 | 0 | 126189.7 |
| FOG | 101 | 0.5940594 | 7.049505 | 130153.5 | 0 | 130153.5 |
| FREEZING FOG | 7 | 0.0000000 | 0.000000 | 311714.3 | 0 | 311714.3 |
Types of FOG causes mostly injuries and property damage. As there is no apparent difference between the three types on FOG we will be keeping all 3 FOG types.
table(df_new[grep("DUST",df_new$EVTYPE),]$EVTYPE)
##
## BLOWING DUST DUST DEVIL DUST STORM
## 1 84 96
df_new[grep("BLOWING DUST",
df_new$EVTYPE),]$EVTYPE <- toupper("Dust Storm")
table(df_new[grep("HEAT|HYPERTHERMIA|WARM",df_new$EVTYPE),]$EVTYPE)
##
## EXCESSIVE HEAT HEAT HEAT WAVE
## 685 164 1
## HYPERTHERMIA/EXPOSURE RECORD HEAT UNSEASONABLY WARM
## 1 1 5
## WARM WEATHER
## 1
df_new[grep("HYPERTHERMIA|WARM",
df_new$EVTYPE),]$EVTYPE <- toupper("Heat")
df_new[grep("HEAT WAVE|RECORD HEAT",
df_new$EVTYPE),]$EVTYPE <- toupper("Excessive Heat")
table(df_new[grep("FLASH|DAM BREAK|HIGH WATER",df_new$EVTYPE),]$EVTYPE)
##
## DAM BREAK FLASH FLOOD FLASH FLOOD/FLOOD FLOOD/FLASH/FLOOD
## 2 19012 1 1
## HIGH WATER
## 2
df_new[grep("FLASH|DAM BREAK|HIGH WATER",
df_new$EVTYPE),]$EVTYPE <- toupper("Flash Flood")
table(df_new[grep("FLOOD",df_new$EVTYPE),]$EVTYPE)
##
## COASTAL FLOOD EROSION/CSTL FLOOD FLASH FLOOD
## 198 2 19018
## FLOOD ICE JAM FLOOD (MINOR LAKESHORE FLOOD
## 9513 1 5
## RIVER FLOOD RIVER FLOODING TIDAL FLOODING
## 80 6 4
df_new[grep("RIVER",
df_new$EVTYPE),]$EVTYPE <- toupper("Flood")
df_new[grep("CSTL|TIDAL FLOODING",
df_new$EVTYPE),]$EVTYPE <- toupper("Coastal Flood")
df_new[grep("ICE JAM|FLOOD/FLASH/FLOOD|FLASH FLOOD/FLOOD",
df_new$EVTYPE),]$EVTYPE <- toupper("FLASH FLOOD")
table(df_new[grep("FROST|FREEZE",df_new$EVTYPE),]$EVTYPE)
##
## AGRICULTURAL FREEZE DAMAGING FREEZE EARLY FROST
## 3 3 1
## FREEZE FROST FROST/FREEZE
## 14 1 117
## HARD FREEZE
## 1
df_new[grep("FROST|FREEZE",
df_new$EVTYPE),]$EVTYPE <- toupper("Frost/Freeze")
table(df_new[grep("HAIL",df_new$EVTYPE),]$EVTYPE)
##
## GUSTY WIND/HAIL HAIL MARINE HAIL SMALL HAIL
## 1 22679 2 11
## TSTM WIND/HAIL
## 441
df_new[grep("SMALL HAIL",
df_new$EVTYPE),]$EVTYPE <- toupper("Hail")
table(df_new[grep("GUSTY",df_new$EVTYPE),]$EVTYPE)
##
## GUSTY WIND GUSTY WIND/HAIL GUSTY WIND/HVY RAIN
## 13 1 1
## GUSTY WIND/RAIN GUSTY WINDS
## 1 30
df_new[grep("GUSTY WIND/HAIL",
df_new$EVTYPE),]$EVTYPE <- toupper("Hail")
df_new[grep("GUSTY WIND/HVY RAIN|GUSTY WIND/RAIN",
df_new$EVTYPE),]$EVTYPE <- toupper("Heavy Rain")
df_new[grep("GUSTY WIND",
df_new$EVTYPE),]$EVTYPE <- toupper("Strong Wind")
table(df_new[grep("RAIN|DRIZZLE",df_new$EVTYPE),]$EVTYPE)
##
## FREEZING DRIZZLE FREEZING RAIN HEAVY RAIN
## 5 9 1049
## HEAVY RAIN/HIGH SURF LIGHT FREEZING RAIN RAIN
## 1 22 3
## RAIN/SNOW TORRENTIAL RAINFALL UNSEASONAL RAIN
## 2 1 2
df_new[grep("FREEZING RAIN|RAIN/SNOW|FREEZING DRIZZLE",
df_new$EVTYPE),]$EVTYPE <- toupper("Winter Weather")
df_new[grep("RAIN",
df_new$EVTYPE),]$EVTYPE <- toupper("Heavy Rain")
table(df_new[grep("SNOW",df_new$EVTYPE),]$EVTYPE)
##
## BLOWING SNOW EXCESSIVE SNOW FALLING SNOW/ICE HEAVY SNOW
## 1 25 2 1029
## HEAVY SNOW SHOWER LAKE-EFFECT SNOW LAKE EFFECT SNOW LATE SEASON SNOW
## 1 194 4 1
## LIGHT SNOW LIGHT SNOWFALL SNOW SNOW AND ICE
## 141 1 16 1
## SNOW SQUALL SNOW SQUALLS
## 3 5
df_new[grep("EXCESSIVE SNOW|LATE SEASON SNOW|HEAVY SNOW SHOWER",
df_new$EVTYPE),]$EVTYPE <- toupper("Heavy Snow")
df_new[grep("LAKE-EFFECT|LAKE EFFECT",
df_new$EVTYPE),]$EVTYPE <- toupper("Lake-Effect Snow")
df_new[grep("BLOWING SNOW|LIGHT SNOW|SNOW SQUALL|FALLING SNOW/ICE|SNOW AND ICE",
df_new$EVTYPE),]$EVTYPE <- toupper("Winter Weather")
df_new$EVTYPE[which(df_new$EVTYPE=="SNOW")] <- toupper("Heavy Snow")
table(df_new[grep("SURF",df_new$EVTYPE),]$EVTYPE)
##
## HAZARDOUS SURF HEAVY SURF HEAVY SURF AND WIND
## 1 29 1
## HEAVY SURF/HIGH SURF HIGH SURF HIGH SURF ADVISORY
## 50 128 1
## ROUGH SURF
## 2
df_new[grep("SURF",
df_new$EVTYPE),]$EVTYPE <- toupper("High Surf")
table(df_new[grep("HIGH WIND",df_new$EVTYPE),]$EVTYPE)
##
## HIGH WIND HIGH WIND (G40) HIGH WINDS MARINE HIGH WIND
## 5402 2 1 19
df_new[grep("^HIGH WIND",
df_new$EVTYPE),]$EVTYPE <- toupper("High Wind")
table(df_new[grep("HURRICANE|TYPHOON",df_new$EVTYPE),]$EVTYPE)
##
## HURRICANE HURRICANE EDOUARD HURRICANE/TYPHOON TYPHOON
## 126 1 72 9
df_new[grep("HURRICANE|TYPHOON",
df_new$EVTYPE),]$EVTYPE <- toupper("Hurricane (Typhoon)")
table(df_new[grep("RIP|GRADIENT",df_new$EVTYPE),]$EVTYPE)
##
## GRADIENT WIND RIP CURRENT RIP CURRENTS
## 6 364 239
df_new[grep("RIP|GRADIENT",
df_new$EVTYPE),]$EVTYPE <- toupper("Rip Current")
table(df_new[grep("SURGE|TIDE",df_new$EVTYPE),]$EVTYPE)
##
## ASTRONOMICAL HIGH TIDE ASTRONOMICAL LOW TIDE STORM SURGE
## 8 2 169
## STORM SURGE/TIDE
## 47
df_new[grep("STORM SURGE",
df_new$EVTYPE),]$EVTYPE <- toupper("Storm Surge/Tide")
df_new[grep("ASTRONOMICAL HIGH",
df_new$EVTYPE),]$EVTYPE <- toupper("Coastal Flood")
table(df_new[grep("STRONG WIND",df_new$EVTYPE),]$EVTYPE)
##
## MARINE STRONG WIND STRONG WIND STRONG WINDS
## 46 3412 45
df_new[grep("^STRONG WIND",
df_new$EVTYPE),]$EVTYPE <- toupper("Strong Wind")
table(df_new[grep("THUNDERSTORM|TSTM|BURST",df_new$EVTYPE),]$EVTYPE)
##
## DOWNBURST DRY MICROBURST MARINE THUNDERSTORM WIND
## 1 75 33
## MARINE TSTM WIND MICROBURST NON-TSTM WIND
## 109 1 1
## NON TSTM WIND THUNDERSTORM THUNDERSTORM WIND
## 1 2 43097
## THUNDERSTORM WIND (G40) TSTM WIND TSTM WIND (G45)
## 1 61778 1
## TSTM WIND (41) TSTM WIND (G35) TSTM WIND (G40)
## 1 1 9
## TSTM WIND (G45) TSTM WIND 40 TSTM WIND 45
## 37 1 1
## TSTM WIND AND LIGHTNING TSTM WIND G45 TSTM WIND/HAIL
## 1 1 441
## WET MICROBURST
## 3
df_new[grep("^THUNDERSTORM|^TSTM|BURST",
df_new$EVTYPE),]$EVTYPE <- toupper("Thunderstorm Wind")
df_new[grep("MARINE TSTM",
df_new$EVTYPE),]$EVTYPE <- toupper("Marine Thunderstorm Wind")
df_new[agrep("NON-TSTM WIND",
df_new$EVTYPE),]$EVTYPE <- toupper("Other")
table(df_new[grep("TORNADO|LANDSPOUT",df_new$EVTYPE),]$EVTYPE)
##
## LANDSPOUT TORNADO
## 2 12366
df_new[grep("TORNADO|LANDSPOUT",
df_new$EVTYPE),]$EVTYPE <- toupper("Tornado")
table(df_new[grep("FIRE",df_new$EVTYPE),]$EVTYPE)
##
## BRUSH FIRE WILD/FOREST FIRE WILDFIRE
## 1 381 847
df_new[grep("FIRE",
df_new$EVTYPE),]$EVTYPE <- toupper("Wildfire")
table(df_new[grep("WINTER WEATHER|ICE|ICY|GLAZE|MIXED PRECIP|WINTRY|FREEZING SPRAY",df_new$EVTYPE),]$EVTYPE)
##
## BLACK ICE FREEZING SPRAY GLAZE
## 1 1 16
## ICE ON ROAD ICE ROADS ICE STORM
## 1 1 631
## ICY ROADS MIXED PRECIP MIXED PRECIPITATION
## 18 6 18
## WINTER WEATHER WINTER WEATHER MIX WINTER WEATHER/MIX
## 597 2 139
## WINTRY MIX
## 4
df_new[grep("WINTER WEATHER|ICE$|ICE ROADS|ICE ON ROAD|ICY|GLAZE|MIXED PRECIP|WINTRY|FREEZING SPRAY",
df_new$EVTYPE),]$EVTYPE <- toupper("Winter Weather")
table(df_new[grep("WIND",df_new$EVTYPE),]$EVTYPE)
##
## COLD/WIND CHILL EXTREME COLD/WIND CHILL HIGH WIND
## 120 301 5405
## MARINE HIGH WIND MARINE STRONG WIND MARINE THUNDERSTORM WIND
## 19 46 142
## NON-SEVERE WIND DAMAGE STRONG WIND THUNDERSTORM WIND
## 1 3457 105452
## WHIRLWIND WIND WIND AND WAVE
## 3 67 1
## WIND DAMAGE WINDS
## 1 1
df_new[grep("^WIND|WHIRLWIND|WIND DAMAGE",
df_new$EVTYPE),]$EVTYPE <- toupper("Other")
df_new[grep("URBAN",df_new$EVTYPE),]$EVTYPE="HEAVY RAIN"
df_new[grep("DROWNING|MARINE ACCIDENT|BEACH EROSION",
df_new$EVTYPE),]$EVTYPE="OTHER"
df_new[grep("SWELLS|ROGUE|SEAS",df_new$EVTYPE),]$EVTYPE="HIGH SURF"
Claening of the event type is done. Lest chcek them.
unique(df_new$EVTYPE)
## [1] "WINTER STORM" "TORNADO"
## [3] "THUNDERSTORM WIND" "HIGH WIND"
## [5] "FLASH FLOOD" "WINTER WEATHER"
## [7] "EXTREME COLD/WIND CHILL" "LIGHTNING"
## [9] "HAIL" "FLOOD"
## [11] "EXCESSIVE HEAT" "RIP CURRENT"
## [13] "OTHER" "HEAVY SNOW"
## [15] "WILDFIRE" "ICE STORM"
## [17] "BLIZZARD" "STORM SURGE/TIDE"
## [19] "DUST STORM" "STRONG WIND"
## [21] "DUST DEVIL" "HEAVY RAIN"
## [23] "FOG" "HIGH SURF"
## [25] "AVALANCHE" "FROST/FREEZE"
## [27] "COASTAL FLOOD" "WATERSPOUT"
## [29] "HURRICANE (TYPHOON)" "TROPICAL STORM"
## [31] "DROUGHT" "DEBRIS FLOW"
## [33] "COLD/WIND CHILL" "HEAT"
## [35] "LAKE-EFFECT SNOW" "FUNNEL CLOUD"
## [37] "SEICHE" "VOLCANIC ASH"
## [39] "MARINE THUNDERSTORM WIND" "DENSE FOG"
## [41] "TROPICAL DEPRESSION" "MARINE HIGH WIND"
## [43] "TSUNAMI" "LAKESHORE FLOOD"
## [45] "MARINE STRONG WIND" "ASTRONOMICAL LOW TIDE"
## [47] "DENSE SMOKE" "MARINE HAIL"
## [49] "FREEZING FOG"
So cleaning of event type is complete. SLEET is not present in our sub dataset and we have included FOG and OTHER events, that resulted in 48-1+2 = 49 event type.
With this data processing stage is complete.
Lets divide the sub datset in two dataset for the two assignment.
sub_df1 <- df_new %>%
select(REFNUM,EVTYPE,BGN_DATE,FATALITIES,INJURIES) %>%
filter(!(FATALITIES == 0 & INJURIES == 0))
sub_df2 <- df_new %>%
select(everything(),-FATALITIES,-INJURIES) %>%
filter(!(PROPDMG_ABS == 0 & CROPDMG_ABS == 0))
To analyze this part we will use sub_df1.
Analyzing and comparing the total FATALITIES or INJURIES by the events will not be as informative, as the events are random and total human damage is different every year. So I will convert FATALITIES and INJURIES of say event “a” to ->
(FATALITIES or INJURIES of event “a” for a certain year)/(total FATALITIES or INJURIES across all the events for that year )
sub_df1_year <- sub_df1 %>% mutate(years = year(BGN_DATE)) %>%
group_by(years) %>% summarise(T_FATALITIES = sum(FATALITIES),
T_INJURIES = sum(INJURIES))
sub_df1_year_event <- sub_df1 %>% mutate(years = year(BGN_DATE)) %>%
group_by(years,EVTYPE) %>%
summarise(T_FATALITIES = sum(FATALITIES),
T_INJURIES = sum(INJURIES))
for(idx in 1:nrow(sub_df1_year_event)){
temp_df2 <- sub_df1_year[sub_df1_year$years == sub_df1_year_event[idx,]$years,]
sub_df1_year_event[idx,
"T_FATALITIES"] = sub_df1_year_event[idx,
"T_FATALITIES"] / temp_df2$T_FATALITIES
sub_df1_year_event[idx,
"T_INJURIES"] = sub_df1_year_event[idx,
"T_INJURIES"] / temp_df2$T_INJURIES
}
sub_df1_year_event$T_FATALITIES = sub_df1_year_event$T_FATALITIES * 100
sub_df1_year_event$T_INJURIES = sub_df1_year_event$T_INJURIES * 100
Lets create a subdataset for the top events that caused most human damage throughout the years.
plot_df1 <- sub_df1_year_event %>% group_by(years) %>% top_n(n=1,wt=T_FATALITIES)
plot_df2 <- sub_df1_year_event %>% group_by(years) %>% top_n(n=1,wt=T_INJURIES)
plot_df1 <- sub_df1_year_event %>%
filter(EVTYPE %in% unique(plot_df1$EVTYPE)) %>% group_by(years)
plot_df2 <- sub_df1_year_event %>%
filter(EVTYPE %in% unique(plot_df2$EVTYPE)) %>% group_by(years)
#for line plots 1, 2
Lets create a subdataset for total human damage by the event type and take only the top 10.
sub_df1_event <- sub_df1 %>% group_by(EVTYPE) %>%
summarise(T_FATALITIES = sum(FATALITIES),
T_INJURIES = sum(INJURIES))
plot_df3 <- sub_df1_event %>%
top_n(n=10,wt=T_FATALITIES)
plot_df4 <- sub_df1_event %>%
top_n(n=10,wt=T_INJURIES)
events1 <- unique(c(plot_df3$EVTYPE, plot_df4$EVTYPE))
len <- length(events1)
plot_df5 <- tbl_df(data = list("EVTYPE"=1:len,"Total Fatalities"=1:len,"Total Injuries"=1:len))
plot_df5$EVTYPE <- events1
for(event in plot_df5$EVTYPE){
t_f <- sub_df1_event[sub_df1_event$EVTYPE == event,]$T_FATALITIES
t_i <- sub_df1_event[sub_df1_event$EVTYPE == event,]$T_INJURIES
plot_df5[plot_df5$EVTYPE == event,"Total Fatalities"] <- t_f
plot_df5[plot_df5$EVTYPE == event,"Total Injuries"] <- t_i
}
plot_df5 <- melt(data = plot_df5, id.vars="EVTYPE")
#for bar plots 5
Lets create a table to show some states for top 5 killers through the years.
#use sub_df1_year_event, plot_df1 and plot_df2
plot_df8 <- sub_df1_year_event %>% group_by(years) %>% top_n(n=5,wt=T_FATALITIES)
plot_df9 <- sub_df1_year_event %>% group_by(years) %>% top_n(n=5,wt=T_INJURIES)
events2 <- unique(c(plot_df8$EVTYPE, plot_df9$EVTYPE))
df <- 16
plot_df6 <- sub_df1_year_event %>% filter(EVTYPE %in% events2) %>%
group_by(EVTYPE) %>%
summarise(mean_fatalities = mean(T_FATALITIES),
sd_fatalities = sd(T_FATALITIES),
conf_interval_f = paste(round((mean_fatalities +
c(-1,1)*qt(0.975,df-1)*sd_fatalities/sqrt(df)),2),
collapse = " - "),
mean_injuries = mean(T_INJURIES),
sd_injuries = sd(T_INJURIES),
conf_interval_i = paste(round((mean_injuries +
c(-1,1)*qt(0.975,df-1)*sd_injuries/sqrt(df)),2),
collapse = " - "))
#plot table with 6
Lets plot :) …………
plot_df1 <- plot_df1 %>% rename(Event_Type = EVTYPE)
p1 <- ggplot(plot_df1,aes(x=years,y=T_FATALITIES,color=Event_Type))+
geom_point(size=4,alpha=0.5)+
geom_line()+
labs(x="",y="% Fatalities per year")+
scale_color_brewer(palette = "Paired")
plot_df2 <- plot_df2 %>% rename(Event_Type = EVTYPE)
p2 <- ggplot(plot_df2,aes(x=years,y=T_INJURIES,color=Event_Type))+
geom_point(size=4,alpha=0.5)+
geom_line()+
labs(x="",y="% Injuries per year")+
scale_color_brewer(palette = "Paired")
plot_df5 <- plot_df5 %>% rename(Type_of_damage = variable)
p3 <- ggplot(plot_df5,aes(x=EVTYPE,y=value,fill=Type_of_damage))+
geom_bar(stat = "identity")+
theme(axis.text.x = element_text(angle = 20))+
labs(y="Total human effected",x="Event type")+
theme(legend.position = c(0.08, 0.9),
legend.background = element_rect(fill = "transparent"))+
scale_fill_brewer(palette = "Set2")
plot_df6 <- plot_df6 %>% rename(Event_Type = EVTYPE)
plot_df6 <- plot_df6 %>% arrange(mean_fatalities)
p4 <- ggtexttable(plot_df6, rows = NULL,
theme = ttheme("mOrange"))
text.p <- paste("This figure shows the which types of events are most harmful",
"with respect to population health. ",
"\n1: Top 3 killer by the years, ",
"\n2: Top 3 events causing most injuries by the years,",
"\n3: Top 10 events causing most death and injuries in total since 1996, ",
"\n4: Statistics of the Top 5 events causing most death and injuries", sep=" ")
text.p <- ggparagraph(text = text.p, face = "bold", size = 15, color = "black")
figure <- ggarrange(p1, p2, p3, p4,text.p,
labels = c("1", "2", "3", "4",""),
ncol = 1, nrow = 5)
figure
So the analysis reviled some interesting points:
* “EXCESSIVE HEAT” was cause of most fatalities till 2007. Then it dropped.
* “TORNADO” consistently kills 10-20% of total deaths. For the year 2011 it caused about 60% of the total deaths. * “FLASH FLOOD” is consistent of % deaths(5-19%).
* “TORNADO” consistently injures more people than other types of storm and a hedge spike 2011.
* “FLOOD” caused big % of injures in the year 1998.
* “EXCESSIVE HEAT”-(1999,2006) & “HURRICANE (TYPHOON)”-(2004) have few spikes in % injures.
* In total “TORNADO”, “EXCESSIVE HEAT”, “FLOOD”, “THUNDERSTORM WIND” and “LIGHTNING” are the clear winners when it comes to historical cumulative damage to humans.
* The stat chart figure(1.4) shows the mean casualty and 95% confidence interval of the top 15 event types.
To analyze this part we will use sub_df2.
Analyzing and comparing the total PROPDMG_ABS or CROPDMG_ABS by the events will not be as informative, as the events are random and total economic damage is different every year. So I will convert PROPDMG_ABS and CROPDMG_ABS of say event “a” to ->
(PROPDMG_ABS or CROPDMG_ABS of event “a” for a certain year)/(total PROPDMG_ABS or CROPDMG_ABS across all the events for that year )
sub_df2_year <- sub_df2 %>% mutate(years = year(BGN_DATE)) %>%
group_by(years) %>% summarise(T_PROPDMG_ABS = sum(PROPDMG_ABS),
T_CROPDMG_ABS = sum(CROPDMG_ABS))
sub_df2_year_event <- sub_df2 %>% mutate(years = year(BGN_DATE)) %>%
group_by(years,EVTYPE) %>%
summarise(T_PROPDMG_ABS = sum(PROPDMG_ABS),
T_CROPDMG_ABS = sum(CROPDMG_ABS))
for(idx in 1:nrow(sub_df2_year_event)){
temp_df3 <- sub_df2_year[sub_df2_year$years == sub_df2_year_event[idx,]$years,]
sub_df2_year_event[idx,
"T_PROPDMG_ABS"] = sub_df2_year_event[idx,
"T_PROPDMG_ABS"] / temp_df3$T_PROPDMG_ABS
sub_df2_year_event[idx,
"T_CROPDMG_ABS"] = sub_df2_year_event[idx,
"T_CROPDMG_ABS"] / temp_df3$T_CROPDMG_ABS
}
sub_df2_year_event$T_PROPDMG_ABS = sub_df2_year_event$T_PROPDMG_ABS * 100
sub_df2_year_event$T_CROPDMG_ABS = sub_df2_year_event$T_CROPDMG_ABS * 100
Property damage and crop damges are not consistent with the events. Lets create a subdataset for the top event that caused most economic damage throughout the years.
plot_df11 <- sub_df2_year_event %>% group_by(years) %>% top_n(n=1,wt=T_PROPDMG_ABS)
plot_df12 <- sub_df2_year_event %>% group_by(years) %>% top_n(n=1,wt=T_CROPDMG_ABS)
plot_df11 <- sub_df2_year_event %>%
filter(EVTYPE %in% unique(plot_df11$EVTYPE)) %>% group_by(years)
plot_df12 <- sub_df2_year_event %>%
filter(EVTYPE %in% unique(plot_df12$EVTYPE)) %>% group_by(years)
#for line plots 11, 12
Lets create a subdataset for total economic damage by the event type and take only the top 10.
sub_df2_event <- sub_df2 %>% group_by(EVTYPE) %>%
summarise(T_PROPDMG_ABS = sum(PROPDMG_ABS),
T_CROPDMG_ABS = sum(CROPDMG_ABS))
plot_df13 <- sub_df2_event %>%
top_n(n=10,wt=T_PROPDMG_ABS)
plot_df14 <- sub_df2_event %>%
top_n(n=10,wt=T_CROPDMG_ABS)
events3 <- unique(c(plot_df13$EVTYPE, plot_df14$EVTYPE))
len <- length(events3)
plot_df15 <- tbl_df(data = list("EVTYPE"=1:len,"Total PropertyDmg"=1:len,"Total CropDmg"=1:len))
plot_df15$EVTYPE <- events3
for(event in plot_df15$EVTYPE){
t_f <- sub_df2_event[sub_df2_event$EVTYPE == event,]$T_PROPDMG_ABS
t_i <- sub_df2_event[sub_df2_event$EVTYPE == event,]$T_CROPDMG_ABS
plot_df15[plot_df15$EVTYPE == event,"Total PropertyDmg"] <- t_f
plot_df15[plot_df15$EVTYPE == event,"Total CropDmg"] <- t_i
}
plot_df15 <- melt(data = plot_df15, id.vars="EVTYPE")
#for bar plots 5
Lets create a table to show some states for top 5 killers through the years.
#use sub_df1_year_event, plot_df1 and plot_df2
plot_df18 <- sub_df2_year_event %>% group_by(years) %>% top_n(n=5,wt=T_PROPDMG_ABS)
plot_df19 <- sub_df2_year_event %>% group_by(years) %>% top_n(n=5,wt=T_CROPDMG_ABS)
events4 <- unique(c(plot_df18$EVTYPE, plot_df19$EVTYPE))
df <- 16
plot_df16 <- sub_df2_year_event %>% filter(EVTYPE %in% events4) %>%
group_by(EVTYPE) %>%
summarise(mean_PropDmg = mean(T_PROPDMG_ABS),
sd_PropDmg = sd(T_PROPDMG_ABS),
conf_interval_p = paste(round((mean_PropDmg +
c(-1,1)*qt(0.975,df-1)*sd_PropDmg/sqrt(df)),2),
collapse = " - "),
mean_CorpDmg = mean(T_CROPDMG_ABS),
sd_CorpDmg = sd(T_CROPDMG_ABS),
conf_interval_c = paste(round((mean_CorpDmg +
c(-1,1)*qt(0.975,df-1)*sd_CorpDmg/sqrt(df)),2),
collapse = " - "))
#plot table with 6
Lets plot :) …………
plot_df11 <- plot_df11 %>% rename(Event_Type = EVTYPE)
p11 <- ggplot(plot_df11,aes(x=years,y=T_PROPDMG_ABS,color=Event_Type))+
geom_point(size=4,alpha=0.5)+
geom_line()+
labs(x="",y="% Property damage per year")+
scale_color_brewer(palette = "Paired")
plot_df12 <- plot_df12 %>% rename(Event_Type = EVTYPE)
p12 <- ggplot(plot_df12,aes(x=years,y=T_CROPDMG_ABS,color=Event_Type))+
geom_point(size=4,alpha=0.5)+
geom_line()+
labs(x="",y="% Crop damage per year")+
scale_color_brewer(palette = "Paired")
plot_df15 <- plot_df15 %>% rename(Type_of_damage = variable)
p13 <- ggplot(plot_df15,aes(x=EVTYPE,y=value,fill=Type_of_damage))+
geom_bar(stat = "identity")+
theme(axis.text.x = element_text(angle = 20))+
labs(y="Total economic damage",x="Event type")+
theme(legend.position = c(0.08, 0.9),
legend.background = element_rect(fill = "transparent"))+
scale_fill_brewer(palette = "Set2")
plot_df16 <- plot_df16 %>% rename(Event_Type = EVTYPE)
plot_df16 <- plot_df16 %>% arrange(mean_PropDmg)
p14 <- ggtexttable(plot_df16, rows = NULL,
theme = ttheme("mOrange"))
text.p <- paste("This figure shows the which types of events are most harmful",
"with respect to economic damage. ",
"\n1: Top 3 events causing most property damage by the years,",
"\n2: Top 3 events causing most corp damage by the years,",
"\n3: Top 10 events causing most economic damage in total since 1996, ",
"\n4: Statistics of the Top 5 events causing most economic damage", sep=" ")
text.p <- ggparagraph(text = text.p, face = "bold", size = 15, color = "black")
figure1 <- ggarrange(p11, p12, p13, p14,text.p,
labels = c("1", "2", "3", "4",""),
ncol = 1, nrow = 5)
figure1
So the analysis reviled some interesting points:
* No event type consistently causes property damage and corp damage.
* “FLOOD” shows two hedge spikes for property damage.(1997,2005)
* “HURRICANE (TYPHOON)” shows two hedge spikes for property damage.(1999,2004)
* “TROPICAL STORM”-(2001), “STORM SURGE/TIDE”-(2005,2008) and “HAIL”-(2010) showed few prominent spikes.
* “DROUGHT”-(2000-2003,2006), “HAIL”-(2009) and “HURRICANE (TYPHOON)”-(1999,2005) event types are main causes of corp damage.
* “FLOOD” is cause a problem for corps since 2008.
* In Total “FLOOD”, “HURRICANE (TYPHOON)”, “STORM SURGE/TIDE” and “TORNADO” are the clear winners when it comes to historical cumulative economic damage.
* The stat chart figure(2.4) shows the mean casualty and 95% confidence interval of the top 15 event types.