In this project we explore the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.
Through out our analysis we found that floods cause the most property damage while droughts cause the most crop damage. Floods cause the most overall damage. On average though we found that hurricanes/typhoons cause the most property damage per event while debris flows cause on average the most crop damage per event. Hurricanes/typhoons cause on average the most overall damage per event. Floods are very common and thus account for most of the damage over the time period we studied.
The analysis showed that the types of events that are most harmful to population health are tornadoes (causing the most injuries) and excessive heat (the most fatal). We found that excessive heat on average is more fatal and causes more injuries per event than tornadoes but tornadoes are more common. Floods also come high in the list of fatalities and injuries and are very common types of events.
Try downloading data from https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2 (if not already downloaded):
# Load required libraries
library(data.table)
library(stringdist)
library(plyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following objects are masked from 'package:data.table':
##
## between, first, last
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(ggpubr)
##
## Attaching package: 'ggpubr'
## The following object is masked from 'package:plyr':
##
## mutate
# Download storm data
if(!file.exists("repdata_data_StormData.csv.bz2")) {
download.file("https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2", destfile = "repdata_data_StormData.csv.bz2")
}
# Clear environment
#rm(list=ls())
We read the database with fread() for better performance and less waiting time. Note that the chunk is cached in knitr to save time in the following compilations:
# Read the compressed storm database file
system.time({
stormDB <- fread(file="repdata_data_StormData.csv.bz2", data.table = FALSE)
})
## user system elapsed
## 23.641 0.852 24.568
Data frame dimensions and count of NAs per column:
dim(stormDB)
## [1] 902297 37
colSums(is.na(stormDB))
## STATE__ BGN_DATE BGN_TIME TIME_ZONE COUNTY COUNTYNAME STATE
## 0 0 0 0 0 0 0
## EVTYPE BGN_RANGE BGN_AZI BGN_LOCATI END_DATE END_TIME COUNTY_END
## 0 0 0 0 0 0 0
## COUNTYENDN END_RANGE END_AZI END_LOCATI LENGTH WIDTH F
## 902297 0 0 0 0 0 843563
## MAG FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG CROPDMGEXP
## 0 0 0 0 0 0 0
## WFO STATEOFFIC ZONENAMES LATITUDE LONGITUDE LATITUDE_E LONGITUDE_
## 0 0 0 47 0 40 0
## REMARKS REFNUM
## 0 0
First we unabbreviate TSTM to THUNDERSTORM.
We then delete the summary lines.
We convert BGN_DATE from character to date.
According to NOAA Storm Events Database from 1996 to present, all official event types are recorded as defined in NWS Directive 10-1605. So we decided to include in our analysis only the data starting from 1996 since those data are representative for the most types of events. We also keep the columns that are needed for further analysis.
# Unabbreviate Thunderstorms
stormDB$EVTYPE <- gsub("TSTM", "THUNDERSTORM", stormDB$EVTYPE)
# Omit summary lines
stormDB <- stormDB[!grepl("SUMMARY*", toupper(stormDB$EVTYPE)), ]
# Convert BGN_DATE to a date field
stormDB$BGN_DATE <- as.Date(stormDB$BGN_DATE, "%m/%d/%Y")
# According to NOAA Storm Events Database
# https://www.ncdc.noaa.gov/stormevents/details.jsp?type=eventtype
# From 1996 to present, all official event types are recorded as defined in
# NWS Directive 10-1605 https://www.ncdc.noaa.gov/stormevents/pd01016005curr.pdf
# So we keep only the data from 1996 for our further analysis
stormData <- stormDB[stormDB$BGN_DATE >= "1996-01-01", c(8, 23:28)]
We then clean the values of EVTYPE field to match the official types of events:
# Sleet
stormData$EVTYPE[grepl("FREEZING RAIN", stormData$EVTYPE, ignore.case = TRUE)] <- "SLEET"
# Wet means rain
stormData$EVTYPE[grepl("HEAVY RAIN|RAIN|WET", stormData$EVTYPE, ignore.case = TRUE)] <- "HEAVY RAIN"
# Dry means drought
stormData$EVTYPE[grepl("DRY|DRIEST", stormData$EVTYPE, ignore.case = TRUE)] <- "DROUGHT"
# Snow that is not Lake-effect
stormData$EVTYPE[grepl("SNOW", stormData$EVTYPE, ignore.case = TRUE) & !grepl("LAK", stormData$EVTYPE, ignore.case = TRUE)] <- "HEAVY SNOW"
# Extreme or excessive cold
stormData$EVTYPE[grepl("EXTREM|EXCES", stormData$EVTYPE, ignore.case = TRUE) & grepl("COLD|COOL|CHILL", stormData$EVTYPE, ignore.case = TRUE)] <- "EXTREME COLD/WIND CHILL"
stormData$EVTYPE[grepl("HYPOTHERM", stormData$EVTYPE, ignore.case = TRUE)] <- "EXTREME COLD/WIND CHILL"
# Extreme or excessive heat, record temperatures, hyperthermia
stormData$EVTYPE[grepl("EXTREM|EXCES", stormData$EVTYPE, ignore.case = TRUE) & grepl("WARM|HOT|HEAT", stormData$EVTYPE, ignore.case = TRUE)] <- "EXCESSIVE HEAT"
stormData$EVTYPE[grepl("RECOR", stormData$EVTYPE, ignore.case = TRUE) & grepl("TEMP|HIGH", stormData$EVTYPE, ignore.case = TRUE)] <- "EXCESSIVE HEAT"
stormData$EVTYPE[grepl("HYPERTHERM", stormData$EVTYPE, ignore.case = TRUE) ] <- "EXCESSIVE HEAT"
# Not extreme or excessive cold
stormData$EVTYPE[!grepl("EXTREM|EXCES", stormData$EVTYPE, ignore.case = TRUE) & grepl("COLD|COOL|CHILL", stormData$EVTYPE, ignore.case = TRUE)] <- "COLD/WIND CHILL"
stormData$EVTYPE[grepl("LOW", stormData$EVTYPE, ignore.case = TRUE) & grepl("TEMP", stormData$EVTYPE, ignore.case = TRUE)] <- "COLD/WIND CHILL"
# Not extreme or excessive heat
stormData$EVTYPE[!grepl("EXTREM|EXCES", stormData$EVTYPE, ignore.case = TRUE) & grepl("WARM|HOT|HEAT", stormData$EVTYPE, ignore.case = TRUE)] <- "HEAT"
# Erosion is coastal flooding
stormData$EVTYPE[grepl("EROS|OUT TIDE", stormData$EVTYPE, ignore.case = TRUE) ] <- "COASTAL FLOOD"
# Urban flooding
stormData$EVTYPE[grepl("FLD|FLOOD", stormData$EVTYPE, ignore.case = TRUE) & grepl("URBAN|STREET", stormData$EVTYPE, ignore.case = TRUE)] <- "FLOOD"
# Every other type of flooding
stormData$EVTYPE[grepl("FLD|FLOOD", stormData$EVTYPE, ignore.case = TRUE) & !grepl("COAST|DEBR|FLASH|LAK", stormData$EVTYPE, ignore.case = TRUE)] <- "FLOOD"
# Red flag fire is wildfire
stormData$EVTYPE[grepl("FLAG FIR|FIR", stormData$EVTYPE, ignore.case = TRUE) ] <- "WILDFIRE"
# Anything with surf maps to high surf
stormData$EVTYPE[grepl("SURF", stormData$EVTYPE, ignore.case = TRUE) ] <- "HIGH SURF"
# Dust storms
stormData$EVTYPE[grepl("DUST", stormData$EVTYPE, ignore.case = TRUE) & !grepl("DEV", stormData$EVTYPE, ignore.case = TRUE)] <- "DUST STORM"
stormData$EVTYPE[grepl("HURRICANE EDOU", stormData$EVTYPE, ignore.case = TRUE)] <- "MARINE HURRICANE/TYPHOON"
# Non marine hail is simply hail
stormData$EVTYPE[grepl("HAIL", stormData$EVTYPE, ignore.case = TRUE) & !grepl("MAR", stormData$EVTYPE, ignore.case = TRUE)] <- "HAIL"
# Non marine thunderstorms are simply thunderstorms
stormData$EVTYPE[grepl("THUND", stormData$EVTYPE, ignore.case = TRUE) & !grepl("MAR", stormData$EVTYPE, ignore.case = TRUE)] <- "THUNDERSTORM WIND"
# Freezing
stormData$EVTYPE[grepl("FREEZ", stormData$EVTYPE, ignore.case = TRUE) & !grepl("FOG|FROST|MAR", stormData$EVTYPE, ignore.case = TRUE)] <- "FROST/FREEZE"
# Mixed precipitation
stormData$EVTYPE[grepl("PRECIPIT", stormData$EVTYPE, ignore.case = TRUE) ] <- "MIXED PRECIPITATION"
#######
# From https://www.ncdc.noaa.gov/stormevents/pd01016005curr.pdf
# APPENDIX A–Event Types
# 55 Types total
######
officialEventTypes <- c("Astronomical Low Tide", "Avalanche", "Blizzard", "Coastal Flood", "Cold/Wind Chill", "Debris Flow", "Dense Fog",
"Dense Smoke", "Drought", "Dust Devil", "Dust Storm", "Excessive Heat", "Extreme Cold/Wind Chill", "Flash Flood", "Flood",
"Freezing Fog", "Frost/Freeze", "Funnel Cloud", "Hail", "Heat", "Heavy Rain", "Heavy Snow", "High Surf",
"High Wind", "Hurricane/Typhoon", "Ice Storm", "Lakeshore Flood", "Lake-Effect Snow", "Lightning", "Marine Dense Fog", "Marine Hail",
"Marine Heavy Freezing Spray", "Marine High Wind", "Marine Hurricane/Typhoon", "Marine Lightning",
"Marine Strong Wind", "Marine Thunderstorm Wind", "Marine Tropical Depression", "Marine Tropical Storm",
"Rip Current", "Seiche", "Sleet", "Sneaker Wave", "Storm Surge/Tide", "Strong Wind", "Thunderstorm Wind", "Tornado",
"Tropical Depression", "Tropical Storm", "Tsunami", "Volcanic Ash", "Waterspout", "Wildfire", "Winter Storm", "Winter Weather",
# The last two are added by me and are not official event types
"MIXED PRECIPITATION", "UNOFFICIAL EVENT TYPE"
)
# Match EVTYPE with 55 official event types plus one for mixed precipitation. If no match then set to 57 (UNOFFICIAL EVENT TYPE)
matchidx <- amatch(toupper(stormData$EVTYPE), toupper(officialEventTypes), maxDist = 7, nomatch = 57)
# Replace EVTYPE with matched official event types
stormData$EVTYPE <- toupper(officialEventTypes[matchidx])
#sum(!(stormData$EVTYPE %in% toupper(officialEventTypes)))
#sum(stormData$EVTYPE == "UNOFFICIAL EVENT TYPE")
Then we calculate the multipliers for property and crop damage from the exponent fields (PROPDMGEXP, CROPDMGEXP). We apply the multipliers and calculate the damages. Finally we drop the exponent fields:
#
# Set the appropriate multipliers for property and crop damages
# Citation: https://rstudio-pubs-static.s3.amazonaws.com/58957_37b6723ee52b455990e149edde45e5b6.html
#
PROPmultiplier <- plyr::mapvalues(stormData$PROPDMGEXP,
c("", "-", "?", "+", "0", "1", "2", "3", "4", "5", "6", "7", "8", "h", "H", "k", "K", "m", "M", "b", "B"),
c( 0, 0, 0, 1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e2, 1e2, 1e3, 1e3, 1e6, 1e6, 1e9, 1e9),
warn_missing = FALSE)
CROPmultiplier <- plyr::mapvalues(stormData$CROPDMGEXP,
c("", "-", "?", "+", "0", "1", "2", "3", "4", "5", "6", "7", "8", "h", "H", "k", "K", "m", "M", "b", "B"),
c( 0, 0, 0, 1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e1, 1e2, 1e2, 1e3, 1e3, 1e6, 1e6, 1e9, 1e9),
warn_missing = FALSE)
# Calculate property and crops damages
stormData$PROPDMG <- stormData$PROPDMG * as.numeric(PROPmultiplier)
stormData$CROPDMG <- stormData$CROPDMG * as.numeric(CROPmultiplier)
# Delete exponents from data
stormData$PROPDMGEXP <- NULL
stormData$CROPDMGEXP <- NULL
Calculate summary statistics for every type of event:
eventStats <-
stormData %>%
group_by(EVTYPE) %>%
summarise(across(.fns = list(TOTAL = sum, AVG = mean)))
## `summarise()` ungrouping output (override with `.groups` argument)
countEventsByType <-
stormData %>%
count(EVTYPE)
eventStats <-
eventStats %>%
mutate(COUNT = countEventsByType$n) %>%
mutate(DAMAGES = PROPDMG_TOTAL + CROPDMG_TOTAL)
Summary statistics for every type of event:
print(eventStats, width = Inf, n = Inf)
## # A tibble: 53 x 11
## EVTYPE FATALITIES_TOTAL FATALITIES_AVG INJURIES_TOTAL
## <chr> <dbl> <dbl> <dbl>
## 1 ASTRONOMICAL LOW TIDE 0 0 0
## 2 AVALANCHE 224 0.567 180
## 3 BLIZZARD 70 0.0266 385
## 4 COASTAL FLOOD 10 0.0130 9
## 5 COLD/WIND CHILL 118 0.166 24
## 6 DEBRIS FLOW 0 0 0
## 7 DENSE FOG 9 0.00753 143
## 8 DENSE SMOKE 0 0 0
## 9 DROUGHT 12 0.00443 34
## 10 DUST DEVIL 2 0.0146 39
## 11 DUST STORM 11 0.0251 376
## 12 EXCESSIVE HEAT 1798 1.04 6391
## 13 EXTREME COLD/WIND CHILL 264 0.143 108
## 14 FLASH FLOOD 887 0.0174 1674
## 15 FLOOD 525 0.0183 7855
## 16 FREEZING FOG 0 0 0
## 17 FROST/FREEZE 3 0.00208 13
## 18 FUNNEL CLOUD 0 0 1
## 19 HAIL 12 0.0000575 818
## 20 HEAT 239 0.214 1315
## 21 HEAVY RAIN 98 0.00842 236
## 22 HEAVY SNOW 131 0.00883 753
## 23 HIGH SURF 149 0.140 247
## 24 HIGH WIND 243 0.0121 1095
## 25 HURRICANE/TYPHOON 64 0.727 1275
## 26 ICE STORM 87 0.0454 341
## 27 LAKE-EFFECT SNOW 0 0 0
## 28 LAKESHORE FLOOD 0 0 0
## 29 LIGHTNING 651 0.0493 4141
## 30 MARINE DENSE FOG 0 0 0
## 31 MARINE HAIL 1 0.00226 2
## 32 MARINE HIGH WIND 1 0.00741 1
## 33 MARINE HURRICANE/TYPHOON 0 0 2
## 34 MARINE STRONG WIND 14 0.292 22
## 35 MARINE THUNDERSTORM WIND 19 0.00159 34
## 36 MIXED PRECIPITATION 2 0.0233 26
## 37 RIP CURRENT 542 0.738 503
## 38 SEICHE 65 0.312 48
## 39 SLEET 4 0.0139 1
## 40 SNEAKER WAVE 0 0 2
## 41 STORM SURGE/TIDE 13 0.0324 42
## 42 STRONG WIND 110 0.0293 299
## 43 THUNDERSTORM WIND 374 0.00178 5035
## 44 TORNADO 1511 0.0653 20667
## 45 TROPICAL DEPRESSION 0 0 0
## 46 TROPICAL STORM 57 0.0836 338
## 47 TSUNAMI 33 1.65 129
## 48 UNOFFICIAL EVENT TYPE 0 0 7
## 49 VOLCANIC ASH 0 0 0
## 50 WATERSPOUT 2 0.000589 2
## 51 WILDFIRE 124 0.0260 1510
## 52 WINTER STORM 192 0.0168 1369
## 53 WINTER WEATHER 61 0.00753 483
## INJURIES_AVG PROPDMG_TOTAL PROPDMG_AVG CROPDMG_TOTAL CROPDMG_AVG COUNT
## <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 0 9745000 35181. 0 0 277
## 2 0.456 3711800 9397. 0 0 395
## 3 0.146 525658950 199643. 7060000 2681. 2633
## 4 0.0117 407355560 527663. 0 0 772
## 5 0.0337 2644000 3713. 30742500 43178. 712
## 6 0 0 0 42000000 42000000 1
## 7 0.120 7319000 6125. 0 0 1195
## 8 0 100000 10000 0 0 10
## 9 0.0126 1047855600 386948. 13367581000 4936330. 2708
## 10 0.285 663630 4844. 0 0 137
## 11 0.856 5594100 12743. 3100000 7062. 439
## 12 3.71 7723700 4480. 492402000 285616. 1724
## 13 0.0583 29163400 15755. 1326023000 716382. 1851
## 14 0.0328 15222268910 298447. 1334901700 26172. 51005
## 15 0.273 144745424200 5033048. 5014286500 174355. 28759
## 16 0 2182000 47435. 0 0 46
## 17 0.00900 18785000 13009. 1326761000 918810. 1444
## 18 0.000165 134100 22.1 0 0 6069
## 19 0.00392 14640138920 70113. 2561518700 12267. 208807
## 20 1.18 2577500 2308. 1220900 1093. 1117
## 21 0.0203 598703440 51462. 739919800 63600. 11634
## 22 0.0508 641809540 43266. 71122100 4795. 14834
## 23 0.233 95394500 89910. 0 0 1061
## 24 0.0547 5250093360 262308. 633771300 31665. 20015
## 25 14.5 69305840000 787566364. 2607872800 29634918. 88
## 26 0.178 3642742010 1902215. 15660000 8178. 1915
## 27 0 40182000 61253. 0 0 656
## 28 0 7540000 327826. 0 0 23
## 29 0.314 743077080 56277. 6898440 522. 13204
## 30 0 0 0 0 0 3
## 31 0.00451 54000 122. 0 0 443
## 32 0.00741 1297010 9607. 0 0 135
## 33 1 0 0 0 0 2
## 34 0.458 418330 8715. 0 0 48
## 35 0.00284 5857400 489. 50000 4.17 11987
## 36 0.302 790000 9186. 0 0 86
## 37 0.685 163000 222. 0 0 734
## 38 0.231 11815024010 56803000. 2741410000 13179856. 208
## 39 0.00348 1082000 3770. 0 0 287
## 40 1 1000000 500000 0 0 2
## 41 0.105 47834724000 119288589. 855000 2132. 401
## 42 0.0796 176994240 47098. 64953500 17284. 3758
## 43 0.0240 7869230380 37440. 952246350 4531. 210181
## 44 0.893 24616945710 1063137. 283425010 12240. 23155
## 45 0 1737000 28950 0 0 60
## 46 0.496 7642475550 11205976. 677711000 993711. 682
## 47 6.45 144062000 7203100 20000 1000 20
## 48 0.233 5000 167. 0 0 30
## 49 0 500000 18519. 0 0 27
## 50 0.000589 5737200 1690. 0 0 3394
## 51 0.316 8085037500 1692847. 422272130 88415. 4776
## 52 0.120 1532755750 134382. 11944000 1047. 11406
## 53 0.0596 27298000 3371. 15000000 1852. 8098
## DAMAGES
## <dbl>
## 1 9745000
## 2 3711800
## 3 532718950
## 4 407355560
## 5 33386500
## 6 42000000
## 7 7319000
## 8 100000
## 9 14415436600
## 10 663630
## 11 8694100
## 12 500125700
## 13 1355186400
## 14 16557170610
## 15 149759710700
## 16 2182000
## 17 1345546000
## 18 134100
## 19 17201657620
## 20 3798400
## 21 1338623240
## 22 712931640
## 23 95394500
## 24 5883864660
## 25 71913712800
## 26 3658402010
## 27 40182000
## 28 7540000
## 29 749975520
## 30 0
## 31 54000
## 32 1297010
## 33 0
## 34 418330
## 35 5907400
## 36 790000
## 37 163000
## 38 14556434010
## 39 1082000
## 40 1000000
## 41 47835579000
## 42 241947740
## 43 8821476730
## 44 24900370720
## 45 1737000
## 46 8320186550
## 47 144082000
## 48 5000
## 49 500000
## 50 5737200
## 51 8507309630
## 52 1544699750
## 53 42298000
g1 <- eventStats %>% arrange(desc(PROPDMG_TOTAL)) %>% top_n(10, wt = PROPDMG_TOTAL) %>%
mutate(EVTYPE = reorder(EVTYPE, PROPDMG_TOTAL)) %>%
ggplot( aes(y=PROPDMG_TOTAL/1e9, x=EVTYPE)) +
geom_bar(stat="identity", fill="#FFDB6D", alpha=.9, width=.8) +
xlab("TYPES OF EVENTS") +
ylab("PROPERTY DAMAGE (in B$)") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"),
axis.title=element_text(size=8))
g2 <- eventStats %>% arrange(desc(CROPDMG_TOTAL)) %>% top_n(10, wt = CROPDMG_TOTAL) %>%
mutate(EVTYPE = reorder(EVTYPE, CROPDMG_TOTAL)) %>%
ggplot( aes(y=CROPDMG_TOTAL/1e9, x=EVTYPE)) +
geom_bar(stat="identity", fill="#00AFBB", alpha=.9, width=.8) +
xlab("") +
ylab("CROP DAMAGE (in B$)") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"),
axis.title=element_text(size=8))
g3 <- eventStats %>% arrange(desc(DAMAGES)) %>% top_n(10, wt = DAMAGES) %>%
mutate(EVTYPE = reorder(EVTYPE, DAMAGES)) %>%
ggplot( aes(y=DAMAGES/1e9, x=EVTYPE)) +
geom_bar(stat="identity", fill="#f68060", alpha=.8, width=.8) +
xlab("TYPES OF EVENTS") +
ylab("TOTAL DAMAGE (in Billion US $)") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"))
ggarrange(
ggarrange(g1, g2, ncol = 2), # First row
g3, # Second row
nrow = 2
)
In the above figure we can see that floods cause the most property damage while droughts cause the most crop damage. Floods cause the most overall damage.
g11 <- eventStats %>% arrange(desc(PROPDMG_AVG)) %>% top_n(10, wt = PROPDMG_AVG) %>%
mutate(EVTYPE = reorder(EVTYPE, PROPDMG_AVG)) %>%
ggplot( aes(y=PROPDMG_AVG/1e6, x=EVTYPE)) +
geom_bar(stat="identity", fill="#FFDB6D", alpha=.9, width=.8) +
xlab("TYPES OF EVENTS") +
ylab("PROPERTY DAMAGE PER EVENT (in M$)") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"),
axis.title=element_text(size=8))
g12 <- eventStats %>% arrange(desc(CROPDMG_AVG)) %>% top_n(10, wt = CROPDMG_AVG) %>%
mutate(EVTYPE = reorder(EVTYPE, CROPDMG_AVG)) %>%
ggplot( aes(y=CROPDMG_AVG/1e6, x=EVTYPE)) +
geom_bar(stat="identity", fill="#00AFBB", alpha=.9, width=.8) +
xlab("") +
ylab("CROP DAMAGE PER EVENT (in M$)") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"),
axis.title=element_text(size=8))
g13 <- eventStats %>% arrange(desc(DAMAGES/COUNT)) %>% top_n(10, wt = DAMAGES/COUNT) %>%
mutate(EVTYPE = reorder(EVTYPE, DAMAGES/COUNT)) %>%
ggplot( aes(y=DAMAGES/(1e6*COUNT), x=EVTYPE)) +
geom_bar(stat="identity", fill="#f68060", alpha=.8, width=.8) +
xlab("TYPES OF EVENTS") +
ylab("TOTAL DAMAGE PER EVENT (in Million US $)") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"))
ggarrange(
ggarrange(g11, g12, ncol = 2), # First row with line plot
g13, # Second row
nrow = 2
)
In the above figure we can see that hurricanes/typhoons cause on average the most property damage per event while debris flows cause on average the most crop damage per event. Hurricanes/typhoons cause on average the most overall damage per event.
g4 <- eventStats %>% arrange(desc(FATALITIES_TOTAL)) %>% top_n(10, wt = FATALITIES_TOTAL) %>%
mutate(EVTYPE = reorder(EVTYPE, FATALITIES_TOTAL)) %>%
ggplot( aes(y=FATALITIES_TOTAL, x=EVTYPE)) +
geom_bar(stat="identity", fill="#f68060", alpha=.9, width=.8) +
xlab("TYPES OF EVENTS") +
ylab("FATALITIES") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"))
g5 <- eventStats %>% arrange(desc(INJURIES_TOTAL)) %>% top_n(10, wt = INJURIES_TOTAL) %>%
mutate(EVTYPE = reorder(EVTYPE, INJURIES_TOTAL)) %>%
ggplot( aes(y=INJURIES_TOTAL, x=EVTYPE)) +
geom_bar(stat="identity", fill="#FFDB6D", alpha=.9, width=.8) +
xlab("TYPES OF EVENTS") +
ylab("INJURIES") +
coord_flip() +
theme_minimal() +
theme(axis.text.x = element_text(color = "grey20", size = 10, hjust = .5, vjust = .5, face = "plain"))
ggarrange(g4, g5, nrow = 2)
In the above figure we can see that tornadoes, excessive heat and floods are to be accounted as the most harmful types of events for population health.
We conclude our analysis with same summary statistics.
eventStats %>% arrange(desc(COUNT)) %>% top_n(10, wt = COUNT) %>% select(EVTYPE, COUNT)
## # A tibble: 10 x 2
## EVTYPE COUNT
## <chr> <int>
## 1 THUNDERSTORM WIND 210181
## 2 HAIL 208807
## 3 FLASH FLOOD 51005
## 4 FLOOD 28759
## 5 TORNADO 23155
## 6 HIGH WIND 20015
## 7 HEAVY SNOW 14834
## 8 LIGHTNING 13204
## 9 MARINE THUNDERSTORM WIND 11987
## 10 HEAVY RAIN 11634
eventStats %>% arrange(desc(FATALITIES_AVG)) %>% top_n(10, wt = FATALITIES_AVG) %>% select(EVTYPE, FATALITIES_AVG)
## # A tibble: 10 x 2
## EVTYPE FATALITIES_AVG
## <chr> <dbl>
## 1 TSUNAMI 1.65
## 2 EXCESSIVE HEAT 1.04
## 3 RIP CURRENT 0.738
## 4 HURRICANE/TYPHOON 0.727
## 5 AVALANCHE 0.567
## 6 SEICHE 0.312
## 7 MARINE STRONG WIND 0.292
## 8 HEAT 0.214
## 9 COLD/WIND CHILL 0.166
## 10 EXTREME COLD/WIND CHILL 0.143
eventStats %>% arrange(desc(INJURIES_AVG)) %>% top_n(10, wt = INJURIES_AVG) %>% select(EVTYPE, INJURIES_AVG)
## # A tibble: 10 x 2
## EVTYPE INJURIES_AVG
## <chr> <dbl>
## 1 HURRICANE/TYPHOON 14.5
## 2 TSUNAMI 6.45
## 3 EXCESSIVE HEAT 3.71
## 4 HEAT 1.18
## 5 MARINE HURRICANE/TYPHOON 1
## 6 SNEAKER WAVE 1
## 7 TORNADO 0.893
## 8 DUST STORM 0.856
## 9 RIP CURRENT 0.685
## 10 TROPICAL STORM 0.496