This report aims to provide information around the impact of various storm event types on human health and on property, more particularly:
Across the United States, which types of events are most harmful with respect to population health?
Across the United States, which types of events have the greatest economic consequences?
This section details the data import and transformation mechanisms used to analyse the data.
To calculate the storm even types most harmful with respect to population health, the combined values of storm fatalities and injuries due to storms are combined. To assess which events cause the most damage, the property and crop damage values are combined.
The original data stores a factorised value for property and crop damage, and the adjacent PROPDMGEXP and CROPDMGEXP (later renamed to propdmgexp and cropdmgexp) colums indicate the factors:
Any other symbols or numbers used in the factor exp columns are assumed as 1.
library(dplyr)
library(ggplot2)
storm <- read.csv("./Datasets/StormData.csv.bz2")
storm <- rename(storm, eventtype = EVTYPE, fatalities = FATALITIES, injuries = INJURIES, propdamage = PROPDMG,
propdmgexp = PROPDMGEXP, cropdamage = CROPDMG, cropdmgexp = CROPDMGEXP)
storm <- select(storm, c(eventtype, fatalities, injuries, propdamage, propdmgexp, cropdamage, cropdmgexp))
Change the class type of exp columns to character Replace the h, k, K, m, M, and B values with the appropriate decimal factors Change the class type of *exp columns to numeric
storm$propdmgexp <- as.character(storm$propdmgexp)
storm$propdmgexp[storm$propdmgexp == "K"] <- 1000
storm$propdmgexp[storm$propdmgexp == "m"] <- 1000000
storm$propdmgexp[storm$propdmgexp == "M"] <- 1000000
storm$propdmgexp[storm$propdmgexp == "h"] <- 100
storm$propdmgexp[storm$propdmgexp == "H"] <- 100
storm$propdmgexp[storm$propdmgexp == "B"] <- 1000000000
storm$propdmgexp[storm$propdmgexp == "-"] <- 1
storm$propdmgexp[storm$propdmgexp == "?"] <- 1
storm$propdmgexp[storm$propdmgexp == "+"] <- 1
storm$propdmgexp[storm$propdmgexp == "0"] <- 1
storm$propdmgexp[storm$propdmgexp == "2"] <- 1
storm$propdmgexp[storm$propdmgexp == "3"] <- 1
storm$propdmgexp[storm$propdmgexp == "4"] <- 1
storm$propdmgexp[storm$propdmgexp == "5"] <- 1
storm$propdmgexp[storm$propdmgexp == "6"] <- 1
storm$propdmgexp[storm$propdmgexp == "7"] <- 1
storm$propdmgexp[storm$propdmgexp == "8"] <- 1
storm$propdmgexp <- as.numeric(storm$propdmgexp)
storm$cropdmgexp <- as.character(storm$cropdmgexp)
storm$cropdmgexp[storm$cropdmgexp == "K"] <- 1000
storm$cropdmgexp[storm$cropdmgexp == "k"] <- 1000
storm$cropdmgexp[storm$cropdmgexp == "B"] <- 1000000000
storm$cropdmgexp[storm$cropdmgexp == "m"] <- 1000000
storm$cropdmgexp[storm$cropdmgexp == "M"] <- 1000000
storm$cropdmgexp[storm$cropdmgexp == "?"] <- 1
storm$cropdmgexp[storm$cropdmgexp == "0"] <- 1
storm$cropdmgexp[storm$cropdmgexp == "2"] <- 1
storm$cropdmgexp <- as.numeric(storm$cropdmgexp)
Calculate property and crop damage
storm$property <- storm$propdamage * storm$propdmgexp
storm$crops <- storm$cropdamage * storm$cropdmgexp
Calculate sum of property and crop damage
storm$totals <- storm$crops + storm$property
Aggregate by event type
damage <- aggregate(storm$totals, by = list(storm$eventtype), FUN = sum)
damage <- rename(damage, EventType = Group.1, Value = x)
damage <- arrange(damage, desc(Value))
Filter out top 5 event types by amount of damage
damage <- damage[1:5,]
Calculate sum of injuries and fatalities
storm$harmed <- storm$injuries + storm$fatalities
Aggregate by event type
harmed <- aggregate(storm$harmed, by = list(storm$eventtype), FUN = sum)
harmed <- rename(harmed, EventType = Group.1, Harmed = x)
harmed <- arrange(harmed, desc(Harmed))
Filter out top 5 event types by number of harmed
harmed <- harmed[1:5,]
Across the United States, which types of events are most harmful with respect to population health?
The top 5 events impacting population health are:
## EventType Harmed
## 1 TORNADO 96979
## 2 EXCESSIVE HEAT 8428
## 3 TSTM WIND 7461
## 4 FLOOD 7259
## 5 LIGHTNING 6046
ggplot(harmed, aes(reorder(EventType, -Harmed), Harmed)) + geom_bar(stat = "identity", fill = "red") + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + labs(x = "Event Type", y = "Population Health Impact")
Across the United States, which types of events have the greatest economic consequences?
The top 5 events by greatest economic impact are:
## EventType Value
## 1 TORNADOES, TSTM WIND, HAIL 1602500000
## 2 HIGH WINDS/COLD 117500000
## 3 HURRICANE OPAL/HIGH WINDS 110000000
## 4 WINTER STORM HIGH WINDS 65000000
## 5 Heavy Rain/High Surf 15000000
ggplot(damage, aes(reorder(EventType, -Value), Value)) + geom_bar(stat = "identity", fill = "blue") + theme(axis.text.x = element_text(angle = 45, hjust = 1)) + labs(x = "Event Type", y = "Amount of damage (US$)")
The event with the greatest economic consequence is TORNADOES, TSTM WIND, HAIL, with a total combined amount of US$ 1.6 billion.
The most harm inflicted on the population was by TORNADO, with a total combined number of 96979 injuries and fatalities.
Report End