Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.
This project explored the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, which tracks characteristics of major storms and weather events in the United States, including estimates of any fatalities, injuries, and property damage over the period during the years 1950 - 2011. Exploration of the database sought to answer the following two questions about severe weather events across the United States of America:
The CSV file containing the raw data was downloaded from the internet and loaded in R where:
EVTYPE, FATALITIES,
INJURIES, PROPDMG, PROPDMGEXP,
CROPDMG, and CROPDMGEXP, deemed necessary to
the assessment were retained and the other columns discarded;EVTYPE variable, which indicates the types of
events that are most harmful with respect to population health and
economic consequences was converted to a factor variable to facilitate
analysis and graphing;PROPDMG and CROPDMG were converted to
numeric vectors by substituting “K”, “M”, and “B” with 1,000, 1,000,000,
1,000,000,000 respectively since according to the documentation these
were the approved categories; all other levels were of these variables
were set to zero; andlibrary(dplyr)
library(ggplot2)
library(forcats)
# download data, if necessary
data_url <- "https://d396qusza40orc.cloudfront.net/repdata%2Fdata%2FStormData.csv.bz2"
file_name <- "storm_data_csv.bz2"
if(!file.exists(file_name)){
download.file(data_url, destfile = file_name, method = "curl")
}
# load data from disk
raw_data <- read.csv(file_name, header = TRUE)
# process data for analysis
cleaned_data <- raw_data %>%
select(EVTYPE, FATALITIES, INJURIES, #
PROPDMG, PROPDMGEXP, CROPDMG,
CROPDMGEXP) %>%
mutate(EVTYPE = as.factor(EVTYPE),
PROPDMGEXP = case_when(PROPDMGEXP == "K" ~ 1e3,
PROPDMGEXP == "M" ~ 1e6,
PROPDMGEXP == "B" ~ 1e9,
TRUE ~ 0
),
CROPDMGEXP = case_when(CROPDMGEXP == "K" ~ 1e3,
CROPDMGEXP == "M" ~ 1e6,
CROPDMGEXP == "B" ~ 1e9,
TRUE ~ 0
)
)
# plot total causalities for top 10 weather event
cleaned_data %>%
group_by(EVTYPE) %>%
summarise(casualities = sum(FATALITIES + INJURIES, na.rm = TRUE)) %>%
ungroup() %>%
top_n(10) %>%
ggplot(aes(x=casualities, y=fct_reorder(EVTYPE, casualities))) +
geom_col() +
labs(y = "Severe Weather Event",
x = "Total Number of Direct or Indirect Casualities",
title = "Top 10 Severe Weather Events with the Most Casualities in the United States from 1950 - 2011"
) +
theme_classic() +
theme(axis.title.x = element_text(size=11, face="bold", color = "black"),
axis.title.y = element_text(size=11, face="bold", color = "black"),
plot.title = element_text(size=12, face="bold", color = "black")
)
Tornadoes were the severe weather events that resulted in the most causalities in the United States.
# plot economic damages for top 10 weather events
cleaned_data %>%
group_by(EVTYPE) %>%
summarise(econ_damage = sum((PROPDMG *PROPDMGEXP) + (CROPDMG * CROPDMGEXP),
na.rm = TRUE)) %>%
ungroup() %>%
top_n(10) %>%
ggplot(aes(x=econ_damage, y=fct_reorder(EVTYPE, econ_damage))) +
geom_col() +
labs(y = "Severe Weather Event",
x = "Total Economic Damage",
title = "Top 10 Severe Weather Events with the Most Economic Damage from 1950 - 2011"
) +
theme_classic() +
theme(axis.title.x = element_text(size=11, face="bold", color = "black"),
axis.title.y = element_text(size=11, face="bold", color = "black"),
plot.title = element_text(size=12, face="bold", color = "black")
)
Floods were the severe weather events that resulted in the most economic damage in the United States.