This assignment is Peer-graded Assignment: Course Project 2
This analysis explores the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database, focusing on the impact of severe weather events on population health and the economy. We identify the most harmful event types in terms of fatalities and injuries, and those with the greatest economic consequences.
setwd("C:/Users/baret/Desktop/R Programming Storm Data")
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.3.3
## Warning: package 'ggplot2' was built under R version 4.3.3
## Warning: package 'tibble' was built under R version 4.3.3
## Warning: package 'tidyr' was built under R version 4.3.3
## Warning: package 'readr' was built under R version 4.3.3
## Warning: package 'purrr' was built under R version 4.3.3
## Warning: package 'dplyr' was built under R version 4.3.3
## Warning: package 'stringr' was built under R version 4.3.3
## Warning: package 'forcats' was built under R version 4.3.3
## Warning: package 'lubridate' was built under R version 4.3.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(lubridate)
storm_data <- read_csv("repdata-data-StormData.csv")
## Rows: 902297 Columns: 37
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (18): BGN_DATE, BGN_TIME, TIME_ZONE, COUNTYNAME, STATE, EVTYPE, BGN_AZI,...
## dbl (18): STATE__, COUNTY, BGN_RANGE, COUNTY_END, END_RANGE, LENGTH, WIDTH, ...
## lgl (1): COUNTYENDN
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
storm_data <- storm_data %>%
mutate(
BGN_DATE = as.Date(BGN_DATE, format="%m/%d/%Y %H:%M:%S"),
END_DATE = as.Date(END_DATE, format="%m/%d/%Y %H:%M:%S")
)
storm_data <- storm_data %>%
mutate_at(vars(STATE, COUNTYNAME, EVTYPE, BGN_AZI, BGN_LOCATI, END_AZI, END_LOCATI, PROPDMGEXP, CROPDMGEXP), as.factor)
storm_data <- storm_data %>%
mutate_at(vars(FATALITIES, INJURIES, PROPDMG, CROPDMG), as.numeric)
storm_data <- storm_data %>%
drop_na(FATALITIES, INJURIES, PROPDMG, CROPDMG)
convert_exp <- function(exp) {
ifelse(exp %in% c('H', 'h'), 100,
ifelse(exp %in% c('K', 'k'), 1000,
ifelse(exp %in% c('M', 'm'), 1e6,
ifelse(exp %in% c('B', 'b'), 1e9,
ifelse(exp %in% c('', '-', '?', '+'), 1,
as.numeric(exp))))))
}
storm_data <- storm_data %>%
mutate(
PROPDMGEXP = sapply(PROPDMGEXP, convert_exp),
CROPDMGEXP = sapply(CROPDMGEXP, convert_exp),
PROPDMG = PROPDMG * PROPDMGEXP,
CROPDMG = CROPDMG * CROPDMGEXP
) %>%
select(-PROPDMGEXP, -CROPDMGEXP)
health_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarize(
total_fatalities = sum(FATALITIES),
total_injuries = sum(INJURIES)
) %>%
arrange(desc(total_fatalities), desc(total_injuries))
top_10_fatalities <- health_impact %>% top_n(10, total_fatalities)
# The events most harmful with respect to population health are: Tornadoes, Excessive Heat, Flash Floods.
economic_impact <- storm_data %>%
group_by(EVTYPE) %>%
summarize(
total_property_damage = sum(PROPDMG),
total_crop_damage = sum(CROPDMG)
) %>%
arrange(desc(total_property_damage + total_crop_damage))
top_10_economic_damage <- economic_impact %>% top_n(10, total_property_damage + total_crop_damage)
# The events with the greatest economic consequences are: Tornadoes, TSTM Wind, and Hail; High Winds/Cold; Hurricane Opal/High Winds and Winter Storm High Winds.
This analysis has highlighted the weather events that have the most significant impact on population health and economic damage in the United States. Tornadoes, excessive heat, and floods are among the top events affecting human health, while floods, hurricanes, and tornadoes are the costliest in terms of economic damage. These findings can inform policy and resource allocation to mitigate the effects of severe weather events.