Stat220 Portfolio Project 3

Clara Fields 2025-02-26

Portfolio Project 3

Billion Dollar Disasters

The function read_disaster_data reads in the data from each year, parses the columns as specific types, and mutates the code to create separate columns for begin date, end date, and name of each disaster.

paths <- list.files("data/", pattern = "[.]csv$", full.names = TRUE)
read_disaster_data <- function(path){read_csv(path, skip = 2, col_types = list(
  "Name" = col_character(),
  "Disaster" = col_factor(), 
  "Begin Date" = col_date(format = "%Y %m %d" ), 
  "End Date" = col_date(format = "%Y %m %d"), 
  "CPI-Adjusted Cost" = col_double(),
  "Unadjusted Cost" = col_double(),
  "Deaths" = col_double())) %>%
    mutate("Name" = str_split_i(Name, "[(]", i = 1)) 
  }

To make graphing easier, I cleaned up the variable names with janitor::clean_names() and added a year column.

disaster_data <- paths %>%
  map(read_disaster_data) %>%
  bind_rows() %>%
  janitor::clean_names() %>%
  mutate(year = str_sub(begin_date, 1, 4))
  
write_csv(disaster_data, "disaster_data.csv")
disaster_data_1 <- read_csv("disaster_data.csv", col_types = list(Year = col_date()))

Graph 1: Bar plot of disaster type in each year

disaster_data_1 %>%
  ggplot(mapping = aes(x = year, fill = disaster)) + geom_bar() + scale_fill_viridis_d() + scale_x_discrete(breaks = c(1980, 1982, 1984, 1986, 1988, 1990, 1992, 1994, 1996, 1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024)) + labs(title = "United States Billion-Dollar Events 1980-2024 (CPI-Adjusted)", x = "Year", y = "Number of Events") + theme_minimal()

Graph 2: Time series plot of combined cost for each year

The variable total_cost represents the total CPI-adjusted cost of disasters in each year.

disaster_data_2 <- disaster_data_1 %>%
  group_by(year) %>%
  mutate(total_cost = (sum(cpi_adjusted_cost))/1000)
disaster_data_2 %>%
  ungroup%>%
  ggplot(mapping = aes(x = year, y = total_cost)) + geom_point(color = "red4") + geom_line(color = "red4") + scale_x_continuous(breaks = c(1980, 1982, 1984, 1986, 1988, 1990, 1992, 1994, 1996, 1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024)) + labs(title = "Combined Cost of Disasters per Year", x = "Year", y = "Combined Cost in Billions") + theme_minimal()