Executive summary

How has the number of disaster declarations changed over time? The number of disasters declared has increased significantly since 1953.Peaks in declarations are seen around 2005 (Hurricane Katrina).

What types of disasters occur most frequently in the U.S.? Storms, floods, and hurricanes are the most commonly declared disasters.

Which states experience the most disaster declarations? Texas, Missouri, Kentucky, Virginia, and Oklahoma have the highest number of disaster declarations.

Are there seasonal patterns in disaster occurrences? September has the highest number of disasters, aligning with hurricane season. Winter months (December–February) show more occurrences of snow and ice-related disasters.

The findings from this analysis underscore the importance of proactive disaster management, infrastructure resilience, and climate adaptation policies to minimize the impact of future disasters.

Data background

The dataset, sourced from the Federal Emergency Management Agency (FEMA) on Kaggle, catalogs all federally declared disasters in the United States since 1953 to 2017. Each record includes details such as the disaster number, state affected, declaration type, dates, and programs activated. This comprehensive dataset provides insights into the frequency, types, and geographic distribution of disasters over time, aiding in analysis and policy development.

Data cleaning

library(tidyverse)
library(treemapify)
library(lubridate)

disaster_data <- read_csv("c:/CS134/FinalProject/Federal Emergencies and Disasters.csv")

# Removes rows with any NA values
disaster_data_cleaned <- disaster_data %>%
  drop_na()  # Removes rows with any NA values

# Remove Duplicate Rows
disaster_data_cleaned <- disaster_data %>%
  distinct()

# Convert "Declaration Date" to Date Format & Extract Year
disaster_data <- disaster_data %>%
  mutate(Declaration_Date = as.Date(`Declaration Date`, format="%m/%d/%Y"),
         Year = year(Declaration_Date))

Individual figures

Figure 1

I used a stacked bar chart to show the yearly trend of disaster declarations categorized by type, making it easy to compare changes over time. This visualization was chosen because it effectively highlights the increasing frequency of disasters and notable spikes, such as Hurricane Katrina in 2005.

disasters_by_type <- disaster_data %>%
  group_by(Year, `Disaster Type`) %>%
  summarise(Count = n()) %>%
  ungroup()

ggplot(disasters_by_type, aes(x = Year, y = Count, fill = `Disaster Type`)) +
  geom_bar(stat = "identity") +
  scale_x_continuous(breaks = seq(1950, max(disasters_by_type$Year), by = 10)) +
  scale_fill_viridis_d(option = "plasma") +
  labs(title = "Annual Disaster Declarations (1953-2017)",
       subtitle = "Stacked Bar Chart of Declared Disasters by Type",
       x = "Year",
       y = "Number of Disasters Declared",
       fill = "Disaster Type") +
  theme_minimal() +
  theme(plot.title = element_text(size = 16, face = "bold"),
        axis.text.x = element_text(angle = 45, hjust = 1)) +
  
  annotate("text", x = 2004, y = max(disasters_by_type$Count) - 50, 
           label = "Hurricane Katrina (2005)", color = "red", size = 5)

Figure 2

A treemap was selected to represent the top 10 disaster types, emphasizing their relative frequency. This type of visualization clearly shows which disasters dominate federal declarations, with storms and floods leading the lis with hurricanes also ranking high.

disaster_counts <- disaster_data %>%
  count(`Disaster Type`, sort = TRUE) %>%
  top_n(10, n)

top_disaster <- disaster_counts %>% slice_max(n, n = 1)

ggplot(disaster_counts, aes(area = n, fill = `Disaster Type`, label = `Disaster Type`)) +
  geom_treemap() +
  geom_treemap_text(color = "white", place = "centre", grow = TRUE) +
  geom_treemap(data = top_disaster, fill = NA, color = "red", size = 2) +
  scale_fill_viridis_d(option = "plasma") +
  labs(title = "Top 10 Disaster Types") +
  theme_minimal() +
  theme(legend.position = "none") +
  annotate("text", x = 0.5, y = -0.1, 
             label = paste(top_disaster$`Disaster Type`, 
                           "(", top_disaster$n, "cases)"), 
             color = "red", size = 6, fontface = "bold")

Figure 3

A horizontal bar chart was used to display disaster declarations by state, highlighting regional differences. This format makes it easy to compare which states experience the most disasters, with Texas and Missouri standing out.This may be due to their geographical exposure to hurricanes, tornadoes, and severe storms.

state_disasters <- disaster_data %>%
  count(State, sort = TRUE) %>%
  top_n(20, n)  # Select top 20 states

most_disaster_state <- state_disasters %>% slice_max(n, n = 1)

ggplot(state_disasters, aes(x = n, y = reorder(State, n), fill = n)) +
  geom_col() +
  geom_col(data = most_disaster_state, fill = "red", color = "black", size = 1) +
  scale_fill_viridis_c(option = "plasma", direction = -1) +  
  labs(title = "Top 20 States with Most Disasters Declared",
       x = "Number of Disasters",
       y = "State") +
  theme_minimal() +
  theme(legend.position = "right",  
        plot.title = element_text(size = 16, face = "bold"),
        axis.text.y = element_text(size = 10)) +  
  guides(fill = guide_colorbar(title = "Disaster Count")) +
  annotate("text", x = most_disaster_state$n + 50, y = most_disaster_state$State, 
           label = paste("Most Disasters:", most_disaster_state$State, 
                         "(", most_disaster_state$n, "cases)"), 
           color = "red", size = 5, fontface = "bold")

Figure 4

The heatmap reveals seasonal and long-term trends in disaster occurrences, with some months consistently experiencing higher disaster declarations. This pattern suggests the influence of seasonal weather conditions, such as hurricanes in late summer and winter storms in colder months. Identifying these trends can improve disaster response planning and resource allocation.

disaster_data <- disaster_data %>%
  mutate(Declaration_Date = as.Date(`Declaration Date`, format="%m/%d/%Y"),
         Year = year(Declaration_Date),
         Month = month(Declaration_Date, label = TRUE, abbr = FALSE))

disasters_by_month_year <- disaster_data %>%
  count(Year, Month) %>%
  spread(key = Year, value = n, fill = 0)

disasters_melted <- disasters_by_month_year %>%
  gather(key = "Year", value = "Count", -Month) %>%
  mutate(Year = as.integer(Year))

ggplot(disasters_melted, aes(x = Year, y = Month, fill = Count)) +
  geom_tile() +
  scale_fill_viridis_c(option = "inferno", direction = -1) +
  labs(title = "Monthly Disaster Trends Over the Years",
       x = "Year",
       y = "Month",
       fill = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Figure 5

A bar chart was used to compare disaster occurrences by month, showing that September has the highest frequency. September has the highest number of disaster declarations, likely due to peak hurricane activity. The clear seasonal variation in disasters highlights the need for proactive disaster preparedness, especially in high-risk months. Strengthening emergency response measures during these periods can help mitigate disaster impacts.

seasonal_disasters <- disaster_data %>%
  mutate(Month = month(Declaration_Date, label = TRUE, abbr = TRUE)) %>%
  count(Month, sort = FALSE)

highlight_september <- seasonal_disasters %>% filter(Month == "Sep")

ggplot(seasonal_disasters, aes(x = Month, y = n, fill = n)) +
  geom_col() +
  scale_fill_viridis_c(option = "magma", direction = -1) +
  geom_text(data = highlight_september, aes(label = n), vjust = -0.5, color = "red", size = 5, fontface = "bold") +
  labs(title = "Seasonal Distribution of Disasters",
       x = "Month",
       y = "Number of Disasters") +
  theme_minimal() +
  theme(plot.title = element_text(size = 16, face = "bold"))