1. Research Question & Why It Matters

Research Question

This study addresses the following central question:

“Has the spatial diversity of FEMA disaster declarations increased from 1953 to 2017?”

To answer this, we examine three sub-questions:

Has the number of U.S. states affected by disasters increased over time?
Has the annual diversity of (incident_type × state) combinations increased?
Has the geographic centroid of disaster occurrences shifted in a particular direction?

Why We Should Care

Previous studies primarily focus on how many disasters occur, while the spatial expansion and diversification of disasters remain understudied.
This research is important for several reasons:

Helps evaluate whether national disaster response strategies should shift from region-specific to nationwide approaches
Provides indirect insights into the spatial consequences of climate change
Offers evidence for revising federal budget allocation and resource distribution strategies

Spatial diversity cannot be captured by simple frequency counts.
This study fills a meaningful gap in existing literature by addressing long-term spatial patterns in disaster declarations.

2. Data Source

Dataset

Source: FEMA Disaster Declarations Summary
Period: 1953–2017
Unit of analysis: Emergency, Major Disaster declaration
Provider: Federal Emergency Management Agency (FEMA)

Why We Use This Data

A long-term administrative dataset that enables comparison of disaster patterns across the entire U.S. since 1953
Appropriate for analyzing spatial and temporal patterns

Data Processing

Only variables necessary for analysis were selected
Original structure kept; no additional manipulation
State centroid coordinates taken from R’s built-in state.center (approximate locations)

3. Data Description

Main Variables of Interest

State: Location of disaster occurrence; essential for geographic spread analysis
Disaster Type: Hurricane, storm, flood, wildfire, etc.; required for diversity measurement
Declaration Date: Enables long-term trend analysis
State: Used to calculate the shifting geographic centroid of disasters

Call Data & Functions

# Function Lord
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)
library(lubridate)

## 
## Attaching package: 'lubridate'

## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union

library(maps)

# Read CSV File
disaster <- read.csv("C:/Users/user/Desktop/data/database.csv",
                     stringsAsFactors = FALSE)

# Checking the name of Variables
str(disaster)

## 'data.frame':    46185 obs. of  14 variables:
##  $ Declaration.Number              : chr  "DR-1" "DR-2" "DR-3" "DR-4" ...
##  $ Declaration.Type                : chr  "Disaster" "Disaster" "Disaster" "Disaster" ...
##  $ Declaration.Date                : chr  "05/02/1953" "05/15/1953" "05/29/1953" "06/02/1953" ...
##  $ State                           : chr  "GA" "TX" "LA" "MI" ...
##  $ County                          : chr  "" "" "" "" ...
##  $ Disaster.Type                   : chr  "Tornado" "Tornado" "Flood" "Tornado" ...
##  $ Disaster.Title                  : chr  "Tornado" "Tornado and Heavy Rainfall" "Flood" "Tornado" ...
##  $ Start.Date                      : chr  "05/02/1953" "05/15/1953" "05/29/1953" "06/02/1953" ...
##  $ End.Date                        : chr  "05/02/1953" "05/15/1953" "05/29/1953" "06/02/1953" ...
##  $ Close.Date                      : chr  "06/01/1954" "01/01/1958" "02/01/1960" "02/01/1956" ...
##  $ Individual.Assistance.Program   : chr  "Yes" "Yes" "Yes" "Yes" ...
##  $ Individuals...Households.Program: chr  "No" "No" "No" "No" ...
##  $ Public.Assistance.Program       : chr  "Yes" "Yes" "Yes" "Yes" ...
##  $ Hazard.Mitigation.Program       : chr  "Yes" "Yes" "Yes" "Yes" ...

# Extract date -> year
disaster$Declaration.Date <- as.Date(disaster$Declaration.Date, format = "%m/%d/%Y")
disaster$year <- year(disaster$Declaration.Date)

# Create a subset with only the essential variables for analysis (for readability)
disaster_clean <- disaster %>%
  select(year,
         State,
         Disaster.Type,
         Declaration.Date) %>%
  filter(!is.na(year),
         !is.na(State),
         State != "")

# Descriptive statistics of key variables
summary(disaster_clean)

##       year         State           Disaster.Type      Declaration.Date    
##  Min.   :1953   Length:46185       Length:46185       Min.   :1953-05-02  
##  1st Qu.:1993   Class :character   Class :character   1st Qu.:1993-03-17  
##  Median :2003   Mode  :character   Mode  :character   Median :2003-03-20  
##  Mean   :1998                                         Mean   :1998-10-26  
##  3rd Qu.:2008                                         3rd Qu.:2008-07-09  
##  Max.   :2017                                         Max.   :2017-02-14

State and Disaster Type Variables

State: Two-letter state code (e.g., TX, CA, NY)

Disaster.Type: Disaster categories such as Tornado, Flood, Hurricane

year: Year of the disaster declaration (1953–2017)

These three variables each play a key role:

State → “How many regions are affected?” (spatial extent)

Disaster.Type × State → “Which disasters occur in which regions?” (pattern diversity)

year → “How do these patterns change over time?” (long-term trends)

4. Results

We examine the research questions step-by-step using three primary plots.

4.1 Plot 1: Changes in the Spatial Extent of Disasters

4.1.1 Code for Plot 1

Plot 1 calculates the number of states with at least one disaster declaration per year, visualized with a bar chart, with an overlaid smooth line to show general trends.

X-axis: Year
Y-axis: Number of unique states affected
Bars: Actual yearly state counts
Red Loess curve: Long-term trend

# Number of States Affected Per Year
year_state_count <- disaster_clean %>%
  group_by(year) %>%
  summarise(
    n_states = n_distinct(State),
    .groups = "drop") %>%
  filter(!is.na(year))

head(year_state_count, 100)

## # A tibble: 65 × 2
##     year n_states
##    <dbl>    <int>
##  1  1953       11
##  2  1954       17
##  3  1955       17
##  4  1956       14
##  5  1957       16
##  6  1958        7
##  7  1959        6
##  8  1960       10
##  9  1961       12
## 10  1962       20
## # ℹ 55 more rows

# Plot 1: Bar + Smooth line
ggplot(year_state_count, aes(x = year, y = n_states)) +
  geom_col(fill = "steelblue", alpha = 0.7) +
  geom_smooth(method = "loess", se = FALSE, color = "darkred", linewidth = 1) +
  labs(
    title = "Number of States Affected by Disasters per Year",
    x = "Year",
    y = "Number of States with Disaster Declarations") +
  theme_minimal(base_size = 12)

## `geom_smooth()` using formula = 'y ~ x'

4.1.2 Plot 1 Result

Key Observations - Continuous increase from the late 1950s to early 2000s - Peak in early 2000s - Slight decline afterward

Core Interpretation - Disaster impact has expanded spatially over the long term - Despite yearly fluctuations, the pattern is clear: 1950s < 1980s < 2000s - Some years show more than 40 states affected → Indicates U.S. disasters increasingly impact many regions simultaneously

4.2 Plot 2: Complexity of Hazard Structure

Plot 2 counts unique disaster-type × state combinations for each year. It can measure complexity and diversity of the national disaster “portfolio”.

4.2.1 Calculating the Plot 2

We use the simple measure: unique combinations per year.

If in one year the combinations were (Flood, TX), (Flood, CA), (Hurricane, FL) → then the count = 3

# Unique Type × State Combinations Over Time
combo_diversity <- disaster_clean %>%
  mutate(
    combo = paste(Disaster.Type, State, sep = "_")) %>%
  group_by(year) %>%
  summarise(
    n_unique_combos = n_distinct(combo),
    .groups = "drop") %>%
  filter(!is.na(year))

head(combo_diversity)

## # A tibble: 6 × 2
##    year n_unique_combos
##   <dbl>           <int>
## 1  1953              12
## 2  1954              17
## 3  1955              17
## 4  1956              15
## 5  1957              16
## 6  1958               7

4.2.2 Code for Plot 2

ggplot(combo_diversity, aes(x = year, y = n_unique_combos)) +
  geom_line(color = "darkorange", linewidth = 1) +
  geom_point(color = "darkorange", size = 1) +
  labs(
    title = "Yearly Diversity of (Disaster Type × State) Combinations", 
    x = "Year",
    y = "Number of Unique Combinations") +
  theme_minimal(base_size = 12)

4.2.3 Plot 2 Result

Key Interpretation - Large increase from the 1950s (10–20 combos) to the 2000s (60–90 combos) - Disaster types spread across more regions - Trend shows long-term growth despite fluctuations - Recent decreases may be temporary or reflect external factors

Conclusion Not only has the range expanded, but the U.S. hazard environment has become far more complex and multi-layered.

4.3 Plot 3: Spatial Movement of Disasters

Summary of how the spatial centers of each disaster type have moved over time:

Annual Centroids by Disaster Type (Top 5) 1. Fire, 2. Flood, 3. Snow, 4. storm, 5. Hurricane

4.3.1 Calculating Yearly Centroids

Use the state center coordinates (state.center) for the 50 U.S. states to obtain approximate latitude and longitude for each state.
Map each FEMA disaster declaration to the corresponding state’s centroid coordinates.
For each year, compute the average latitude (lat) and longitude (lon) of all disaster declarations to obtain the yearly centroid.

state_coords <- data.frame(
  State = state.abb,
  lon   = state.center$x,
  lat   = state.center$y)
# 3. Match FEMA data with state centroid coordinates (latitude/longitude)
dis_geo <- disaster %>%
  left_join(state_coords, by = "State") %>%
  filter(!is.na(lon), !is.na(lat), !is.na(Disaster.Type), !is.na(year))

# 4. Extract the top 5 most frequent disaster types
top5_types <- dis_geo %>%
  count(Disaster.Type, sort = TRUE) %>%
  slice_head(n = 5) %>%
  pull(Disaster.Type)

# 5. Compute yearly centroids (mean latitude/longitude) for top 5 disaster types
centroids_top5 <- dis_geo %>%
  filter(Disaster.Type %in% top5_types) %>%
  group_by(Disaster.Type, year) %>%
  summarise(
    centroid_lon = mean(lon, na.rm = TRUE),
    centroid_lat = mean(lat, na.rm = TRUE),
    .groups = "drop") %>%
  arrange(Disaster.Type, year)

# 1. U.S. map data
us_map <- map_data("state")

# 2. Define a function to plot centroid movement for each disaster type
plot_centroid_type <- function(type_label) {
  
  # Filter for the selected disaster type
  df_type <- centroids_top5 %>%
    filter(Disaster.Type == type_label)
  
  ggplot() +
    # Background U.S. map
    geom_polygon(
      data = us_map,
      aes(x = long, y = lat, group = group),
      fill = "white",
      color = "gray70",
      inherit.aes = FALSE) +
    geom_path(
      data = df_type,
      aes(x = centroid_lon, y = centroid_lat, group = 1),
      color = "gray",
      linewidth = 0.5) +
    # Centroid points with color mapped to year
    geom_point(
      data = df_type,
      aes(x = centroid_lon, y = centroid_lat, color = year),
      size = 2) +
    scale_color_gradient(
      low = "blue",
      high = "orange",
      name = "Year") +
    labs(
      title = paste0("Top 5 Disaster: ", type_label),
      subtitle = "FEMA Disaster Declarations, 1953–2017",
      x = "Longitude",
      y = "Latitude") +
    coord_fixed(1.3) +
    theme_minimal(base_size = 11) +
    theme(panel.grid = element_line(linewidth = 0.2, color = "gray85"))}

# 3. Generate plots for the top 5 disaster types
plot_centroid_type("Fire")

plot_centroid_type("Flood")

plot_centroid_type("Snow")

plot_centroid_type("Storm")

plot_centroid_type("Hurricane")

4.3.3 Plot 3 Result

Fire

Early (1960s): Center in the Northeast
Mid-period (1980s): Moved toward central U.S.
Recent (2000s): Concentrated in the West/Southwest → Clear long-term shift East → Central → West

Flood

Highly concentrated in the Midwest
Early and recent centroids cluster in similar regions
No strong directional long-term movement → Flood risk appears anchored in specific inland regions

Snow Spread across a wide East–Central range Early (blue) and recent (orange) points intermixed → No consistent directional shift → High yearly variability
Storm

Clustered in the central inland region
No strong long-term directional pattern
Some recent eastward/southward dispersion → Essentially a stable hazard zone with minor variation

Hurricane

Early (1960s): Southernmost parts of the U.S. coastline
Recent (2000s): Shifted northward and northeastward → Clear shift Southeast → East/Northeast coastline

5. Discussion & Overall Conclusion

5.1 Integrated Interpretation

Combining all analyses, long-term structural changes in U.S. disaster patterns are as follows:

Expansion of Spatial Extent (Plot 1) → More states are being affected over time

Increase in Structural Complexity (Plot 2) → More diverse types of disasters occurring across more regions

Geographic Shifts in Disaster Centers (Plot 3) → Some types (Fire, Hurricane) show clear directional movement → Others (Flood, Storm) remain stable → Snow is highly variable

→ Together, these indicators show a structural reconfiguration of national disaster patterns compared to the past.

4.2 Limitations & Further Work

Limitations - Dataset includes only declared FEMA disasters → Non-declared events excluded - state.center provides approximate—not precise—geographic locations

Further Work - Weighted centroid analysis incorporating cost or damage severity - Clustering analysis based on disaster occurrence patterns

Team Final Report

Spatio-Temporal Changes in U.S. Federal Disasters (1953–2017)

박미현, 김현서

2025-12-04