Stable-CCA Crash Typology Discovery Using clustrd

1 Purpose

This notebook implements a Stability-Regularized Cluster Correspondence Analysis (Stable-CCA) framework for crash typology discovery. Each row of the data represents one crash record, and each crash is described using categorical roadway, traffic control, environmental, human factor, and severity attributes.

The framework uses clustrd::clusmca() to estimate Cluster Correspondence Analysis and adds a bootstrap-based stability layer. The purpose is to move beyond a one-time exploratory CCA and develop a reproducibility-aware crash typology method.

2 Research Questions

2.1 RQ1: Stable crash typology discovery

Can Cluster Correspondence Analysis identify stable and reproducible crash typologies from categorical crash records?

2.2 RQ2: Engineering interpretability

Do stable crash typologies reveal distinct roadway, traffic control, human factor, and severity mechanisms that can support engineering countermeasure selection?

2.3 RQ3: Countermeasure readiness

Which stable crash typologies are most actionable from a Safe System and roadway safety engineering perspective?

3 Methodological Contribution

This study introduces Stable-CCA, a bootstrap-regularized extension of Cluster Correspondence Analysis for crash data. The method:

  1. Treats each crash record as a categorical profile.
  2. Applies Cluster Correspondence Analysis using clustrd::clusmca().
  3. Repeats CCA clustering over bootstrap samples.
  4. Aligns bootstrap cluster labels to the reference solution.
  5. Computes crash-level, cluster-level, and model-level stability.
  6. Selects the number of clusters using a composite stability score.
  7. Profiles stable crash typologies and computes a countermeasure readiness score.

4 Packages

# Install packages if needed
# install.packages(c("tidyverse", "clustrd", "mclust", "clue", "janitor", "ggplot2", "scales", "rmdformats"))

library(tidyverse)
library(clustrd)
library(mclust)
library(clue)
library(janitor)
library(ggplot2)
library(scales)

5 Data

The following sample dataset is based on the crash-record structure provided by the user. In the full study, replace this section with a file import, such as read.csv() or readr::read_csv().

crash_df <- tribble(
  ~Wthr, ~Lght, ~Rd_l, ~Trf_, ~Int_, ~FHE_, ~Oth_, ~Rd_c, ~PSL, ~Sesn, ~Cnt_, ~Ethn, ~Gndr, ~Hlmt, ~Dr__, ~Pp_G, ~Svrt,
  "Clear",  "Daylight", "Straight, level", "Signal light", "Int related", "Omv vehicle going straight", "Other", "City street", "30-40 mph", "Winter", "Other", "Hispanic", "Male", "Not worn", "Other", "250,000 pop. And over", "BC",
  "Clear",  "Dark, not lighted", "Straight, level", "Stop sign", "Int related", "Omv vehicle turning left", "Other", "City street", "45-60 mph", "Winter", "Pedestrian ftyrow to vehicle", "Hispanic", "Male", "Not worn", "Unlicensed", "100,000 - 249,999 pop.", "BC",
  "Clear",  "Daylight", "Straight, level", "Marked lanes", "Non Int", "Omv vehicle going straight", "Other", "Other", "30-40 mph", "Winter", "Other", "Asian", "Male", "Not worn", "Other", "Other", "BC",
  "Cloudy", "Daylight", "Straight, level", "Stop sign", "Int", "Omv vehicle turning left", "Other", "City street", "30-40 mph", "Winter", "Other", "White", "Male", "Not worn", "Other", "Other", "BC",
  "Clear",  "Daylight", "Straight, level", "Signal light", "Int related", "Omv vehicle going straight", "Inattentive driving", "City street", "30-40 mph", "Winter", "Other", "White", "Female", "Not worn", "Other", "Other", "BC",
  "Clear",  "Daylight", "Straight, level", "Stop sign", "Int", "Omv vehicle going straight", "Other", "City street", "30-40 mph", "Winter", "Other", "Hispanic", "Female", "Not worn", "Other", "Other", "BC",
  "Cloudy", "Daylight", "Straight, level", "Stop sign", "Int", "Omv vehicle going straight", "Inattentive driving", "City street", "30-40 mph", "Winter", "Driver inattention", "White", "Male", "Not worn", "Other", "50,000 - 99,999 pop", "O",
  "Clear",  "Daylight", "Straight, level", "Other", "Driveway accs", "Omv vehicle turning left", "veh leaving driveway", "City street", "30-40 mph", "Winter", "Other", "White", "Male", "Not worn", "Other", "Other", "BC",
  "Clear",  "Daylight", "Straight, level", "Signal light", "Non Int", "Omv vehicle going straight", "Other", "Other", "45-60 mph", "Winter", "Other", "Hispanic", "Male", "Not worn", "Other", "250,000 pop. And over", "BC",
  "Cloudy", "Daylight", "Straight, level", "Marked lanes", "Int related", "Omv vehicle going straight", "Other", "City street", "45-60 mph", "Winter", "Not applicable", "White", "Male", "Not worn", "Other", "Other", "BC",
  "Clear",  "Daylight", "Straight, level", "None", "Non Int", "Omv vehicle going straight", "Other", "City street", "30-40 mph", "Winter", "Other", "White", "Female", "Not worn", "Unlicensed", "Other", "BC"
)

crash_df <- crash_df %>%
  mutate(Crash_ID = paste0("C", row_number())) %>%
  relocate(Crash_ID)

vars_for_cca <- c(
  "Wthr", "Lght", "Rd_l", "Trf_", "Int_", "FHE_", "Oth_",
  "Rd_c", "PSL", "Sesn", "Cnt_", "Ethn", "Gndr", "Hlmt",
  "Dr__", "Pp_G", "Svrt"
)

crash_cat <- crash_df %>%
  mutate(across(all_of(vars_for_cca), ~ as.factor(.x)))

knitr::kable(crash_cat, caption = "Sample crash data. Each row represents one crash record.")
Sample crash data. Each row represents one crash record.
Crash_ID Wthr Lght Rd_l Trf_ Int_ FHE_ Oth_ Rd_c PSL Sesn Cnt_ Ethn Gndr Hlmt Dr__ Pp_G Svrt
C1 Clear Daylight Straight, level Signal light Int related Omv vehicle going straight Other City street 30-40 mph Winter Other Hispanic Male Not worn Other 250,000 pop. And over BC
C2 Clear Dark, not lighted Straight, level Stop sign Int related Omv vehicle turning left Other City street 45-60 mph Winter Pedestrian ftyrow to vehicle Hispanic Male Not worn Unlicensed 100,000 - 249,999 pop. BC
C3 Clear Daylight Straight, level Marked lanes Non Int Omv vehicle going straight Other Other 30-40 mph Winter Other Asian Male Not worn Other Other BC
C4 Cloudy Daylight Straight, level Stop sign Int Omv vehicle turning left Other City street 30-40 mph Winter Other White Male Not worn Other Other BC
C5 Clear Daylight Straight, level Signal light Int related Omv vehicle going straight Inattentive driving City street 30-40 mph Winter Other White Female Not worn Other Other BC
C6 Clear Daylight Straight, level Stop sign Int Omv vehicle going straight Other City street 30-40 mph Winter Other Hispanic Female Not worn Other Other BC
C7 Cloudy Daylight Straight, level Stop sign Int Omv vehicle going straight Inattentive driving City street 30-40 mph Winter Driver inattention White Male Not worn Other 50,000 - 99,999 pop O
C8 Clear Daylight Straight, level Other Driveway accs Omv vehicle turning left veh leaving driveway City street 30-40 mph Winter Other White Male Not worn Other Other BC
C9 Clear Daylight Straight, level Signal light Non Int Omv vehicle going straight Other Other 45-60 mph Winter Other Hispanic Male Not worn Other 250,000 pop. And over BC
C10 Cloudy Daylight Straight, level Marked lanes Int related Omv vehicle going straight Other City street 45-60 mph Winter Not applicable White Male Not worn Other Other BC
C11 Clear Daylight Straight, level None Non Int Omv vehicle going straight Other City street 30-40 mph Winter Other White Female Not worn Unlicensed Other BC

6 Descriptive Summary

summary_table <- crash_cat %>%
  select(all_of(vars_for_cca)) %>%
  summarise(across(everything(), ~ n_distinct(.x))) %>%
  pivot_longer(everything(), names_to = "Variable", values_to = "Number_of_Categories")

knitr::kable(summary_table, caption = "Number of categories by crash variable.")
Number of categories by crash variable.
Variable Number_of_Categories
Wthr 2
Lght 2
Rd_l 1
Trf_ 5
Int_ 4
FHE_ 2
Oth_ 3
Rd_c 2
PSL 2
Sesn 1
Cnt_ 4
Ethn 3
Gndr 2
Hlmt 1
Dr__ 2
Pp_G 4
Svrt 2

7 Core Stable-CCA Functions

7.1 Function 1: Run clustrd CCA

run_clustrd_cca <- function(data,
                            vars,
                            nclus = 2,
                            ndim = 2,
                            method = "clusCA",
                            nstart = 100,
                            seed = 123) {
  
  x <- data %>%
    select(all_of(vars)) %>%
    mutate(across(everything(), as.factor))
  
  fit <- clustrd::clusmca(
    data = x,
    nclus = nclus,
    ndim = ndim,
    method = method,
    nstart = nstart,
    seed = seed
  )
  
  return(fit)
}

7.2 Function 2: Align bootstrap cluster labels

Cluster labels can switch across bootstrap runs. This function aligns the bootstrap solution to the reference solution by maximizing label overlap.

align_clusters <- function(base_labels, boot_labels, common_index) {
  
  base_common <- base_labels[common_index]
  
  tab <- table(base_common, boot_labels)
  cost <- max(tab) - tab
  
  assignment <- clue::solve_LSAP(cost)
  
  map <- setNames(seq_along(assignment), assignment)
  aligned <- map[as.character(boot_labels)]
  
  return(as.integer(aligned))
}

7.3 Function 3: Stability-Regularized CCA

stable_clustrd_cca <- function(data,
                               vars,
                               id_col = "Crash_ID",
                               nclus = 2,
                               ndim = 2,
                               method = "clusCA",
                               n_boot = 50,
                               sample_frac = 0.80,
                               nstart = 100,
                               seed = 123) {
  
  set.seed(seed)
  
  n <- nrow(data)
  ids <- data[[id_col]]
  
  base_fit <- run_clustrd_cca(
    data = data,
    vars = vars,
    nclus = nclus,
    ndim = ndim,
    method = method,
    nstart = nstart,
    seed = seed
  )
  
  base_labels <- base_fit$cluster
  
  assignment_mat <- matrix(NA, nrow = n, ncol = n_boot)
  rownames(assignment_mat) <- ids
  
  ari_values <- numeric(n_boot)
  
  for (b in seq_len(n_boot)) {
    
    boot_index <- sample(
      seq_len(n),
      size = floor(sample_frac * n),
      replace = TRUE
    )
    
    unique_boot_index <- unique(boot_index)
    
    if (length(unique_boot_index) <= nclus) next
    
    boot_data <- data[unique_boot_index, ]
    
    boot_fit <- tryCatch(
      run_clustrd_cca(
        data = boot_data,
        vars = vars,
        nclus = nclus,
        ndim = ndim,
        method = method,
        nstart = nstart,
        seed = seed + b
      ),
      error = function(e) NULL
    )
    
    if (is.null(boot_fit)) next
    
    boot_labels_raw <- boot_fit$cluster
    
    aligned_labels <- align_clusters(
      base_labels = base_labels,
      boot_labels = boot_labels_raw,
      common_index = unique_boot_index
    )
    
    assignment_mat[unique_boot_index, b] <- aligned_labels
    
    ari_values[b] <- mclust::adjustedRandIndex(
      base_labels[unique_boot_index],
      aligned_labels
    )
  }
  
  crash_stability <- purrr::map_dbl(seq_len(n), function(i) {
    
    observed <- assignment_mat[i, ]
    observed <- observed[!is.na(observed)]
    
    if (length(observed) == 0) return(NA_real_)
    
    mean(observed == base_labels[i])
  })
  
  cluster_stability <- tibble(
    Crash_ID = ids,
    Cluster = base_labels,
    Crash_Stability = crash_stability
  ) %>%
    group_by(Cluster) %>%
    summarise(
      N_Crashes = n(),
      Mean_Stability = mean(Crash_Stability, na.rm = TRUE),
      Median_Stability = median(Crash_Stability, na.rm = TRUE),
      Min_Stability = min(Crash_Stability, na.rm = TRUE),
      .groups = "drop"
    )
  
  out <- list(
    base_fit = base_fit,
    base_cluster = base_labels,
    assignment_matrix = assignment_mat,
    crash_stability = crash_stability,
    cluster_stability = cluster_stability,
    mean_ari = mean(ari_values, na.rm = TRUE),
    overall_stability = mean(crash_stability, na.rm = TRUE),
    ari_values = ari_values
  )
  
  return(out)
}

8 Run Stable-CCA

For the small sample, the analysis uses two clusters and two CCA dimensions. For a full crash dataset, use more bootstrap replications and test multiple cluster solutions.

stable_fit <- stable_clustrd_cca(
  data = crash_cat,
  vars = vars_for_cca,
  id_col = "Crash_ID",
  nclus = 2,
  ndim = 2,
  method = "clusCA",
  n_boot = 50,
  sample_frac = 0.80,
  nstart = 100,
  seed = 123
)
##   |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%
crash_results <- crash_cat %>%
  mutate(
    Stable_CCA_Cluster = stable_fit$base_cluster,
    Crash_Stability = stable_fit$crash_stability,
    Stability_Class = case_when(
      Crash_Stability >= 0.80 ~ "Highly stable",
      Crash_Stability >= 0.60 ~ "Moderately stable",
      Crash_Stability >= 0.40 ~ "Weakly stable",
      TRUE ~ "Unstable"
    )
  )

knitr::kable(crash_results, caption = "Crash-level Stable-CCA results.")
Crash-level Stable-CCA results.
Crash_ID Wthr Lght Rd_l Trf_ Int_ FHE_ Oth_ Rd_c PSL Sesn Cnt_ Ethn Gndr Hlmt Dr__ Pp_G Svrt Stable_CCA_Cluster Crash_Stability Stability_Class
C1 Clear Daylight Straight, level Signal light Int related Omv vehicle going straight Other City street 30-40 mph Winter Other Hispanic Male Not worn Other 250,000 pop. And over BC 1 1.0 Highly stable
C2 Clear Dark, not lighted Straight, level Stop sign Int related Omv vehicle turning left Other City street 45-60 mph Winter Pedestrian ftyrow to vehicle Hispanic Male Not worn Unlicensed 100,000 - 249,999 pop. BC 2 1.0 Highly stable
C3 Clear Daylight Straight, level Marked lanes Non Int Omv vehicle going straight Other Other 30-40 mph Winter Other Asian Male Not worn Other Other BC 1 1.0 Highly stable
C4 Cloudy Daylight Straight, level Stop sign Int Omv vehicle turning left Other City street 30-40 mph Winter Other White Male Not worn Other Other BC 1 1.0 Highly stable
C5 Clear Daylight Straight, level Signal light Int related Omv vehicle going straight Inattentive driving City street 30-40 mph Winter Other White Female Not worn Other Other BC 1 1.0 Highly stable
C6 Clear Daylight Straight, level Stop sign Int Omv vehicle going straight Other City street 30-40 mph Winter Other Hispanic Female Not worn Other Other BC 1 1.0 Highly stable
C7 Cloudy Daylight Straight, level Stop sign Int Omv vehicle going straight Inattentive driving City street 30-40 mph Winter Driver inattention White Male Not worn Other 50,000 - 99,999 pop O 1 1.0 Highly stable
C8 Clear Daylight Straight, level Other Driveway accs Omv vehicle turning left veh leaving driveway City street 30-40 mph Winter Other White Male Not worn Other Other BC 2 0.5 Weakly stable
C9 Clear Daylight Straight, level Signal light Non Int Omv vehicle going straight Other Other 45-60 mph Winter Other Hispanic Male Not worn Other 250,000 pop. And over BC 1 1.0 Highly stable
C10 Cloudy Daylight Straight, level Marked lanes Int related Omv vehicle going straight Other City street 45-60 mph Winter Not applicable White Male Not worn Other Other BC 1 1.0 Highly stable
C11 Clear Daylight Straight, level None Non Int Omv vehicle going straight Other City street 30-40 mph Winter Other White Female Not worn Unlicensed Other BC 1 1.0 Highly stable

9 Stability Results

knitr::kable(stable_fit$cluster_stability, caption = "Cluster-level stability results.")
Cluster-level stability results.
Cluster N_Crashes Mean_Stability Median_Stability Min_Stability
1 9 1.00 1.00 1.0
2 2 0.75 0.75 0.5
cat("Overall stability:", round(stable_fit$overall_stability, 3), "\n")
## Overall stability: 0.955
cat("Mean adjusted Rand index:", round(stable_fit$mean_ari, 3), "\n")
## Mean adjusted Rand index: 0.068
tibble(ARI = stable_fit$ari_values) %>%
  filter(!is.na(ARI)) %>%
  ggplot(aes(x = ARI)) +
  geom_histogram(bins = 10, color = "white") +
  labs(
    title = "Bootstrap Stability Distribution",
    subtitle = "Adjusted Rand Index between reference and bootstrap-aligned CCA clusters",
    x = "Adjusted Rand Index",
    y = "Bootstrap Count"
  ) +
  theme_minimal()

10 Cluster Selection

For the small sample, only k = 2:3 is evaluated. For a full dataset, use a wider range such as k = 2:8.

evaluate_clustrd_k <- function(data,
                               vars,
                               k_values = 2:3,
                               ndim = 2,
                               method = "clusCA",
                               n_boot = 30,
                               sample_frac = 0.80,
                               nstart = 100,
                               seed = 123) {
  
  purrr::map_dfr(k_values, function(k) {
    
    message("Evaluating k = ", k)
    
    fit_k <- stable_clustrd_cca(
      data = data,
      vars = vars,
      id_col = "Crash_ID",
      nclus = k,
      ndim = ndim,
      method = method,
      n_boot = n_boot,
      sample_frac = sample_frac,
      nstart = nstart,
      seed = seed + k
    )
    
    min_cluster_stability <- min(
      fit_k$cluster_stability$Mean_Stability,
      na.rm = TRUE
    )
    
    tibble(
      K = k,
      Overall_Stability = fit_k$overall_stability,
      Mean_ARI = fit_k$mean_ari,
      Min_Cluster_Stability = min_cluster_stability,
      Stable_CCA_Score =
        0.50 * Mean_ARI +
        0.30 * Overall_Stability +
        0.20 * Min_Cluster_Stability
    )
  }) %>%
    arrange(desc(Stable_CCA_Score))
}
k_results <- evaluate_clustrd_k(
  data = crash_cat,
  vars = vars_for_cca,
  k_values = 2:3,
  ndim = 2,
  method = "clusCA",
  n_boot = 30,
  sample_frac = 0.80,
  nstart = 100,
  seed = 123
)
##   |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%
##   |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%  |                                                                              |============================                                          |  40%  |                                                                              |===================================                                   |  50%  |                                                                              |==========================================                            |  60%  |                                                                              |=================================================                     |  70%  |                                                                              |========================================================              |  80%  |                                                                              |===============================================================       |  90%  |                                                                              |======================================================================| 100%  |                                                                              |                                                                      |   0%  |                                                                              |=======                                                               |  10%  |                                                                              |==============                                                        |  20%  |                                                                              |=====================                                                 |  30%
knitr::kable(k_results, digits = 3, caption = "Candidate cluster solutions ranked by Stable-CCA score.")
Candidate cluster solutions ranked by Stable-CCA score.
K Overall_Stability Mean_ARI Min_Cluster_Stability Stable_CCA_Score
3 0.863 0.389 0.703 0.594
2 1.000 0.100 1.000 0.550
k_results %>%
  pivot_longer(
    cols = c(Overall_Stability, Mean_ARI, Min_Cluster_Stability, Stable_CCA_Score),
    names_to = "Metric",
    values_to = "Value"
  ) %>%
  ggplot(aes(x = K, y = Value, group = Metric, linetype = Metric, shape = Metric)) +
  geom_line() +
  geom_point(size = 3) +
  scale_x_continuous(breaks = k_results$K) +
  labs(
    title = "Stable-CCA Cluster Selection Metrics",
    x = "Number of Clusters",
    y = "Metric Value"
  ) +
  theme_minimal()

11 Cluster Profiling

describe_clusters <- function(data, cluster_col, vars) {
  
  purrr::map_dfr(vars, function(v) {
    
    data %>%
      group_by(
        Cluster = .data[[cluster_col]],
        Category = .data[[v]]
      ) %>%
      summarise(
        n = n(),
        .groups = "drop_last"
      ) %>%
      mutate(
        pct_within_cluster = 100 * n / sum(n)
      ) %>%
      ungroup() %>%
      mutate(
        Variable = v
      ) %>%
      select(Cluster, Variable, Category, n, pct_within_cluster)
  })
}
cluster_profile <- describe_clusters(
  data = crash_results,
  cluster_col = "Stable_CCA_Cluster",
  vars = vars_for_cca
)

top_cluster_profile <- cluster_profile %>%
  group_by(Cluster, Variable) %>%
  slice_max(pct_within_cluster, n = 2, with_ties = FALSE) %>%
  arrange(Cluster, Variable, desc(pct_within_cluster))

knitr::kable(top_cluster_profile, digits = 1, caption = "Top categorical features within each Stable-CCA cluster.")
Top categorical features within each Stable-CCA cluster.
Cluster Variable Category n pct_within_cluster
1 Cnt_ Other 7 77.8
1 Cnt_ Driver inattention 1 11.1
1 Dr__ Other 8 88.9
1 Dr__ Unlicensed 1 11.1
1 Ethn White 5 55.6
1 Ethn Hispanic 3 33.3
1 FHE_ Omv vehicle going straight 8 88.9
1 FHE_ Omv vehicle turning left 1 11.1
1 Gndr Male 6 66.7
1 Gndr Female 3 33.3
1 Hlmt Not worn 9 100.0
1 Int_ Int 3 33.3
1 Int_ Int related 3 33.3
1 Lght Daylight 9 100.0
1 Oth_ Other 7 77.8
1 Oth_ Inattentive driving 2 22.2
1 PSL 30-40 mph 7 77.8
1 PSL 45-60 mph 2 22.2
1 Pp_G Other 6 66.7
1 Pp_G 250,000 pop. And over 2 22.2
1 Rd_c City street 7 77.8
1 Rd_c Other 2 22.2
1 Rd_l Straight, level 9 100.0
1 Sesn Winter 9 100.0
1 Svrt BC 8 88.9
1 Svrt O 1 11.1
1 Trf_ Signal light 3 33.3
1 Trf_ Stop sign 3 33.3
1 Wthr Clear 6 66.7
1 Wthr Cloudy 3 33.3
2 Cnt_ Other 1 50.0
2 Cnt_ Pedestrian ftyrow to vehicle 1 50.0
2 Dr__ Other 1 50.0
2 Dr__ Unlicensed 1 50.0
2 Ethn Hispanic 1 50.0
2 Ethn White 1 50.0
2 FHE_ Omv vehicle turning left 2 100.0
2 Gndr Male 2 100.0
2 Hlmt Not worn 2 100.0
2 Int_ Driveway accs 1 50.0
2 Int_ Int related 1 50.0
2 Lght Dark, not lighted 1 50.0
2 Lght Daylight 1 50.0
2 Oth_ Other 1 50.0
2 Oth_ veh leaving driveway 1 50.0
2 PSL 30-40 mph 1 50.0
2 PSL 45-60 mph 1 50.0
2 Pp_G 100,000 - 249,999 pop. 1 50.0
2 Pp_G Other 1 50.0
2 Rd_c City street 2 100.0
2 Rd_l Straight, level 2 100.0
2 Sesn Winter 2 100.0
2 Svrt BC 2 100.0
2 Trf_ Other 1 50.0
2 Trf_ Stop sign 1 50.0
2 Wthr Clear 2 100.0
top_cluster_profile %>%
  mutate(Label = paste0(Variable, ": ", Category)) %>%
  group_by(Cluster) %>%
  slice_max(pct_within_cluster, n = 10, with_ties = FALSE) %>%
  ungroup() %>%
  ggplot(aes(x = reorder(Label, pct_within_cluster), y = pct_within_cluster)) +
  geom_col() +
  coord_flip() +
  facet_wrap(~ Cluster, scales = "free_y") +
  labs(
    title = "Dominant Crash Characteristics by Stable-CCA Cluster",
    x = "Crash Characteristic",
    y = "Percent Within Cluster"
  ) +
  theme_minimal()

12 Countermeasure Readiness Score

This section computes a countermeasure readiness score using actionable variables. The score combines cluster stability and the dominance of actionable engineering features.

actionable_vars <- c(
  "Lght", "Rd_l", "Trf_", "Int_", "FHE_", "Rd_c", "PSL", "Oth_"
)

countermeasure_relevance <- cluster_profile %>%
  filter(Variable %in% actionable_vars) %>%
  group_by(Cluster) %>%
  summarise(
    Actionable_Dominance = mean(pct_within_cluster[pct_within_cluster >= 50], na.rm = TRUE),
    N_Actionable_Dominant_Items = sum(pct_within_cluster >= 50, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  mutate(
    Actionable_Dominance = ifelse(is.nan(Actionable_Dominance), 0, Actionable_Dominance)
  ) %>%
  left_join(stable_fit$cluster_stability, by = "Cluster") %>%
  mutate(
    Countermeasure_Readiness_Score =
      0.60 * Mean_Stability +
      0.40 * scales::rescale(Actionable_Dominance, to = c(0, 1))
  ) %>%
  arrange(desc(Countermeasure_Readiness_Score))

knitr::kable(countermeasure_relevance, digits = 3, caption = "Countermeasure readiness by Stable-CCA cluster.")
Countermeasure readiness by Stable-CCA cluster.
Cluster Actionable_Dominance N_Actionable_Dominant_Items N_Crashes Mean_Stability Median_Stability Min_Stability Countermeasure_Readiness_Score
1 87.037 6 9 1.00 1.00 1.0 1.00
2 61.538 13 2 0.75 0.75 0.5 0.45
countermeasure_relevance %>%
  mutate(Cluster = as.factor(Cluster)) %>%
  ggplot(aes(x = Cluster, y = Countermeasure_Readiness_Score)) +
  geom_col() +
  labs(
    title = "Countermeasure Readiness Score by Stable-CCA Cluster",
    x = "Stable-CCA Cluster",
    y = "Countermeasure Readiness Score"
  ) +
  theme_minimal()

13 CCA Visualization

coords <- as.data.frame(stable_fit$base_fit$obscoord)
names(coords)[1:2] <- c("Dim1", "Dim2")

coords_plot <- coords %>%
  mutate(
    Crash_ID = crash_results$Crash_ID,
    Cluster = as.factor(crash_results$Stable_CCA_Cluster),
    Stability = crash_results$Crash_Stability,
    Severity = crash_results$Svrt
  )

ggplot(coords_plot, aes(x = Dim1, y = Dim2, color = Cluster)) +
  geom_point(aes(size = Stability, shape = Severity), alpha = 0.80) +
  geom_text(aes(label = Crash_ID), vjust = -0.7, size = 3) +
  labs(
    title = "Stability-Regularized Cluster Correspondence Analysis",
    subtitle = "Implemented with clustrd::clusmca; each point is one crash record",
    x = "CCA Dimension 1",
    y = "CCA Dimension 2",
    color = "Stable CCA Cluster",
    size = "Crash Stability",
    shape = "Severity"
  ) +
  theme_minimal()

14 Interpretation Template

Use the following structure when interpreting the clusters in a manuscript.

14.1 Cluster 1 interpretation

Cluster 1 is characterized by the dominant categories shown in the cluster profile table. The stability value indicates how reproducible the typology is under bootstrap resampling. If the cluster has high stability and high countermeasure readiness, it can be interpreted as an actionable crash typology for engineering treatment selection.

14.2 Cluster 2 interpretation

Cluster 2 should be interpreted by identifying its dominant roadway, traffic control, environmental, and user-related variables. The countermeasure readiness score indicates whether the cluster is strongly associated with modifiable engineering features such as lighting, traffic control, speed limit, roadway class, or intersection context.

15 Suggested Manuscript Method Text

Each crash record was represented as a multivariate categorical profile consisting of roadway, traffic control, environmental, user-related, and severity attributes. Cluster Correspondence Analysis was implemented using the clustrd package through the clusmca() function. To address the sensitivity of one-time exploratory clustering, this study introduced a bootstrap-based stability regularization procedure. For each candidate number of clusters, CCA was repeatedly estimated on bootstrap samples, and the resulting cluster labels were aligned to the reference solution using a maximum-overlap assignment procedure. Crash-level stability was defined as the proportion of bootstrap replications in which a crash retained its original cluster assignment. Cluster-level stability was computed as the average stability of crashes within each typology, and model-level stability was summarized using the adjusted Rand index. The final cluster solution was selected using a composite score that combined overall stability, adjusted Rand index, and minimum cluster stability.

16 Scaling to Full Crash Data

For a full crash dataset, replace the sample data with the following structure:

crash_df <- readr::read_csv("your_crash_file.csv")

crash_df <- crash_df %>%
  mutate(Crash_ID = as.character(Crash_ID))

vars_for_cca <- c(
  "Wthr", "Lght", "Rd_l", "Trf_", "Int_", "FHE_", "Oth_",
  "Rd_c", "PSL", "Sesn", "Cnt_", "Ethn", "Gndr", "Hlmt",
  "Dr__", "Pp_G", "Svrt"
)

crash_cat <- crash_df %>%
  mutate(across(all_of(vars_for_cca), ~ as.factor(replace_na(as.character(.x), "Unknown"))))

stable_fit <- stable_clustrd_cca(
  data = crash_cat,
  vars = vars_for_cca,
  id_col = "Crash_ID",
  nclus = 4,
  ndim = 3,
  method = "clusCA",
  n_boot = 200,
  sample_frac = 0.80,
  nstart = 200,
  seed = 123
)

k_results <- evaluate_clustrd_k(
  data = crash_cat,
  vars = vars_for_cca,
  k_values = 2:8,
  ndim = 3,
  method = "clusCA",
  n_boot = 100,
  sample_frac = 0.80,
  nstart = 200,
  seed = 123
)

17 Export Results

write_csv(crash_results, "stable_cca_crash_level_results.csv")
write_csv(stable_fit$cluster_stability, "stable_cca_cluster_stability.csv")
write_csv(cluster_profile, "stable_cca_cluster_profile.csv")
write_csv(countermeasure_relevance, "stable_cca_countermeasure_readiness.csv")
write_csv(k_results, "stable_cca_cluster_selection.csv")

18 Conclusion

This R Markdown file provides a complete workflow for Stability-Regularized Cluster Correspondence Analysis using clustrd. The approach supports crash typology discovery while addressing a key limitation of traditional CCA: the lack of reproducibility testing. By adding crash-level and cluster-level stability metrics, the framework helps identify which crash typologies are robust enough to support safety diagnosis and countermeasure selection.