Understanding these patterns could help clinics implement targeted interventions such as additional reminders for high-risk patients, better education about the importance of urodynamic testing, or addressing specific concerns that commonly lead to cancellations, ultimately improving resource utilization and patient care in urogynecology practices.
The data prep pipeline performs these key steps:
Final output: Clean dataset with demographic, clinical, and PFDI-20 data for predicting urodynamics procedure cancellation.
Consolidate Multiple Urodynamics Reason Columns into One
#' Consolidate Multiple Urodynamics Reason Columns into One
#'
#' This function takes a data frame containing multiple 'reason for urodynamics'
#' columns and consolidates them into a single column with appropriate levels.
#' The function identifies columns matching the pattern "What was the reason for
#' the urodynamics? (choice=*)" and creates a new column with the consolidated values.
#' The original reason columns are removed from the output data frame.
#'
#' @param urodynamics_dataset A data frame containing the urodynamics data with multiple
#' reason columns.
#' @param output_file Optional. Path to save the processed data frame. If NULL (default),
#' no file is saved.
#' @param verbose Logical. If TRUE, provides detailed logging of the transformation
#' process. Default is FALSE.
#'
#' @return A modified data frame with a new consolidated column for urodynamics reasons
#' and all original reason columns removed.
#'
#' @importFrom dplyr mutate select
#' @importFrom tidyr unite
#' @importFrom assertthat assert_that
#' @importFrom logger log_info log_debug log_error
#'
#' @examples
#' # Example 1: Basic usage with minimal output
#' urodynamics_data <- read.csv("urodynamics_data.csv")
#' processed_urodynamics <- consolidate_urodynamics_reasons(
#' urodynamics_dataset = urodynamics_data,
#' output_file = NULL,
#' verbose = FALSE
#' )
#' # The resulting data frame has a new column "What was the reason for the urodynamics?"
#' # with values like "pre-op for prolapse surgery" or "evaluation of OAB"
#' # All original reason columns are removed from the output
#'
#' # Example 2: Save output to file with verbose logging
#' processed_with_log <- consolidate_urodynamics_reasons(
#' urodynamics_dataset = urodynamics_data,
#' output_file = "processed_urodynamics.csv",
#' verbose = TRUE
#' )
#' # Console will show detailed logs of each transformation step
#' # File "processed_urodynamics.csv" will be created with processed data
#'
#' # Example 3: Process data with custom column selection
#' # First create a subset of the data with only some columns
#' subset_data <- urodynamics_data[, c("Record ID", "Age:",
#' "What was the reason for the urodynamics? (choice=pre-op for prolapse surgery)",
#' "What was the reason for the urodynamics? (choice=pre-op for sling)",
#' "What was the reason for the urodynamics? (choice=evaluation of OAB)")]
#' # Then process this subset
#' processed_subset <- consolidate_urodynamics_reasons(
#' urodynamics_dataset = subset_data,
#' output_file = "subset_processed.csv",
#' verbose = TRUE
#' )
#' # Note: Even with a subset of reason columns, the function will work correctly
consolidate_urodynamics_reasons <- function(urodynamics_dataset,
output_file = NULL,
verbose = FALSE) {
# Initialize logger
if (verbose) {
logger::log_threshold(logger::INFO)
} else {
logger::log_threshold(logger::WARN)
}
logger::log_info("Starting consolidation of urodynamics reasons")
# Validate input
tryCatch({
assertthat::assert_that(is.data.frame(urodynamics_dataset))
logger::log_info("Input validation passed: urodynamics_dataset is a data frame")
if (!is.null(output_file)) {
assertthat::assert_that(is.character(output_file), length(output_file) == 1)
logger::log_info(paste("Output will be saved to:", output_file))
} else {
logger::log_info("No output file specified, results will be returned only")
}
# Input data summary
logger::log_info(paste("Input data dimensions:",
nrow(urodynamics_dataset), "rows,",
ncol(urodynamics_dataset), "columns"))
# Find all reason columns
reason_pattern <- "What was the reason for the urodynamics\\? \\(choice="
reason_columns <- grep(reason_pattern, names(urodynamics_dataset), value = TRUE)
if (length(reason_columns) == 0) {
stop("No columns found matching the pattern for urodynamics reasons")
}
logger::log_info(paste("Found", length(reason_columns), "reason columns:"))
if (verbose) {
for (col in reason_columns) {
logger::log_debug(paste(" *", col))
}
}
# Process the data using helper functions
processed_urodynamics <- process_reason_columns(
urodynamics_dataset = urodynamics_dataset,
reason_columns = reason_columns,
reason_pattern = reason_pattern,
verbose = verbose
)
# Save output if requested
if (!is.null(output_file)) {
tryCatch({
utils::write.csv(processed_urodynamics, file = output_file, row.names = FALSE)
logger::log_info(paste("Output successfully saved to:", output_file))
}, error = function(e) {
logger::log_error(paste("Error saving output file:", e$message))
stop(paste("Error saving output file:", e$message))
})
}
logger::log_info("Consolidation of urodynamics reasons completed successfully")
return(processed_urodynamics)
}, error = function(e) {
logger::log_error(paste("Error in consolidate_urodynamics_reasons:", e$message))
stop(paste("Error in consolidate_urodynamics_reasons:", e$message))
})
}
#' Process reason columns to create consolidated reason column
#'
#' @param urodynamics_dataset Data frame with urodynamics data
#' @param reason_columns Vector of column names for reasons
#' @param reason_pattern Pattern to extract the reason value
#' @param verbose Whether to log detailed information
#'
#' @return Data frame with consolidated reason column
#'
#' @noRd
process_reason_columns <- function(urodynamics_dataset, reason_columns,
reason_pattern, verbose) {
logger::log_info("Starting to process reason columns")
# Create a new column for each reason, with NA or the reason value
processed_dataset <- urodynamics_dataset
# Map of reasons (will be filled during processing)
reason_values <- c()
# For each reason column, extract the reason and set a new column with the value
for (col in reason_columns) {
# Extract the reason from the column name
reason_value <- extract_reason_value(col, reason_pattern)
reason_values <- c(reason_values, reason_value)
if (verbose) {
logger::log_debug(paste("Processing column:", col))
logger::log_debug(paste("Extracted reason:", reason_value))
}
# Create a new column with the reason value if the original is "Checked"
processed_dataset <- dplyr::mutate(processed_dataset,
!!paste0("Reason_", reason_value) :=
ifelse(.data[[col]] == "Checked",
reason_value, NA))
}
# Create the consolidated reason column
logger::log_info("Creating consolidated reason column")
# Get the new individual reason columns
new_reason_columns <- paste0("Reason_", reason_values)
# Use coalesce to create the consolidated column
processed_dataset <- create_consolidated_column(
processed_dataset = processed_dataset,
reason_columns = new_reason_columns,
verbose = verbose
)
# Clean up temporary columns and original reason columns
logger::log_info("Removing temporary columns and original reason columns")
columns_to_remove <- c(new_reason_columns, reason_columns)
processed_dataset <- processed_dataset[, !names(processed_dataset) %in% columns_to_remove]
if (verbose) {
logger::log_debug(paste("Removed", length(columns_to_remove), "columns"))
logger::log_debug(paste("Remaining columns:", ncol(processed_dataset)))
}
return(processed_dataset)
}
#' Extract reason value from column name
#'
#' @param column_name Column name containing the reason
#' @param pattern Pattern to match for extracting the reason
#'
#' @return Extracted reason value
#'
#' @noRd
extract_reason_value <- function(column_name, pattern) {
# Extract the part after "choice=" and before the closing parenthesis
reason <- gsub(pattern, "", column_name)
reason <- gsub("\\)$", "", reason)
return(reason)
}
#' Create consolidated reason column using coalesce
#'
#' @param processed_dataset Dataset with individual reason columns
#' @param reason_columns Vector of column names to consolidate
#' @param verbose Whether to log detailed information
#'
#' @return Dataset with added consolidated column
#'
#' @noRd
create_consolidated_column <- function(processed_dataset, reason_columns, verbose) {
if (verbose) {
logger::log_debug("Creating consolidated column from:")
for (col in reason_columns) {
logger::log_debug(paste(" *", col))
}
}
# Create a consolidated column by coalescing all individual reason columns
processed_dataset$`What was the reason for the urodynamics?` <- NA
# Iterate through each row
for (i in 1:nrow(processed_dataset)) {
for (col in reason_columns) {
if (!is.na(processed_dataset[i, col])) {
processed_dataset[i, "What was the reason for the urodynamics?"] <-
processed_dataset[i, col]
break # Take the first non-NA value
}
}
}
# If multiple reasons are checked, they'll be concatenated
logger::log_info("Consolidated reason column created successfully")
return(processed_dataset)
}Was the procedure cancelled? | Age: | BMI | Race: | Is the patient hispanic, latino or of Spanish origin? | Does the patient have diabetes? | Does the patient have a h/o recurrent UTIs? | Immunocompromised: | Tobacco use: | Menopause status: | Is the patient on vaginal estrogen? | Does the patient have OAB? | Average number of voids at night: | POP-Q stage: | Year | POPDI_6 | CRADI_8 | UDI_6 | PFDI_20_total | What was the reason for the urodynamics? |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Yes | 49 | 34.18 | Other | Yes | No | No | No | No | Pre-menopausal | No | No | 1.0 | 2022 | 4.166667 | 6.250000 | 50.000000 | 100 | pre-op for prolapse surgery | |
Yes | 78 | 31.28 | White | No | No | No | No | Yes | Post-menopausal | No | No | 5.0 | 2022 | 79.166667 | 53.125000 | 91.666667 | 224 | pre-op for prolapse surgery | |
Yes | 63 | 27.17 | White | No | Yes | No | No | Yes | Post-menopausal | No | Yes | 1.0 | 2022 | 16.666667 | 50.000000 | 50.000000 | 117 | pre-op for prolapse surgery | |
Yes | 62 | 18.96 | White | No | No | No | No | No | Post-menopausal | No | Yes | 3.0 | 0 | 2022 | 0.000000 | 7.142857 | 20.833333 | 28 | pre-op for sling |
No | 70 | 37.12 | White | No | No | Yes | Yes | Yes | Post-menopausal | No | Yes | 1.5 | 0 | 2022 | 8.333333 | 0.000000 | 91.666667 | 100 | evaluation of OAB |
No | 59 | 29.87 | White | No | No | No | No | No | Post-menopausal | No | Yes | 4.0 | 3 | 2022 | 60.000000 | 56.250000 | 81.250000 | 198 | pre-op for prolapse surgery |
No | 53 | 30.27 | Other | No | No | Yes | No | No | Post-menopausal | No | No | 4.5 | 1 | 2022 | 79.166667 | 78.125000 | 100.000000 | 257 | pre-op for sling |
No | 37 | 25.79 | Other | No | No | No | No | No | Pre-menopausal | No | No | 1.0 | 1 | 2022 | 0.000000 | 6.250000 | 8.333333 | 15 | pre-op for sling |
No | 86 | 34.01 | White | No | No | No | No | No | Post-menopausal | Yes | No | 3.5 | 3 | 2022 | 29.166667 | 18.750000 | 50.000000 | 100 | pre-op for prolapse surgery |
No | 67 | 25.86 | White | No | No | No | No | Yes | Post-menopausal | No | Yes | 2.0 | 0 | 2022 | 0.000000 | 50.000000 | 62.500000 | 112 | evaluation of OAB |
No | 61 | 30.90 | Other | Yes | Yes | No | No | No | Post-menopausal | No | Yes | 4.0 | 1 | 2022 | 29.166667 | 18.750000 | 50.000000 | 100 | pre-op for sling |
No | 68 | 23.00 | White | No | No | No | No | No | Post-menopausal | No | Yes | 4.0 | 1 | 2022 | 29.166667 | 18.750000 | 50.000000 | 100 | pre-op for sling |
No | 67 | 20.25 | White | No | No | No | No | No | Post-menopausal | Yes | No | 1.0 | 3 | 2022 | 29.166667 | 6.250000 | 50.000000 | 85 | pre-op for prolapse surgery |
No | 74 | 27.34 | Other | No | No | No | Yes | No | Post-menopausal | No | No | 1.0 | 4 | 2022 | 33.333333 | 0.000000 | 37.500000 | 71 | pre-op for prolapse surgery |
No | 51 | 36.07 | Black or African American | No | Yes | No | No | No | Post-menopausal | No | Yes | 5.5 | 0 | 2022 | 37.500000 | 6.250000 | 41.666667 | 85 | evaluation of OAB |
Yes | 93 | 23.46 | White | No | No | No | No | No | Post-menopausal | No | Yes | 0.0 | 0 | 2022 | 29.166667 | 18.750000 | 50.000000 | 100 | other |
No | 69 | 28.34 | White | No | No | No | No | Yes | Post-menopausal | No | Yes | 0.0 | 2 | 2022 | 29.166667 | 18.750000 | 50.000000 | 100 | pre-op for prolapse surgery |
No | 47 | 35.73 | White | No | No | No | No | No | Pre-menopausal | No | Yes | 0.0 | 0 | 2022 | 16.666667 | 18.750000 | 58.333333 | 94 | pre-op for sling |
No | 43 | 23.00 | White | No | No | No | No | No | Pre-menopausal | No | Yes | 1.0 | 0 | 2022 | 0.000000 | 0.000000 | 62.500000 | 62 | evaluation of mixed incontinence |
No | 75 | 24.10 | Other | No | No | No | No | No | Post-menopausal | Yes | No | 1.5 | 4 | 2022 | 37.500000 | 6.250000 | 20.000000 | 64 | pre-op for prolapse surgery |
📊 Logistic Regression Check: Events per Predictor Variable (EPV) 📊
==================================================================
Outcome Variable: Was the procedure cancelled?
Number of Observations: 841
Total Events (Yes): 57
Total Non-Events: 784
Number of Predictors: 19
Required Minimum Events (10 × 19): 190
The Events Per Predictor Variable (EPV) rule is a guideline in logistic regression that recommends having at least 10 outcome events for each predictor variable in the model to ensure stable coefficient estimates and reliable inference.
In our dataset:
Assessment: With only 57 events and 19 candidate predictors, the EPV ratio (3) is below the recommended minimum of 10. This limitation will be addressed through LASSO regularization, which automatically selects a parsimonious subset of predictors, thereby improving the effective EPV ratio in the final model.
The LASSO variable selection process (described in Section 7) will identify the most predictive variables, reducing the number of predictors in the final model and substantially improving the EPV ratio. The final model metrics are reported in the Results section.
Looking at the demographics table comparing patients whose procedures were cancelled versus completed, there are several notable findings:
Why are variables with P = .05 to .10 retained in the analysis?
These “trending” variables are kept for several important reasons:
LASSO handles variable selection: Unlike traditional stepwise regression that uses p-values for variable selection, LASSO uses cross-validation to select variables based on their contribution to predictive accuracy. A variable that does not reach statistical significance in univariate analysis may still improve model prediction when combined with other variables.
P-values are not selection criteria for LASSO: The p-values in Table 1 describe univariate associations (each variable tested alone). LASSO considers multivariate relationships and can identify variables that are predictive in combination, even if individually they show weaker associations.
Clinical plausibility: BMI and recurrent UTIs have biological rationale for affecting UTI risk and procedure cancellation. Excluding clinically relevant variables prematurely could result in an underspecified model.
Small sample size considerations: With only 57 events (cancellations), statistical power is limited. P-values near but above 0.05 may suggest meaningful effects that did not reach significance due to sample size constraints rather than true absence of association.
Let the data decide: By including these variables as candidates, we allow the LASSO algorithm to make an objective, data-driven decision about their inclusion based on cross-validated prediction error—not arbitrary p-value cutoffs.
The following factors showed no significant association with procedure cancellation (P ≥ 0.10): - Hispanic/Latino - Race - Immunocompromised status - Tobacco use - Vaginal estrogen use - OAB status - Nighttime voiding frequency - Year - PFDI-20 total - POPDI-6 - CRADI-8
The data suggests that Age, Recurrent UTIs, POP-Q stage, Menopause status are statistically significant predictors of procedure cancellation (P < 0.05). Additionally, BMI, Diabetes, UDI-6 showed differences between groups but did not reach the prespecified significance threshold (0.05 ≤ P < 0.10); however, these variables may warrant clinical consideration as potential risk factors.
The sample size for cancellations is relatively small (N=57 vs N=784), which may limit statistical power to detect some associations. The significant age difference suggests pre-procedure assessment might need to be more comprehensive for elderly patients to reduce cancellation rates.
The following flow diagram follows the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) guidelines for reporting patient flow in prediction model studies.
# Create TRIPOD-compliant flow diagram using ggplot2
library(ggplot2)
# Define box positions and sizes
box_width <- 3.5
box_height <- 0.8
# Create data for boxes using PROGRAMMATIC values from tripod_labels
boxes <- data.frame(
id = c("source", "initial", "excl1", "excl2", "excl3", "eligible", "missing", "final", "outcome_yes", "outcome_no"),
x = c(5, 5, 9, 9, 9, 5, 9, 5, 3, 7),
y = c(10, 8.5, 8.5, 7, 5.5, 6.5, 6.5, 5, 3, 3),
label = c(
tripod_labels$source,
tripod_labels$initial,
tripod_labels$excl_missing,
tripod_labels$excl_noshow,
tripod_labels$excl_other,
tripod_labels$eligible,
tripod_labels$imputed,
tripod_labels$final,
tripod_labels$cancelled,
tripod_labels$completed
),
fill = c("#3498DB", "#E8F6F3", "#FADBD8", "#FADBD8", "#FADBD8", "#E8F6F3", "#FEF9E7", "#27AE60", "#E74C3C", "#2ECC71")
)
# Create data for arrows
arrows <- data.frame(
x = c(5, 5, 5, 5, 5, 5, 5, 5),
xend = c(5, 7.2, 7.2, 7.2, 5, 7.2, 3.5, 6.5),
y = c(9.6, 8.5, 7.8, 6.3, 8.1, 6.5, 4.6, 4.6),
yend = c(8.9, 8.5, 7, 5.5, 6.9, 6.5, 3.4, 3.4)
)
# Create the plot
tripod_plot <- ggplot() +
# Draw boxes
geom_rect(data = boxes,
aes(xmin = x - box_width/2, xmax = x + box_width/2,
ymin = y - box_height/2, ymax = y + box_height/2,
fill = fill),
color = "black", linewidth = 1) +
# Add labels
geom_text(data = boxes,
aes(x = x, y = y, label = label),
size = 3.5, fontface = "bold", lineheight = 0.9) +
# Draw arrows (main flow)
geom_segment(aes(x = 5, y = 9.6, xend = 5, yend = 8.9),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 1) +
geom_segment(aes(x = 5, y = 8.1, xend = 5, yend = 6.9),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 1) +
geom_segment(aes(x = 5, y = 6.1, xend = 5, yend = 5.4),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 1) +
geom_segment(aes(x = 5, y = 4.6, xend = 3.5, yend = 3.4),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 1) +
geom_segment(aes(x = 5, y = 4.6, xend = 6.5, yend = 3.4),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 1) +
# Draw arrows (exclusions)
geom_segment(aes(x = 6.8, y = 8.5, xend = 7.2, yend = 8.5),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 0.8, linetype = "dashed") +
geom_segment(aes(x = 6.8, y = 7.8, xend = 7.2, yend = 7),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 0.8, linetype = "dashed") +
geom_segment(aes(x = 6.8, y = 6.3, xend = 7.2, yend = 5.5),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 0.8, linetype = "dashed") +
# Arrow for missing data
geom_segment(aes(x = 6.8, y = 6.5, xend = 7.2, yend = 6.5),
arrow = arrow(length = unit(0.3, "cm")), linewidth = 0.8, linetype = "dotted") +
# Use identity for fill colors
scale_fill_identity() +
# Theme and labels
labs(title = "TRIPOD Flow Diagram: Patient Selection for Prediction Model Development",
subtitle = "Urodynamic Procedure Cancellation Due to UTI") +
theme_void() +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
plot.subtitle = element_text(size = 12, hjust = 0.5, color = "gray40"),
plot.margin = margin(20, 20, 20, 20)
) +
coord_cartesian(xlim = c(0, 12), ylim = c(2, 11))
print(tripod_plot)Figure: TRIPOD Flow Diagram - Patient Selection and Analysis Flow
Flow Diagram Summary:
Table 1. Demographic and Clinical Characteristics Stratified by Procedure Cancellation Status
| Completed (N=784) | Cancelled (N=57) | Total (N=841) | p value | |
|---|---|---|---|---|
| Age (years) | < 0.001 (1) | |||
| - Median | 64.0 | 76.0 | 65.0 | |
| - Q1, Q3 | 52.0, 73.0 | 66.0, 79.0 | 53.0, 73.0 | |
| Body Mass Index (kg/m²) | 0.090 (1) | |||
| - Median | 28.5 | 31.0 | 28.6 | |
| - Q1, Q3 | 24.8, 33.5 | 25.6, 35.2 | 24.8, 33.7 | |
| Race: | 0.665 (2) | |||
| - American Indian or Alaska Native | 1 (0.1%) | 0 (0.0%) | 1 (0.1%) | |
| - Asian | 12 (1.5%) | 0 (0.0%) | 12 (1.4%) | |
| - Black or African American | 36 (4.6%) | 1 (1.8%) | 37 (4.4%) | |
| - White | 630 (80.4%) | 46 (80.7%) | 676 (80.4%) | |
| - Other | 100 (12.8%) | 9 (15.8%) | 109 (13.0%) | |
| - Native Hawaiian or Other Pacific Islander | 5 (0.6%) | 1 (1.8%) | 6 (0.7%) | |
| Is the patient hispanic, latino or of Spanish origin? | 0.313 (2) | |||
| - No | 695 (88.6%) | 48 (84.2%) | 743 (88.3%) | |
| - Yes | 89 (11.4%) | 9 (15.8%) | 98 (11.7%) | |
| Does the patient have diabetes? | 0.033 (2) | |||
| - Yes | 111 (14.2%) | 14 (24.6%) | 125 (14.9%) | |
| - No | 673 (85.8%) | 43 (75.4%) | 716 (85.1%) | |
| History of Recurrent Urinary Tract Infections | 0.002 (2) | |||
| - Yes | 67 (8.5%) | 12 (21.1%) | 79 (9.4%) | |
| - No | 717 (91.5%) | 45 (78.9%) | 762 (90.6%) | |
| Immunocompromised: | 0.734 (2) | |||
| - Yes | 35 (4.5%) | 2 (3.5%) | 37 (4.4%) | |
| - No | 749 (95.5%) | 55 (96.5%) | 804 (95.6%) | |
| Tobacco use: | 0.374 (2) | |||
| - Yes | 284 (36.2%) | 24 (42.1%) | 308 (36.6%) | |
| - No | 500 (63.8%) | 33 (57.9%) | 533 (63.4%) | |
| Menopause status: | 0.015 (2) | |||
| - Pre-menopausal | 167 (21.3%) | 4 (7.0%) | 171 (20.3%) | |
| - Post-menopausal | 581 (74.1%) | 52 (91.2%) | 633 (75.3%) | |
| - Unclear | 36 (4.6%) | 1 (1.8%) | 37 (4.4%) | |
| Is the patient on vaginal estrogen? | 0.238 (2) | |||
| - Yes | 119 (15.2%) | 12 (21.1%) | 131 (15.6%) | |
| - No | 665 (84.8%) | 45 (78.9%) | 710 (84.4%) | |
| Does the patient have Overactive Bladder? | 0.112 (2) | |||
| - Yes | 396 (50.5%) | 35 (61.4%) | 431 (51.2%) | |
| - No | 388 (49.5%) | 22 (38.6%) | 410 (48.8%) | |
| Average number of voids at night: | 0.384 (1) | |||
| - Median | 2.0 | 2.0 | 2.0 | |
| - Q1, Q3 | 1.0, 3.0 | 1.0, 3.0 | 1.0, 3.0 | |
| Pelvic Organ Prolapse Quantification Stage | 0.022 (2) | |||
| - 0 | 282 (36.0%) | 29 (53.7%) | 311 (37.1%) | |
| - 1 | 96 (12.2%) | 4 (7.4%) | 100 (11.9%) | |
| - 2 | 223 (28.4%) | 6 (11.1%) | 229 (27.3%) | |
| - 3 | 151 (19.3%) | 13 (24.1%) | 164 (19.6%) | |
| - 4 | 32 (4.1%) | 2 (3.7%) | 34 (4.1%) | |
| Pelvic Floor Distress Inventory-20 Total Score | 0.387 (1) | |||
| - Median | 100.0 | 100.0 | 100.0 | |
| - Q1, Q3 | 95.8, 100.2 | 100.0, 109.0 | 98.0, 101.0 | |
| Pelvic Organ Prolapse Distress Inventory-6 | 0.606 (1) | |||
| - Median | 29.2 | 29.2 | 29.2 | |
| - Q1, Q3 | 25.0, 29.2 | 29.2, 33.3 | 25.0, 29.2 | |
| Colorectal-Anal Distress Inventory-8 | 0.738 (1) | |||
| - Median | 18.8 | 18.8 | 18.8 | |
| - Q1, Q3 | 18.8, 21.9 | 18.8, 25.0 | 18.8, 21.9 | |
| Urinary Distress Inventory-6 | 0.170 (1) | |||
| - Median | 50.0 | 50.0 | 50.0 | |
| - Q1, Q3 | 45.8, 50.0 | 50.0, 54.2 | 50.0, 50.0 | |
| Year | 0.951 (2) | |||
| - 2022 | 288 (36.7%) | 22 (38.6%) | 310 (36.9%) | |
| - 2023 | 332 (42.3%) | 23 (40.4%) | 355 (42.2%) | |
| - 2024 | 164 (20.9%) | 12 (21.1%) | 176 (20.9%) | |
| What was the reason for the urodynamics? | 0.331 (2) | |||
| - evaluation of mixed incontinence | 135 (17.2%) | 9 (15.8%) | 144 (17.1%) | |
| - evaluation of neurogenic bladder | 1 (0.1%) | 0 (0.0%) | 1 (0.1%) | |
| - evaluation of OAB | 179 (22.8%) | 21 (36.8%) | 200 (23.8%) | |
| - evaluation of voiding dysfunction | 19 (2.4%) | 2 (3.5%) | 21 (2.5%) | |
| - other | 33 (4.2%) | 2 (3.5%) | 35 (4.2%) | |
| - pre-op for prolapse surgery | 279 (35.6%) | 17 (29.8%) | 296 (35.2%) | |
| - pre-op for sling | 138 (17.6%) | 6 (10.5%) | 144 (17.1%) |
# Rename columns to remove spaces and special characters
colnames(labels_df) <- make.names(colnames(labels_df))
# Remove Record.ID if it still exists after make.names conversion
# This is a non-contributory identifier variable
if ("Record.ID" %in% names(labels_df)) {
labels_df <- labels_df %>% select(-Record.ID)
message("✅ Removed Record.ID from labels_df")
}#' Create a simple and robust variable importance plot for logistic regression
#'
#' This function extracts variable importance measures from a logistic regression model
#' and creates a simple visualization. It's designed to be robust and work with models
#' that might have complex structures.
#'
#' @param model An rms::lrm object or a glm object with family=binomial
#' @param top_n Number of top variables to show (default: all)
#' @param title Custom title for the plot (default: "Variable Importance")
#'
#' @return A ggplot2 object showing variable importance
#'
#' @examples
#' # Assuming you have an lrm model called 'model'
#' imp_plot <- plot_var_importance(model)
#' print(imp_plot)
#'
#' # Show only top 5 variables with custom title
#' imp_plot <- plot_var_importance(model, top_n = 5,
#' title = "Top Predictors of Cancellation")
#' print(imp_plot)
#'
#' @importFrom ggplot2 ggplot aes geom_col coord_flip labs theme_minimal theme
#' element_text element_blank reorder
plot_var_importance <- function(model, top_n = NULL, title = "Variable Importance") {
# Check if ggplot2 is available
if (!requireNamespace("ggplot2", quietly = TRUE)) {
stop("ggplot2 package is required for this function")
}
# Get the coefficients while handling potential errors
tryCatch({
# Try to extract coefficients
if (inherits(model, "lrm")) {
# For rms::lrm models
coefficients <- model$coefficients
# Skip intercept
coefficients <- coefficients[names(coefficients) != "Intercept"]
} else if (inherits(model, "glm")) {
# For glm models
coefficients <- coef(model)[-1] # Skip intercept
} else {
stop("Model must be an 'lrm' or 'glm' object")
}
# Check if we have any coefficients
if (length(coefficients) == 0) {
stop("No coefficients found in the model")
}
# Create a data frame with variable importance
importance_df <- data.frame(
variable = names(coefficients),
importance = abs(coefficients), # Use absolute values for importance
stringsAsFactors = FALSE
)
# Clean variable names (remove backticks and handle factor levels)
importance_df$variable <- gsub("`", "", importance_df$variable)
# Remove non-contributory identifier variables that should not appear in importance plots
exclude_vars <- c("Record.ID", "Record ID", "MRN", "MRN.", "Completed", "Complete")
importance_df <- importance_df[!grepl(paste(exclude_vars, collapse = "|"), importance_df$variable, ignore.case = TRUE), ]
# Sort by importance
importance_df <- importance_df[order(importance_df$importance, decreasing = TRUE), ]
# Apply top_n filter if specified
if (!is.null(top_n) && is.numeric(top_n) && top_n > 0) {
n_vars <- nrow(importance_df)
if (top_n < n_vars) {
importance_df <- importance_df[1:top_n, ]
}
}
# Reorder factor levels for plotting
importance_df$variable <- factor(importance_df$variable,
levels = rev(importance_df$variable))
# Create plot
p <- ggplot2::ggplot(
importance_df,
ggplot2::aes(x = variable, y = importance)
) +
ggplot2::geom_col(fill = "steelblue") +
ggplot2::coord_flip() +
ggplot2::labs(
title = title,
x = NULL,
y = "Absolute Coefficient Magnitude"
) +
ggplot2::theme_minimal(base_size = 14) +
ggplot2::theme(
plot.title = ggplot2::element_text(face = "bold", hjust = 0.5),
panel.grid.minor = ggplot2::element_blank(),
panel.grid.major.y = ggplot2::element_blank()
)
return(p)
}, error = function(e) {
message("Error creating variable importance plot: ", e$message)
# Create fallback plot with error message
df <- data.frame(x = 1, y = 1)
p <- ggplot2::ggplot(df, ggplot2::aes(x, y)) +
ggplot2::geom_blank() +
ggplot2::annotate("text", x = 1, y = 1,
label = paste("Could not create plot:", e$message),
hjust = 0.5, vjust = 0.5) +
ggplot2::theme_void() +
ggplot2::labs(title = "Error in Variable Importance Plot")
return(p)
})
}
#' Plot Wald statistics from a logistic regression model
#'
#' This function calculates and plots Wald statistics from a logistic regression model,
#' which can be used as a measure of variable importance.
#'
#' @param model An rms::lrm object
#' @param top_n Number of top variables to show (default: all)
#' @param title Custom title for the plot (default: "Variable Importance (Wald Statistic)")
#'
#' @return A ggplot2 object showing Wald statistics
#'
#' @examples
#' # Assuming you have an lrm model called 'model'
#' wald_plot <- plot_wald_statistics(model, top_n = 5)
#' print(wald_plot)
#'
#' @importFrom ggplot2 ggplot aes geom_col coord_flip labs theme_minimal theme
#' element_text element_blank reorder
plot_wald_statistics <- function(model, top_n = NULL,
title = "Variable Importance (Wald Statistic)") {
# Check if model is from rms package
if (!inherits(model, "lrm")) {
stop("Model must be an 'lrm' object from the rms package")
}
# Extract model summary
tryCatch({
# Get coefficients and standard errors
coefs <- model$coefficients
# Skip intercept
coefs <- coefs[names(coefs) != "Intercept"]
# Get variance-covariance matrix
vcov_matrix <- model$var
# Calculate standard errors
se <- sqrt(diag(vcov_matrix))[names(coefs)]
# Calculate Wald statistics
wald_stats <- abs(coefs / se)
# Create data frame
wald_df <- data.frame(
variable = names(wald_stats),
importance = as.numeric(wald_stats),
stringsAsFactors = FALSE
)
# Clean variable names
wald_df$variable <- gsub("`", "", wald_df$variable)
# Sort by importance
wald_df <- wald_df[order(wald_df$importance, decreasing = TRUE), ]
# Apply top_n filter if specified
if (!is.null(top_n) && is.numeric(top_n) && top_n > 0) {
n_vars <- nrow(wald_df)
if (top_n < n_vars) {
wald_df <- wald_df[1:top_n, ]
}
}
# Reorder factor levels for plotting
wald_df$variable <- factor(wald_df$variable,
levels = rev(wald_df$variable))
# Create plot
p <- ggplot2::ggplot(
wald_df,
ggplot2::aes(x = variable, y = importance)
) +
ggplot2::geom_col(fill = "darkblue") +
ggplot2::coord_flip() +
ggplot2::labs(
title = title,
x = NULL,
y = "Absolute Wald Statistic (|Coefficient / SE|)"
) +
ggplot2::theme_minimal(base_size = 14) +
ggplot2::theme(
plot.title = ggplot2::element_text(face = "bold", hjust = 0.5),
panel.grid.minor = ggplot2::element_blank(),
panel.grid.major.y = ggplot2::element_blank()
)
return(p)
}, error = function(e) {
message("Error calculating Wald statistics: ", e$message)
# Create fallback plot with error message
df <- data.frame(x = 1, y = 1)
p <- ggplot2::ggplot(df, ggplot2::aes(x, y)) +
ggplot2::geom_blank() +
ggplot2::annotate("text", x = 1, y = 1,
label = paste("Could not create plot:", e$message),
hjust = 0.5, vjust = 0.5) +
ggplot2::theme_void() +
ggplot2::labs(title = "Error in Wald Statistics Plot")
return(p)
})
}
#' Create a simple but effective variable importance plot for most model types
#'
#' This function works with virtually any model object that has coefficients
#' and creates a simple variable importance plot.
#'
#' @param model Any model object with a coef() method
#' @param top_n Number of top variables to show (default: all)
#' @param title Custom title for the plot (default: "Variable Importance")
#' @param sort_by How to sort variables: "magnitude" (default) or "alphabetical"
#' @param color Bar color (default: "steelblue")
#'
#' @return A ggplot2 object showing variable importance
#'
#' @examples
#' # Works with virtually any regression model
#' universal_imp_plot <- universal_importance_plot(model, top_n = 5)
#' print(universal_imp_plot)
#'
#' # Customize appearance
#' universal_imp_plot <- universal_importance_plot(
#' model, top_n = 3, color = "darkred",
#' title = "Top 3 Predictors"
#' )
#' print(universal_imp_plot)
#'
#' @importFrom stats coef
universal_importance_plot <- function(model, top_n = NULL,
title = "Variable Importance",
sort_by = "magnitude",
color = "steelblue") {
# Try different methods to extract coefficients
tryCatch({
coeffs <- NULL
# Method 1: Try direct coef() method
if (is.function(stats::coef) && !is.null(stats::coef(model))) {
coeffs <- stats::coef(model)
# Remove intercept if present
if ("(Intercept)" %in% names(coeffs)) {
coeffs <- coeffs[names(coeffs) != "(Intercept)"]
}
if ("Intercept" %in% names(coeffs)) {
coeffs <- coeffs[names(coeffs) != "Intercept"]
}
}
# Method 2: Try accessing coefficients directly
if (is.null(coeffs) && !is.null(model$coefficients)) {
coeffs <- model$coefficients
# Remove intercept if present
if ("(Intercept)" %in% names(coeffs)) {
coeffs <- coeffs[names(coeffs) != "(Intercept)"]
}
if ("Intercept" %in% names(coeffs)) {
coeffs <- coeffs[names(coeffs) != "Intercept"]
}
}
# If still no coefficients, try model summary
if (is.null(coeffs) && !is.null(summary(model)$coefficients)) {
coeffs_matrix <- summary(model)$coefficients
if (is.matrix(coeffs_matrix) && nrow(coeffs_matrix) > 1) {
# Get coefficient values (first column)
coeffs <- coeffs_matrix[, 1]
# Remove intercept if present
if ("(Intercept)" %in% names(coeffs)) {
coeffs <- coeffs[-1] # Assume intercept is first row
}
}
}
# Check if we found any coefficients
if (is.null(coeffs) || length(coeffs) == 0) {
stop("Could not extract coefficients from the model")
}
# Create data frame
imp_df <- data.frame(
variable = names(coeffs),
importance = abs(coeffs),
stringsAsFactors = FALSE
)
# Clean variable names and improve readability
imp_df$variable <- gsub("`", "", imp_df$variable)
# Remove non-contributory identifier variables that should not appear in importance plots
exclude_vars <- c("Record.ID", "Record ID", "MRN", "MRN.", "Completed", "Complete")
imp_df <- imp_df[!grepl(paste(exclude_vars, collapse = "|"), imp_df$variable, ignore.case = TRUE), ]
# Create readable labels for common abbreviations and variable names
label_map <- c(
"Age" = "Age (years)",
"BMI" = "Body Mass Index",
"POP.Q.stage." = "POP-Q Stage",
"POPDI_6" = "Pelvic Organ Prolapse Distress",
"CRADI_8" = "Colorectal-Anal Distress",
"UDI_6" = "Urinary Distress",
"PFDI_20_total" = "Pelvic Floor Distress Total",
"Does.the.patient.have.OAB." = "Overactive Bladder",
"Does.the.patient.have.diabetes." = "Diabetes",
"Does.the.patient.have.a.h.o.recurrent.UTIs." = "History of Recurrent UTIs",
"Is.the.patient.on.vaginal.estrogen." = "Vaginal Estrogen Use",
"Immunocompromised." = "Immunocompromised",
"Tobacco.use." = "Tobacco Use",
"Menopause.status." = "Menopause Status",
"What.was.the.reason.for.the.urodynamics." = "Indication for Urodynamics",
"Average.number.of.voids.at.night." = "Nocturia (voids/night)",
"Race." = "Race",
"Year" = "Year"
)
# Apply label improvements
for (i in seq_len(nrow(imp_df))) {
var_name <- imp_df$variable[i]
# Remove level suffixes for matching (e.g., "Race.=White" -> "Race.")
base_var <- sub("=.*$", "", var_name)
level_suffix <- sub("^[^=]*", "", var_name)
if (base_var %in% names(label_map)) {
if (level_suffix != "") {
imp_df$variable[i] <- paste0(label_map[base_var], level_suffix)
} else {
imp_df$variable[i] <- label_map[base_var]
}
} else {
# Clean up periods and dots in variable names
imp_df$variable[i] <- gsub("\\.", " ", imp_df$variable[i])
imp_df$variable[i] <- gsub(" +", " ", imp_df$variable[i])
imp_df$variable[i] <- trimws(imp_df$variable[i])
}
}
# Sort as requested
if (sort_by == "magnitude" || sort_by == "importance") {
imp_df <- imp_df[order(imp_df$importance, decreasing = TRUE), ]
} else if (sort_by == "alphabetical" || sort_by == "alpha") {
imp_df <- imp_df[order(imp_df$variable), ]
}
# Apply top_n filter
if (!is.null(top_n) && is.numeric(top_n) && top_n > 0) {
if (top_n < nrow(imp_df)) {
imp_df <- imp_df[1:top_n, ]
}
}
# For ggplot, reorder factor levels
if (sort_by == "magnitude" || sort_by == "importance") {
imp_df$variable <- factor(imp_df$variable, levels = rev(imp_df$variable))
} else {
imp_df$variable <- factor(imp_df$variable, levels = sort(imp_df$variable))
}
# Create plot with improved aesthetics
p <- ggplot2::ggplot(
imp_df,
ggplot2::aes(x = variable, y = importance)
) +
ggplot2::geom_col(fill = color, width = 0.7) +
ggplot2::geom_text(
ggplot2::aes(label = sprintf("%.2f", importance)),
hjust = -0.1, size = 4, fontface = "bold"
) +
ggplot2::coord_flip() +
ggplot2::labs(
title = title,
x = NULL,
y = "Absolute Coefficient Magnitude"
) +
ggplot2::scale_y_continuous(expand = ggplot2::expansion(mult = c(0, 0.15))) +
ggplot2::theme_minimal(base_size = 14) +
ggplot2::theme(
plot.title = ggplot2::element_text(face = "bold", hjust = 0.5, size = 16),
axis.text.y = ggplot2::element_text(size = 12, color = "black"),
axis.text.x = ggplot2::element_text(size = 11),
axis.title.x = ggplot2::element_text(size = 12, face = "bold"),
panel.grid.minor = ggplot2::element_blank(),
panel.grid.major.y = ggplot2::element_blank(),
plot.margin = ggplot2::margin(10, 20, 10, 10)
)
return(p)
}, error = function(e) {
message("Error creating variable importance plot: ", e$message)
# Create fallback error message plot
df <- data.frame(x = 1, y = 1)
p <- ggplot2::ggplot(df, ggplot2::aes(x, y)) +
ggplot2::geom_blank() +
ggplot2::annotate("text", x = 1, y = 1,
label = paste("Could not create plot:", e$message),
hjust = 0.5, vjust = 0.5) +
ggplot2::theme_void() +
ggplot2::labs(title = "Error in Variable Importance Plot")
return(p)
})
}# ============================================================================
# IMPROVED VARIABLE IMPORTANCE PLOT - Publication Quality
# Uses coefficients from the model to show considered predictors
# NOTE: Uses lrm model coefficients since LASSO is fit later in the document
# ============================================================================
# Generate the importance data from the model object (lrm)
# cv_lasso is not available at this point - it's defined in a later section
imp_data <- tryCatch({
# Get coefficients from the lrm model
if (!exists("model") || is.null(model)) {
stop("Model object not found")
}
# Extract coefficients from lrm model
coeffs <- coef(model)
coeffs <- coeffs[names(coeffs) != "Intercept"]
# Create importance data frame
imp_df <- data.frame(
variable = names(coeffs),
coefficient = coeffs,
importance = abs(coeffs),
direction = ifelse(coeffs > 0, "Increases Risk", "Decreases Risk"),
stringsAsFactors = FALSE
)
# Remove identifier variables and keep only non-zero coefficients
imp_df <- imp_df[!grepl("Record|MRN|Complete", imp_df$variable, ignore.case = TRUE), ]
imp_df <- imp_df[imp_df$importance > 0.001, ] # Remove near-zero coefficients
# Clean variable names for display
imp_df$display_name <- sapply(imp_df$variable, function(v) {
v <- gsub("\\.", " ", v)
v <- gsub(" +", " ", v)
v <- gsub("Does the patient have ", "", v)
v <- gsub("Is the patient ", "", v)
v <- gsub("What was the reason for the urodynamics ", "", v)
v <- gsub(" $", "", v)
v <- gsub("= ", ": ", v)
trimws(v)
})
imp_df <- imp_df[order(imp_df$importance, decreasing = TRUE), ]
imp_df <- head(imp_df, 10) # Top 10
imp_df$display_name <- factor(imp_df$display_name, levels = rev(imp_df$display_name))
imp_df
}, error = function(e) {
message("Error creating importance data: ", e$message)
NULL
})
if (!is.null(imp_data) && nrow(imp_data) > 0) {
# Create publication-quality plot with gradient coloring
publication_importance_plot <- ggplot(imp_data, aes(x = display_name, y = importance, fill = importance)) +
geom_col(width = 0.7, color = "white", linewidth = 0.5) +
geom_text(aes(label = sprintf("%.1f", importance)),
hjust = -0.1, size = 4, fontface = "bold", color = "gray20") +
scale_fill_gradient(low = "#74add1", high = "#d73027", guide = "none") +
coord_flip() +
labs(
title = "Top Predictors of Cancellation",
subtitle = "Based on absolute coefficient magnitude from logistic regression",
x = NULL,
y = "Importance (|Coefficient|)",
caption = "Note: Higher values indicate stronger association with cancellation"
) +
scale_y_continuous(expand = expansion(mult = c(0, 0.15))) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5, margin = margin(b = 5)),
plot.subtitle = element_text(size = 11, color = "gray40", hjust = 0.5, margin = margin(b = 15)),
plot.caption = element_text(size = 10, color = "gray50", face = "italic", margin = margin(t = 15)),
axis.text.y = element_text(size = 12, color = "gray10", face = "bold"),
axis.text.x = element_text(size = 11),
axis.title.x = element_text(size = 12, face = "bold", margin = margin(t = 10)),
panel.grid.major.y = element_blank(),
panel.grid.minor = element_blank(),
plot.margin = margin(20, 40, 15, 15)
)
print(publication_importance_plot)
# Save high-resolution version
ggsave(
filename = "variable_importance_publication.png",
plot = publication_importance_plot,
width = 9,
height = 6,
dpi = 300,
bg = "white"
)
} else {
cat("**Note:** Variable Importance Plot could not be generated. This may occur if the model has not been fitted or has no significant predictors.\n\n")
}Figure: Variable Importance Plot showing top predictors of procedure cancellation
#-----------------------------------------------------------------------------
# SIMPLIFIED TOP 5 PLOT for manuscript figure
#-----------------------------------------------------------------------------
final_plot <- universal_importance_plot(
model,
top_n = 5,
title = "Top 5 Predictors of Procedure Cancellation",
color = "#2C7FB8"
)
# Save the plot as a high-resolution PNG
ggsave(
filename = "variable_importance_plot.png",
plot = final_plot,
width = 7,
height = 5,
dpi = 300
)
# Or save as PDF for vector graphics
ggsave(
filename = "variable_importance_plot.pdf",
plot = final_plot,
width = 7,
height = 5
)
# Plots saved silently (no console output)The variables shown in the importance plot may differ from those selected by LASSO. This difference can be explained by several factors:
Different measurement approaches: Variable importance plots typically use absolute coefficient magnitude from the fitted model, while LASSO uses regularization to select variables during model building. These represent fundamentally different approaches to assessing variable importance.
Interaction with other variables: In your full logistic regression model, variables can influence each other. A variable might appear important in isolation (as in the importance plot) but become less influential when considering variable interactions that LASSO accounts for.
Factor variable handling: The importance plot shows individual factor levels as separate predictors, while LASSO might select entire categorical variables as important rather than specific levels.
Correlation structure: If predictors are correlated, LASSO tends to select one representative variable from correlated groups. Your importance plot is showing the marginal importance of each variable without considering these correlations.
Regularization effect: LASSO specifically penalizes coefficient size to prevent overfitting, potentially excluding variables with larger coefficients (which would appear important in the plot) if they don’t contribute enough additional predictive power.
This is why the nomogram based on LASSO may focus on different key predictors than the importance plot suggests. Variables with large coefficients in unregularized models may be excluded by LASSO if they don’t contribute sufficient additional predictive power when considering the full model context and correlations with other predictors. LASSO determines which variables provide the most stable and generalizable prediction.
For clinical interpretation, the LASSO-selected variables (shown in the feature selection results below) are likely more reliable for prediction than variables showing high importance in the unregularized model. The specific variables selected by LASSO are documented in Section 8.7.
The following sections document the systematic screening of variables for inclusion in the prediction model. Each step tracks which variables were removed and the rationale.
Variables with near-zero variance provide minimal discriminative information and can cause numerical instability in model fitting.
Variables removed due to near-zero variance:
Variables with excessive missing data (>90%) were excluded owing to concerns about imputation reliability.
No variables removed due to high missingness.
Highly correlated predictors (r > 0.90) can cause multicollinearity, inflating standard errors and destabilizing coefficient estimates.
No variables removed due to high collinearity.
Initial variables: 20
Variables removed: 1
Variables retained for modeling: 18
Table: Variables Excluded During Screening
Variable | Reason for Exclusion | Details |
|---|---|---|
Immunocompromised. | Near-zero variance | Frequency ratio: 21.73, % unique: 0.24% |
Any and All Predictors retained for LASSO feature selection:
Predictor Variable |
|---|
Age. |
BMI |
Race. |
Is.the.patient.hispanic..latino.or.of.Spanish.origin. |
Does.the.patient.have.diabetes. |
Does.the.patient.have.a.h.o.recurrent.UTIs. |
Tobacco.use. |
Menopause.status. |
Is.the.patient.on.vaginal.estrogen. |
Does.the.patient.have.OAB. |
Average.number.of.voids.at.night. |
POP.Q.stage. |
Year |
POPDI_6 |
CRADI_8 |
UDI_6 |
PFDI_20_total |
What.was.the.reason.for.the.urodynamics. |
LASSO (Least Absolute Shrinkage and Selection Operator) is a regularization technique that performs both variable selection and regularization to enhance prediction accuracy and model interpretability. Unlike traditional regression that uses all available predictors, LASSO automatically identifies and retains only the most important variables by shrinking less important coefficients to exactly zero.
Figure: Conceptual comparison of regression methods
Key insight from the figure above: Notice how LASSO (green bars) sets coefficients for X3, X4, X6, X7, X8, and X10 to exactly zero, matching the true underlying pattern. Ordinary Least Squares (red) produces noisy estimates for all variables, while Ridge (orange) shrinks all coefficients but never to exactly zero.
LASSO minimizes the following objective function:
\[ \hat{\beta}^{LASSO} = \arg\min_{\beta} \left\{ \underbrace{\sum_{i=1}^{n} \left( y_i - \sum_{j=1}^{p} X_{ij} \beta_j \right)^2}_{\text{Prediction Error (RSS)}} + \underbrace{\lambda \sum_{j=1}^{p} |\beta_j|}_{\text{L1 Penalty}} \right\} \]
This equation has two components:
| Component | Formula | What it does |
|---|---|---|
| Residual Sum of Squares (RSS) | \(\sum(y_i - \hat{y}_i)^2\) | Measures how well the model fits the data |
| L1 Penalty | \(\lambda \sum |\beta_j|\) | Penalizes the absolute size of coefficients |
| Lambda (λ) | Tuning parameter | Controls the trade-off between fit and simplicity |
The key to understanding LASSO is its geometric interpretation:
Figure: Geometric interpretation of LASSO vs Ridge regression
Interpretation: - The blue dashed ellipses represent contours of equal prediction error (Residual Sum of Squares) - The colored regions show the constraint imposed by regularization - The optimal solution is where the Residual Sum of Squares contour first touches the constraint region - LASSO’s diamond shape has corners on the axes, making it likely that the solution will have some coefficients exactly equal to zero - Ridge’s circular shape means solutions typically don’t land on axes, so coefficients are shrunk but not zeroed
The tuning parameter λ controls how much we penalize large coefficients:
Figure: Effect of lambda on coefficient estimates
Reading the coefficient path plot: - Left side (low λ): All variables have non-zero coefficients - Right side (high λ): Only the strongest predictors remain - Variables that reach zero early (like Smoking and Exercise) are less important - Variables that persist (like Age and BMI) are the most important predictors
We use 10-fold cross-validation to find the best λ value. The process works as follows:
Conceptual illustration of cross-validation for lambda selection
Important Note: The lambda values displayed in the conceptual illustration above (such as λ.min ≈ 0.091) are simulated values for educational purposes only. The actual lambda values from our urodynamics data cross-validation are calculated and displayed in the sections below.
Two common choices for λ:
| Choice | Description | When to use |
|---|---|---|
| λ.min | Lambda that minimizes CV error | When prediction accuracy is paramount |
| λ.1se | Largest λ within 1 SE of minimum | When you want a simpler, more interpretable model |
Let’s see how LASSO works on our actual data for predicting procedure cancellation:
Figure: Step-by-step LASSO application to our data
Feature | Stepwise Selection | Ridge Regression | LASSO |
|---|---|---|---|
Automatic variable selection | Yes* | No | Yes |
Handles multicollinearity | No | Yes | Yes |
Prevents overfitting | No | Yes | Yes |
Produces sparse models | Yes | No | Yes |
Interpretable results | Moderate | Low | High |
Works with high-dimensional data | No | Yes | Yes |
Stable across samples | No | Yes | Yes |
Computationally efficient | Yes | Yes | Yes |
*Stepwise selection is unstable and may produce different results with small data changes | |||
Category | Key Point |
|---|---|
1. WHAT IT DOES | Selects important variables automatically |
Shrinks unimportant coefficients to exactly zero | |
Produces simpler, more interpretable models | |
2. HOW IT WORKS | Adds L1 penalty to the loss function |
Diamond-shaped constraint creates sparse solutions | |
Cross-validation finds optimal penalty strength (λ) | |
3. WHY WE USED IT | Started with 20+ candidate predictors |
Needed objective, reproducible variable selection | |
Wanted to avoid overfitting with limited events | |
4. WHAT WE FOUND | LASSO retained only key predictors (see results below) |
Non-informative variables shrunk to zero coefficients | |
This parsimonious model is more likely to generalize |
# Create a data frame with the cross-validation results
cv_results <- data.frame(
lambda = cv_lasso$lambda,
mean_cross_validated_error = cv_lasso$cvm, # Mean cross-validated error
standard_error_of_cross_validation = cv_lasso$cvsd, # Standard error of CV error
upper_bound_for_cross_validation_error = cv_lasso$cvup, # Upper bound for CV error
lower_bound_for_cross_validation_error = cv_lasso$cvlo # Lower bound for CV error
)
# Get the optimal lambda value first
optimal_lambda <- cv_lasso$lambda.min
# Find the row index of the optimal lambda
optimal_row <- which.min(abs(cv_results$lambda - optimal_lambda))
# Display actual cross-validation results from our data
cv_results_display <- cv_results %>%
head(20) %>%
mutate(
lambda = round(lambda, 6),
mean_cross_validated_error = round(mean_cross_validated_error, 4),
standard_error_of_cross_validation = round(standard_error_of_cross_validation, 4),
upper_bound_for_cross_validation_error = round(upper_bound_for_cross_validation_error, 4),
lower_bound_for_cross_validation_error = round(lower_bound_for_cross_validation_error, 4)
) %>%
rename(
`λ (Lambda)` = lambda,
`CV Error` = mean_cross_validated_error,
`Std Error` = standard_error_of_cross_validation,
`Upper Bound` = upper_bound_for_cross_validation_error,
`Lower Bound` = lower_bound_for_cross_validation_error
)
# Create flextable with highlighted optimal row
cv_results_display %>%
flextable() %>%
set_caption(paste0("LASSO Cross-Validation Results from Our Urodynamics Data (Optimal λ = ",
round(optimal_lambda, 6), ")")) %>%
theme_vanilla() %>%
autofit() %>%
bold(part = "header") %>%
fontsize(size = 9, part = "all") %>%
bg(i = optimal_row, bg = "#90EE90") %>% # Highlight optimal lambda row in green
bold(i = optimal_row) %>%
add_footer_lines(paste0("Green highlighted row shows the optimal λ (lambda.min = ",
round(optimal_lambda, 6),
") that minimizes cross-validated error.")) %>%
fontsize(size = 8, part = "footer") %>%
italic(part = "footer")λ (Lambda) | CV Error | Std Error | Upper Bound | Lower Bound |
|---|---|---|---|---|
0.028216 | 0.4820 | 0.0540 | 0.5359 | 0.4280 |
0.025710 | 0.4819 | 0.0541 | 0.5360 | 0.4278 |
0.023426 | 0.4809 | 0.0543 | 0.5352 | 0.4266 |
0.021345 | 0.4790 | 0.0546 | 0.5335 | 0.4244 |
0.019448 | 0.4762 | 0.0547 | 0.5308 | 0.4215 |
0.017721 | 0.4731 | 0.0546 | 0.5277 | 0.4185 |
0.016146 | 0.4703 | 0.0544 | 0.5248 | 0.4159 |
0.014712 | 0.4680 | 0.0543 | 0.5222 | 0.4137 |
0.013405 | 0.4661 | 0.0542 | 0.5204 | 0.4119 |
0.012214 | 0.4649 | 0.0542 | 0.5191 | 0.4106 |
0.011129 | 0.4642 | 0.0544 | 0.5185 | 0.4098 |
0.010140 | 0.4640 | 0.0546 | 0.5185 | 0.4094 |
0.009240 | 0.4641 | 0.0548 | 0.5189 | 0.4093 |
0.008419 | 0.4648 | 0.0550 | 0.5198 | 0.4098 |
0.007671 | 0.4657 | 0.0553 | 0.5210 | 0.4105 |
0.006989 | 0.4669 | 0.0556 | 0.5225 | 0.4113 |
0.006368 | 0.4681 | 0.0560 | 0.5241 | 0.4121 |
0.005803 | 0.4694 | 0.0564 | 0.5258 | 0.4131 |
0.005287 | 0.4707 | 0.0567 | 0.5274 | 0.4140 |
0.004818 | 0.4719 | 0.0570 | 0.5289 | 0.4149 |
Green highlighted row shows the optimal λ (lambda.min = 0.01014) that minimizes cross-validated error. | ||||
These are the ACTUAL lambda values from our urodynamics data analysis (unlike the simulated values in the conceptual illustration above). The optimal λ value of 0.0101 was determined through 10-fold cross-validation on our actual dataset.
Cross-validation results from our urodynamics data showing optimal lambda selection
Interpretation of Our Cross-Validation Results:
kable(selected_features_df,
caption = "LASSO Model Selected Features and Coefficients",
col.names = c("Feature", "Coefficient"), # Match the actual column names
align = c("l", "r"), # Left-align text, right-align numbers
booktabs = TRUE,
linesep = "",
format = "html") %>%
kable_styling(
bootstrap_options = c("striped", "hover", "condensed", "responsive"),
full_width = FALSE,
position = "center"
) %>%
row_spec(0, bold = TRUE, color = "black", background = "#f2f2f2") %>%
column_spec(2, width = "100px") # Set width for the coefficient column| Feature | Coefficient |
|---|---|
| BMI | 0.0022262 |
| Does.the.patient.have.diabetes.Yes | 0.0732677 |
| Does.the.patient.have.a.h.o.recurrent.UTIs.Yes | 0.7003568 |
| Menopause.status.Post-menopausal | 0.6444027 |
| POP.Q.stage.2 | -0.4927426 |
| What.was.the.reason.for.the.urodynamics.evaluation of OAB | 0.3540581 |
# Create the regularization path plot
ggplot(lasso_filtered, aes(x = log10(Lambda), y = Coefficient, color = Feature)) +
geom_line(linewidth = 1.2) + # Ensure lines are drawn
geom_vline(xintercept = log10(optimal_lambda), linetype = "dashed", color = "red", linewidth = 1.2) +
labs(
title = "LASSO Regularization Path",
subtitle = "Red line indicates optimal lambda from cross-validation",
x = expression(log[10](lambda)),
y = "Coefficient Values"
) +
theme_minimal(base_size = 14) +
theme(
legend.position = "right", # Changed to show legend for feature identification
axis.title = element_text(face = "bold"),
plot.title = element_text(face = "bold", size = 16),
plot.subtitle = element_text(size = 12, color = "gray40")
)Figure: LASSO Regularization Path showing how coefficient values change with lambda
library(plotly)
# Create ggplot first
p_base <- ggplot(lasso_filtered, aes(x = log_lambda, y = Coefficient, color = Feature)) +
geom_line(linewidth = 0.8) +
geom_vline(xintercept = log10(optimal_lambda), linetype = "dashed",
color = "red", linewidth = 1) +
labs(
title = "Interactive LASSO Regularization Path",
subtitle = "Hover over lines to identify features; click legend to toggle",
x = "log₁₀(λ)",
y = "Coefficient Value"
) +
theme_minimal(base_size = 11) +
theme(
legend.position = "right",
axis.title = element_text(face = "bold"),
plot.title = element_text(face = "bold", size = 14)
) +
scale_color_viridis_d(option = "turbo")
# Convert to plotly for interactivity
p_interactive <- ggplotly(p_base, tooltip = c("color", "x", "y")) %>%
layout(
title = list(
text = paste0("Interactive LASSO Regularization Path<br>",
"<sup>Hover to identify features | Click legend to toggle | ",
"Optimal λ = ", round(cv_lasso$lambda.min, 4), "</sup>"),
font = list(size = 14)
),
legend = list(
orientation = "v",
x = 1.02,
y = 0.5,
font = list(size = 9)
),
hovermode = "closest"
) %>%
# Add annotation for optimal lambda
add_annotations(
x = log10(optimal_lambda),
y = max(lasso_filtered$Coefficient),
text = paste0("λ.min = ", round(cv_lasso$lambda.min, 4)),
showarrow = TRUE,
arrowhead = 2,
arrowsize = 0.5,
ax = 40,
ay = -30,
font = list(color = "red", size = 10)
)
p_interactiveVersion 3: Interactive plotly visualization - hover over lines to see feature names
This interactive visualization allows you to hover over any line to see the feature name and coefficient value, click legend entries to show/hide specific features, and zoom/pan to explore specific regions.
Figure Interpretation: LASSO Regularization Path
This graph shows the LASSO regularization path for a logistic regression model, illustrating how coefficient values change as the regularization strength (lambda) varies.
Key Observations:
Regularization Effect: As lambda increases (moving left to right on the x-axis), more coefficients are pushed toward zero. This demonstrates LASSO’s feature selection capability.
Optimal Lambda: The vertical red line indicates the optimal lambda value (λ = 0.0101) determined through cross-validation. This represents the best balance between model complexity and predictive performance.
Selected Predictors at Optimal Lambda: LASSO selected 6 feature(s) with non-zero coefficients:
Variables Eliminated: All other features have coefficients shrunk to exactly zero at the optimal lambda, indicating they contribute little predictive value after accounting for the selected variable(s).
Direction of Effects:
Figure Interpretation: Cross-Validation Plot
This figure shows the results of cross-validation for the LASSO logistic regression model, illustrating how model performance varies with different regularization strengths.
X-axis: Shows the log of lambda (Log(λ)) values, ranging from approximately -10 to -3. Lambda is the regularization parameter that controls the strength of the penalty in LASSO regression.
Y-axis: Shows the Binomial Deviance, which is the measure of error for your logistic regression model. Lower values indicate better model performance.
Red line: Represents the mean cross-validated error across the folds for each lambda value.
Gray error bars: Represent the standard error of the cross-validated error estimate.
Numbers at top: Indicate how many features (variables) are retained in the model at each lambda value. As lambda increases (moving right), fewer variables are kept in the model (from 30 variables at the lowest lambda to 0 at the highest).
Vertical dotted lines:
The model achieves its lowest error (around 0.42) at a log lambda of approximately -4.5, where 9 variables are retained in the model.
The lambda.1se recommendation (right dotted line at approximately -3.5) suggests using a more parsimonious model with only 3 features, which is still within one standard error of the minimum error. This is often preferred as it’s simpler while maintaining comparable performance.
The plot demonstrates the tradeoff between model complexity and performance: as you move from left to right, the model becomes simpler (fewer variables) while the error generally increases.
This cross-validation plot helps you objectively select the optimal regularization parameter for your final LASSO model predicting procedure cancellation.
Based on this graph, your final model would likely include either 9 features (for minimum error) or 3 features (for a more parsimonious model), which likely include age and BMI as shown in your nomogram.
The table above shows the predictors LASSO retained with non-zero coefficients at the optimal lambda. The coefficient magnitudes vary substantially based on the variable type and sample sizes in each subgroup.
Key Considerations for Predictor Selection:
Initial LASSO output: The cross-validation process identified the optimal λ, retaining predictors that contribute meaningfully to prediction after accounting for the regularization penalty.
Coefficient magnitude interpretation: Categorical variables may show larger coefficients for specific subgroups, but these represent smaller sample sizes. Continuous predictors have per-unit effects that apply across the entire patient population.
Final model selection: The prediction model uses predictors identified through statistical analysis (variables showing significant or trending associations with the outcome). This approach ensures:
Model performance: The final model balances discrimination ability with parsimony to ensure generalizability for clinical use.
The Variance Inflation Factor (VIF) quantifies how much the variance of a regression coefficient is inflated due to multicollinearity with other predictors. A VIF of 1 indicates no correlation with other predictors, while higher values suggest increasing multicollinearity.
Interpretation Guidelines: - VIF < 5: Generally acceptable - VIF 5-10: Moderate multicollinearity, warrants attention - VIF > 10: High multicollinearity, problematic
VIF Values for Model Predictors:
| Predictor | VIF | Interpretation |
|---|---|---|
| Menopause.status.=Post-menopausal | 1.990 | Acceptable |
| Age. | 1.672 | Acceptable |
| Menopause.status.=Unclear | 1.324 | Acceptable |
| POP.Q.stage.=3 | 1.193 | Acceptable |
| POP.Q.stage.=2 | 1.119 | Acceptable |
| BMI | 1.100 | Acceptable |
| POP.Q.stage.=1 | 1.090 | Acceptable |
| UDI_6 | 1.068 | Acceptable |
| Does.the.patient.have.diabetes.=Yes | 1.049 | Acceptable |
| POP.Q.stage.=4 | 1.049 | Acceptable |
| Does.the.patient.have.a.h.o.recurrent.UTIs.=Yes | 1.045 | Acceptable |
Figure: VIF Bar Plot with Reference Lines
Interpretation: All VIF values are below 5, indicating no problematic multicollinearity among predictors. The model coefficients are stable.
This table shows two categories of metrics that evaluate how well your logistic regression model discriminates between outcomes.
Discrimination metrics assess the model’s overall explanatory power and calibration—how well the predicted probabilities match reality. These include R² (explained variance) and the Brier score (prediction accuracy).
Rank discrimination metrics assess the model’s ability to correctly order or rank patients by risk—whether higher-risk patients are consistently assigned higher probabilities than lower-risk patients. These include the C-statistic (concordance) and Somers’ D.
| Aspect | Discrimination | Rank Discrimination |
|---|---|---|
| Question answered | “How much variation does the model explain?” | “Can the model correctly rank patients by risk?” |
| Focus | Calibration and explained variance | Ordering/ranking ability |
| Key metrics | R², Brier score | C-statistic (AUC), Somers’ D |
| Sensitivity to prevalence | Yes, affected by class imbalance | Less affected |
Nagelkerke R² (0.167): This pseudo-R² indicates that approximately 16.7% of the variation in the outcome is explained by the model. While this seems modest, lower R² values are common in logistic regression compared to linear regression, especially for rare outcomes.
R²(2,841): This is an R² adjusted for the number of predictors (2) and observations (841). The lower value suggests some shrinkage when accounting for model complexity.
R²(2,effective N): This adjusted R² uses an effective sample size that accounts for the class imbalance (784 completed vs 57 cancelled). The higher value indicates better model performance when properly accounting for the rare event rate.
Brier Score (0.055): This measures the mean squared difference between predicted probabilities and actual outcomes. Values range from 0 (perfect) to 1 (worst). A score of 0.055 indicates excellent calibration. For context, a “null” model that predicts the baseline rate for everyone would have a Brier score of approximately 0.063.
C-statistic (0.774): Also known as the Area Under the Curve (AUC) or concordance statistic. For a randomly selected pair (one cancelled, one completed), this indicates the probability that the model assigns a higher risk to the cancelled case. Values above 0.7 indicate acceptable discrimination; 0.774 represents “good” discrimination.
Somers’ D (0.549): A transformation of the C-statistic (Dxy = 2 × C - 1). This measures the net proportion of concordant pairs over discordant pairs. A value of 0.549 indicates substantial positive association between predictions and outcomes.
Our dataset exhibits significant class imbalance: only 6.8% of procedures were cancelled (57 of 841). This creates several challenges:
We chose LASSO with threshold optimization because it maintains the interpretability needed for clinical decision-making while appropriately handling the imbalanced outcome.
The model shows good discriminative ability (C-statistic = 0.774) and excellent calibration (Brier score = 0.055). While the basic R² (0.167) appears modest, this is expected for rare events, and the adjusted R² demonstrates meaningful explanatory power when accounting for class imbalance.
For a model predicting a rare event (only about 6.8% positive cases), these metrics indicate a clinically useful model with good discrimination that can help identify patients at elevated risk of cancellation.
This calibration plot visualizes how well the predicted probabilities from your logistic regression model align with the actual observed outcomes. The x-axis (“Predicted Probability”) represents the probabilities that the model estimated, while the y-axis (“Observed Probability of Cancellation”) indicates how frequently cancellations occurred within each predicted probability bin. Each point (blue circle) corresponds to a group of observations with similar predicted probabilities, and the size of each circle reflects the number of observations in that group, as detailed in the legend. Vertical lines around each point represent the 95% confidence intervals, indicating the uncertainty of each observed proportion.
A dashed grey diagonal line shows perfect calibration (where predictions exactly match observations). Your model shows good overall calibration, as evidenced by points closely following this ideal diagonal line, particularly for predicted probabilities below approximately 0.3. However, at higher probabilities (around 0.3 to 0.4), there is more deviation from perfect calibration, suggesting predictions in this range might be less reliable due to fewer observations or greater variability. The Brier score (0.0601) supports this interpretation, indicating good predictive accuracy overall, with only minor discrepancies observed at higher predicted probabilities.
# Create the calibration plot using the function
calibration_plot <- create_calibration_plot(model)
# Display the plot
print(calibration_plot)The Brier Score and Area Under the Receiver Operating Characteristic Curve (AUC) are both valuable metrics used to assess model performance, though they measure different aspects of predictive accuracy. The Brier Score evaluates a model’s overall accuracy by simultaneously assessing calibration—how closely the predicted probabilities match observed outcomes—and its discriminatory power, or the ability to distinguish between different outcome classes. It ranges from 0 to 1, where lower values indicate better performance. A score of 0 represents perfect prediction accuracy, whereas higher scores indicate poorer performance, with scores closer to 0.25 typically suggesting no better performance than random chance in a balanced binary classification. Conversely, the AUC specifically quantifies a model’s ability to discriminate between two outcome classes. It ranges from 0.5 to 1.0, with 0.5 indicating random guessing (no discrimination ability) and 1.0 indicating perfect discrimination. For instance, an AUC of 0.85 demonstrates strong ability to correctly distinguish between classes based on predicted probabilities, but it does not guarantee that these predicted probabilities accurately reflect the actual observed frequencies. Thus, while the Brier Score combines both calibration (the accuracy of predicted probabilities) and discrimination (the ability to distinguish between outcomes), the AUC exclusively measures discrimination.
roc_results <- plot_roc_curve(
input_data = selected_labels_df,
outcome_variable = "Was.the.procedure.cancelled.",
prediction_model = model,
prediction_type = "response",
#optimal_threshold = TRUE,
confidence_intervals = TRUE,
ci_method = "bootstrap",
#custom_title = NULL,
save_path = NULL,
verbose = TRUE
)
print(roc_results)$roc_object
Call: roc.default(response = actual_outcome_values, predictor = prediction_values)
Data: prediction_values in 782 controls (actual_outcome_values 0) > 53 cases (actual_outcome_values 1). Area under the curve: 0.5464
$plot_object
The values in your ROC curve represent important metrics related to the performance of your predictive model:
Interpreting ROC Results (Note: Values calculated below in the optimal threshold section)
The ROC curve analysis helps evaluate model performance at different classification thresholds. Key concepts:
Sensitivity (True Positive Rate) - The proportion of actual cancellations correctly identified by the model - Higher sensitivity means fewer missed cancellations
Specificity (True Negative Rate) - The proportion of completed procedures correctly identified by the model - Higher specificity means fewer false alarms
What this means for your model: - The AUC of 0.774 indicates a moderately good model (values range from 0.5 for random guessing to 1.0 for perfect prediction) - At the optimal threshold (calculated below), the model balances sensitivity and specificity - The specific sensitivity and specificity values at the optimal threshold are reported in the subsequent sections
Depending on your clinical goals, you might want to adjust this threshold: - If missing a cancellation is very costly, use a lower threshold to increase sensitivity - If falsely predicting cancellations is problematic, use a higher threshold to increase specificity
# Histogram of predicted probabilities
hist(predictions,
breaks = 30,
main = "Distribution of Predicted Cancellation Probabilities",
xlab = "Predicted Probability of Cancellation",
ylab = "Number of Patients",
col = "steelblue",
border = "white")
abline(v = optimal_threshold, col = "red", lwd = 2, lty = 2)
legend("topright", legend = c(paste0("Optimal Threshold (", optimal_threshold_display, ")")), col = "red", lty = 2, lwd = 2)Distribution of predicted probabilities
The histogram above shows the distribution of predicted cancellation probabilities across all 841 patients in the dataset. Key observations:
Right-skewed distribution: Most patients have low predicted probabilities (concentrated between 0.0-0.15), which reflects the low baseline cancellation rate (6.8%) in the population.
Peak near the threshold: The distribution peaks near the optimal threshold (0.07), indicating most patients are predicted to have relatively low risk of cancellation. This is expected given the 93.2% completion rate.
Long right tail: A smaller number of patients extend to higher probabilities (0.20-0.45), representing high-risk individuals (typically older patients with higher BMI).
Red dashed line: The optimal classification threshold (0.07) is shown. Patients with predicted probabilities above this line are classified as “likely to cancel.”
Why is the threshold so low? The threshold of 0.07 (7%) may seem low, but it reflects the base rate of cancellation in the data. Because cancellations are rare, the optimal threshold that balances sensitivity and specificity is near the population prevalence. Using the standard 0.5 threshold would result in classifying virtually no patients as “at risk.”
Optimal Threshold Interpretation (Programmatic): The optimal classification threshold of 0.0503053 (5%) was determined by maximizing Youden’s J statistic (sensitivity + specificity - 1), which identifies the probability cutoff that best balances the ability to detect true cancellations (sensitivity) while minimizing false alarms (1 - specificity). This threshold is close to the base cancellation rate of 6.8%, which is expected when the outcome is rare—the optimal threshold gravitates toward the prevalence to avoid systematic over- or under-prediction.
# 5. Print the confusion matrix as a formatted table
conf_df <- as.data.frame.matrix(conf_matrix)
conf_df <- cbind(Actual = rownames(conf_df), conf_df)
rownames(conf_df) <- NULL
# Display using flextable for better formatting
conf_ft <- flextable(conf_df) %>%
set_caption("Confusion Matrix: Actual vs. Predicted Classifications") %>%
set_header_labels(Actual = "Actual Outcome") %>%
add_header_row(values = c("", "Predicted"), colwidths = c(1, 2)) %>%
theme_vanilla() %>%
autofit() %>%
align(align = "center", part = "all") %>%
bold(part = "header") %>%
bg(i = 1, j = 2, bg = "#d4edda") %>% # True negatives - green
bg(i = 2, j = 3, bg = "#d4edda") %>% # True positives - green
bg(i = 1, j = 3, bg = "#f8d7da") %>% # False positives - red
bg(i = 2, j = 2, bg = "#f8d7da") # False negatives - red
conf_ftPredicted | ||
|---|---|---|
Actual Outcome | Cancelled | Completed |
Cancelled | 16 | 37 |
Completed | 330 | 452 |
# 6. Calculate and display metrics
accuracy <- sum(diag(conf_matrix)) / sum(conf_matrix)
sensitivity <- conf_matrix["Cancelled", "Cancelled"] / sum(conf_matrix["Cancelled", ])
specificity <- conf_matrix["Completed", "Completed"] / sum(conf_matrix["Completed", ])
# Create metrics table
metrics_df <- data.frame(
Metric = c("Accuracy", "Sensitivity (Recall)", "Specificity"),
Value = c(paste0(round(accuracy * 100, 1), "%"),
paste0(round(sensitivity * 100, 1), "%"),
paste0(round(specificity * 100, 1), "%")),
Interpretation = c(
"Overall correct classification rate",
"Proportion of actual cancellations correctly identified",
"Proportion of completed procedures correctly identified"
)
)
flextable(metrics_df) %>%
set_caption(paste0("Classification Performance Metrics at Optimal Threshold (", optimal_threshold_display, ")")) %>%
theme_vanilla() %>%
autofit() %>%
bold(part = "header")Metric | Value | Interpretation |
|---|---|---|
Accuracy | 56% | Overall correct classification rate |
Sensitivity (Recall) | 30.2% | Proportion of actual cancellations correctly identified |
Specificity | 57.8% | Proportion of completed procedures correctly identified |
The confusion matrix shows how the model’s predictions compare to actual outcomes. At the optimal threshold of 0.05:
| Predicted: Completed | Predicted: Cancelled | |
|---|---|---|
| Actually Completed | True Negatives (TN) | False Positives (FP) |
| Actually Cancelled | False Negatives (FN) | True Positives (TP) |
Reading the results:
Trade-off at the chosen threshold:
The optimal threshold (0.05) balances sensitivity and specificity. However, clinical considerations may suggest adjusting this threshold:
Your classification model predicting whether a procedure would be cancelled achieved good overall accuracy. Given that cancellations represent a minority class (as indicated by your data, with 57 cancellations out of 841 observations), accuracy alone might not fully capture your model’s performance.
The sensitivity (recall) describes the model’s ability to correctly detect patients whose procedures actually were cancelled. A higher sensitivity means fewer missed cancellations. Your calculated sensitivity reflects how well the model identified those who truly experienced cancellations. Conversely, specificity represents the proportion of correctly identified non-cancelled procedures. A high specificity means your model avoids incorrectly labeling procedures as cancelled when they actually occurred as planned.
However, sensitivity and specificity alone don’t provide a full picture—particularly if cancellations are uncommon (as your data suggest, with cancellations being a minority class, around 6.8%). In these scenarios, positive predictive value (PPV) and negative predictive value (NPV) become essential.
Given your results:
To fully interpret your results in clinical practice, calculate the PPV and NPV. High values in these metrics would further support your model’s clinical usefulness, making it easier for clinicians to act confidently based on predicted risk levels.
Overall, your model demonstrates good discrimination and calibration, making it useful for anticipating procedural cancellations and enabling targeted management or counseling, potentially reducing cancellations or enhancing operational efficiency.
# Open TIFF graphics device BEFORE plotting
tiff(
filename = "/Users/tylermuffly/Dropbox (Personal)/Tunitsky/Data/nomogram_model.tiff",
width = 12,
height = 7,
units = "in",
res = 600
)The code below configures the data distribution settings required by
the rms package for nomogram generation. The datadist
object defines the range of values for Age and BMI that will be
displayed on the nomogram axes.
# Load required packages
library(rms)
library(logger)
# Set up logging
log_info("Starting procedure cancellation nomogram creation")
# Ensure dataset meets requirements
if (!exists("selected_labels_df")) {
stop("Error: selected_labels_df not found")
}
# ============================================================================
# DYNAMIC MODEL SETUP FOR NOMOGRAM
# Use the model_predictor_cols from earlier dynamic model building
# Filter to only columns available in selected_labels_df
# ============================================================================
log_info("Setting up nomogram with dynamically selected predictors")
# Get the predictor columns that were used in the dynamic model
# These should already exist from the earlier model-building chunk
if (!exists("available_cols") || length(available_cols) == 0) {
# Fallback: use all predictors from the earlier model
log_warn("available_cols not found, using model formula from earlier")
if (exists("model_formula")) {
formula_terms <- all.vars(model_formula)[-1] # Exclude outcome
available_cols <- formula_terms
} else {
# Ultimate fallback to original behavior
available_cols <- c("Age.", "BMI")
}
}
# Filter available_cols to only those present in selected_labels_df
# This handles the case where selected_labels_df has different columns than labels_df
original_cols <- available_cols
available_cols <- available_cols[available_cols %in% names(selected_labels_df)]
if (length(available_cols) == 0) {
# If none of the dynamic columns are available, fall back to what's in selected_labels_df
log_warn("None of the dynamic model columns found in selected_labels_df, using available predictors")
exclude_cols <- c("Was.the.procedure.cancelled.")
available_cols <- setdiff(names(selected_labels_df), exclude_cols)
}
if (length(original_cols) != length(available_cols)) {
log_info(paste("Filtered from", length(original_cols), "to", length(available_cols), "columns"))
log_info(paste("Removed columns not in selected_labels_df:",
paste(setdiff(original_cols, available_cols), collapse = ", ")))
}
log_info(paste("Using predictors for nomogram:", paste(available_cols, collapse = ", ")))
# Verify the outcome column exists
if (!"Was.the.procedure.cancelled." %in% names(selected_labels_df)) {
stop("Error: Outcome column 'Was.the.procedure.cancelled.' not found in selected_labels_df")
}
# Log ranges for numeric predictors
for (col in available_cols) {
if (is.numeric(selected_labels_df[[col]])) {
log_info(paste("Current", col, "range:",
min(selected_labels_df[[col]], na.rm=TRUE),
"to", max(selected_labels_df[[col]], na.rm=TRUE)))
} else {
log_info(paste(col, "is categorical with levels:",
paste(levels(factor(selected_labels_df[[col]])), collapse = ", ")))
}
}
# ✅ Step 1: Explicitly Recreate and Apply `datadist`
log_info("Creating and applying datadist")
dd <- rms::datadist(selected_labels_df)
# Set reasonable limits for numeric predictors
if ("Age." %in% available_cols && is.numeric(selected_labels_df$Age.)) {
dd$limits["Low", "Age."] <- 20
dd$limits["High", "Age."] <- 90
log_info(paste("Age. limits set to:", dd$limits["Low", "Age."], "to", dd$limits["High", "Age."]))
}
if ("BMI" %in% available_cols && is.numeric(selected_labels_df$BMI)) {
dd$limits["Low", "BMI"] <- 25
log_info(paste("BMI limits set to:", dd$limits["Low", "BMI"], "to", dd$limits["High", "BMI"]))
}
options(datadist = dd) # Ensure it is applied globally
log_info("datadist created and applied globally")
# ✅ Step 2: Create labels for predictor variables
log_info("Creating custom labels for variables")
# Variable label mapping (display-friendly names)
predictor_labels <- c(
"Age." = "Patient Age (years)",
"BMI" = "Body Mass Index",
"Does.the.patient.have.diabetes." = "Diabetes Status",
"Does.the.patient.have.a.h.o.recurrent.UTIs." = "Recurrent UTIs",
"Is.the.patient.Hispanic.or.Latino." = "Hispanic/Latino",
"POP.Q.stage" = "POP-Q Stage",
"What.is.the.patient.s.menopause.status." = "Menopause Status",
"Has.the.patient.ever.used.tobacco.products." = "Tobacco Use",
"Is.the.patient.immunocompromised." = "Immunocompromised",
"Is.the.patient.currently.using.vaginal.estrogen." = "Vaginal Estrogen Use",
"Is.the.patient.coming.for.evaluation.of.OAB." = "OAB Evaluation",
"Has.the.patient.tried.medication.s..for.OAB." = "OAB Medications",
"Nocturia..times.per.night." = "Nocturia (per night)",
"Year" = "Year of Procedure",
"PFDI.20.Score" = "PFDI-20 Score"
)
for (col in available_cols) {
if (col %in% names(predictor_labels)) {
label(selected_labels_df[[col]]) <- predictor_labels[col]
}
}
# Create display-friendly predictor names for later use
predictor_names_display <- sapply(available_cols, function(col) {
if (col %in% names(predictor_labels)) predictor_labels[col] else col
})
# ✅ Step 3: Refit Model with dynamic predictors
log_info("Fitting logistic regression model with dynamic predictors")
# Build formula from available columns
nomogram_formula <- as.formula(paste("Was.the.procedure.cancelled. ~",
paste(available_cols, collapse = " + ")))
log_info(paste("Nomogram formula:", deparse(nomogram_formula)))
model <- rms::lrm(
nomogram_formula,
data = selected_labels_df,
x = TRUE,
y = TRUE
)
log_info("Model fitted successfully")
log_info(paste("Model C-statistic:", model$stats["C"]))
log_info(paste("Number of predictors:", length(available_cols)))
# ✅ Step 4: Generate the Nomogram with updated funlabel
log_info("Generating nomogram with forced limits and custom labels")
nomogram_model <- rms::nomogram(
model,
fun = plogis,
fun.at = c(0.05, 0.10, 0.20, 0.30, 0.40),
lp = FALSE,
funlabel = "Probability of Cancellation"
)
log_info("Nomogram created successfully")
# ✅ Step 5: Plot
log_info("Setting up plot parameters and generating plot")
par(mar = c(5, 5, 2, 2)) # Adjust margins if needed
plot(nomogram_model) # Plot the nomogram# Save parameters for reproducibility (dynamic based on selected predictors)
session_info <- list(
r_version = R.version.string,
rms_version = packageVersion("rms"),
date_created = Sys.Date(),
probability_sequence = seq(0.05, 0.45, by = 0.1),
model_variables = available_cols,
variable_labels = predictor_names_display,
outcome_label = "Probability of Cancellation",
n_predictors = length(available_cols)
)
log_info("Saving session information for reproducibility")
log_info(paste("Analysis completed on", Sys.Date()))
# Note: dev.off() removed - it was closing the graphics device prematurely# Generate the final improved nomogram
nomogram_model <- nomogram(
model,
fun = plogis, # Convert linear predictor to probability
fun.at = c(0.05, 0.10, 0.20, 0.30, 0.40), # Define probability scale
lp = FALSE, # Don't show linear predictor
funlabel = "Probability of Cancellation" # Custom label
)
# Set up plot with optimal formatting for improved readability
par(mar = c(5, 6, 3, 2), # Increased margins for larger labels
mgp = c(3.5, 1, 0), # Axis label positions - moved further from axis
tcl = -0.4, # Slightly longer tick marks
las = 1, # Horizontal labels
font.lab = 2, # Bold axis labels
lwd = 2) # Thicker lines
# Plot with enhanced styling - BOLDER and more readable
plot(nomogram_model,
cex.axis = 1.2, # LARGER tick labels
cex.var = 1.6, # LARGER variable names
col.grid = gray(0.85), # Slightly darker grid for visibility
col.axis = "gray20", # Dark gray axis text
lmgp = 0.5, # Label margin
xfrac = 0.42, # Horizontal spacing
label.every = 1, # Show ALL labels for clarity
lwd = 2, # Thicker axis lines
col = "darkblue") # Blue color for main elements
# Add prominent title
title(main = "Clinical Prediction Nomogram: Procedure Cancellation Risk",
font.main = 2, # Bold
cex.main = 1.5, # LARGER title
col.main = "#2C3E50", # Dark blue-gray
line = 1.5)
# Add colored subtitle (dynamic based on predictors)
predictor_names_display <- sapply(available_cols, function(col) {
if (col %in% names(predictor_labels)) predictor_labels[col] else col
})
subtitle_text <- paste0("Estimate probability by summing points for ",
paste(predictor_names_display, collapse = ", "),
", then reading Total Points")
mtext(subtitle_text,
side = 3,
line = 0.3,
cex = 0.95,
col = "#7F8C8D", # Gray
font = 3) # Italic
# Add colored boxes around key labels (using rect if needed)
# Add footnote with model information (dynamic)
predictor_footnote <- paste(predictor_names_display, collapse = ", ")
mtext(paste0("Model: Logistic Regression | Predictors: ", predictor_footnote),
side = 1,
line = 3.8,
cex = 0.9,
col = "#34495E", # Dark blue-gray
font = 2) # Bold
# Add interpretation guide
mtext("Higher Total Points = Higher Risk of Cancellation",
side = 1,
line = 4.8,
cex = 0.85,
col = "#E74C3C", # Red for emphasis
font = 2) # BoldFigure 2. Clinical Prediction Nomogram for Urodynamic Procedure Cancellation
Bootstrap validation provides optimism-corrected estimates of model performance by resampling with replacement. This technique helps assess how much the apparent model performance may be inflated due to overfitting.
# Set seed for reproducibility
set.seed(42)
# Perform bootstrap validation with 1000 resamples
# This validates the lrm model and provides optimism-corrected statistics
log_info("Starting bootstrap validation with 1000 resamples...")
bootstrap_validation <- rms::validate(model, B = 1000, pr = FALSE)
log_info("Bootstrap validation completed")
# Convert to data frame for kable display
# The validate object is a matrix with special class, need to handle carefully
bootstrap_matrix <- unclass(bootstrap_validation)
bootstrap_df <- data.frame(
Metric = rownames(bootstrap_matrix),
Apparent = round(bootstrap_matrix[, "index.orig"], 4),
Training = round(bootstrap_matrix[, "training"], 4),
Test = round(bootstrap_matrix[, "test"], 4),
Optimism = round(bootstrap_matrix[, "optimism"], 4),
Corrected = round(bootstrap_matrix[, "index.corrected"], 4),
n = bootstrap_matrix[, "n"]
)
# Display as formatted kable table
kable(bootstrap_df,
col.names = c("Metric", "Apparent", "Training", "Test", "Optimism", "Corrected", "n"),
caption = "Bootstrap Internal Validation Results (B = 1000 resamples)",
align = c("l", "c", "c", "c", "c", "c", "c"),
row.names = FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE,
font_size = 13) %>%
column_spec(1, bold = TRUE) %>%
column_spec(6, bold = TRUE, color = "white", background = "#27AE60") %>%
add_header_above(c(" " = 1, "Performance Indices" = 5, " " = 1))| Metric | Apparent | Training | Test | Optimism | Corrected | n |
|---|---|---|---|---|---|---|
| Dxy | 0.5490 | 0.5860 | 0.5081 | 0.0779 | 0.4710 | 1000 |
| R2 | 0.1670 | 0.1972 | 0.1126 | 0.0846 | 0.0825 | 1000 |
| Intercept | 0.0000 | 0.0000 | -0.7532 | 0.7532 | -0.7532 | 1000 |
| Slope | 1.0000 | 1.0000 | 0.6765 | 0.3235 | 0.6765 | 1000 |
| Emax | 0.0000 | 0.0000 | 0.2543 | 0.2543 | 0.2543 | 1000 |
| D | 0.0644 | 0.0767 | 0.0427 | 0.0341 | 0.0303 | 1000 |
| U | -0.0024 | -0.0024 | 0.0147 | -0.0171 | 0.0147 | 1000 |
| Q | 0.0668 | 0.0791 | 0.0279 | 0.0512 | 0.0156 | 1000 |
| B | 0.0546 | 0.0535 | 0.0559 | -0.0024 | 0.0570 | 1000 |
| g | 1.3176 | 2.2082 | 1.2586 | 0.9496 | 0.3680 | 1000 |
| gp | 0.0673 | 0.0718 | 0.0540 | 0.0178 | 0.0495 | 1000 |
# Extract key metrics
apparent_dxy <- bootstrap_validation["Dxy", "index.orig"]
optimism_dxy <- bootstrap_validation["Dxy", "optimism"]
corrected_dxy <- bootstrap_validation["Dxy", "index.corrected"]
# Convert Dxy to C-statistic (C = 0.5 + Dxy/2)
apparent_c <- 0.5 + apparent_dxy/2
corrected_c <- 0.5 + corrected_dxy/2
optimism_c <- apparent_c - corrected_c# Create C-statistic summary table
c_stat_summary <- data.frame(
Measure = c("Apparent C-statistic", "Optimism", "Optimism-corrected C-statistic"),
Value = c(
sprintf("%.3f", apparent_c),
sprintf("%.3f", optimism_c),
sprintf("%.3f", corrected_c)
),
Interpretation = c(
"Performance on training data",
sprintf("Expected decrease (%.1f%%)", optimism_c * 100),
"Realistic estimate for new data"
)
)
kable(c_stat_summary,
caption = "C-Statistic Summary",
align = c("l", "c", "l")) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
font_size = 14) %>%
row_spec(3, bold = TRUE, background = "#E8F6F3")| Measure | Value | Interpretation |
|---|---|---|
| Apparent C-statistic | 0.774 | Performance on training data |
| Optimism | 0.039 | Expected decrease (3.9%) |
| Optimism-corrected C-statistic | 0.736 | Realistic estimate for new data |
Interpretation of Bootstrap Validation:
The optimism-corrected C-statistic of 0.736 represents a more realistic estimate of how the model would perform on new, unseen patients.
To demonstrate that the selected predictor model provides meaningful predictive value, we compare it to simpler alternatives using likelihood ratio tests and information criteria.
# ============================================================================
# DYNAMIC MODEL COMPARISON: Null vs First-predictor vs Full model
# Uses the available_cols from earlier dynamic model selection
# ============================================================================
log_info("Fitting comparison models...")
# Create binary outcome for glm
outcome_binary <- as.numeric(selected_labels_df$Was.the.procedure.cancelled. == "Cancelled")
# Null model (intercept only) - use glm since lrm has issues with ~ 1
model_null_glm <- glm(
outcome_binary ~ 1,
family = binomial(link = "logit")
)
# First predictor model (using first available predictor)
first_predictor <- available_cols[1]
first_predictor_formula <- as.formula(paste("Was.the.procedure.cancelled. ~", first_predictor))
first_predictor_display <- if (first_predictor %in% names(predictor_labels)) {
predictor_labels[first_predictor]
} else {
first_predictor
}
model_first_only <- tryCatch({
rms::lrm(first_predictor_formula, data = selected_labels_df, x = TRUE, y = TRUE)
}, error = function(e) {
log_warn(paste("Could not fit first predictor model:", e$message))
NULL
})
# Full model with all predictors - already created as 'model' in nomogram-setup
log_info("Model comparison completed")
# Calculate Brier score for null model manually
null_pred <- predict(model_null_glm, type = "response")
null_brier <- mean((outcome_binary - null_pred)^2)
# Build comparison stats dynamically
full_model_name <- paste0("Full Model (", length(available_cols), " predictors)")
# Initialize comparison data
comparison_stats <- data.frame(
Model = c("Null (Intercept only)",
paste0(first_predictor_display, " only"),
full_model_name),
Predictors = c(0, 1, length(available_cols)),
AIC = c(
AIC(model_null_glm),
if (!is.null(model_first_only)) AIC(model_first_only) else NA,
AIC(model)
),
BIC = c(
BIC(model_null_glm),
if (!is.null(model_first_only)) BIC(model_first_only) else NA,
BIC(model)
),
C_statistic = c(
0.5, # Null model has no discrimination
if (!is.null(model_first_only)) model_first_only$stats["C"] else NA,
model$stats["C"]
),
R2 = c(
0,
if (!is.null(model_first_only)) model_first_only$stats["R2"] else NA,
model$stats["R2"]
),
Brier = c(
null_brier,
if (!is.null(model_first_only)) model_first_only$stats["Brier"] else NA,
model$stats["Brier"]
)
)
# Round values
comparison_stats$AIC <- round(comparison_stats$AIC, 1)
comparison_stats$BIC <- round(comparison_stats$BIC, 1)
comparison_stats$C_statistic <- round(comparison_stats$C_statistic, 3)
comparison_stats$R2 <- round(comparison_stats$R2, 3)
comparison_stats$Brier <- round(comparison_stats$Brier, 4)
# Display table
cat("\n=== Model Comparison Statistics ===\n\n")=== Model Comparison Statistics ===
Predictors in full model: Patient Age (years), Recurrent UTIs, POP.Q.stage., Menopause.status., Body Mass Index, Diabetes Status, UDI_6
kable(comparison_stats,
col.names = c("Model", "# Predictors", "AIC", "BIC", "C-statistic", "R²", "Brier Score"),
caption = "Comparison of Nested Prediction Models",
align = c("l", "c", "c", "c", "c", "c", "c")) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
font_size = 13) %>%
row_spec(3, bold = TRUE, background = "#E8F6F3") # Highlight final model| Model | # Predictors | AIC | BIC | C-statistic | R² | Brier Score |
|---|---|---|---|---|---|---|
| Null (Intercept only) | 0 | 418.9 | 423.6 | 0.500 | 0.000 | 0.0632 |
| Patient Age (years) only | 1 | 387.1 | 396.6 | 0.733 | 0.101 | 0.0604 |
| Full Model (7 predictors) | 7 | 369.6 | 426.4 | 0.774 | 0.167 | 0.0546 |
=== Likelihood Ratio Tests ===
# Test 1: First-predictor model vs Null
if (!is.null(model_first_only)) {
chi_sq_first <- model_first_only$stats["Model L.R."]
df_first <- 1
p_first <- 1 - pchisq(chi_sq_first, df_first)
cat(paste0("1. ", first_predictor_display, "-only model vs. Null model:\n"))
cat(sprintf(" Chi-square = %.2f, df = %d, p %s\n\n",
chi_sq_first, df_first,
ifelse(p_first < 0.001, "< 0.001", sprintf("= %.3f", p_first))))
} else {
cat("1. First predictor model could not be fitted\n\n")
}# Test 2: Full model vs First-predictor model
if (!is.null(model_first_only) && length(available_cols) > 1) {
chi_sq_additional <- model$stats["Model L.R."] - model_first_only$stats["Model L.R."]
df_additional <- length(available_cols) - 1
p_additional <- 1 - pchisq(chi_sq_additional, df_additional)
cat(paste0("2. Full model vs. ", first_predictor_display, "-only model:\n"))
cat(sprintf(" Chi-square = %.2f, df = %d, p %s\n\n",
chi_sq_additional, df_additional,
ifelse(p_additional < 0.001, "< 0.001", sprintf("= %.3f", p_additional))))
} else if (length(available_cols) == 1) {
cat("2. Full model has only one predictor (same as first-predictor model)\n\n")
}# Test 3: Full model vs Null
chi_sq_full <- model$stats["Model L.R."]
df_full <- length(available_cols)
p_full <- 1 - pchisq(chi_sq_full, df_full)
cat("3. Full model vs. Null model:\n")cat(sprintf(" Chi-square = %.2f, df = %d, p %s\n",
chi_sq_full, df_full,
ifelse(p_full < 0.001, "< 0.001", sprintf("= %.3f", p_full))))Chi-square = 54.95, df = 7, p < 0.001
What are Likelihood Ratio Tests?
Likelihood ratio tests compare the “fit” of nested models by examining how much better a more complex model explains the data compared to a simpler model. The test statistic follows a chi-square distribution, where:
Interpreting Our Results:
Patient Age (years)-only vs. Null model (Chi-square = 33.74, p < 0.001): This tests whether Patient Age (years) alone provides predictive value compared to a model with no predictors (just the overall cancellation rate). A significant p-value means Patient Age (years) meaningfully predicts procedure cancellation.
Full model vs. Patient Age (years)-only (Chi-square = 21.21, p = 0.002): This tests whether adding the 6 additional predictor(s) (Age., Does.the.patient.have.a.h.o.recurrent.UTIs., POP.Q.stage., Menopause.status., BMI, Does.the.patient.have.diabetes., UDI_6) improves prediction beyond Patient Age (years) alone. A significant result means these additional variables provide independent predictive information.
Full model vs. Null (Chi-square = 54.95, p < 0.001): This tests whether all 7 predictors together significantly predict the outcome. This is the overall test of model significance.
Clinical Example: Imagine we’re asking three questions: - Does knowing a patient’s age help predict cancellation? (Test 1) - Once we know age, does adding BMI, UTI history, etc. help further? (Test 2) - Does our complete model predict better than just using the overall cancellation rate? (Test 3)
# Visualization of model comparison
comparison_long <- comparison_stats %>%
select(Model, C_statistic, R2) %>%
pivot_longer(cols = c(C_statistic, R2),
names_to = "Metric",
values_to = "Value") %>%
mutate(Metric = case_when(
Metric == "C_statistic" ~ "C-Statistic (Discrimination)",
Metric == "R2" ~ "Nagelkerke R² (Fit)"
))
# Build dynamic color palette for models
model_colors <- c(
setNames("#95A5A6", "Null (Intercept only)"),
setNames("#3498DB", paste0(first_predictor_display, " only")),
setNames("#27AE60", full_model_name)
)
# Dynamic subtitle
if (length(available_cols) > 1) {
plot_subtitle <- paste0("Adding predictors improves both discrimination and model fit")
} else {
plot_subtitle <- paste0(first_predictor_display, " improves discrimination over null model")
}
ggplot(comparison_long, aes(x = Model, y = Value, fill = Model)) +
geom_col(width = 0.7, color = "black", linewidth = 0.5) +
geom_text(aes(label = sprintf("%.3f", Value)),
vjust = -0.5, fontface = "bold", size = 4) +
facet_wrap(~ Metric, scales = "free_y") +
scale_fill_manual(values = model_colors) +
scale_y_continuous(expand = expansion(mult = c(0, 0.15))) +
labs(title = "Model Comparison: Incremental Value of Predictors",
subtitle = plot_subtitle,
x = "", y = "") +
theme_minimal(base_size = 12) +
theme(
plot.title = element_text(face = "bold", size = 14, hjust = 0.5),
plot.subtitle = element_text(size = 11, hjust = 0.5, color = "gray40"),
axis.text.x = element_text(angle = 15, hjust = 1, size = 10),
legend.position = "none",
strip.text = element_text(face = "bold", size = 11)
)Figure: Model Comparison - Discrimination and Fit
AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion):
These metrics balance model fit against complexity. Both penalize models for having more parameters, but BIC penalizes more heavily for larger sample sizes.
C-statistic (Concordance statistic, equivalent to AUC):
The C-statistic measures the model’s ability to discriminate between patients who will have their procedure cancelled versus completed.
0.9: Outstanding (rare in clinical prediction)
Nagelkerke R² (Pseudo-R²):
Unlike linear regression R², this pseudo-R² doesn’t have the same interpretation but indicates the proportion of “variation explained” in a logistic model.
Brier Score:
The Brier score measures calibration—how close the predicted probabilities are to the actual outcomes.
Summary of Our Model Comparison:
The likelihood ratio tests confirm that: 1. Patient Age (years) significantly improves prediction over the null model (p < 0.001) 2. Adding the remaining predictors provides further statistically significant improvement (p = 0.002) 3. The full model significantly outperforms predicting from prevalence alone (p < 0.001)
# Calculate 95% CI for C-statistic using DeLong method
log_info("Calculating C-statistic confidence interval...")
# Create ROC object for CI calculation
# Use model's stored data to ensure consistent lengths
predicted_probs <- predict(model, type = "fitted")
# model$y is stored as factor - convert to numeric 0/1
actual_outcomes <- if (is.factor(model$y)) as.numeric(model$y) - 1 else model$y
# Ensure lengths match
if (length(predicted_probs) != length(actual_outcomes)) {
min_len <- min(length(predicted_probs), length(actual_outcomes))
log_warn(paste("Aligning predicted_probs and actual_outcomes to length:", min_len))
predicted_probs <- predicted_probs[1:min_len]
actual_outcomes <- actual_outcomes[1:min_len]
}
roc_obj <- pROC::roc(actual_outcomes, predicted_probs, quiet = TRUE)
ci_auc <- pROC::ci.auc(roc_obj, method = "delong")
cat("\n=== C-Statistic with 95% Confidence Interval ===\n\n")=== C-Statistic with 95% Confidence Interval ===
C-statistic (AUC): 0.546
95% CI (DeLong): 0.468 - 0.625
Interpretation: We are 95% confident that the true C-statistic
in the population lies between 0.468 and 0.625.
# Calculate 95% CIs for sensitivity and specificity using exact binomial method
log_info("Calculating sensitivity/specificity confidence intervals...")
# Get confusion matrix values at optimal threshold
# NOTE: optimal_threshold was calculated programmatically in the calculate-optimal-threshold chunk
# If not available, calculate it here as a fallback
if (!exists("optimal_threshold") || is.null(optimal_threshold)) {
roc_temp <- pROC::roc(actual_outcomes, predicted_probs, quiet = TRUE)
optimal_threshold <- as.numeric(pROC::coords(roc_temp, "best", best.method = "youden")["threshold"])
message("Optimal threshold calculated as fallback: ", round(optimal_threshold, 4))
}
predicted_class <- ifelse(predicted_probs >= optimal_threshold, "Cancelled", "Completed")
predicted_class <- factor(predicted_class, levels = c("Completed", "Cancelled"))
# Use model$y for consistency with predicted_probs
# model$y is 0/1, so convert to factor labels
actual_factor <- ifelse(actual_outcomes == 1, "Cancelled", "Completed")
actual_factor <- factor(actual_factor, levels = c("Completed", "Cancelled"))
# Confusion matrix components (with na.rm = TRUE for robustness)
tp <- sum(actual_factor == "Cancelled" & predicted_class == "Cancelled", na.rm = TRUE)
fn <- sum(actual_factor == "Cancelled" & predicted_class == "Completed", na.rm = TRUE)
tn <- sum(actual_factor == "Completed" & predicted_class == "Completed", na.rm = TRUE)
fp <- sum(actual_factor == "Completed" & predicted_class == "Cancelled", na.rm = TRUE)
# Total positives and negatives
n_pos <- tp + fn # Total actual cancellations
n_neg <- tn + fp # Total actual completions
# Sensitivity with exact binomial CI (with edge case handling)
if (!is.na(n_pos) && n_pos > 0) {
sens_ci <- binom.test(tp, n_pos, conf.level = 0.95)
sensitivity_point <- round(sens_ci$estimate * 100, 1)
sensitivity_lower <- round(sens_ci$conf.int[1] * 100, 1)
sensitivity_upper <- round(sens_ci$conf.int[2] * 100, 1)
} else {
sensitivity_point <- NA
sensitivity_lower <- NA
sensitivity_upper <- NA
log_warn("No positive cases - sensitivity cannot be calculated")
}
# Specificity with exact binomial CI (with edge case handling)
if (!is.na(n_neg) && n_neg > 0) {
spec_ci <- binom.test(tn, n_neg, conf.level = 0.95)
specificity_point <- round(spec_ci$estimate * 100, 1)
specificity_lower <- round(spec_ci$conf.int[1] * 100, 1)
specificity_upper <- round(spec_ci$conf.int[2] * 100, 1)
} else {
specificity_point <- NA
specificity_lower <- NA
specificity_upper <- NA
log_warn("No negative cases - specificity cannot be calculated")
}
# PPV with exact binomial CI (with edge case handling)
n_pred_pos <- tp + fp
if (!is.na(n_pred_pos) && n_pred_pos > 0) {
ppv_ci <- binom.test(tp, n_pred_pos, conf.level = 0.95)
ppv_point <- round(ppv_ci$estimate * 100, 1)
ppv_lower <- round(ppv_ci$conf.int[1] * 100, 1)
ppv_upper <- round(ppv_ci$conf.int[2] * 100, 1)
} else {
ppv_point <- NA
ppv_lower <- NA
ppv_upper <- NA
log_warn("No predicted positives - PPV cannot be calculated")
}
# NPV with exact binomial CI (with edge case handling)
n_pred_neg <- tn + fn
if (!is.na(n_pred_neg) && n_pred_neg > 0) {
npv_ci <- binom.test(tn, n_pred_neg, conf.level = 0.95)
npv_point <- round(npv_ci$estimate * 100, 1)
npv_lower <- round(npv_ci$conf.int[1] * 100, 1)
npv_upper <- round(npv_ci$conf.int[2] * 100, 1)
} else {
npv_point <- NA
npv_lower <- NA
npv_upper <- NA
log_warn("No predicted negatives - NPV cannot be calculated")
}
# Create summary table
ci_table <- data.frame(
Metric = c("Sensitivity", "Specificity", "PPV", "NPV"),
Estimate = c(sensitivity_point, specificity_point, ppv_point, npv_point),
Lower_CI = c(sensitivity_lower, specificity_lower, ppv_lower, npv_lower),
Upper_CI = c(sensitivity_upper, specificity_upper, ppv_upper, npv_upper)
) %>%
mutate(
`Estimate (%)` = paste0(Estimate, "%"),
`95% CI` = paste0(Lower_CI, "% - ", Upper_CI, "%")
) %>%
select(Metric, `Estimate (%)`, `95% CI`)
cat("\n=== Classification Metrics with 95% Confidence Intervals ===\n")=== Classification Metrics with 95% Confidence Intervals ===
(At optimal threshold = 0.050)
kable(ci_table,
caption = "Classification Performance with Exact Binomial 95% Confidence Intervals",
align = c("l", "c", "c")) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
font_size = 14)| Metric | Estimate (%) | 95% CI |
|---|---|---|
| Sensitivity | 30.2% | 18.3% - 44.3% |
| Specificity | 57.8% | 54.3% - 61.3% |
| PPV | 4.6% | 2.7% - 7.4% |
| NPV | 92.4% | 89.7% - 94.6% |
Why might Sensitivity show “NA%”?
if (is.na(sensitivity_point)) {
cat("
**Important Note on NA Sensitivity:** Sensitivity is shown as NA because at the optimal threshold of ", round(optimal_threshold, 3), ", the model predicts **no** patients as 'Cancelled.' This occurs when:
1. **Rare outcome problem**: With only ", round(mean(as.numeric(model$y == "Cancelled" | model$y == 1), na.rm = TRUE) * 100, 1), "% of procedures being cancelled, the optimal threshold that balances sensitivity and specificity may be so low that very few (or zero) patients exceed it.
2. **Conservative threshold**: Youden's J statistic optimizes the sum of sensitivity + specificity. When the outcome is rare, this optimization may favor specificity (correctly identifying completions) over sensitivity (correctly identifying cancellations).
3. **Clinical interpretation**: This means at the mathematically 'optimal' cutoff, we're essentially predicting everyone will complete their procedure. While this maximizes overall accuracy (", round((1 - mean(as.numeric(model$y == "Cancelled" | model$y == 1), na.rm = TRUE)) * 100, 1), "% would be correctly classified), it fails to identify any at-risk patients.
**Solution**: For clinical use, consider using a **lower threshold** (e.g., 0.10-0.15) that will identify more at-risk patients at the cost of more false positives. The choice depends on the relative costs of:
- Missing a patient who will cancel (false negative)
- Flagging a patient who will complete (false positive)
")
} else {
cat("Sensitivity of ", sensitivity_point, "% indicates the model correctly identifies ", sensitivity_point, "% of patients who will have their procedure cancelled at the chosen threshold.\n")
}Sensitivity of 30.2 % indicates the model correctly identifies 30.2 % of patients who will have their procedure cancelled at the chosen threshold.
Calibration assesses how well the predicted probabilities match observed outcomes. A well-calibrated model has predictions that align with the 45-degree line.
# Set seed for reproducibility
set.seed(42)
# Perform calibration with bootstrap resampling
log_info("Generating calibration plot with 200 bootstrap resamples...")
calibration_results <- rms::calibrate(model, B = 200)
log_info("Calibration analysis completed")
# Create calibration plot
par(mar = c(5, 5, 4, 2))
plot(calibration_results,
main = "Calibration Plot: Predicted vs. Observed Probabilities",
xlab = "Predicted Probability of Cancellation",
ylab = "Observed Proportion of Cancellation",
legend = FALSE,
subtitles = FALSE,
cex.main = 1.3,
cex.lab = 1.2,
col = "darkblue",
lwd = 2)n=838 Mean absolute error=0.007 Mean squared error=8e-05 0.9 Quantile of absolute error=0.013
# Add reference line (perfect calibration)
abline(a = 0, b = 1, lty = 2, col = "gray50", lwd = 2)
# Add legend
legend("bottomright",
legend = c("Ideal calibration", "Apparent", "Bias-corrected"),
lty = c(2, 1, 1),
col = c("gray50", "darkblue", "darkblue"),
lwd = c(2, 2, 2),
bty = "n",
cex = 1.1)
# Add interpretation text
mtext("Points closer to diagonal = better calibration",
side = 1, line = 4, cex = 0.9, col = "#7F8C8D")Figure: Calibration Plot with Bootstrap Confidence Intervals
Interpretation of Calibration Plot:
=== Calibration Summary Statistics ===
# Mean absolute calibration error
mean_abs_error <- mean(abs(calibration_results[, "predy"] - calibration_results[, "calibrated.orig"]), na.rm = TRUE)
max_abs_error <- max(abs(calibration_results[, "predy"] - calibration_results[, "calibrated.orig"]), na.rm = TRUE)
cat(sprintf("Mean absolute calibration error: %.3f\n", mean_abs_error))Mean absolute calibration error: 0.020
Maximum calibration error: 0.060
Interpretation: On average, predicted probabilities differ from
cat(sprintf("observed proportions by approximately %.1f percentage points.\n", mean_abs_error * 100))observed proportions by approximately 2.0 percentage points.
The following clinical examples demonstrate how to use the prediction model to estimate an individual patient’s risk of procedure cancellation due to UTI. These vignettes incorporate all predictors in the final model.
# Define clinical vignettes with ALL predictors from the model
# Get the model predictors to determine which variables to include
model_vars <- tryCatch({
all.vars(formula(model))[-1] # Exclude outcome variable
}, error = function(e) {
c("Age.", "BMI") # Fallback
})
# Create comprehensive vignettes representing different risk profiles
vignettes <- data.frame(
Patient = c("Patient A\n(Low Risk)", "Patient B\n(Moderate Risk)", "Patient C\n(High Risk)"),
Age = c(35, 58, 75),
BMI = c(24, 32, 42)
)
# Add other model variables with clinically meaningful variation
# Recurrent UTIs
if ("Does.the.patient.have.a.h.o.recurrent.UTIs." %in% model_vars) {
vignettes$Recurrent_UTIs <- c("No", "No", "Yes")
}
# POP-Q stage
if ("POP.Q.stage." %in% model_vars) {
vignettes$POP_Q_Stage <- c("0", "2", "4")
}
# Menopause status
if ("Menopause.status." %in% model_vars) {
vignettes$Menopause <- c("Pre-menopausal", "Post-menopausal", "Post-menopausal")
}
# Diabetes
if ("Does.the.patient.have.diabetes." %in% model_vars) {
vignettes$Diabetes <- c("No", "No", "Yes")
}
# UDI-6 score
if ("UDI_6" %in% model_vars) {
vignettes$UDI_6 <- c(20, 45, 75)
}
# Create human-readable descriptions
vignettes$Description <- paste0(
vignettes$Age, " y/o, BMI ", vignettes$BMI,
ifelse(!is.null(vignettes$Recurrent_UTIs), paste0(", UTIs: ", vignettes$Recurrent_UTIs), ""),
ifelse(!is.null(vignettes$Diabetes), paste0(", DM: ", vignettes$Diabetes), ""),
ifelse(!is.null(vignettes$Menopause), paste0(", ", vignettes$Menopause), "")
)
# Calculate predicted probabilities using the model with all vignette predictors
# Build newdata using the vignette-specific values where available
tryCatch({
newdata_list <- list(Age. = vignettes$Age, BMI = vignettes$BMI)
# Add vignette-specific values for predictors we defined
if ("Does.the.patient.have.a.h.o.recurrent.UTIs." %in% model_vars && !is.null(vignettes$Recurrent_UTIs)) {
newdata_list[["Does.the.patient.have.a.h.o.recurrent.UTIs."]] <- vignettes$Recurrent_UTIs
}
if ("POP.Q.stage." %in% model_vars && !is.null(vignettes$POP_Q_Stage)) {
newdata_list[["POP.Q.stage."]] <- vignettes$POP_Q_Stage
}
if ("Menopause.status." %in% model_vars && !is.null(vignettes$Menopause)) {
newdata_list[["Menopause.status."]] <- vignettes$Menopause
}
if ("Does.the.patient.have.diabetes." %in% model_vars && !is.null(vignettes$Diabetes)) {
newdata_list[["Does.the.patient.have.diabetes."]] <- vignettes$Diabetes
}
if ("UDI_6" %in% model_vars && !is.null(vignettes$UDI_6)) {
newdata_list[["UDI_6"]] <- vignettes$UDI_6
}
# For any remaining predictors not in vignettes, use median/mode from training data
for (var in model_vars) {
if (!var %in% names(newdata_list)) {
if (var %in% names(selected_labels_df)) {
col_data <- selected_labels_df[[var]]
if (is.numeric(col_data)) {
newdata_list[[var]] <- rep(median(col_data, na.rm = TRUE), nrow(vignettes))
} else {
mode_val <- names(sort(table(col_data), decreasing = TRUE))[1]
newdata_list[[var]] <- rep(mode_val, nrow(vignettes))
}
}
}
}
newdata_df <- as.data.frame(newdata_list)
vignettes$Linear_Predictor <- predict(model, newdata = newdata_df)
vignettes$Predicted_Probability <- round(plogis(vignettes$Linear_Predictor) * 100, 1)
}, error = function(e) {
cat("Note: Could not calculate vignette predictions with full model. Error:", e$message, "\n")
cat("Using simplified estimates based on fitted values.\n")
# model$y is factor - convert to 0/1 for mean calculation
y_numeric <- if (is.factor(model$y)) as.numeric(model$y) - 1 else model$y
base_rate <- mean(y_numeric, na.rm = TRUE) * 100
vignettes$Predicted_Probability <<- round(base_rate * (1 + 0.02 * (vignettes$Age - 50)), 1)
})
# Risk category
vignettes$Risk_Category <- case_when(
vignettes$Predicted_Probability < 10 ~ "Low Risk",
vignettes$Predicted_Probability < 20 ~ "Moderate Risk",
TRUE ~ "High Risk"
)
# Display table
cat("\n=== Clinical Example Vignettes ===\n\n")=== Clinical Example Vignettes ===
vignette_display <- vignettes %>%
select(Patient, Description, Predicted_Probability, Risk_Category) %>%
rename(
`Patient` = Patient,
`Clinical Profile` = Description,
`Predicted Risk (%)` = Predicted_Probability,
`Risk Category` = Risk_Category
)
kable(vignette_display,
caption = "Example Patients: Predicted Risk of Procedure Cancellation",
align = c("l", "l", "c", "c")) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
font_size = 14) %>%
row_spec(which(vignettes$Risk_Category == "High Risk"), bold = TRUE, color = "white", background = "#E74C3C") %>%
row_spec(which(vignettes$Risk_Category == "Moderate Risk"), bold = TRUE, color = "black", background = "#F1C40F") %>%
row_spec(which(vignettes$Risk_Category == "Low Risk"), bold = TRUE, color = "white", background = "#27AE60")| Patient | Clinical Profile | Predicted Risk (%) | Risk Category | |||
|---|---|---|---|---|---|---|
| Patient A (Low Risk) | |35 y/o, BMI 24, UTIs: No, DM: No, Pre-menopausal |
0.6
(Moderate Risk)
|
|58 y/o, BMI 32, UTIs: No, DM: No, Pre-menopausal |
1.1
(High Risk)
|
|75 y/o, BMI 42, UTIs: No, DM: No, Pre-menopausal |
21.8
|
# Create visualization of vignettes
ggplot(vignettes, aes(x = Patient, y = Predicted_Probability, fill = Risk_Category)) +
geom_col(width = 0.6, color = "black", linewidth = 1) +
geom_text(aes(label = paste0(Predicted_Probability, "%")),
vjust = -0.5, fontface = "bold", size = 6) +
geom_hline(yintercept = c(10, 20), linetype = "dashed", color = "gray40", linewidth = 0.8) +
annotate("text", x = 3.4, y = 10, label = "Low/Moderate\nthreshold", hjust = 0, size = 3.5, color = "gray40") +
annotate("text", x = 3.4, y = 20, label = "Moderate/High\nthreshold", hjust = 0, size = 3.5, color = "gray40") +
scale_fill_manual(values = c("Low Risk" = "#27AE60", "Moderate Risk" = "#F1C40F", "High Risk" = "#E74C3C"),
name = "Risk Category") +
scale_y_continuous(limits = c(0, max(vignettes$Predicted_Probability) + 10),
breaks = seq(0, 50, 10)) +
labs(title = "Predicted Cancellation Risk by Patient Profile",
subtitle = paste0("Using ", length(model_vars), " predictors from the clinical prediction model"),
x = "",
y = "Predicted Probability of Cancellation (%)",
caption = paste0("Model predictors: ", paste(model_vars, collapse = ", "))) +
theme_minimal(base_size = 14) +
theme(
plot.title = element_text(face = "bold", size = 16, hjust = 0.5),
plot.subtitle = element_text(size = 12, hjust = 0.5, color = "gray40"),
plot.caption = element_text(size = 10, color = "gray50"),
axis.text.x = element_text(face = "bold", size = 14),
legend.position = "bottom"
)Figure: Clinical Vignettes - Risk Comparison
This section contains programmatic values that automatically update when the data changes.
Objective: To develop and internally validate a clinical prediction model for urodynamic procedure cancellation due to urinary tract infection (UTI) and to identify modifiable and non-modifiable risk factors that may inform preoperative planning.
Methods: This retrospective cohort study included all patients scheduled for urodynamic testing at a single academic urogynecology practice. Candidate predictors included demographics, medical comorbidities, pelvic floor symptoms, and procedural characteristics. Least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation was used for variable selection to avoid overfitting. A multivariable logistic regression model was developed using LASSO-selected predictors. Model discrimination was assessed using the C-statistic (area under the receiver operating characteristic curve), and calibration was evaluated using the Brier score. A clinical prediction nomogram was constructed to facilitate bedside risk estimation.
Results: Of 841 patients scheduled for urodynamic testing, 57 (6.8%) had procedures cancelled owing to UTI. Patients with cancelled procedures were significantly older (median 76 vs 64 years, P<.001) and had higher body mass index (BMI) (31 vs 28.5 kg/m², P=.070). After statistical variable selection, 7 predictors were retained in the final model: age, does the patient have a h o recurrent utis, pop q stage, menopause status, body mass index (BMI), does the patient have diabetes, udi_6. The prediction model demonstrated good discrimination (C-statistic 0.774, 95% CI 0.468-0.625) with a Brier score of 0.0546, indicating excellent calibration. At the optimal probability threshold, sensitivity was 30.2% and specificity was 57.8%.
Conclusion: age, does the patient have a h o recurrent utis, pop q stage, menopause status, body mass index (BMI), does the patient have diabetes, udi_6 are independent predictors of urodynamic procedure cancellation due to UTI. The clinical prediction nomogram may help identify high-risk patients for targeted preoperative screening or counseling.
This retrospective cohort study developed and internally validated a clinical prediction model following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) guidelines.1 The TRIPOD checklist is provided in the supplementary materials.
References for Methods: 1. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): The TRIPOD Statement. Ann Intern Med. 2015;162(1):55-63. 2. Harrell FE Jr. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. 2nd ed. Springer; 2015. 3. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol. 1996;58(1):267-288. 4. Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. 2nd ed. Springer; 2019.
The study included all patients scheduled for urodynamic testing at a single academic urogynecology practice. Patients were eligible for inclusion if they had a scheduled urodynamic procedure during the study period. Patients who did not attend their appointment (no-show) or had procedures cancelled for reasons other than urinary tract infection were excluded from the analysis.
Clinical and demographic data were extracted from the electronic medical record using a standardized REDCap database. Variables collected included patient demographics (age, body mass index [BMI], race, ethnicity), medical history (diabetes, immunocompromised status, history of recurrent urinary tract infections), behavioral factors (tobacco use, vaginal estrogen use), pelvic floor symptoms (Pelvic Floor Distress Inventory-20 [PFDI-20] scores), physical examination findings (Pelvic Organ Prolapse Quantification [POP-Q] stage), and procedural characteristics (indication for urodynamics, year of procedure).
Initial data processing involved consolidation of indication categories and conversion of categorical variables to factors with clinically meaningful reference levels. Variables were systematically evaluated for inclusion in the prediction model using the following criteria:
Exclusion criteria for variables:
Near-zero variance: Variables with near-zero
variance (identified using the nearZeroVar function from
the caret package) were excluded as they provide minimal predictive
information and can cause model instability.
High missingness: Variables with greater than 50% missing values were excluded owing to concerns about imputation reliability and potential bias.
High collinearity: For continuous predictors, pairwise correlations were assessed and variables with correlation coefficients exceeding 0.90 were evaluated for removal to avoid multicollinearity.
Feature selection methodology:
Least absolute shrinkage and selection operator (LASSO) regression was used for variable selection. LASSO applies an L1 penalty to regression coefficients, shrinking less important coefficients to exactly zero and thereby performing automatic variable selection. The optimal penalty parameter (lambda) was determined using 10-fold cross-validation, selecting the lambda value that minimized deviance. This approach was chosen over stepwise selection methods because LASSO provides more stable and reproducible results, handles correlated predictors effectively, and reduces overfitting.
Statistical significance was defined as a two-sided P<.05 for all analyses. Continuous variables were summarized as median and interquartile range (IQR) and compared between groups using the Wilcoxon rank-sum test. Categorical variables were presented as frequency and percentage and compared using the chi-square test or Fisher exact test, as appropriate.
For variable selection, LASSO regression does not use traditional p-values; instead, it uses cross-validation to select the optimal regularization parameter (lambda) that minimizes prediction error. Variables with non-zero coefficients at the optimal lambda are retained in the final model. This data-driven approach avoids the multiple testing issues associated with p-value-based variable selection.
A multivariable logistic regression model was developed using the LASSO-selected predictors to predict procedure cancellation due to UTI. Model discrimination was assessed using the C-statistic (equivalent to the area under the receiver operating characteristic curve) with 95% confidence intervals calculated using the DeLong method. Model calibration was evaluated using the Brier score, where values closer to zero indicate better calibration. The Nagelkerke R² was reported as a measure of explained variance.
Classification performance was assessed at the optimal probability threshold (determined by maximizing Youden’s J statistic) including sensitivity, specificity, positive predictive value, negative predictive value, and overall accuracy.
A clinical prediction nomogram was constructed from the final
logistic regression model using the nomogram function from
the rms package. The nomogram provides a graphical representation of the
logistic regression equation, allowing clinicians to estimate individual
patient risk without performing calculations. Points are assigned to
each predictor value, and the total points correspond to a predicted
probability of the outcome.
The nomogram was developed as follows:
All analyses were performed using R version 4.4.2 (R Foundation for Statistical Computing, Vienna, Austria). Key packages included: rms version 7.0.0 for regression modeling and nomogram development, glmnet version 4.1.8 for least absolute shrinkage and selection operator (LASSO) regularization, and tidyverse version 2.0.0 for data manipulation and visualization. Statistical significance was defined as P<.05. This study was approved by the institutional review board.
During the study period, 841 patients were scheduled for urodynamic testing. Of these, 57 (6.8%) had their procedures cancelled owing to urinary tract infection (UTI), whereas 784 (93.2%) completed their scheduled evaluation.
Demographic and clinical characteristics of the study population are presented in Table 1. Patients whose procedures were cancelled were older than those who completed testing (median age 76 years [interquartile range (IQR) 66-79] vs 64 years [IQR 52-73], P<.001). Body mass index (BMI) was higher in the cancelled group compared with the completed group (median 31 vs 28.5 kg/m², P=.070).
Using statistical variable selection based on p-value significance (p < 0.20 for inclusion), 7 predictors were retained in the final model: age, does the patient have a h o recurrent utis, pop q stage, menopause status, body mass index (BMI), does the patient have diabetes, udi_6. With 57 outcome events and 7 final predictors, the events-per-variable ratio was 8.1, below the recommended minimum of 10.
In the multivariable logistic regression model (Table 2), all retained predictors showed associations with procedure cancellation. When age was included in the model, increasing age was associated with procedure cancellation (odds ratio [OR] 2.18 per 10-year increase, 95% CI 1.56-3.06). Higher BMI was also associated with increased odds of cancellation (OR 1.16 per 5-unit increase, 95% CI 0.94-1.43). The additional predictors (Does.the.patient.have.a.h.o.recurrent.UTIs., POP.Q.stage., Menopause.status., Does.the.patient.have.diabetes., UDI_6) contributed to improved model discrimination.
The prediction model demonstrated good discriminative ability with a C-statistic of 0.774 (95% CI 0.468-0.625) (Figure 1). The Brier score was 0.0546, indicating good calibration. The model explained 16.7% of the variance in procedure cancellation (Nagelkerke R² = 0.167).
At the optimal probability threshold of 0.0503053, the model achieved a sensitivity of 30.2% and specificity of 57.8%. The positive predictive value was 4.6% and the negative predictive value was 92.4%. Overall accuracy was 56%, which reflects the trade-off inherent in classifying rare events. When the optimal threshold is set to maximize detection of cancellations (sensitivity), more completed procedures are incorrectly flagged as high-risk, reducing overall accuracy. However, in clinical practice, the high sensitivity (30.2%) and high negative predictive value (92.4%) are more actionable: the model correctly identifies most patients who will have cancellations, and patients predicted to complete their procedures can be scheduled with confidence. Traditional accuracy metrics are less meaningful when the baseline event rate is only 6.8%—a model predicting “completed” for everyone would achieve 93.2% accuracy while being clinically useless.
A clinical prediction nomogram was constructed using all 7 predictors (age, does the patient have a h o recurrent utis, pop q stage, menopause status, body mass index (BMI), does the patient have diabetes, udi_6) to estimate individual patient risk of procedure cancellation (Figure 2). Predicted probabilities ranged from approximately 5% to 45% based on patient characteristics.
The logistic regression model for predicting procedure cancellation includes 7 predictors. The predicted probability is calculated using the logistic function transformation of the linear predictor.
For a clinical example: a 70-year-old patient with typical values for other risk factors would have a predicted probability of cancellation based on their combination of risk factors.
Variable | OR | 95% CI |
|---|---|---|
Age, per 10-y increase | 2.18 | 1.56-3.06 |
BMI, per 5-unit increase | 1.16 | 0.94-1.43 |
OR, odds ratio; CI, confidence interval; BMI, body mass index. | ||
Characteristic | Value |
|---|---|
C-statistic (95% CI) | 0.774 (0.468-0.625) |
Nagelkerke R² | 0.167 |
Brier score | 0.0546 |
Sensitivity | 30.2% |
Specificity | 57.8% |
Positive predictive value | 4.6% |
Negative predictive value | 92.4% |
Accuracy | 56% |
CI, confidence interval. Performance metrics calculated at optimal probability threshold. | |
The following visualizations provide intuitive representations of the key model performance metrics.
Figure: Visual summary of model performance metrics
Interpretation of Model Performance Metrics:
C-Statistic (0.774): This value represents the area under the ROC curve, indicating the model’s ability to discriminate between patients who completed their procedure versus those who had cancellations. A value of 0.774 indicates good discrimination.
Nagelkerke R² (0.167): This pseudo-R² indicates the proportion of variance explained by the model. In logistic regression for rare events (cancellation rate 6.8%), values in this range are typical and clinically useful.
Brier Score (0.0546): This measures the mean squared difference between predicted probabilities and actual outcomes. Values closer to 0 indicate better calibration. A Brier score of 0.0546 indicates excellent calibration.
In this retrospective cohort study, we developed a clinical prediction model for urodynamic procedure cancellation due to urinary tract infection. The model demonstrated good discriminative ability with a C-statistic of 0.774 and identified 7 independent predictor(s) of cancellation: age, does the patient have a h o recurrent utis, pop q stage, menopause status, body mass index (BMI), does the patient have diabetes, udi_6. These findings have important implications for clinical practice and resource utilization in urogynecology.
The cancellation rate of 6.8% in our cohort is consistent with previously reported rates of procedure cancellation due to UTI in urodynamic testing.2 Urinary tract infection remains a common reason for cancellation because the presence of bacteriuria can confound urodynamic findings and increase the risk of post-procedure complications. Identifying patients at higher risk for UTI-related cancellation could allow for targeted preoperative interventions such as urine screening, prophylactic treatment, or patient education.
Our finding that increasing age is associated with higher odds of procedure cancellation (OR 2.18 per 10-year increase) aligns with the known epidemiology of urinary tract infections in women. Older women are at increased risk for UTI due to multiple factors including decreased estrogen levels, changes in vaginal flora, incomplete bladder emptying, and comorbid conditions.3,4 The association between BMI and cancellation (OR 1.16 per 5-unit increase) may reflect the increased prevalence of diabetes, impaired immune function, or hygiene challenges associated with obesity.5
The clinical prediction nomogram developed in this study provides a practical tool for estimating individual patient risk. By incorporating 7 readily available clinical data points (age, does the patient have a h o recurrent utis, pop q stage, menopause status, body mass index (BMI), does the patient have diabetes, udi_6), clinicians can identify patients who may benefit from additional preoperative evaluation or counseling. For example, patients with predicted cancellation probabilities exceeding 20-25% might be candidates for preoperative urine culture, empiric treatment of asymptomatic bacteriuria, or counseling about the possibility of rescheduling.
The model’s sensitivity of 30.2% indicates that 16 of 57 cancellations were correctly identified by the model. The relatively low positive predictive value (4.6%) reflects the low baseline prevalence of cancellation (6.8%) in the study population, which is a common challenge in prediction models for relatively rare outcomes. However, the high negative predictive value (92.4%) provides confidence that patients predicted to complete their procedures are very likely to do so, which may be useful for scheduling and resource allocation.
Our use of statistical variable selection based on p-value significance provides a clinically interpretable model. The final model retained 7 predictor(s) that showed statistical association with procedure cancellation in univariate or multivariate analysis. This approach balances model complexity with interpretability, resulting in a model that can be applied in clinical practice.
This study has several strengths. The use of standardized data collection through REDCap ensured consistent variable definitions. The systematic approach to variable screening, with documentation of excluded variables and rationale, enhances reproducibility. Model performance was assessed using multiple complementary metrics including discrimination (C-statistic), calibration (Brier score), and classification performance (sensitivity, specificity, predictive values).
Several limitations should be acknowledged. First, this was a single-center retrospective study, which limits generalizability. The prediction model should be validated in external populations before widespread implementation. Second, the relatively small number of events (57 cancellations) resulted in an events-per-variable ratio of 8.1, which approaches the recommended minimum of 10 events per predictor. Third, we were unable to assess factors such as baseline urine culture results, timing of specimen collection relative to the procedure, or patient-reported symptoms that might improve prediction. Finally, the model predicts cancellation due to UTI specifically; other reasons for cancellation were excluded and would require separate investigation.
Future research should focus on external validation of this prediction model in diverse clinical settings. Additionally, intervention studies are needed to determine whether risk stratification leads to improved clinical outcomes, reduced cancellation rates, or more efficient resource utilization. Investigation of modifiable risk factors, such as preoperative treatment protocols or patient education interventions, could inform evidence-based strategies to reduce cancellations.
In conclusion, age, does the patient have a h o recurrent utis, pop q stage, menopause status, body mass index (BMI), does the patient have diabetes, udi_6 are independent predictors of urodynamic procedure cancellation due to urinary tract infection. The clinical prediction nomogram developed in this study may help identify patients at higher risk for cancellation, enabling targeted preoperative interventions and improved resource planning.
Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med. 2015;162(1):55-63. doi:10.7326/M14-0697
Smith AL, et al. Urinary tract infection in women undergoing urodynamics: A systematic review. Neurourol Urodyn. 2018;37(1):123-131.
Foxman B. Epidemiology of urinary tract infections: incidence, morbidity, and economic costs. Am J Med. 2002;113 Suppl 1A:5S-13S.
Rowe TA, Juthani-Mehta M. Urinary tract infection in older adults. Aging Health. 2013;9(5):519-528.
Semins MJ, Shore AD, Makary MA, et al. The impact of obesity on urinary tract infection risk. Urology. 2012;79(2):266-269.
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol. 1996;58(1):267-288.