The NHS Referral to Treatment (RTT) pathway measures patient waiting times for elective care. This project focuses on delays in referral to treatment and patterns within acute NHS trusts between April 2024 and February 2025, aiming to uncover key insights and trends.
Under the NHS Constitution, patients have the right to start treatment within 18 weeks, with a historical operational standard of 92% compliance. However, multiple challenges in recent years, including the COVID-19 pandemic backlog, staff shortages, and rising healthcare demand, have led to significant breaches of this standard.
Today, many NHS trusts operate with more realistic goals of 60–70% compliance, making breach management a critical operational priority.
This project specifically examines Incomplete Pathways and subsequent patterns across acute trusts.
For reference on data collection methodologies and system limitations, the latest NHS RTT monthly report (February 2025) is attached: England NHS RTT Statistics - February 2025 (https://www.england.nhs.uk/statistics/wp-content/uploads/sites/2/2025/04/Feb25-RTT-SPN-Publication-PDF-386K-76538.pdf)
This project draws upon two primary data sources:
Data preprocessing involved cleaning and structuring the RTT data, targeting “Incomplete Pathways” and “Incomplete Pathways with Decision to Admit (DTA)” and additionally “Completed Pathways” for further analysis.
Several key metrics were developed:
Analytical Approaches:
To explore underlying patterns in trust performance, two complementary approaches were applied. Hierarchical clustering was conducted to group trusts based on their breach rates, providing a similarity-based segmentation. Separately, a typology framework was developed, categorizing trusts by their overall treatment volumes and delay severity to create more interpretable and strategic groupings.
It is important to note several limitations. This analysis covers only an 11-month period, rather than a full multi-year view. The study is restricted to acute trusts, with community and specialist trusts excluded for consistency. Additionally, the share of patients with a decision to admit (DTA) within incomplete pathways was found to be relatively stable during the period and has been documented but not deeply modeled in this report.
Patients were analyzed across three RTT reporting categories:
file_paths <- list.files(
here("data", "raw"),
pattern = "\\.csv$",
full.names = TRUE
)
if (length(file_paths) == 0) {
stop("No CSV files found in 'data/raw/'. Please check the folder and try again.")
}
message(length(file_paths), " files found for processing.")
## 11 files found for processing.
get_month_from_filename <- function(path) {
file <- basename(path)
str_extract(file, "(?i)(January|February|March|April|May|June|July|August|September|October|November|December)-20\\d{2}") %>%
parse_date(format = "%B-%Y")
}
message("Month extraction function created.")
## Month extraction function created.
rtt_data <- map_dfr(file_paths, ~ {
read_csv(.x, show_col_types = FALSE) %>%
clean_names() %>%
mutate(period = get_month_from_filename(.x))
})
if (nrow(rtt_data) == 0) {
stop("No data loaded. Please check if the CSV files are correctly formatted.")
}
message(nrow(rtt_data), " rows loaded into rtt_data.")
## 2037283 rows loaded into rtt_data.
if (!"rtt_part_description" %in% colnames(rtt_data)) {
stop("Column `rtt_part_description` is missing in rtt_data. Check CSV structure.")
}
rtt_data <- rtt_data %>%
mutate(rtt_part_description = str_to_lower(rtt_part_description))
message("Data preparation completed successfully.")
## Data preparation completed successfully.
file_list <- list.files(
here("data", "raw"),
pattern = "\\.csv$",
full.names = TRUE
)
if (length(file_list) == 0) {
stop("No CSV files found in 'data/raw/'. Please check the folder and files.")
}
rtt_all <- file_list %>%
map_dfr(~ read_csv(.x, show_col_types = FALSE)) %>%
clean_names()
if (nrow(rtt_all) == 0) {
stop("Data loading failed — 'rtt_all' is empty after combining files.")
}
dir.create(here("data", "processed"), showWarnings = FALSE, recursive = TRUE)
write_csv(rtt_all, here("data", "processed", "rtt_all_months.csv"))
# Preview
#glimpse(rtt_all)
# Confirmation message
message("rtt_all_months.csv saved successfully into 'data/processed/'.")
## rtt_all_months.csv saved successfully into 'data/processed/'.
# Step 2: Load summary dataset
monthly <- read_csv(
here("output", "monthly_summary.csv"),
show_col_types = FALSE
)
# Safety check: Confirm data loaded
if (nrow(monthly) == 0) {
stop("Loaded 'monthly_summary.csv' is empty. Please check the output folder and file contents.")
}
# Step 3: Convert period to proper date (if not already)
monthly <- monthly %>%
mutate(period = as.Date(period))
# Step 4: Plot % Over 18 Weeks
plot_18 <- monthly %>%
ggplot(aes(x = period, y = percent_over_18, color = category)) +
geom_line(size = 0.8) +
geom_point() +
labs(
title = "% of Patients Waiting Over 18 Weeks",
x = "Month",
y = "% Over 18 Weeks",
color = "RTT Category"
) +
scale_y_continuous(labels = percent_format(scale = 1)) +
theme_minimal(base_size = 10) +
theme(
legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_text(size = 10),
legend.text = element_text(size = 9)
)
# Step 5: Plot % Over 52 Weeks (exclude completed)
plot_52 <- monthly %>%
filter(!str_detect(category, "Completed")) %>%
ggplot(aes(x = period, y = percent_over_52, color = category)) +
geom_line(size = 0.8) +
geom_point() +
labs(
title = "% of Patients Waiting Over 52 Weeks",
x = "Month",
y = "% Over 52 Weeks",
color = "RTT Category"
) +
scale_y_continuous(labels = percent_format(scale = 1)) +
theme_minimal(base_size = 10) +
theme(
legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_text(size = 10),
legend.text = element_text(size = 9)
)
# Step 6: Plot total patients by category (line plot)
plot_total_trend <- ggplot(monthly, aes(x = period, y = total_patients, color = category)) +
geom_line(size = 0.8) +
geom_point() +
scale_x_date(date_labels = "%b '%y", date_breaks = "1 month") +
scale_y_continuous(labels = label_number(scale_cut = cut_si("M"))) +
scale_color_brewer(palette = "Dark2") +
labs(
title = "Monthly Total Patients by RTT Category",
x = "Month", y = "Total Patients", color = "Category"
) +
theme_minimal(base_size = 10) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_text(size = 10),
legend.text = element_text(size = 9)
)
# Step 7: Print the plot
print(plot_52)
print(plot_18)
Shorter-term breaches (>18 weeks) have remained persistently high, fluctuating between 40% and 50% across all RTT categories.
print(plot_total_trend)
Total patient volumes across pathways have remained relatively stable, suggesting steady referral pressure.
# Step 2: Load data
monthly <- read_csv(
here("output", "monthly_summary.csv"),
show_col_types = FALSE
)
# Safety check
if (nrow(monthly) == 0) {
stop("Loaded 'monthly_summary.csv' is empty. Please check the output folder and file contents.")
}
# Step 3: Prepare backlog (Incomplete + DTA)
backlog_df <- monthly %>%
filter(category %in% c("Incomplete Pathways", "Incomplete Pathways with DTA")) %>%
group_by(period) %>%
summarise(
backlog = sum(total_patients, na.rm = TRUE),
.groups = "drop"
)
# Step 4: Prepare completions (Admitted only)
treated_df <- monthly %>%
filter(category == "Completed Pathways (Admitted)") %>%
select(period, treated = total_patients)
# Step 5: Join & calculate ratio
bt_ratio <- backlog_df %>%
left_join(treated_df, by = "period") %>%
mutate(
backlog_to_treatment_ratio = round(backlog / treated, 2)
)
# Step 6: Save output
dir.create(here("output"), showWarnings = FALSE, recursive = TRUE)
write_csv(bt_ratio, here("output", "backlog_to_treatment.csv"))
# Step 7: Preview
#print(bt_ratio, n = Inf)
# Step 8: Plot backlog-to-treatment ratio over time
plot_bt_ratio <- ggplot(bt_ratio, aes(x = period, y = backlog_to_treatment_ratio)) +
geom_line(color = "#1f77b4", size = 0.8) +
geom_point(color = "#1f77b4", size = 2) +
labs(
title = "Backlog-to-Treatment Ratio Over Time",
subtitle = "Total patients waiting (Incomplete + DTA) per 1 treated (Admitted)",
x = "Month",
y = "Backlog-to-Treatment Ratio"
) +
scale_x_date(date_labels = "%b '%y", date_breaks = "1 month") +
scale_y_continuous(labels = label_number(accuracy = 1)) +
theme_minimal(base_size = 12) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(face = "bold", hjust = 0.5)
)
# Step 9: Show the plot
print(plot_bt_ratio)
The ratio of patients waiting (Incomplete + DTA) to those treated (Admitted) fluctuated between 26 and 32 over the period, highlighting persistent system pressures.
# Step 1: Load monthly summary
monthly <- read_csv(
here("output", "monthly_summary.csv"),
show_col_types = FALSE
) %>%
clean_names()
# Safety check
if (nrow(monthly) == 0) {
stop("Loaded 'monthly_summary.csv' is empty. Please check the output folder and file contents.")
}
# Step 2: Filter backlog categories and calculate severity
severity_df <- monthly %>%
filter(category %in% c("Incomplete Pathways", "Incomplete Pathways with DTA")) %>%
group_by(period) %>%
summarise(
patients_over_18 = sum(over_18_weeks, na.rm = TRUE),
patients_over_52 = sum(over_52_weeks, na.rm = TRUE),
.groups = "drop"
) %>%
mutate(
severity_ratio = round(patients_over_52 / patients_over_18, 3)
)
# Step 3: Save output
dir.create(here("output"), showWarnings = FALSE, recursive = TRUE)
write_csv(severity_df, here("output", "severity_ratio.csv"))
# Step 4: Preview
#print(severity_df, n = Inf)
# Step 5: Load severity data (reloading from saved file)
severity_df <- read_csv(
here("output", "severity_ratio.csv"),
show_col_types = FALSE
)
# Step 6: Plot severity ratio trend
plot_severity_ratio <- ggplot(severity_df, aes(x = period, y = severity_ratio)) +
geom_line(color = "#9467bd", size = 0.8) +
geom_point(color = "#9467bd", size = 2) +
scale_x_date(date_labels = "%b '%y", date_breaks = "1 month") +
scale_y_continuous(
labels = scales::percent_format(accuracy = 1),
breaks = seq(0, 0.14, by = 0.02),
limits = c(0, 0.14)
) +
labs(
title = "Severity of Delay: % of Over-18-Week Patients Waiting Over 52 Weeks",
x = "Month",
y = "Severity Ratio (%)"
) +
theme_minimal(base_size = 11) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(face = "bold", hjust = 0.5)
)
# Step 7: Print the plot
print(plot_severity_ratio)
The proportion of patients breaching 52 weeks among those over 18 weeks declined from approximately 11% to under 8%, indicating some improvement in managing extreme delays.
# Step 1: Load clustering output
trust_clusters <- read_csv(
here("output", "clustering", "trust_clusters.csv"),
show_col_types = FALSE
)
# Step 2: Load combined RTT data
rtt_all_months <- read_csv(
here("data", "processed", "rtt_all_months.csv"),
show_col_types = FALSE
) %>%
clean_names()
# Step 3: Filter to acute trusts only
rtt_filtered <- rtt_all_months %>%
filter(provider_org_code %in% trust_clusters$provider_org_code)
# Step 4: Get week-related columns
week_cols_all <- names(rtt_filtered)[str_detect(names(rtt_filtered), "^gt_.*")]
# Step 5: Calculate total patients per trust
trust_totals <- rtt_filtered %>%
group_by(provider_org_code) %>%
summarise(
total_patients = sum(across(all_of(week_cols_all)), na.rm = TRUE),
.groups = "drop"
)
# Step 6: Merge totals with clustering and build hover text
trust_plot_data <- trust_clusters %>%
left_join(trust_totals, by = "provider_org_code") %>%
mutate(
trust_label = str_to_title(provider_org_name),
hover_text = paste0(
trust_label, "<br>",
"Over 18w Delays: ", round(percent_over_18, 1), "%<br>",
"Over 52w Delays: ", round(percent_over_52, 1), "%<br>",
"Total Patients: ", scales::comma(total_patients)
)
)
# Step 7: Create interactive scatter plot
plot_ly(
data = trust_plot_data,
x = ~percent_over_18,
y = ~percent_over_52,
type = "scatter",
mode = "markers",
text = ~hover_text,
hoverinfo = "text",
color = ~as.factor(cluster),
colors = "Set2",
marker = list(size = 8, opacity = 0.8),
showlegend = FALSE
) %>%
plotly::layout(
title = "NHS Trust Clustering by RTT Delays (April 2024 – Feb 2025)",
xaxis = list(title = "% Over 18 Weeks", tickformat = ".0f"),
yaxis = list(title = "% Over 52 Weeks", tickformat = ".0f"),
margin = list(l = 60, r = 50, b = 60, t = 60)
)
Figure 4.1: Clustering of Acute NHS Trusts by RTT Delays.
The resulting scatter plot categorizes trusts into five broad clusters:
Following clustering analysis, a secondary classification was created to categorize NHS trusts based on two key metrics: total patient volume and delay severity.
A 3x3 matrix approach was used:
Volume: High, Medium, or Low total patients handled.
Delay Severity: High, Medium, or Low percentage of patients breaching the 18-week standard.
Each trust was assigned a typology label based on its position in this grid:
| Volume / Delay | Low Delay | Medium Delay | High Delay |
|---|---|---|---|
| Low Volume | Top Performing Small Trusts | Small Trusts Managing Well | Specialist or Small Trusts Under Pressure |
| Medium Volume | High Performing Mid-Sized Trusts | Mid-Sized Steady Performers | Mid-Sized Trusts Facing Challenges |
| High Volume | Leading Large Performers | Large Trusts Managing Backlogs Steadily | Large Trusts Struggling to Manage Delays |
Typology provides a structured lens to identify trusts excelling under different pressures versus those requiring operational improvements.
# Load required libraries
library(tidyverse)
library(here)
library(knitr)
# Step 1: Load the trust delay summary CSV
trust_delay <- read_csv(here("output", "trust_typology_base.csv"), show_col_types = FALSE)
# Step 2: Define volume categories
trust_delay <- trust_delay %>%
mutate(
volume_category = case_when(
total_patients >= 1500000 ~ "High Volume",
total_patients >= 700000 ~ "Medium Volume",
TRUE ~ "Low Volume"
)
)
# Step 3: Define delay categories
trust_delay <- trust_delay %>%
mutate(
delay_category = case_when(
percent_over_18 >= 50 ~ "High Delay",
percent_over_18 >= 30 ~ "Medium Delay",
TRUE ~ "Low Delay"
)
)
# Step 4: Assign typology labels
trust_delay <- trust_delay %>%
mutate(
typology_label = case_when(
volume_category == "High Volume" & delay_category == "High Delay" ~ "Large Trusts Struggling to Manage Delays",
volume_category == "High Volume" & delay_category == "Medium Delay" ~ "Large Trusts Managing Backlogs Steadily",
volume_category == "High Volume" & delay_category == "Low Delay" ~ "Leading Large Performers",
volume_category == "Medium Volume" & delay_category == "High Delay" ~ "Mid-Sized Trusts Facing Challenges",
volume_category == "Medium Volume" & delay_category == "Medium Delay" ~ "Mid-Sized Steady Performers",
volume_category == "Medium Volume" & delay_category == "Low Delay" ~ "High Performing Mid-Sized Trusts",
volume_category == "Low Volume" & delay_category == "High Delay" ~ "Specialist or Small Trusts Under Pressure",
volume_category == "Low Volume" & delay_category == "Medium Delay" ~ "Small Trusts Managing Well",
volume_category == "Low Volume" & delay_category == "Low Delay" ~ "Top Performing Small Trusts",
TRUE ~ "Unclassified"
)
)
# Step 5: Save final output to project-root output folder
write_csv(trust_delay, here("output", "trust_typology_final.csv"))
# Step 6: Display first 20 rows in a neat table
knitr::kable(head(trust_delay, 20), caption = "Sample of Trust Typology Classification")
| provider_org_code | provider_org_name | total_patients | patients_over_18 | percent_over_18 | volume_category | delay_category | typology_label |
|---|---|---|---|---|---|---|---|
| R0A | MANCHESTER UNIVERSITY NHS FOUNDATION TRUST | 5674038 | 2934604 | 51.7 | High Volume | High Delay | Large Trusts Struggling to Manage Delays |
| R0B | SOUTH TYNESIDE AND SUNDERLAND NHS FOUNDATION TRUST | 1907618 | 565556 | 29.6 | High Volume | Low Delay | Leading Large Performers |
| R0D | UNIVERSITY HOSPITALS DORSET NHS FOUNDATION TRUST | 2097918 | 829544 | 39.5 | High Volume | Medium Delay | Large Trusts Managing Backlogs Steadily |
| R1L | ESSEX PARTNERSHIP UNIVERSITY NHS FOUNDATION TRUST | 26000 | 6086 | 23.4 | Low Volume | Low Delay | Top Performing Small Trusts |
| RA2 | ROYAL SURREY COUNTY HOSPITAL NHS FOUNDATION TRUST | 1068140 | 486210 | 45.5 | Medium Volume | Medium Delay | Mid-Sized Steady Performers |
| RA7 | UNIVERSITY HOSPITALS BRISTOL AND WESTON NHS FOUNDATION TRUST | 1785868 | 705630 | 39.5 | High Volume | Medium Delay | Large Trusts Managing Backlogs Steadily |
| RA9 | TORBAY AND SOUTH DEVON NHS FOUNDATION TRUST | 1037662 | 402236 | 38.8 | Medium Volume | Medium Delay | Mid-Sized Steady Performers |
| RAE | BRADFORD TEACHING HOSPITALS NHS FOUNDATION TRUST | 1081246 | 420282 | 38.9 | Medium Volume | Medium Delay | Mid-Sized Steady Performers |
| RAJ | MID AND SOUTH ESSEX NHS FOUNDATION TRUST | 4603276 | 2195282 | 47.7 | High Volume | Medium Delay | Large Trusts Managing Backlogs Steadily |
| RAL | ROYAL FREE LONDON NHS FOUNDATION TRUST | 3151714 | 1396350 | 44.3 | High Volume | Medium Delay | Large Trusts Managing Backlogs Steadily |
| RAS | THE HILLINGDON HOSPITALS NHS FOUNDATION TRUST | 807996 | 374412 | 46.3 | Medium Volume | Medium Delay | Mid-Sized Steady Performers |
| RAX | KINGSTON AND RICHMOND NHS FOUNDATION TRUST | 435372 | 173382 | 39.8 | Low Volume | Medium Delay | Small Trusts Managing Well |
| RAX | KINGSTON HOSPITAL NHS FOUNDATION TRUST | 769784 | 294772 | 38.3 | Medium Volume | Medium Delay | Mid-Sized Steady Performers |
| RBD | DORSET COUNTY HOSPITAL NHS FOUNDATION TRUST | 694748 | 321912 | 46.3 | Low Volume | Medium Delay | Small Trusts Managing Well |
| RBL | WIRRAL UNIVERSITY TEACHING HOSPITAL NHS FOUNDATION TRUST | 1263958 | 539086 | 42.7 | Medium Volume | Medium Delay | Mid-Sized Steady Performers |
| RBQ | LIVERPOOL HEART AND CHEST HOSPITAL NHS FOUNDATION TRUST | 196812 | 65060 | 33.1 | Low Volume | Medium Delay | Small Trusts Managing Well |
| RBS | ALDER HEY CHILDREN’S NHS FOUNDATION TRUST | 674204 | 315184 | 46.7 | Low Volume | Medium Delay | Small Trusts Managing Well |
| RBT | MID CHESHIRE HOSPITALS NHS FOUNDATION TRUST | 1120446 | 503522 | 44.9 | Medium Volume | Medium Delay | Mid-Sized Steady Performers |
| RBV | THE CHRISTIE NHS FOUNDATION TRUST | 116836 | 3980 | 3.4 | Low Volume | Low Delay | Top Performing Small Trusts |
| RC9 | BEDFORDSHIRE HOSPITALS NHS FOUNDATION TRUST | 2715622 | 1282306 | 47.2 | High Volume | Medium Delay | Large Trusts Managing Backlogs Steadily |
While the analysis offers valuable insights into RTT delays and patterns across NHS acute trusts, certain limitations should be noted:
Data Scope: The project focused on national-level RTT monthly datasets between April 2024 and February 2025. Trust-specific nuances, specialty-level variations, and short-term operational factors (e.g., seasonal surges) were outside the current scope.
Assumptions and Delay Calculations: Delay metrics were based on total patients and breach counts provided in the RTT datasets. Complexities like patient prioritization, emergency admissions, or multiple referrals could not be modeled.
Clustering Constraints: Hierarchical clustering was applied on a trust-level summary dataset (~112 trusts), not on individual patient records (~2 million rows), due to computational practicality and the need for interpretability. As a result, cluster boundaries are indicative and should be interpreted qualitatively rather than rigidly.
Typology Generalization: The typology categories (e.g., “Large Trusts Managing Backlogs Steadily”) were created to assist in interpretation and strategic thinking, but trusts within the same typology may still differ significantly based on local factors.
Data Reporting Variations: Some trusts had missing or inconsistent reporting in the monthly RTT data, meaning their inclusion in summary analyses reflects only available submissions.
These considerations are important when applying the findings in real-world NHS operational contexts.
This project explored patterns of RTT delay across NHS acute trusts through trend analysis, backlog characterization, clustering, and typology development. The findings reveal significant trust-level variations and highlight the need for tailored operational strategies rather than generalized policies.
Both clustering results and volume-delay typologies offer structured frameworks to identify areas needing attention, supporting targeted improvements and better backlog management across the NHS.