Exploring the Impact of Travel Distance on Cancer Patient Experience: A Geospatial Analysis of the Under 16 CPES, 2024.

Executive Summary

Aim

This report explores how far families in the UK travel for children’s cancer treatment and whether longer distances impact their satisfaction with care. We looked at 2024 survey data from 3,434 young patients (under 16) treated at specialist centres in England and Wales. Parents (717 responses) and children (366 responses) rated their overall experiences. Travel distances were calculated based on real driving routes from home postcodes to treatment centres.

Key Findings

Overall: Distance Does not Affect Most Families’ Satisfaction

Families traveled an average of 50 km (about 48 minutes by car), with some going up to 563 km.
There was no clear link between distance and satisfaction scores overall, for parents (very weak negative trend, not significant) or children (similarly weak and not significant).
Most families (89%) traveled under 100 km, and even those with long journeys reported similar satisfaction levels to those living nearby. This suggests the UK’s children’s cancer services generally handle distance well for the majority.

Travel Distances: Who Travels Farthest?

By Distance Bands: About 26% traveled 50-100 km, 25% traveled 25-50 km, and 11% went over 100 km. Shorter trips (under 25 km) made up 38%.
By Treatment Centre: Centres like Leicester Royal Infirmary and Leeds General Infirmary had shorter average trips (20-40 km). Others, like Birmingham Children’s Hospital and University College Hospital, served wider areas with some families traveling 400+ km.
By Region/Country: English families averaged 48 km; Welsh families traveled much farther (161 km on average) but still reported high satisfaction. Rural areas like the South West (Peninsula) had the longest trips (up to 217 km average) but no drop in satisfaction.
By Cancer Type: Retinoblastoma patients traveled the farthest (116 km average), followed by bone tumors (85 km). Leukemias had shorter trips (45 km).
Visual Summary: Families in central England and London travel shorter distances, while those in rural Wales, South West England, and northern areas face longer journeys.

Correlation Between Distance and Satisfaction: Where It Matters

While distance did not affect most patients, some groups showed patterns:

Negative Impacts (Worse Experience with Longer Distance): Bone tumor families: Moderate negative link—longer trips tied to lower satisfaction, likely due to frequent visits for surgeries and rehab. Southampton General Hospital: Similar negative trend, possibly from inadequate support for its rural patients.
Positive Impacts (Better Experience with Longer Distance):
Retinoblastoma families: Positive link—families were happier traveling far to specialist centres, showing centralized care works well with good support.
Humber, Coast and Vale region: Unexpected positive trend—families preferred traveling to distant experts over local options.
Welsh families: High satisfaction despite 3x longer trips, thanks to strong cross-border support.

No Clear Links in Other Groups:

By region: South East had a slight negative trend (driven by Southampton), but South West managed long distances well. By income level (deprivation): No differences—distance affected rich and poor families similarly.
By age: Parents of younger kids (0-7) slightly preferred distant specialists; teens (12-15) showed a minor negative trend, perhaps due to school disruptions.
By Cancer Alliance: Peninsula (South West) excelled with long distances but high satisfaction; no widespread issues.

What This Means

Most families are satisfied regardless of distance, proving the UK’s system works for the majority. However, hidden challenges exist for specific groups like bone tumor patients or those at certain centres. Success stories (e.g., Welsh and retinoblastoma families) show that with better support—like travel help, accommodation, or local partnerships—distance barriers can be overcome.

options(repos = c(CRAN = "https://cran.rstudio.com/"))

install.packages("dplyr")

## package 'dplyr' successfully unpacked and MD5 sums checked

## 
## The downloaded binary packages are in
##  C:\TEMP\Rtmp86Mk3j\downloaded_packages

install.packages("gridExtra")

## package 'gridExtra' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\TEMP\Rtmp86Mk3j\downloaded_packages

install.packages("leaflet")

## package 'leaflet' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\TEMP\Rtmp86Mk3j\downloaded_packages

install.packages("readxl")

## package 'readxl' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\TEMP\Rtmp86Mk3j\downloaded_packages

library(dplyr)
library(tidyr)
library(haven)
library(labelled)
library(DT) 
library(stringr)
library(ggplot2)
library(plotly)
library(kableExtra)
library(data.table)
library(PostcodesioR)
library(leaflet)
library(gridExtra)
library(kableExtra)
library(RColorBrewer)
library(corrplot)
library(viridis)
library(sf)
library(leaflet)
library(htmlwidgets) #for offline leaflet maps
library(htmltools) #for offline leaflet maps
library(scales)
library(knitr)
library(kableExtra)
library(ggpubr)
library(corrplot)
library(gridExtra)
library(plotly)
library(forcats)
library(httr)
library(osrm)
library(dplyr)
library(tidyverse)
library(readxl)
library(boot)
library(grid)
library(stringr)

#Color palette 
cb_palette <- c("#E69F00", "#56B4E9", "#009E73", "#F0E442", 
                "#0072B2", "#D55E00", "#CC79A7", "#999999")

data <- read_excel('data.xlsx')

Explorative Data Analysis

Table 1: Summary Statistics of Distance and Travel Time

Fig 1: Distribution of Travel Distance

p_hist <- ggplot(data, aes(x = distance_km)) +
  geom_histogram(bins = 30, fill = cb_palette[1], alpha = 0.7, color = "white",
                 aes(text = paste("Range: ", round(after_stat(xmin), 1), "-", 
                                  round(after_stat(xmax), 1), "km<br>",
                                  "Count: ", after_stat(count), " patients"))) +
  labs(
    title = "Distribution of Travel Distance",
    x = "Distance from Home to Treatment Centre (km)",
    y = "Number of Patients"
  )

# Convert to interactive plotly
ggplotly(p_hist, tooltip = "text") %>%
  layout(
    hoverlabel = list(bgcolor = "white", font = list(size = 12)),
    annotations = list(
      list(x = 0.5, y = -0.15,
           text = "Hover over bars for details | Red line = mean, Blue line = median",
           showarrow = FALSE, xref = "paper", yref = "paper",
           font = list(size = 10, color = "grey50"))
    )
  )

Fig 2: Distribution of Distance Traveled by Distance Bands

data <- data %>%
  mutate(
    distance_band = factor(
      distance_band,
      levels = c("<10 km", "10-25 km", "25-50 km", "50-100 km", ">100 km"),
      ordered = TRUE
    )
  )
dt1 <- ggplot(data, aes(x = distance_band, fill = distance_band)) +
  geom_bar() +
  geom_text(stat = "count", aes(label = paste0(after_stat(count), "\n(", 
            round(after_stat(count)/sum(after_stat(count))*100, 1), "%)")), 
            vjust = -0.2, size = 3) +
  scale_fill_viridis_d(option = "plasma") +
  labs(
    title = "Distribution by Distance Bands",
    x = "Distance Band",
    y = "Number of Patients"
  ) +
  theme(legend.position = "none") + scale_y_continuous(
    expand = expansion(mult = c(0, 0.18)),  # ← 18% extra space
    breaks = seq(0, 1000, 200)
  )
plot(dt1)

Fig 2 shows the distribution of patients across different distance bands. This figure explains that:

The majority of patients (around 88.8%) travel less than 100 km, with 25.8% of patients traveling between 50-100 km.
A smaller portion of the sample (11.2%) has to travel more than 100 km, suggesting that while travel distances are generally moderate, a substantial portion of patients still experience significant travel burdens.
The 10-50 km categories also account for nearly 48.1% of the patient population, indicating a moderate travel burden for many families.

Distance by Treatment Centres

Table 3: Travel Distance by Principal Treatment Centres

# Summary by PTC
site_summary <- data %>%
  group_by(ptc_name) %>%
  summarise(
    `N Patients` = n(),
    `Mean Distance (km)` = round(mean(distance_km, na.rm = TRUE), 1),
    `Median Distance (km)` = round(median(distance_km, na.rm = TRUE), 1),
    `Max Distance (km)` = round(max(distance_km, na.rm = TRUE), 1),
    `% Traveling >50km` = round(sum(distance_km > 50, na.rm = TRUE) / n() * 100, 1)
  ) %>%
  arrange(desc(`Mean Distance (km)`))
site_summary %>%
  kable(caption = "Travel Distance by Principal Treatment Centre") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0, bold = TRUE, background = "#f0f0f0")

Travel Distance by Principal Treatment Centre
ptc_name	N Patients	Mean Distance (km)	Median Distance (km)	Max Distance (km)	% Traveling >50km
University Hospitals Bristol and Weston NHS Foundation Trust	126	74.1	50.7	307.5	50.8
Cambridge University Hospitals NHS Foundation Trust	236	71.3	68.6	150.7	75.8
Birmingham Women’s and Children’s NHS Foundation Trust	401	63.7	36.7	398.8	40.1
University Hospital Southampton NHS Foundation Trust	235	58.0	50.3	268.0	50.6
The Newcastle upon Tyne Hospitals NHS Foundation Trust	189	53.3	38.6	562.8	45.0
Oxford University Hospitals NHS Foundation Trust	162	48.3	49.4	184.5	49.4
Alder Hey Children’s NHS Foundation Trust	220	47.2	27.3	243.7	32.3
The Royal Marsden NHS Foundation Trust & St George’s University Hospitals NHS Foundation Trust	337	47.1	32.8	400.9	38.3
Great Ormond Street Hospital for Children NHS Foundation Trust & University College London Hospitals NHS Foundation Trust	711	43.5	23.4	434.6	26.6
Manchester University NHS Foundation Trust	264	39.5	25.7	335.7	25.8
Nottingham University Hospitals NHS Trust & University Hospitals of Leicester NHS Trust	174	37.6	28.5	135.7	25.9
Sheffield Children’s NHS Foundation Trust	131	36.7	24.2	147.4	24.4
Leeds Teaching Hospitals NHS Trust	272	35.1	26.4	281.4	18.0

Fig 4: Interactive Map showing Patient Locations and Treatment Centres

site_locations <- data %>%
  group_by(site_name, site_latitude, site_longitude, site_postcode) %>%
  summarise(
    n_patients = n(),
    cancer_alliance_name = first(cancer_alliance_name),
    .groups = "drop"
  )

pal <- leaflet::colorNumeric(palette = "YlOrRd", domain = data$distance_km)

map <- leaflet(data) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  
  addCircleMarkers(
    lng = ~patient_longitude,
    lat = ~patient_latitude,
    radius = 3,
    color = ~pal(distance_km),
    fillOpacity = 0.6,
    stroke = FALSE,
    
    # Detailed popup on CLICK
    popup = ~paste0(
      "<div style='min-width: 200px;'>",
      "<b style='font-size: 13px; color: #d55e00;'>Patient Details</b><br>",
      "<b>Distance:</b> ", round(distance_km, 1), " km<br>",
      "<b>Treatment Centre:</b> ", site_name, "<br>",
      "<b>Diagnostic Group:</b> ", diagnostic_group, "<br>",
      "<b>Age:</b> ", patient_age, " years",
      "</div>"
    ),
    
    # Quick info on HOVER
    label = ~paste0(round(distance_km, 1), " km"),
    
    # Style hover label
    labelOptions = labelOptions(
      style = list(
        "font-weight" = "bold",
        "font-size" = "13px",
        "padding" = "4px 10px",
        "background-color" = "rgba(255, 255, 255, 0.95)",
        "border" = "1px solid #999",
        "border-radius" = "3px",
        "box-shadow" = "2px 2px 4px rgba(0,0,0,0.3)"
      ),
      direction = "top",
      offset = c(0, -5)
    )
  ) %>%
  
  addMarkers(
    data = site_locations,
    lng = ~site_longitude,
    lat = ~site_latitude,
    popup = ~paste0(
      "<b style='font-size: 14px; color: #2c3e50;'>", site_name, "</b><br>",
      "<b>Location:</b> ", site_postcode, "<br>",
      "<b>Number of Patients:</b> ", n_patients, "<br>",
      "<b>Cancer Alliance:</b> ", cancer_alliance_name
    ),
    label = ~paste0(site_name, " (", n_patients, " patients)")
  ) %>%
  
  addLegend(
    position = "bottomright",
    pal = pal,
    values = ~distance_km,
    title = "Distance (km)",
    opacity = 1
  ) %>%
  
  # Add title control
  addControl(
    html = "<div style='background: rgba(255,255,255,0.8); padding: 8px; border-radius: 4px;'>
            <b>Hover over dots to see distance | Click for full details</b></div>",
    position = "topright"
  )

map

Figure 4 provides a clear visualization of how far patients must travel to treatment centres, with colour gradients showing varying travel distances.Each points represents patients’ location, the darker points represent longer travel distances, and the lighter points represent shorter distances. Hover on each point, click when it shows the palm symbol to reveal the actual distance travelled, the cancer type and the patient’s actual treatment centre.

Central England and London areas appear to have relatively shorter travel distances, as indicated by the lighter shades. This suggests that patients in these areas tend to travel short distances for treatment.
Rural areas and regions outside the major urban centers, like parts of Wales and South West of England, show darker areas, meaning patients in these regions must travel longer distances to access treatment.
The blue markers represent treatment centers (hospitals), with a clear clustering around major urban areas like London, Birmingham, Manchester, and Leeds. This clustering suggests that these cities house centralised hospitals, which are specialised centers serving larger patient populations.
Treatment centers in central locations (such as London and the Midlands) serve as hubs, with patients traveling from a wider geographic range, potentially requiring them to cover longer distances.

The longer distances shown in areas like rural Wales and northern England are particularly significant for patients in remote or under-served regions who may face substantial travel burdens. This visual reinforces the rationale behind this project that distance could be a key factor in patient experience, as these patients may have more difficulty accessing care compared to those living in urban areas

Distance by NHS England Regions

# Load NHS England Regions shapefile
nhs_regions <- st_read("NHSER_JUL_2022_EN_BUC.shp")

## Reading layer `NHSER_JUL_2022_EN_BUC' from data source 
##   `C:\Users\imonikhe.ayeni\NHS\CDAO Drive - Surveys\U16CPES\Respondent level data\2024\R scripts\imonikhe\Geospatial Analysis\NHSER_JUL_2022_EN_BUC.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 7 features and 7 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 87323.11 ymin: 7054.1 xmax: 655644.8 ymax: 657548.1
## Projected CRS: OSGB36 / British National Grid

# Rename NHSER22NM to region_name
nhs_regions <- nhs_regions %>%
  rename(region_name = NHSER22NM)

#  NHS regions on CRS
nhs_regions <- st_transform(nhs_regions, crs = 4326)

Table 4: Regional Travel Burden by Treatment Centre

# Calculate statistics by site region
site_by_region <- data %>%
  group_by(site_region_name) %>%
  summarise(
    n_patients = n(),
    mean_distance = round(mean(distance_km, na.rm = TRUE), 1),
    median_distance = round(median(distance_km, na.rm = TRUE), 1),
    pct_over_50km = round(sum(distance_km > 50, na.rm = TRUE) / n() * 100, 1),
    n_hospitals = n_distinct(site_name),   # NEW COLUMN
    .groups = "drop"
  )

kable(site_by_region)

site_region_name	n_patients	mean_distance	median_distance	pct_over_50km	n_hospitals
East of England	236	71.3	68.6	75.8	1
London	1048	44.7	25.3	30.3	7
Midlands	575	55.8	34.6	35.8	4
North East and Yorkshire	592	41.3	27.5	28.0	3
North West	484	43.0	26.6	28.7	2
South East	397	54.0	50.2	50.1	3
South West	126	74.1	50.7	50.8	1

Table 4 shows the groups of patients by the region where they take treatment:

Distances reflect how far patients had to travel to reach sites in that region, regardless of where they live.
Sites located in London serve the local population and patients from all over UK, so the average travel distance is much higher (44.7 km) than the average for London residents (shown in table 5)
Regions with more hospitals (London, Midlands, North East & Yorkshire) show shorter average travel distances and lower percentages of patients travelling >50 km.
Regions with fewer hospitals (East of England, South West) show longer travel distances and higher percentages of patients travelling >50 km.

Figure 8: Regional Travel Burden by Treatment Centre

 #Join site region stats to spatial data

site_regions_with_data <- nhs_regions %>%
  left_join(site_by_region, by = c("region_name" = "site_region_name"))

# Ensure same CRS
site_regions_with_data <- st_transform(site_regions_with_data, crs = 4326)

# Create the choropleth map
ggplot(site_regions_with_data) +
  geom_sf(aes(fill = mean_distance), color = "white", size = 0.8) +
  scale_fill_viridis_c(
    option = "plasma",  
    name = "Mean\nDistance\n(km)",
    na.value = "grey90"
  ) +
  # Add labels with semi-transparent background
  geom_sf_label(
    aes(label = paste0(region_name, "\n", 
                       n_patients, " patients\n",
                       n_hospitals, " hospital", ifelse(n_hospitals > 1, "s", ""))),
    size = 3,
    fontface = "bold",
    fill = alpha("white", 0.85),
    color = "black",
    label.padding = unit(0.25, "lines"),
    label.r = unit(0.15, "lines"),
    fun.geometry = sf::st_centroid
  ) +
  labs(
    title = "Regional Travel Distance by Treatment Site",
    subtitle = "Under 16 Cancer Patient Experience Survey 2024",
    caption = "Source: U16 CPES 2024 | Treatment locations"
  ) +
  theme_void() +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5, size = 14),
    plot.subtitle = element_text(hjust = 0.5, size = 10),
    plot.caption = element_text(hjust = 1, size = 8, color = "grey50"),
    legend.position = "right",
    plot.margin = margin(10, 10, 10, 10)
  )

ggsave("Mean_Distance_by_Treatment_Site_Region.png", width = 10, height = 10, dpi = 300)

Figure 10: Intactive Map showing Regional Travel Pattern

# Calculate regional statistics for popups
region_stats <- data %>%
  group_by(patient_region_name) %>%
  summarise(
    n_patients = n(),
    mean_distance = round(mean(distance_km, na.rm = TRUE), 1),
    mean_experience = round(mean(X59, na.rm = TRUE), 2),
    .groups = "drop"
  )

# Join to spatial data
regions_for_map <- nhs_regions %>%
  left_join(region_stats, by = c("region_name" = "patient_region_name"))

# Get unique treatment centres
treatment_centres <- data %>%
  group_by(site_name, site_longitude, site_latitude, site_region_name) %>%
  summarise(
    n_patients = n(),
    .groups = "drop"
  )

# Create color palettes
pal_regions <- colorNumeric(
  palette = "YlOrRd",
  domain = regions_for_map$mean_distance,
  na.color = "grey90"
)

pal_patients <- colorNumeric(
  palette = "YlOrRd",
  domain = data$distance_km
)

# Create interactive map
leaflet() %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  
  # Add NHS Region boundaries
  addPolygons(
    data = regions_for_map,
    fillColor = ~pal_regions(mean_distance),
    fillOpacity = 0.4,
    color = "#444444",
    weight = 2,
    opacity = 0.8,
    popup = ~paste0(
      "<b style='font-size: 14px;'>", region_name, "</b><br>",
      "<b>Patients:</b> ", n_patients, "<br>",
      "<b>Mean Distance:</b> ", mean_distance, " km<br>",
      "<b>Mean Experience:</b> ", mean_experience
    ),
    label = ~region_name,
    highlightOptions = highlightOptions(
      weight = 3,
      color = "#666",
      fillOpacity = 0.6,
      bringToFront = FALSE
    ),
    group = "NHS Regions"
  ) %>%
  
  # Add patient points
  addCircleMarkers(
    data = data,
    lng = ~patient_longitude,
    lat = ~patient_latitude,
    radius = 3,
    color = ~pal_patients(distance_km),
    fillOpacity = 0.6,
    stroke = FALSE,
    popup = ~paste0(
      "<b>Patient Region:</b> ", patient_region_name, "<br>",
      "<b>Treatment Centre:</b> ", site_name, "<br>",
      "<b>Site Region:</b> ", site_region_name, "<br>",
      "<b>Cross-boundary:</b> ", cross_boundary, "<br>",
      "<b>Distance:</b> ", round(distance_km, 1), " km"
    ),
    label = ~paste0(round(distance_km, 1), " km"),
    group = "Patients"
  ) %>%
  
  # Add treatment centres
  addMarkers(
    data = treatment_centres,
    lng = ~site_longitude,
    lat = ~site_latitude,
    popup = ~paste0(
      "<b>", site_name, "</b><br>",
      "<b>Region:</b> ", site_region_name
    ),
    label = ~site_name,
    group = "Treatment Centres"
  ) %>%
  
  # Add legends
  addLegend(
    position = "bottomright",
    pal = pal_regions,
    values = regions_for_map$mean_distance,
    title = "Regional Mean<br>Distance (km)",
    opacity = 0.8,
    group = "NHS Regions"
  ) %>%
  
  # Add layer control
  addLayersControl(
    overlayGroups = c("NHS Regions", "Patients", "Treatment Centres"),
    options = layersControlOptions(collapsed = FALSE)
  ) %>%
  
  # Add title
  addControl(
    html = "<div style='background: white; padding: 10px; border-radius: 5px; font-weight: bold;'>
          Patient Travel Patterns<br>
            <span style='font-size: 11px; font-weight: normal;'>
            Toggle layers | Hover regions for stats | Click patients for details
            </span></div>",
    position = "topleft"
  )

Table 6: Regional Cross Boundary Travel Patterns

# Analyze cross-boundary travel patterns
cross_boundary_flows <- data %>%
  filter(cross_boundary == "Yes") %>%
  group_by(patient_region_name, site_region_name) %>%
  summarise(
    n_patients = n(),
    mean_distance = round(mean(distance_km, na.rm = TRUE), 1),
    .groups = "drop"
  ) %>%
  arrange(desc(n_patients))

# View top flows
cross_boundary_flows %>%
  head(10) %>%
  kable(caption = "Top 10 Cross-Boundary Patient Flows") %>%
  kable_styling(bootstrap_options = c("striped", "hover"))

Top 10 Cross-Boundary Patient Flows
patient_region_name	site_region_name	n_patients	mean_distance
South East	London	290	67.0
East of England	London	172	52.5
South West	South East	66	65.0
Midlands	North East and Yorkshire	36	54.7
East of England	South East	20	66.0
North East and Yorkshire	Midlands	20	254.6
North West	Midlands	18	152.5
Midlands	North West	15	100.9
Midlands	East of England	14	87.3
South West	London	14	247.3

# Create flow map (simplified version)

Table 6 shows the ten most frequent cross-boundary patient flow:

The most common flow was from the South East to London (n=290 patients, mean distance=67.0 km), representing patients from the South East region accessing specialist pediatric oncology centers in London.
The second largest flow was from the East of England to London (n=172, mean=52.5 km), further demonstrating London’s role as a regional referral hub for surrounding areas.
Notable long-distance flows included patients from the North East and Yorkshire traveling to the Midlands (n=20, mean=254.6 km) and South West patients accessing London services (n=14, mean=247.3 km),

Distance by Cancer Alliance

Fig 10: Boxplot showing Distance Travelled by Cancer Alliance

p1 <- ggplot(data, aes(x = cancer_alliance_name, y = distance_km, 
                       fill = cancer_alliance_name)) +
  geom_boxplot() +
  
  labs(title = "Distance by Cancer Alliance",
       x = "", y = "Distance (km)") +
  theme(legend.position = "none", axis.text.x = element_text(angle = 20, hjust = 1))
p1

Distance by Diagnostic Group

Table 7: Distance Travelled by Diagnostic Group

# Summary by Diagnotic Group
Diagnostic_summary <- data %>%
  group_by(diagnostic_group) %>%
  summarise(
    `N Patients` = n(),
    `Mean Distance (km)` = round(mean(distance_km, na.rm = TRUE), 1),
    `Median Distance (km)` = round(median(distance_km, na.rm = TRUE), 1),
    `Max Distance (km)` = round(max(distance_km, na.rm = TRUE), 1),
    `% Traveling >50km` = round(sum(distance_km > 50, na.rm = TRUE) / n() * 100, 1)
  ) %>%
  arrange(desc(`Mean Distance (km)`))
Diagnostic_summary %>%
  kable(caption = "Travel Distance by Diagnostic Group") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  row_spec(0, bold = TRUE, background = "#f0f0f0")

Travel Distance by Diagnostic Group
diagnostic_group	N Patients	Mean Distance (km)	Median Distance (km)	Max Distance (km)	% Traveling >50km
Retinoblastoma	120	115.6	98.8	376.7	69.2
Hepatic tumours	34	59.4	36.4	315.5	44.1
Malignant bone tumours	158	53.2	30.8	398.8	34.2
CNS and miscellaneous intracranial and intraspinal neoplasms	704	51.1	34.7	421.9	38.6
All other	530	50.7	36.7	434.6	40.4
Renal tumours	132	49.5	31.9	268.0	33.3
Leukaemias, myeloproliferative diseases, and myelodysplastic diseases	1431	43.9	31.1	562.8	32.6
Lymphomas and reticuloendothelial neoplasms	349	43.7	31.4	261.2	35.2

Table 7 shows the travel distances for patients diagnosed with different types of cancer, grouped by diagnostic category. The table includes several key metrics: the number of patients, mean distance, median distance, maximum distance, and the percentage of patients traveling over 50 km.

Key Insights:

Retinoblastoma (a type of eye cancer) has the highest mean travel distance (115.6km), which indicates that this specialised condition requires patients to travel to specialised centers for treatment.
Similarly, the maximum distances traveled by patients with Retinoblastoma (376.7 km) and Hepatic tumours (315.5 km) suggest that some of these patients travel long distances to access specialised care.
Leukaemias, myeloproliferative diseases, and myelodysplastic diseases show the shortest mean travel distance (44.0 km) and median distance (31.1 km), suggesting that many of these patients may have access to treatment in local or regional centers.
CNS and miscellaneous intracranial and intraspinal neoplasms have a mean distance of 51.5 km, which is still relatively moderate, indicating regional accessibility for these patients.

Distance and Patience Experience

Figure 20: Distance Travelled by Child Experience Score

# Boxplot showing distance distribution by X60 score, excluding NA values
dce <- ggplot(data %>% filter(!is.na(X60)), aes(x = factor(X60), y = distance_km, fill = factor(X60))) +
  geom_boxplot(alpha = 0.7, outlier.colour = "red", outlier.alpha = 0.5) +
 
  scale_fill_viridis_d(option = "plasma") +
  labs(
    title = "Distance Travelled by Child Experience Score (X60)",
    subtitle = "Red diamond indicates mean distance; red dots show outliers",
    x = "Child Experience Score (X60)",
    y = "Distance Travelled (km)",
    caption = "Figure 7c: Distribution of travel distances for each experience rating"
  ) +
  theme(legend.position = "none")
interactive_dce <- ggplotly(dce)
interactive_dce

Figure 20 shows the distribution of travel distances (in kilometers) for different Child Experience Scores (X60). The child experience score (X60) is used to measure how children perceive their treatment experience

Key Insights:

The Very well group travels the farthest, with a wide range of travel distances, including some extreme outliers. This suggests that patients with the best experiences come from more distant or specialised treatment centers.
While many patients travel moderate distances to their treatment centers, there are very visible outliers in “quite well and very well group who must travel much longer distances, often due to the need for specialised care not available locally.

Distance Distribution by Parent Experience (X59 score)

Figure 21: Distance Travelled by Parent Experience Score

# 
# Boxplot showing distance distribution by X59 score, excluding NA values
dce <- ggplot(data %>% filter(!is.na(X59)), aes(x = factor(X59), y = distance_km, fill = factor(X59))) +
  geom_boxplot(alpha = 0.7, outlier.colour = "red", outlier.alpha = 0.5) +
  
  scale_fill_viridis_d(option = "plasma") +
  labs(
    title = "Distance Travelled by Parent Experience Score (X59)",
    subtitle = "Red diamond indicates mean distance; red dots show outliers",
    x = "Parent Experience Score (X59)",
    y = "Distance Travelled (km)",
    caption = "Distribution of travel distances for each experience rating"
  ) +
  theme(legend.position = "none")

interactive_dce <- ggplotly(dce)
interactive_dce

Figure 21 shows the distribution of travel distances (in kilometers) according to Parent Experience Scores (X59). The parent experience score (X59) measures how well parents perceive the care their children receive, and this plot could be used to explore how these perceptions relate to the distance they travel for treatment.

Key Insights:

Parents giving scores of 9-10 (highest satisfaction) actually travel similar or even longer distances than those with lower scores, this suggests distance alone does not determine experience quality
Extreme Outliers at High Experience Scores. Parents scoring 9-10 include those traveling 300-400+ km (red outlier dots).

These families travel the furthest distances yet still report excellent experiences
Indicates that specialised care quality may outweigh travel burden for some families

So far, there is no obvious downward trend where longer distances equals worse experiences

Statistical Analysis

data <- data %>%
  mutate(
    X59 = as.numeric(str_extract(X59, "\\d+(?:\\.\\d+)?")),
    X59 = as.numeric(str_extract(X59, "\\d+(?:\\.\\d+)?"))
  )

# Ensure that X60 is properly ordered from worst to best experience
data$X60 <- factor(
  data$X60,
  levels = c("Not at all well", "Not very well", "OK", "Quite well", "Very well"),
  ordered = TRUE
)

# 1 = lowest satisfaction → 5 = highest)
data$X60_num <- as.numeric(data$X60)

# Check that conversion worked correctly
table(data$X60, data$X60_num)

##                  
##                     1   2   3   4   5
##   Not at all well   1   0   0   0   0
##   Not very well     0   1   0   0   0
##   OK                0   0   8   0   0
##   Quite well        0   0   0  70   0
##   Very well         0   0   0   0 288

#  Create parent_cohort dataset that stores NA-free X59 and X59
parent_cohort<- data%>%
  filter(!is.na(X59) | !is.na(X59))

# Create child_cohort dataset that stores NA-free X60
child_cohort <- data %>%
  filter(!is.na(X60))

Spearman and Kendall Correlation Analysis

Non‑parametric correlation analyses were conducted using Spearman’s rho and Kendall’s tau. These methods were selected because they do not assume normality, are robust to outliers, and are appropriate for ordinal or non‑linear data. Given that travel distance and experience scores do not follow a linear distribution and contain extreme values, rank‑based correlations provide a more reliable assessment of monotonic relationships than Pearson’s correlation.

Table 10: Spearman and Kendall Correlation Analysis

# bootstrap CI for correlation
bootstrap_ci <- function(data, x, y, method = "spearman", R = 2000) {
  fn <- function(d, indices) {
    dd <- d[indices, ]
    return(cor(dd[[x]], dd[[y]], method = method, use = "complete.obs"))
  }
  set.seed(123)
  boot_res <- boot(data, fn, R = R)
  boot_ci <- boot.ci(boot_res, type = "perc")
  c(lower = boot_ci$percent[4], upper = boot_ci$percent[5])
}

# Spearman correlations
spearman_child <- cor.test(child_cohort$distance_km, child_cohort$X60_num, method = "spearman", use = "complete.obs")
spearman_parent <- cor.test(parent_cohort$distance_km, parent_cohort$X59, method = "spearman", use = "complete.obs")

# Kendall correlations
kendall_child <- cor.test(child_cohort$distance_km, child_cohort$X60_num, method = "kendall", use = "complete.obs")
kendall_parent <- cor.test(parent_cohort$distance_km, parent_cohort$X59, method = "kendall", use = "complete.obs")

# Bootstrap CIs
ci_spearman_child <- bootstrap_ci(child_cohort, "distance_km", "X60_num", method = "spearman")
ci_spearman_parent <- bootstrap_ci(parent_cohort, "distance_km", "X59", method = "spearman")
ci_kendall_child <- bootstrap_ci(child_cohort, "distance_km", "X60_num", method = "kendall")
ci_kendall_parent <- bootstrap_ci(parent_cohort, "distance_km", "X59", method = "kendall")

# Combine results
correlation_summary <- data.frame(
  Experience_Type = c("Child (X60)", "Parent (X59)", "Child (X60)", "Parent (X59)"),
  Method = c("Spearman", "Spearman", "Kendall", "Kendall"),
  Correlation = c(spearman_child$estimate, spearman_parent$estimate,
                  kendall_child$estimate, kendall_parent$estimate),
  P_Value = c(spearman_child$p.value, spearman_parent$p.value,
              kendall_child$p.value, kendall_parent$p.value),
  CI = c(
    paste0("(", round(ci_spearman_child["lower"], 3), ", ", round(ci_spearman_child["upper"], 3), ")"),
    paste0("(", round(ci_spearman_parent["lower"], 3), ", ", round(ci_spearman_parent["upper"], 3), ")"),
    paste0("(", round(ci_kendall_child["lower"], 3), ", ", round(ci_kendall_child["upper"], 3), ")"),
    paste0("(", round(ci_kendall_parent["lower"], 3), ", ", round(ci_kendall_parent["upper"], 3), ")")
  )
)

# Round correlation and p-values
correlation_summary <- correlation_summary %>%
  mutate(across(c(Correlation, P_Value), round, 3))

correlation_summary

##   Experience_Type   Method Correlation P_Value              CI
## 1     Child (X60) Spearman      -0.032   0.538 (-0.134, 0.073)
## 2    Parent (X59) Spearman      -0.008   0.837 (-0.079, 0.065)
## 3     Child (X60)  Kendall      -0.025   0.546 (-0.108, 0.059)
## 4    Parent (X59)  Kendall      -0.006   0.840  (-0.06, 0.049)

Table 10 and Figure 22 summarise the results of non‑parametric correlation analyses (Spearman’s rho and Kendall’s tau) conducted to examine the relationship between travel distance and both child (X60) and parent (X59) experience scores. Confidence intervals were estimated using bootstrap resampling.

Results Interpretation

a) Child Experience (X60)

Spearman’s rho = -0.032, 95% CI = (-0.134, 0.073), p = 0.538
Kendall’s tau = -0.025, 95% CI = (-0.108, 0.059), p = 0.546

Interpretation:

Both Spearman’s and Kendall’s coefficients are very close to zero, indicating an almost nonexistent relationship between travel distance and child experience scores.
The negative signs suggest a weak tendency for longer travel distances to be associated with slightly lower experience scores, but the effect size is negligible.
The confidence intervals straddle zero, reinforcing that the true correlation could be negative, positive, or null.
The p‑values are well above 0.05, confirming that the observed associations are not statistically significant.
In practical terms, children who travel longer distances do not consistently report better or worse experiences.

b) Parent Experience (X59)

Spearman’s rho = -0.008, 95% CI = (-0.079, 0.065), p = 0.837
Kendall’s tau = -0.006, 95% CI = (-0.06, 0.049), p = 0.840

Interpretation:

For parents, the correlations are virtually zero, indicating complete independence between travel distance and parent‑reported experience scores.
The confidence intervals are narrow and centered around zero, further supporting the absence of any meaningful relationship.
The very high p‑values (>0.8) confirm that the null hypothesis of no correlation cannot be rejected.
This suggests that parents’ reported experiences are unaffected by the distance traveled.

Correlation by Sub-Groups

Distance and Parent-Experience (X59) Correlation by Diagnostic Group

Table 13: Correlation by Diagnostic Group

#Helper function: bootstrap CI for Spearman correlation
bootstrap_ci <- function(df, x, y, R = 1000) {
  fn <- function(d, indices) {
    dd <- d[indices, ]
    cor(dd[[x]], dd[[y]], method = "spearman", use = "complete.obs")
  }
  set.seed(123)
  boot_res <- boot(df, fn, R = R)
  boot_ci <- boot.ci(boot_res, type = "perc")
  c(lower = boot_ci$percent[4], upper = boot_ci$percent[5])
}

# Get subgroup list
groups <- unique(parent_cohort$diagnostic_group)

# Loop over groups to compute correlations + CI
cor_by_diag <- lapply(groups, function(g) {
  df <- parent_cohort %>% filter(diagnostic_group == g)
  
  # Spearman correlation test
  test <- cor.test(df$distance_km, df$X59, method = "spearman", use = "complete.obs")
  
  # Bootstrap CI
  ci <- bootstrap_ci(df, "distance_km", "X59", R = 1000)
  
  tibble(
    diagnostic_group = g,
    N = nrow(df),
    Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
    
    Correlation = round(test$estimate, 3),
    P_Value = round(test$p.value, 4),
    CI_lower = round(ci[1], 3),
    CI_upper = round(ci[2], 3),
    CI = paste0("(", round(ci[1], 3), ", ", round(ci[2], 3), ")"),
    Significant = ifelse(test$p.value < 0.05, "✓", "")
  )
}) %>% bind_rows()



# Display table
cor_by_diag %>%
  arrange(desc(Correlation)) %>%
  kable(caption = "Correlation Between Distance and Parent Experience by Diagnostic Group") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Correlation Between Distance and Parent Experience by Diagnostic Group
diagnostic_group	N	Mean_Distance	Correlation	P_Value	CI_lower	CI_upper	CI	Significant
Hepatic tumours	8	72.6	0.507	0.1996	-0.259	0.906	(-0.259, 0.906)
Renal tumours	36	55.3	0.296	0.0792	-0.020	0.604	(-0.02, 0.604)
CNS and miscellaneous intracranial and intraspinal neoplasms	135	52.1	0.071	0.4157	-0.088	0.236	(-0.088, 0.236)
Retinoblastoma	26	97.3	0.042	0.8400	-0.365	0.413	(-0.365, 0.413)
Lymphomas and reticuloendothelial neoplasms	93	38.1	-0.008	0.9397	-0.235	0.199	(-0.235, 0.199)
Leukaemias, myeloproliferative diseases, and myelodysplastic diseases	274	44.2	-0.026	0.6716	-0.149	0.090	(-0.149, 0.09)
All other	108	57.0	-0.059	0.5433	-0.252	0.149	(-0.252, 0.149)
Malignant bone tumours	42	56.1	-0.362	0.0186	-0.603	-0.062	(-0.603, -0.062)	✓

Table 13 examines whether the relationship between travel distance and parent experience varies across different cancer types. Each diagnostic group was analyzed separately using Spearman’s correlation to identify cancer-specific patterns that might be masked in overall population analysis.

Key Findings: * Positive Correlations (farther distance = better experience):

Hepatic tumours: Strongest positive trend (r=0.507), but not statistically significant due to small sample (N=8)
Renal tumours: Moderate positive correlation (r=0.296), marginally significant (N=36)

No Meaningful Relationship:
CNS tumours (N=135), Retinoblastoma (N=26), Lymphomas (N=93), Leukaemias (N=274), and Other cancers (N=108) all showed correlations near zero, indicating distance doesn’t affect parent experience
Negative Correlation (farther distance = worse experience):
Malignant bone tumours: This is the only statistically significant finding by diagnostic, Significant negative correlation (r=-0.362, p=0.019). Families traveling farther report worse experiences

Table 14 shows the relationship between travel distance and parent experience varies across different cancer types.

Key Insights:

Malignant bone tumours showed the strongest negative correlation (r = -0.362, 95% CI: -0.603 to -0.062), with an uncorrected p-value of 0.0186, indicating a statistically significant relationship at the conventional p = 0.05 level. However, after Bonferroni correction (p = 0.1488), this relationship no longer meets the stringent adjusted significance threshold.
Whilst the Bonferroni-corrected p-value (0.1488) does not reach statistical significance, the malignant bone tumours finding remains noteworthy for several reasons:

The correlation coefficient (r = -0.362) represents a moderate negative relationship, suggesting clinical importance regardless of statistical threshold
The 95% CI (-0.603 to -0.062) excludes zero entirely, indicating we can be confident the true correlation is negative
This is the only diagnostic group showing a substantial negative correlation.
The Bonferroni correction is deliberately conservative, prioritising reduction of false positives at the expense of potentially missing true effects. With small subgroup samples (n=42), this correction may be overly stringent.

All other diagnostic groups showed either:
Very weak correlations
Bonferroni-corrected p-values of 1.000 (indicating no evidence of relationship)

Distance and Parent-Experience (X59) Correlation by Cancer Alliance

Table 17: Correlation by Cancer Alliance

# Bootstrap CI function with fallback
bootstrap_ci <- function(df, x, y, R = 1000) {
  if (nrow(na.omit(df[, c(x, y)])) < 3) return(c(NA, NA)) 
  fn <- function(d, indices) {
    dd <- d[indices, ]
    cor(dd[[x]], dd[[y]], method = "spearman", use = "complete.obs")
  }
  set.seed(123)
  boot_res <- boot(df, fn, R = R)
  boot_ci <- boot.ci(boot_res, type = "perc")
  c(lower = boot_ci$percent[4], upper = boot_ci$percent[5])
}

# Loop over cancer alliances
alliances <- unique(parent_cohort$cancer_alliance_name)

cor_by_alliance <- lapply(alliances, function(a) {
  df <- parent_cohort %>% filter(cancer_alliance_name == a)
  complete_df <- df %>% filter(!is.na(distance_km), !is.na(X59))
  
  if (nrow(complete_df) < 3) {
    return(tibble(
      cancer_alliance_name = a,
      N = nrow(df),
      Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
      
      Correlation = NA,
      P_Value = NA,
      CI_lower = NA,
      CI_upper = NA,
      
      Significant = ""
    ))
  }
  
  test <- cor.test(complete_df$distance_km, complete_df$X59, method = "spearman")
  ci <- bootstrap_ci(complete_df, "distance_km", "X59")
  
  tibble(
    cancer_alliance_name = a,
    N = nrow(df),
    Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
   
    Correlation = round(test$estimate, 3),
    P_Value = round(test$p.value, 4),
    CI_lower = round(ci[1], 3),
    CI_upper = round(ci[2], 3),
    
    Significant = ifelse(test$p.value < 0.05, "✓", "")
  )
}) %>% bind_rows()

# Table
cor_by_alliance %>%
  arrange(desc(Correlation)) %>%
  kable(caption = "Correlation Between Distance and Parent Experience by Cancer Alliance (Uncorrected)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Correlation Between Distance and Parent Experience by Cancer Alliance (Uncorrected)
cancer_alliance_name	N	Mean_Distance	Correlation	P_Value	CI_lower	CI_upper	Significant
Humber and North Yorkshire	24	68.0	0.415	0.0437	0.013	0.744	✓
Peninsula	8	206.6	0.358	0.3845	-0.480	0.813
South East London	19	25.7	0.279	0.2465	-0.268	0.703
North Central London	11	11.6	0.226	0.5034	-0.533	0.785
North East London	33	24.9	0.199	0.2678	-0.125	0.515
Greater Manchester	29	20.5	0.189	0.3250	-0.215	0.535
East of England	83	67.2	0.116	0.2948	-0.111	0.349
Northern	54	55.0	0.051	0.7116	-0.219	0.324
West Yorkshire and Harrogate	35	35.5	0.042	0.8087	-0.335	0.416
East Midlands	67	42.6	0.019	0.8777	-0.220	0.276
Lancashire and South Cumbria	17	79.7	-0.027	0.9166	-0.475	0.410
Somerset, Wiltshire, Avon and Gloucestershire	30	66.4	-0.045	0.8149	-0.375	0.292
South Yorkshire and Bassetlaw	17	22.7	-0.054	0.8374	-0.549	0.492
West London	35	19.4	-0.071	0.6861	-0.439	0.311
West Midlands	64	39.5	-0.087	0.4949	-0.349	0.173
Surrey and Sussex	49	57.2	-0.110	0.4532	-0.378	0.196
Cheshire and Merseyside	34	22.4	-0.159	0.3696	-0.470	0.165
Kent and Medway	34	67.9	-0.162	0.3594	-0.472	0.182
Wessex	34	48.1	-0.172	0.3319	-0.501	0.242
Thames Valley	31	46.6	-0.252	0.1718	-0.573	0.111
NA	0	NaN	NA	NA	NA	NA

Figure 26: Correlation by Cancer Alliance (Forest Plot)

Table 16 and figure 26 reveal significant geographic variability in the distance-experience relationship. Key insights:

Of the 21 Cancer Alliances analyzed, only Humber and North Yorkshire demonstrated a statistically significant correlation between distance and parent experience (r = +0.415, p = 0.044, n = 91). This positive correlation indicates families traveling further within this region report better experiences, reflecting the benefit of accessing specialist centers.
Some regions emerge with low satisfaction despite short distances: West London (8.40 satisfaction, 16.3km mean distance) and Surrey/Sussex (8.65 satisfaction, 58.9km).
Thames Valley showed the strongest negative correlation (r = -0.252, p = 0.172, n = 129), though this did not reach statistical significance. Combined with below-average satisfaction (8.68)
Peninsula families travel furthest (216.9km mean) yet maintain good satisfaction (9.00), demonstrating that rural populations adapt to geographic necessity when supported by high-quality care and specialist centers.

Distance and Parent-Experience (X59) Correlation by Site-Name

Table 22: Correlation by Site Name

# Loop over sites
sites <- unique(parent_cohort$site_name)
cor_by_site <- lapply(sites, function(s) {
  df <- parent_cohort %>% filter(site_name == s)
  complete_df <- df %>% filter(!is.na(distance_km), !is.na(X59))
  
  if (nrow(complete_df) < 3) {
    return(tibble(
      site_name = s,
      N = nrow(df),
      Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
      Parent_Exp_Mean = round(mean(df$X59, na.rm = TRUE), 2),
      Correlation = NA,
      P_Value = NA,
      CI_lower = NA,
      CI_upper = NA,
    
      Significant = ""
    ))
  }
  
  test <- cor.test(complete_df$distance_km, complete_df$X59, method = "spearman")
  ci <- bootstrap_ci(complete_df, "distance_km", "X59")
  
  tibble(
    site_name = s,
    N = nrow(df),
    Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
    Parent_Exp_Mean = round(mean(df$X59, na.rm = TRUE), 2),
    Correlation = round(test$estimate, 3),
    P_Value = round(test$p.value, 4),
    CI_lower = round(ci[1], 3),
    CI_upper = round(ci[2], 3),
   
    Significant = ifelse(test$p.value < 0.05, "✓", "")
  )
}) %>% bind_rows()

# Table
cor_by_site %>%
  arrange(desc(Correlation)) %>%
  kable(caption = "Correlation Between Distance and Parent Experience by Site (Uncorrected)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Correlation Between Distance and Parent Experience by Site (Uncorrected)
site_name	N	Mean_Distance	Parent_Exp_Mean	Correlation	P_Value	CI_lower	CI_upper	Significant
The Royal Marsden Hospital (Surrey)	70	49.9	9.03	0.216	0.0720	-0.046	0.456
The Royal Victoria Infirmary	53	44.5	9.66	0.180	0.1975	-0.077	0.414
Great Ormond Street Hospital Central London Site	75	41.3	9.32	0.146	0.2106	-0.061	0.352
Addenbrooke’s Hospital	52	71.7	9.19	0.139	0.3273	-0.126	0.382
Nottingham University Hospitals NHS Trust - Queen’s Medical Centre Campus	31	35.6	8.74	0.097	0.6035	-0.288	0.438
Sheffield Children’s Hospital	29	36.8	9.48	0.083	0.6696	-0.299	0.436
Bristol Royal Hospital for Children	29	70.4	8.69	0.036	0.8532	-0.370	0.408
Birmingham Children’s Hospital	73	64.7	9.05	0.009	0.9410	-0.218	0.234
John Radcliffe Hospital	44	49.8	8.93	-0.027	0.8596	-0.323	0.274
Royal Manchester Children’s Hospital	50	44.0	9.20	-0.029	0.8391	-0.288	0.238
Leicester Royal Infirmary	16	20.6	9.31	-0.036	0.8943	-0.558	0.511
Alder Hey Children’s NHS Foundation Trust	40	40.6	9.15	-0.077	0.6374	-0.353	0.230
University College Hospital	23	61.6	8.87	-0.093	0.6716	-0.590	0.407
Leeds General Infirmary	52	38.7	9.13	-0.094	0.5089	-0.383	0.188
UCH Macmillan Cancer Centre	22	44.2	9.41	-0.117	0.6038	-0.606	0.370
University College Hospital Grafton Way Building	13	79.1	8.38	-0.150	0.6255	-0.759	0.479
Southampton General Hospital	48	60.2	8.60	-0.319	0.0269	-0.537	-0.012	✓
St George’s Hospital (Tooting)	2	7.7	9.00	NA	NA	NA	NA

Distance and Parent-Experience (X59) Correlation by Principal Treatment Centre

Table 24: Correlation by Principal Treatment Centre

# Loop over PTCs
ptcs <- unique(parent_cohort$ptc_name)

cor_by_ptc <- lapply(ptcs, function(p) {
  df <- parent_cohort %>% filter(ptc_name == p)
  complete_df <- df %>% filter(!is.na(distance_km), !is.na(X59))
  
  if (nrow(complete_df) < 3) {
    return(tibble(
      ptc_name = p,
      N = nrow(df),
      Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
    
      Correlation = NA,
      P_Value = NA,
      CI_lower = NA,
    
      Significant = ""
    ))
  }
  
  test <- cor.test(complete_df$distance_km, complete_df$X59, method = "spearman")
  ci <- bootstrap_ci(complete_df, "distance_km", "X59")
  
  tibble(
    ptc_name = p,
    N = nrow(df),
    Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
   
    Correlation = round(test$estimate, 3),
    P_Value = round(test$p.value, 4),
    CI_lower = round(ci[1], 3),
    CI_upper = round(ci[2], 3),
   
    Significant = ifelse(test$p.value < 0.05, "✓", "")
  )
}) %>% bind_rows()

# Table
cor_by_ptc %>%
  arrange(desc(Correlation)) %>%
  kable(caption = "Correlation Between Distance and Parent Experience by PTC (Uncorrected)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Correlation Between Distance and Parent Experience by PTC (Uncorrected)
ptc_name	N	Mean_Distance	Correlation	P_Value	CI_lower	CI_upper	Significant
The Royal Marsden NHS Foundation Trust & St George’s University Hospitals NHS Foundation Trust	72	48.7	0.212	0.0740	-0.016	0.450
The Newcastle upon Tyne Hospitals NHS Foundation Trust	53	44.5	0.180	0.1975	-0.077	0.414
Cambridge University Hospitals NHS Foundation Trust	52	71.7	0.139	0.3273	-0.126	0.382
Sheffield Children’s NHS Foundation Trust	29	36.8	0.083	0.6696	-0.299	0.436
University Hospitals Bristol and Weston NHS Foundation Trust	29	70.4	0.036	0.8532	-0.370	0.408
Nottingham University Hospitals NHS Trust & University Hospitals of Leicester NHS Trust	47	30.5	0.024	0.8704	-0.266	0.329
Birmingham Women’s and Children’s NHS Foundation Trust	73	64.7	0.009	0.9410	-0.218	0.234
Oxford University Hospitals NHS Foundation Trust	44	49.8	-0.027	0.8596	-0.323	0.274
Manchester University NHS Foundation Trust	50	44.0	-0.029	0.8391	-0.288	0.238
Great Ormond Street Hospital for Children NHS Foundation Trust & University College London Hospitals NHS Foundation Trust	133	49.0	-0.042	0.6325	-0.211	0.134
Alder Hey Children’s NHS Foundation Trust	40	40.6	-0.077	0.6374	-0.353	0.230
Leeds Teaching Hospitals NHS Trust	52	38.7	-0.094	0.5089	-0.383	0.188
University Hospital Southampton NHS Foundation Trust	48	60.2	-0.319	0.0269	-0.537	-0.012	✓

Table 24 and Figure 30 shows that:
* University Hospital Southampton NHS Foundation Trust stands out with a moderate negative correlation (ρ = –0.319, p = 0.027), indicating that longer travel distances may be linked to lower experience scores at this site.

All other PTCs have non-significant p-values, meaning the observed correlations could be due to chance.
Confidence intervals are wide for many centres, reflecting uncertainty due to small sample sizes.

Table 25 shows that:

After applying Bonferroni correction (to account for multiple testing across sites), all previously non‑significant results remain non‑significant (adjusted p = 1.00).
Southampton General Hospital’s result, while still showing a negative correlation, is no longer significant after correction (adjusted p = 0.347).

Distance and Parent-Experience (X59) Correlation by Age Group

Table 26: Correlation by Age Group

# Loop over survey types
survey_types <- unique(parent_cohort$survey_type)

cor_by_survey <- lapply(survey_types, function(s) {
  df <- parent_cohort %>% filter(survey_type == s)
  complete_df <- df %>% filter(!is.na(distance_km), !is.na(X59))
  
  if (nrow(complete_df) < 3) {
    return(tibble(
      survey_type = s,
      N = nrow(df),
      Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
    
      Correlation = NA,
      P_Value = NA,
      CI_lower = NA,
      CI_upper = NA,
   
      Significant = ""
    ))
  }
  
  test <- cor.test(complete_df$distance_km, complete_df$X59, method = "spearman")
  ci <- bootstrap_ci(complete_df, "distance_km", "X59")
  
  tibble(
    survey_type = s,
    N = nrow(df),
    Mean_Distance = round(mean(df$distance_km, na.rm = TRUE), 1),
    
    Correlation = round(test$estimate, 3),
    P_Value = round(test$p.value, 4),
    CI_lower = round(ci[1], 3),
    CI_upper = round(ci[2], 3),
   
    Significant = ifelse(test$p.value < 0.05, "✓", "")
  )
}) %>% bind_rows()

# Table
cor_by_survey %>%
  arrange(desc(Correlation)) %>%
  kable(caption = "Correlation Between Distance and Parent Experience by Age Group") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)

Correlation Between Distance and Parent Experience by Age Group
survey_type	N	Mean_Distance	Correlation	P_Value	CI_lower	CI_upper
0-7	339	51.4	0.070	0.1956	-0.034	0.169
12-15	247	48.9	-0.072	0.2595	-0.193	0.056
8-11	136	50.0	-0.080	0.3572	-0.243	0.089

Age-stratified analysis reveals no statistically significant relationships between distance and parent experience across any developmental stage (all p > 0.05). Key insights:

Youngest children (0-7 years, n=339) show a weak positive correlation (r = +0.070, p = 0.196), suggesting parents of infants and toddlers rate distant specialist centers slightly higher, reflecting the value placed on specialised expertise during this vulnerable developmental period.
Parents of adolescents (12-15 years) demonstrate a weak negative correlation (r = -0.072, p = 0.260), this pattern may reflect challenges balancing educational commitments, social development needs, and treatment logistics during this critical age.

Conclusion

In conclusion, geographic equity in paediatric cancer care is achievable. The challenge lies not in eliminating distance burden,but in ensuring that families facing unavoidable travel burdens receive support enabling them to access excellent care without compromising their experience. Peninsula, Welsh, and retinoblastoma families prove this is possible. The task now is extending that success to bone tumour families and others currently struggling with distance-related challenges.

Exploring the Impact of Travel Distance on Cancer Patient Experience: A Geospatial Analysis of the Under 16 CPES, 2024.

Imonikhe Ayeni

2026-01-13

Executive Summary

Aim

Key Findings

Overall: Distance Does not Affect Most Families’ Satisfaction

Travel Distances: Who Travels Farthest?

Correlation Between Distance and Satisfaction: Where It Matters

No Clear Links in Other Groups:

What This Means

Explorative Data Analysis

Table 1: Summary Statistics of Distance and Travel Time

Fig 1: Distribution of Travel Distance

Fig 2: Distribution of Distance Traveled by Distance Bands

Distance by Treatment Centres

Table 3: Travel Distance by Principal Treatment Centres

Fig 4: Interactive Map showing Patient Locations and Treatment Centres

Distance by NHS England Regions

Table 4: Regional Travel Burden by Treatment Centre

Figure 8: Regional Travel Burden by Treatment Centre

Figure 10: Intactive Map showing Regional Travel Pattern

Distance by Cancer Alliance

Fig 10: Boxplot showing Distance Travelled by Cancer Alliance

Distance by Diagnostic Group

Table 7: Distance Travelled by Diagnostic Group

Distance and Patience Experience

Distance Distribution by Parent Experience (X59 score)

Figure 21: Distance Travelled by Parent Experience Score

Statistical Analysis

Spearman and Kendall Correlation Analysis

Table 10: Spearman and Kendall Correlation Analysis

Results Interpretation

Correlation by Sub-Groups

Distance and Parent-Experience (X59) Correlation by Diagnostic Group

Table 13: Correlation by Diagnostic Group

Distance and Parent-Experience (X59) Correlation by Cancer Alliance

Table 17: Correlation by Cancer Alliance

Figure 26: Correlation by Cancer Alliance (Forest Plot)

Distance and Parent-Experience (X59) Correlation by Site-Name

Table 22: Correlation by Site Name

Distance and Parent-Experience (X59) Correlation by Principal Treatment Centre

Table 24: Correlation by Principal Treatment Centre

Distance and Parent-Experience (X59) Correlation by Age Group

Table 26: Correlation by Age Group

Conclusion