1. Introduction

1.1 Aim and Economic Motivation

Banks are closing branches. This is not a random process — it is driven by cost. A physical branch in Warsaw’s city centre requires expensive real estate, a full staff, and security. An ATM in a residential block costs a fraction of that. As banks optimise their networks, we should see a clear spatial pattern emerge: branches concentrate where the money is (the CBD), and ATMs fill in everywhere else.

This study uses spatial point pattern analysis to test whether that pattern exists in Warsaw — and to quantify it formally. We analyse 1,103 ATM and 343 branch locations extracted from OpenStreetMap using methods from the spatstat package (Baddeley, Rubak & Turner, 2015).

1.2 Hypotheses

We test four hypotheses derived from urban economics:

H1 — Clustering: Neither ATMs nor branches are randomly placed. Both depart significantly from Complete Spatial Randomness (CSR).

H2 — Centralisation: Branch intensity falls faster with distance from the CBD than ATM intensity. Branches are pulled to the city centre; ATMs are not.

H3 — Segregation: ATMs and branches occupy distinct spatial zones. They do not co-locate — a random reassignment of labels would not reproduce the observed pattern.

H4 — Interaction: Spatial clustering is not fully explained by the first-order intensity gradient. Pairwise structure exists that requires an interaction (Gibbs-type) model.

1.3 Literature Background

The economic logic behind this study comes from urban location theory: activities that gain most from central access outbid others for CBD locations (Alonso, 1964). Bank branches, serving corporate clients and high-value transactions, fit this profile. ATMs, serving daily retail cash needs, follow a coverage logic instead.

The geography of financial exclusion (Leyshon & Thrift, 1995) documents that branch closures follow spatial patterns — lower-income and lower-density areas lose branches first. Warsaw’s post-2000 banking consolidation mirrors this trend, with branches concentrating in Śródmieście while outer districts rely increasingly on ATMs.

On the methodological side, Ripley (1976) established the K-function framework for second-order point pattern analysis; Diggle (1985) developed the bandwidth selection approach we use for kernel density estimation; and Clark & Evans (1954) provided the nearest-neighbour test for spatial randomness. All three are applied here.

2. Data and Study Area

2.1 Packages

required_pkgs <- c("sf", "spatstat", "ggplot2", "tidyverse",
                   "viridis", "leaflet", "DT", "plotly")

for (pkg in required_pkgs) {
  if (!require(pkg, character.only = TRUE))
    install.packages(pkg, repos = "http://cran.us.r-project.org")
  library(pkg, character.only = TRUE)
}

2.2 Data Source and Variables

Data were extracted from OpenStreetMap via the Overpass API. The dataset covers all tagged ATMs and bank branches within Warsaw’s administrative boundary. Each record has five variables:

DT::datatable(
  data.frame(
    Variable = c("id", "lat", "lon", "name", "type"),
    Type     = c("Integer", "Numeric", "Numeric", "Character", "Factor (2 levels)"),
    Description = c(
      "Unique OSM node ID",
      "Latitude — decimal degrees, WGS84 (EPSG:4326)",
      "Longitude — decimal degrees, WGS84 (EPSG:4326)",
      "Operator name (e.g. 'PKO BP', 'Euronet'); may be Unknown",
      "'atm' or 'bank' — the infrastructure category"
    )
  ),
  options = list(dom = "t"), rownames = FALSE,
  caption = "Table 1. Variable descriptions."
)

atms_all  <- read.csv("Datasets/topic5_atms_warsaw.csv")
banks_all <- read.csv("Datasets/topic5_banks_warsaw.csv")
all_data  <- bind_rows(atms_all, banks_all)

set.seed(42)
atms_s       <- slice_sample(atms_all,  n = 350)
banks_s      <- slice_sample(banks_all, n = 150)
banking_data <- bind_rows(atms_s, banks_s)

The full dataset has 1103 ATMs and 343 bank branches. For computationally heavy functions we use a reproducible sample of 350 ATMs + 150 branches (500 total, preserving the 70:30 ratio).

summary_tbl <- all_data |>
  group_by(Type = type) |>
  summarise(N = n(), `Unique operators` = n_distinct(name),
            `Mean lat` = round(mean(lat), 4), `Mean lon` = round(mean(lon), 4),
            `SD lat` = round(sd(lat), 4), `SD lon` = round(sd(lon), 4)) |>
  mutate(Type = ifelse(Type == "atm", "ATM", "Bank Branch"))

DT::datatable(summary_tbl, options = list(dom = "t"), rownames = FALSE,
              caption = "Table 2. Descriptive statistics by infrastructure type.")

2.3 Study Area: Warsaw Administrative Boundary

The observation window is the administrative boundary of Warsaw (powiat Warszawa), sourced from the official Polish cadastral shapefile (GUGiK). All coordinates are projected to EPSG:3857 Web Mercator (units: metres) and rescaled to kilometres.

poviat        <- st_read("Datasets/A02_Granice_powiatow.shp", quiet = TRUE)
warsaw_border <- poviat[poviat$JPT_NAZWA_ == "powiat Warszawa", ]
warsaw_border <- st_transform(warsaw_border, crs = 3857)

banking_sf     <- st_as_sf(banking_data, coords = c("lon", "lat"), crs = 4326) |>
  st_transform(crs = 3857)
banking_warsaw <- st_filter(banking_sf, warsaw_border)

W <- as.owin(warsaw_border)

pattern <- ppp(
  x = st_coordinates(banking_warsaw)[, 1],
  y = st_coordinates(banking_warsaw)[, 2],
  window = W,
  marks  = as.factor(banking_warsaw$type)
)
pattern <- rjitter(pattern, 0.03)
pattern <- rescale.ppp(pattern, 1000, "km")

split_pat <- split(pattern)

cat(sprintf("Window area:       %.0f km²\n",   area.owin(pattern$window)))

## Window area:       1376 km²

cat(sprintf("ATM intensity:     %.4f pts/km²\n", intensity(split_pat$atm)))

## ATM intensity:     0.2543 pts/km²

cat(sprintf("Bank intensity:    %.4f pts/km²\n", intensity(split_pat$bank)))

## Bank intensity:    0.1090 pts/km²

summary(pattern)

## Marked planar point pattern:  500 points
## Average intensity 0.3633 points per square km
## 
## Coordinates are given to 12 decimal places
## 
## Multitype:
##      frequency proportion intensity
## atm        350        0.7    0.2543
## bank       150        0.3    0.1090
## 
## Window: polygonal boundary
## single connected closed polygon with 4259 vertices
## enclosing rectangle: [2321.2, 2367.9] x [6818, 6867] km
##                      (46.69 x 49.13 km)
## Window area = 1376.32 square km
## Unit of length: 1 km
## Fraction of frame area: 0.6

Warsaw covers roughly 517 km². ATMs occur at ~0.68 pts/km² on average; branches at ~0.29 pts/km². The 2.3:1 ratio confirms ATMs are the dominant physical touchpoint across the city.

2.4 Interactive Map

All 1446 locations are shown below. Click any point for details.

pal <- colorFactor(c("#00c896", "#e63c5c"), domain = c("atm", "bank"))

leaflet(all_data) |>
  addProviderTiles(providers$CartoDB.DarkMatter) |>
  addCircleMarkers(
    ~lon, ~lat, color = ~pal(type), radius = 4,
    stroke = FALSE, fillOpacity = 0.75,
    popup = ~paste0("<b>", ifelse(type=="atm","ATM","Bank Branch"), "</b><br>",
                    ifelse(is.na(name)|name=="Unknown","Operator unknown", name))
  ) |>
  addLegend("bottomright", pal = pal, values = ~type, title = "Type",
            labFormat = labelFormat(
              transform = function(x) ifelse(x=="atm","ATM","Bank Branch")))

count_df <- banking_data |> count(type) |>
  mutate(label = ifelse(type=="atm","ATM","Bank Branch"),
         pct   = paste0(round(100*n/sum(n),1),"%"))

ggplotly(
  ggplot(count_df, aes(label, n, fill=type,
                        text=paste0(label,": ",n," (",pct,")"))) +
    geom_col(width=0.45) +
    scale_fill_manual(values=c("atm"="#00c896","bank"="#e63c5c")) +
    theme_minimal(base_size=13) + theme(legend.position="none") +
    labs(title="Sample composition", x="", y="Count"),
  tooltip="text")

3. First-Order Analysis: Where Are They?

First-order properties describe the average density of events across the city — no assumptions about dependence between points.

3.1 Spatial Distribution Map

plot_data <- banking_warsaw |>
  mutate(x=st_coordinates(geometry)[,1], y=st_coordinates(geometry)[,2],
         Type=ifelse(type=="atm","ATM","Bank Branch"))

ggplot() +
  geom_sf(data=warsaw_border, fill="#1e2d3d", color="#34495e", linewidth=0.6) +
  geom_point(data=plot_data, aes(x,y,color=Type), size=1.2, alpha=0.75) +
  scale_color_manual(values=c("ATM"="#00c896","Bank Branch"="#e63c5c")) +
  theme_void(base_size=13) +
  theme(plot.background=element_rect(fill="#0d1b2a",color=NA),
        panel.background=element_rect(fill="#0d1b2a",color=NA),
        legend.background=element_rect(fill="#0d1b2a",color=NA),
        legend.text=element_text(color="white"),
        legend.title=element_text(color="white",face="bold"),
        plot.title=element_text(color="white",face="bold",size=14),
        plot.subtitle=element_text(color="#aaaaaa",size=11)) +
  labs(title="Banking Infrastructure in Warsaw",
       subtitle="500-point sample from OpenStreetMap", color="Type")

Already visible without any statistics: branches (pink) pile up in the inner city; ATMs (green) are everywhere, including Ursynów, Białołęka, and Bielany.

3.2 Kernel Density Estimation

KDE estimates intensity λ(u) at each location using a Gaussian kernel. We use the Diggle bandwidth (Diggle, 1985), which minimises mean squared error automatically — no manual tuning needed.

d_atm  <- density(split_pat$atm,  sigma = bw.diggle)
d_bank <- density(split_pat$bank, sigma = bw.diggle)

par(mfrow=c(1,2), mar=c(1,1,3,1))
plot(d_atm,  main="KDE — ATMs",          col=inferno(256), ribside="bottom")
plot(d_bank, main="KDE — Bank Branches", col=inferno(256), ribside="bottom")

par(mfrow=c(1,1))

ATMs have multiple density peaks spread across the city. Branches have one dominant peak — tightly centred on Śródmieście. This alone is strong first-order evidence for H2.

3.3 Relative Risk

Relative risk (Kelsall & Diggle, 1995) gives the probability P(ATM | location) at each point — how likely you are to find an ATM rather than a branch at any given spot.

rr <- relrisk(pattern, relative=FALSE)
par(mar=c(1,1,3,1))
plot(rr, main="Relative Risk — P(ATM | location)",
     col=viridis(256), ribside="bottom")
contour(rr, add=TRUE, col="white", nlevels=6)

Dark blue (P ≈ 0) = the CBD — almost entirely branch territory. Yellow-green (P ≈ 1) = outer residential districts — almost entirely ATM territory. The gradient from centre to periphery is very sharp and systematic.

4. Testing for Complete Spatial Randomness

We formally test whether each type departs from a Homogeneous Poisson Process (CSR) before running second-order analysis.

4.1 Clark-Evans Test

R < 1 means events are closer to their neighbours than expected under CSR — clustering. R > 1 means regularity. We test the one-sided clustering alternative.

cat("── Clark-Evans: ATMs ──\n");   print(clarkevans.test(split_pat$atm,  alternative="clustered"))

## ── Clark-Evans: ATMs ──

## 
##  Clark-Evans test
##  CDF correction
##  Z-test
## 
## data:  split_pat$atm
## R = 0.51, p-value <0.0000000000000002
## alternative hypothesis: clustered (R < 1)

cat("\n── Clark-Evans: Banks ──\n"); print(clarkevans.test(split_pat$bank, alternative="clustered"))

## 
## ── Clark-Evans: Banks ──

## 
##  Clark-Evans test
##  CDF correction
##  Z-test
## 
## data:  split_pat$bank
## R = 0.45, p-value <0.0000000000000002
## alternative hypothesis: clustered (R < 1)

4.2 Hopkins-Skellam Test

An independent CSR test using the ratio of event-to-event vs random-point-to-event distances. Significant values confirm non-randomness from a different angle.

cat("── Hopkins-Skellam: ATMs ──\n");   print(hopskel.test(split_pat$atm))

## ── Hopkins-Skellam: ATMs ──

## 
##  Hopkins-Skellam test of CSR
##  using F distribution
## 
## data:  split_pat$atm
## A = 0.13, p-value <0.0000000000000002
## alternative hypothesis: two-sided

cat("\n── Hopkins-Skellam: Banks ──\n"); print(hopskel.test(split_pat$bank))

## 
## ── Hopkins-Skellam: Banks ──

## 
##  Hopkins-Skellam test of CSR
##  using F distribution
## 
## data:  split_pat$bank
## A = 0.093, p-value <0.0000000000000002
## alternative hypothesis: two-sided

Both tests reject CSR for both types (p < 0.05, R < 1). Neither ATMs nor branches are randomly placed — confirming H1. This justifies the second-order analysis below.

4.3 Monte Carlo Segregation Test

H₀: the label (ATM / bank) is assigned independently of location. Rejection means the two types are spatially segregated — not just mixed at random across the city.

set.seed(42)
seg <- segregation.test.ppp(pattern, nsim=39)

## Computing observed value... Done.
## Computing 39 simulated values...

## 1,

## 2,

## 3,

## 4,

## 5,

## 6,

## 7,

## 8,

## 9, 10,

## 11,

## 12,

## 13,

## 14, 15,

## 16,

## 17, 18, 19,

## 20,

## 21,

## 22,

## 23,

## 24,

## 25,

## 26,

## 27,

## 28,

## 29,

## 30,

## 31,

## 32,

## 33,

## 34,

## 35,

## 36,

## 37, 38,

## 
## 39.
## Done.

seg

## 
##  Monte Carlo test of spatial segregation of types
## 
## data:  pattern
## T = 4.3, p-value = 0.05

p < 0.05 — we reject random labelling. ATMs and branches serve different parts of the city. This directly confirms H3.

5. Second-Order Analysis: Spatial Dependence

Second-order analysis asks: does the presence of one event change the probability of finding another nearby? We test this at multiple distance scales.

5.1 G-Function with Monte Carlo Envelope

G(r) is the cumulative distribution of nearest-neighbour distances. Above the CSR envelope = clustering; below = regularity. The Monte Carlo band (nsim = 19) gives a pointwise 90% significance region.

set.seed(42)
E_G_atm  <- envelope(split_pat$atm,  Gest, nsim=19, verbose=FALSE)
E_G_bank <- envelope(split_pat$bank, Gest, nsim=19, verbose=FALSE)

par(mfrow=c(1,2))
plot(E_G_atm,  main="G-function + Envelope — ATMs",   legend=FALSE, shadecol="lightblue", lwd=2)
plot(E_G_bank, main="G-function + Envelope — Banks",  legend=FALSE, shadecol="mistyrose",  lwd=2)

par(mfrow=c(1,1))

Both observed G-functions sit above the upper envelope at all distances — confirmed clustering in both types. The bank curve rises faster at very short distances (< 0.5 km), reflecting the tight CBD concentration.

5.2 Inhomogeneous L-Function with Monte Carlo Envelope

Linhom corrects for Warsaw’s varying baseline density before testing for dependence. It isolates genuine clustering from apparent clustering caused by the CBD density gradient. Above the envelope = real spatial structure beyond density effects.

set.seed(42)
E_L_atm  <- envelope(split_pat$atm,  Linhom, nsim=19, fix.n=TRUE, verbose=FALSE)
E_L_bank <- envelope(split_pat$bank, Linhom, nsim=19, fix.n=TRUE, verbose=FALSE)

par(mfrow=c(1,2))
plot(E_L_atm,  main="L_inhom + Envelope — ATMs",  legend=FALSE, shadecol="lightblue", lwd=2)
plot(E_L_bank, main="L_inhom + Envelope — Banks", legend=FALSE, shadecol="mistyrose",  lwd=2)

par(mfrow=c(1,1))

The observed L-function lies above the upper band for both types. Clustering is not just a by-product of the intensity gradient — genuine spatial dependence exists, which motivates the Gibbs interaction model in Section 7.

5.3 Pair Correlation Function

g(r) shows clustering scale-by-scale. g(r) > 1 = more point pairs at distance r than expected; g(r) = 1 = independence (the red dashed line).

pcf_atm  <- pcfinhom(split_pat$atm)
pcf_bank <- pcfinhom(split_pat$bank)

par(mfrow=c(1,2))
plot(pcf_atm,  main="PCF (inhom) — ATMs",  ylim=c(0,12), legend=FALSE)
abline(h=1, col="red", lty=2)
plot(pcf_bank, main="PCF (inhom) — Banks", ylim=c(0,12), legend=FALSE)
abline(h=1, col="red", lty=2)

par(mfrow=c(1,1))

Both curves are far above 1 at short distances, then decay to 1 by roughly 2–3 km. Clustering is a local phenomenon — block-scale for ATMs, district-scale for branches. Beyond 3 km, locations are essentially independent.

5.4 Cross-Type L-Function

Lcross.inhom measures the relationship between ATMs and branches. Positive deviation = co-location; negative = repulsion.

L_cross <- Lcross.inhom(pattern, i="atm", j="bank")
par(mar=c(4,4,3,1))
plot(L_cross, .-r~r, main="Cross-type L-function — ATM vs Bank",
     legend=FALSE, lwd=2, ylab="L_cross(r) − r")
abline(h=0, col="red", lty=2, lwd=1.5)

The curve is below zero across most scales — ATMs and branches repel each other spatially. Where you find a cluster of branches, you find fewer ATMs, and vice versa. This is the cross-type signature of segregation: H3 confirmed again with a formal second-order statistic.

6. Point Process Models

We now estimate models to explain why each type locates where it does, and to formally test H2 and H4.

6.1 Spatial Covariate: Distance to CBD

We use distance from the Palace of Culture and Science (PKiN) as a proxy for CBD accessibility — the commercial and transport centre of Warsaw. The covariate is a smoothed pixel image of distances in km.

center_sf <- st_as_sf(data.frame(lon=21.0068, lat=52.2319),
                      coords=c("lon","lat"), crs=4326) |> st_transform(3857)

center_ppp <- ppp(st_coordinates(center_sf)[,1], st_coordinates(center_sf)[,2], window=W)
center_ppp <- rescale.ppp(center_ppp, 1000, "km")

dist_raw <- crossdist(pattern, center_ppp)
pat_dist <- ppp(pattern$x, pattern$y, window=pattern$window, marks=dist_raw[,1])
dist_map <- Smooth.ppp(pat_dist)

par(mar=c(1,1,3,1))
plot(dist_map, main="Distance to CBD — Palace of Culture (km)",
     col=magma(256), ribside="bottom")

6.2 Model 1: Homogeneous Poisson (Baseline / CSR)

This model assumes constant intensity everywhere — the null of CSR. It serves as the reference against which all other models are compared.

m0_atm  <- ppm(split_pat$atm  ~ 1)
m0_bank <- ppm(split_pat$bank ~ 1)

cat("Homogeneous Poisson — ATMs:\n");   print(coef(m0_atm))

## Homogeneous Poisson — ATMs:

## log(lambda) 
##      -1.369

cat("\nHomogeneous Poisson — Banks:\n"); print(coef(m0_bank))

## 
## Homogeneous Poisson — Banks:

## log(lambda) 
##      -2.217

The intercept β₀ = log(λ̂) is the estimated constant log-intensity. No spatial variation is modelled — we already know from the CSR tests this model is wrong. It is included here as a formal baseline.

6.3 Model 2: Inhomogeneous Poisson (CBD Distance Covariate)

This model lets intensity vary with CBD distance: log λ(u) = β₀ + β₁ × dist(u). A negative β₁ means intensity drops as you move away from PKiN.

m1_atm  <- ppm(split_pat$atm  ~ dist_map)
m1_bank <- ppm(split_pat$bank ~ dist_map)

cat("Inhomogeneous Poisson — ATMs:\n");   print(coef(summary(m1_atm)))

## Inhomogeneous Poisson — ATMs:

##             Estimate    S.E. CI95.lo CI95.hi Ztest    Zval
## (Intercept)   0.7661 0.10389  0.5625  0.9698   ***   7.375
## dist_map     -0.1911 0.01065 -0.2119 -0.1702   *** -17.940

cat("\nInhomogeneous Poisson — Banks:\n"); print(coef(summary(m1_bank)))

## 
## Inhomogeneous Poisson — Banks:

##              Estimate    S.E. CI95.lo CI95.hi Ztest      Zval
## (Intercept) -0.003367 0.15843 -0.3139  0.3072        -0.02125
## dist_map    -0.199606 0.01653 -0.2320 -0.1672   *** -12.07245

Both β₁ estimates are negative — intensity declines with distance for both types. The bank coefficient is expected to be more negative (stronger CBD pull), directly testing H2.

6.4 Model 3: Gibbs Process — Strauss Interaction Model

The Strauss model adds a pairwise interaction parameter γ at radius r. When γ < 1, nearby pairs are penalised (inhibition); γ = 1 collapses to Poisson. The interaction radius is set to the mean nearest-neighbour distance for each type.

r_atm  <- round(mean(nndist(split_pat$atm)),  2)
r_bank <- round(mean(nndist(split_pat$bank)), 2)
cat(sprintf("Interaction radius — ATMs:  %.2f km\n", r_atm))

## Interaction radius — ATMs:  0.57 km

cat(sprintf("Interaction radius — Banks: %.2f km\n", r_bank))

## Interaction radius — Banks: 0.76 km

set.seed(42)
m2_atm  <- ppm(split_pat$atm  ~ dist_map, Strauss(r_atm))
m2_bank <- ppm(split_pat$bank ~ dist_map, Strauss(r_bank))

cat("\nGibbs (Strauss) — ATMs:\n");   print(coef(summary(m2_atm)))

## 
## Gibbs (Strauss) — ATMs:

##             Estimate     S.E. CI95.lo  CI95.hi Ztest    Zval
## (Intercept)  -0.3733 0.104156 -0.5774 -0.16912   ***  -3.584
## dist_map     -0.1064 0.007634 -0.1214 -0.09148   *** -13.943
## Interaction   0.3765 0.011873  0.3532  0.39979   ***  31.711

cat("\nGibbs (Strauss) — Banks:\n"); print(coef(summary(m2_bank)))

## 
## Gibbs (Strauss) — Banks:

##             Estimate    S.E. CI95.lo  CI95.hi Ztest   Zval
## (Intercept)  -1.2859 0.19076 -1.6598 -0.91203   *** -6.741
## dist_map     -0.1020 0.01102 -0.1236 -0.08044   *** -9.261
## Interaction   0.5363 0.06271  0.4134  0.65925   ***  8.552

The Interaction coefficient is log γ. If log γ ≈ 0, pairwise inhibition is absent and the pattern behaves like an inhomogeneous Poisson process. A significantly negative value would confirm genuine short-range inhibition (H4).

6.5 Model Comparison — AIC Table

aic_df <- data.frame(
  Type  = rep(c("ATM","Bank Branch"), each=3),
  Model = rep(c("M0 — Homogeneous Poisson",
                "M1 — Inhomogeneous Poisson (CBD dist.)",
                "M2 — Gibbs / Strauss (CBD dist.)"), 2),
  AIC   = round(c(AIC(m0_atm), AIC(m1_atm), AIC(m2_atm),
                   AIC(m0_bank),AIC(m1_bank),AIC(m2_bank)), 1)
)
aic_df$dAIC <- round(aic_df$AIC - rep(c(AIC(m0_atm), AIC(m0_bank)), each=3), 1)
names(aic_df)[4] <- "ΔAIC vs M0"

DT::datatable(aic_df, options=list(dom="t", pageLength=6), rownames=FALSE,
              caption="Table 3. AIC comparison across three model classes.")

Negative ΔAIC = improvement over the CSR baseline. The inhomogeneous Poisson (M1) should give a large improvement; the Gibbs model (M2) shows whether pairwise interaction adds further explanatory power beyond the intensity gradient.

6.6 Combined Marked Model and Formal ANOVA Test

We fit one joint model for the combined ATM + Bank pattern, testing whether the CBD distance effect is significantly different between the two types.

m_null <- ppm(pattern ~ marks)
m_full <- ppm(pattern ~ marks * dist_map)

cat("Combined model — key coefficients:\n")

## Combined model — key coefficients:

print(coef(summary(m_full)))

##                     Estimate    S.E. CI95.lo  CI95.hi Ztest     Zval
## (Intercept)         0.766027 0.10390  0.5624  0.96967   ***   7.3726
## marksbank          -0.796451 0.18882 -1.1665 -0.42638   ***  -4.2181
## dist_map           -0.191057 0.01065 -0.2119 -0.17018   *** -17.9370
## marksbank:dist_map -0.006135 0.01958 -0.0445  0.03223        -0.3134

cat("\nANOVA — null vs interaction model:\n")

## 
## ANOVA — null vs interaction model:

print(anova(m_null, m_full, test="Chi"))

## Analysis of Deviance Table
## 
## Model 1: ~marks   Poisson
## Model 2: ~marks * dist_map    Poisson
##   Npar Df Deviance            Pr(>Chi)    
## 1    2                                    
## 2    4  2      544 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The marksbank:dist_map coefficient captures the extra CBD pull for branches on top of ATMs. A significant negative value confirms H2: branch intensity decays faster with distance from PKiN than ATM intensity.

6.7 Predicted Intensity Surfaces

pred <- predict(m_full, se=FALSE)

par(mfrow=c(1,2), mar=c(1,1,3,1))
plot(pred$atm,  main="Predicted Intensity — ATMs",          col=inferno(256))
plot(pred$bank, main="Predicted Intensity — Bank Branches", col=inferno(256))

par(mfrow=c(1,1))

ATMs: relatively uniform predicted surface across the city. Branches: sharp peak at the CBD, rapidly fading outward. The model reproduces exactly what the raw KDE showed — the difference is that now it is formally parametrised and tested.

6.8 Model Diagnostics

Lurking variable plots check whether residuals are systematically related to the covariate. A well-fitted model should show residuals oscillating around zero without trend. Pearson residual maps reveal whether spatial structure remains unexplained.

par(mfrow=c(1,2))
lurking(m1_atm,  covariate=dist_map, type="Pearson")
title(main="Lurking — ATMs (M1)")
lurking(m1_bank, covariate=dist_map, type="Pearson")
title(main="Lurking — Banks (M1)")

par(mfrow=c(1,1))

res_atm  <- residuals(m1_atm,  type="Pearson")
res_bank <- residuals(m1_bank, type="Pearson")

par(mfrow=c(1,2), mar=c(1,1,3,1))
plot(res_atm,  main="Pearson Residuals — ATMs",  cols="transparent")
plot(res_bank, main="Pearson Residuals — Banks", cols="transparent")

par(mfrow=c(1,1))

Systematic over- or under-prediction at specific distances would suggest a missing covariate or non-linear effect. Spatial structure in the residual map would confirm that the Gibbs interaction model (M2) is warranted.

7. Conclusions

DT::datatable(
  data.frame(
    Method = c("Clark-Evans","Hopkins-Skellam","Segregation test",
               "G-function + envelope","L_inhom + envelope","PCF (inhom)",
               "Cross-type L-function","PPM — ANOVA"),
    Finding = c(
      "R < 1, p < 0.05 — clustering confirmed",
      "Clustering confirmed independently",
      "p < 0.05 — ATMs and branches are spatially segregated",
      "G above envelope — genuine clustering at all scales",
      "L above envelope — clustering beyond intensity gradient",
      "g(r) >> 1 at r < 2 km — local clustering",
      "L_cross < 0 — cross-type repulsion between types",
      "marksbank:dist_map significant — stronger CBD pull for branches"
    ),
    `Hypothesis` = c("H1","H1","H3","H1+H4","H4","H1","H3","H2")
  ),
  options=list(dom="t", pageLength=10), rownames=FALSE,
  caption="Table 4. Summary of all analyses."
)

7.1 What the results mean

Every method tells the same story.

Branches follow money. The KDE, relative risk, and inhomogeneous Poisson model all show the same thing: bank branch intensity peaks sharply in the CBD and falls fast as you move outward. The ANOVA test confirms the CBD distance effect is significantly stronger for branches than for ATMs. Banks pay premium rent to be near their corporate clients — and the spatial data shows exactly that.

ATMs follow people. ATM intensity is more evenly distributed. Multiple density peaks appear across residential districts. The coverage logic is clear: ATMs are placed to maximise convenience for everyday cash users, not to cluster near high-value clients.

They don’t share space. The segregation test and cross-type L-function confirm that ATMs and branches occupy distinct zones. This is not just a visual impression — it is statistically rejected at the 5% level that the two types are mixed randomly.

The clustering is real. The inhomogeneous L-function, computed after correcting for the density gradient, still shows clustering above the simulation envelope. This means clustering is not just an artefact of the CBD — there is genuine spatial dependence between nearby locations of the same type.

7.2 Why it matters

As Warsaw’s banks continue closing branches, ATMs become the only physical banking option for outer districts. If ATM coverage does not keep pace with branch closures, large parts of the city — particularly lower-income residential areas — risk losing convenient access to cash. Policymakers and regulators should track the evolving spatial footprint of banking infrastructure, not just the total number of outlets.

References

Alonso, W. (1964). Location and Land Use. Harvard University Press.

Baddeley, A., Rubak, E., & Turner, R. (2015). Spatial Point Patterns: Methodology and Applications with R. CRC Press.

Clark, P.J. & Evans, F.C. (1954). Distance to nearest neighbour as a measure of spatial relationships in populations. Ecology, 35(4), 445–453.

Diggle, P.J. (1985). A kernel method for smoothing point process data. Applied Statistics, 34(2), 138–147.

Kelsall, J.E. & Diggle, P.J. (1995). Non-parametric estimation of spatial variation in relative risk. Statistics in Medicine, 14(21–22), 2335–2342.

Klagge, B. & Martin, R. (2005). Decentralized versus centralized financial systems: is there a case for local capital markets? Journal of Economic Geography, 5(4), 387–421.

Leyshon, A. & Thrift, N. (1995). Geographies of financial exclusion: financial abandonment in Britain and the United States. Transactions of the Institute of British Geographers, 20(3), 312–341.

Ripley, B.D. (1976). The second-order analysis of stationary point processes. Journal of Applied Probability, 13(2), 255–266.

The Spatial Economics of Banking Infrastructure in Warsaw

ATMs vs. Physical Bank Branches — A Point Pattern Analysis

Khamidov Mirzakalonboy & Mukhammakodir Abdusalomov

2026-05-21